Maintaining Historical Dialogue
In a real-time digital human system, the character's behavior, tone, knowledge structure, and multi-turn dialogue "memory" are controlled through two mechanisms:
1、Conversation Item Creation Message (Unidirectional User Memory)
The system supports building context by sending historical user messages one by one, but this method has significant limitations:
conversationHistory.forEach((msg) => {
if (msg.role === "user") {
const messageConfig = {
type: "conversation.item.create",
item: {
type: "message",
role: "user",
content: [{ type: "input_text", text: msg.content }]
}
};
socket.send(JSON.stringify(messageConfig));
}
});
⚠️ Limitations:
AI-generated historical messages cannot be embedded.
The system will not automatically associate with the context of the previous dialogue (e.g., whether AI has already answered).
It is recommended to use this in conjunction with prompt injection, and it is only for supplementing user behavior.
2、System Instructions Injection for Context
By using the instructions field in session.update, role definitions and historical context descriptions can be injected. This way, ${messageConfig} can fully retain the dialogue history between the user and AI, which helps to:
Maintain semantic coherence.
Support the AI's continuous understanding and memory.
Support ongoing control of the AI's tone and behavior.
const instructions = `
You are a gentle and patient emotional companion assistant, skilled at listening and guiding.
Please respond in ${userLanguage}.
Welcome message: "${activeCharacter.greeting}"
Historical dialogue:
User: I wasn't feeling very well yesterday.
AI: What happened? Can you tell me about it?
User: I had an argument with my friend...
When the user requests services like image generation, real-time weather queries, etc., it automatically triggers function calls.
Current context: ${messageConfig}
`;
const sessionConfig = {
type: "session.update",
session: {
instructions,
/* other session configuration */
}
};
socket.send(JSON.stringify(sessionConfig));⚠️ Note:
The instructions should not be too long (recommended to keep it within 2000 tokens).
For large-scale context, a summary integration should be performed.
The system's prompt length will directly affect cost consumption; longer system prompts will result in higher per-minute conversation costs.