Case-study part 2: Explore basics of context engineering
Objectives
Understand how memory behave in agentic systems
Explore how to apply context engineering techniques in real-world
What Is Context?
Context is the complete information the agent needs to make decisions. It includes:
Short-term context:
Immediate context: Current conversation, active scenario
Session context: User’s state within one session
Long-term context: User’s profile across sessions
Session Management
What Is a Session?
A session represents one continuous interaction between user and agent:
session_service = SqliteSessionService("sessions.db")
Sessions are:
Persistent: Stored in a database
Isolated: One user, one conversation thread
Stateful: Maintain variables that evolve during the conversation
How Sessions Store State
Python implementation
# In tools.py
def select_scenario(tool_context: Context, scenario_id: str) -> dict:
# These go into session state
tool_context.state["active_scenario_id"] = scenario_id
tool_context.state["words_practiced"] = []
tool_context.state["exchange_count"] = 0
# In agent.py - on_after_agent callback
async def on_after_agent(callback_context: CallbackContext):
count = callback_context.state.get("exchange_count", 0)
callback_context.state["exchange_count"] = count + 1 # Persisted
Session State Schema
Session State = {
"active_scenario_id": "cafe", # Current practice scenario
"words_practiced": ["en kaffe", "å betale"], # Words learned this session
"exchange_count": 5, # Number of turns in conversation
# Cross-session profile
"user:completed_scenarios": 3,
"user:weak_words": ["kort eller kontant?", "heisen"],
# Additional context
"user:preferred_difficulty": "intermediate"
}
Retrieving Session State
Python implementation
# In on_before_agent
state = callback_context.state
scenario_id = state.get("active_scenario_id")
words_practiced = state.get("words_practiced", [])
# Used to build dynamic instruction
if scenario_id and scenario_id in SCENARIOS:
scenario = SCENARIOS[scenario_id]
new_instruction = build_instruction(scenario, state)
Memory Architectures
Our agent uses two-tier memory:
Tier 1: Short-Term Memory (Session State)
Scope: Current conversation only
Capacity: Small (dictionary)
Retrieval: Instant (in-memory access)
Content: Active scenario, current words practiced, exchange count
Python implementation
# Accessed every turn
state.get("active_scenario_id")
state.get("words_practiced")
Tier 2: Long-Term Memory (Persistent Storage)
Scope: Across all sessions
Capacity: Large (database)
Retrieval: On-demand (memory search)
Content: Completed scenarios, weak words, interaction history
Python implementation
# From on_before_agent
memories = await callback_context.search_memory("scenario fullført")
if memories and memories.memories:
memory_info = "\n\nKontekst fra tidligere samtaler:\n"
for mem in memories.memories[-3:]: # Last 3 memories
memory_info += f"- {mem.content...}\n"
Memory Lifecycle
Turn 1 of Session A:
├─ Short-term: empty
├─ Long-term: search for past completions
└─ Add personalization to instruction
Turn 2 of Session A:
├─ Short-term: words_practiced = ["en kaffe"]
└─ Update persists in session
Turn 3 of Session A:
├─ on_after_agent: serialize session to memory
├─ Long-term: "User practiced scenario:cafe, learned 1 words"
└─ Memory serves future sessions
Session B (next day):
├─ Short-term: reset (new session)
├─ Long-term: "User completed cafe scenario before"
└─ Memory retrieval enables personalization ("Glad to see you back!")
How Memory Informs Behavior
Dynamic Instruction Building Using Memory:
Python implementation
# From prompts.py
def build_instruction(scenario: dict, state: dict) -> str:
weak_words = state.get("user:weak_words", [])
# Memory informs instruction
if weak_words:
instruction += f"VIKTIG: In previous conversations, the user struggled with: {weak_words}.\n"
instruction += "Try to incorporate them into the conversation.\n"
The instruction evolves based on:
Immediate session state (active scenario)
Long-term memory (weak words from past attempts)
Example:
Session 1:
User struggles with: "kort eller kontant?" (card or cash?)
→ Stored in memory as weak_word
Session 2 (next day):
on_before_agent loads memory
build_instruction includes: "Try to use 'kort eller kontant?' in conversation"
→ Agent naturally steers conversation to practice this phrase
Memory Search vs. Full Retrieval
Benefits:
Efficiency: Only relevant memories retrieved
Relevance: Recent memories prioritized
Scalability: System works with large memory stores
Python implementation
# Selective retrieval - only get relevant memories
memories = await callback_context.search_memory("scenario fullført")
# Instead of: all_memories = memory_service.get_all()
# Filter by recency
for mem in memories.memories[-3:]: # Last 3 only
memory_info += f"- {mem.content...}\n"
Practical Memory Example: The Learning Journey
Let’s trace how memory enables adaptive learning across sessions:
DAY 1 - Session 1:
┌─ User selects "café" scenario
├─ on_before_agent: no memory (first time)
│ └─ Instruction: basic introduction
├─ Conversation: user practices 8/12 words
│ └─ Weak: "kort eller kontant?", "kvittering"
├─ on_after_agent: session → memory
│ └─ Memory: "completed scenario:cafe, weak_words: [...]"
└─ Session ends
DAY 2 - Session 2:
┌─ User starts fresh (new session)
├─ on_before_agent: search memory "scenario fullført"
│ ├─ Found: "User completed café scenario"
│ └─ Instruction += "Context: User has practiced café before.\n"
│ └─ Instruction += "Try to use: 'kort eller kontant?', 'kvittering'\n"
├─ Conversation: agent naturally steers toward weak words
│ └─ Agent: "Hvordan vil du betale – kort eller kontant?"
│ └─ User: "Kort, takk!"
│ └─ weak_words now practiced!
├─ on_after_agent: updated memory
│ └─ Memory: weak_words reduced, user profile updated
└─ Session ends
DAY 7 - Session 3:
┌─ User tries "hotel" scenario
├─ on_before_agent: search memory for context
│ ├─ Found: "User completed café, weak words practiced"
│ └─ Instruction += "User is advancing to new scenario\n"
│ └─ Instruction += "Build on café vocabulary where it applies\n"
├─ Conversation: new scenario with continuity
└─ Session ends
Context Integration Patterns
Pattern 1: Dynamic Instruction Generation
Pattern 2: Callback-Driven State Management
Pattern 3: Tool-Based State Mutation
Pattern 1: Dynamic Instruction Generation
The agent doesn’t have a static system prompt. Instead, it’s constructed based on state:
# on_before_agent
if scenario_id and scenario_id in SCENARIOS:
scenario = SCENARIOS[scenario_id]
new_instruction = build_instruction(scenario, state)
root_agent.instruction = new_instruction # ← Dynamic assignment
else:
root_agent.instruction = SYSTEM_INSTRUCTION + memory_info
Why dynamic?
Reduces context window usage (only relevant info included)
Enables precision (instruction tailored to current scenario)
Supports learning (instruction references weak words, completed scenarios)
Pattern 2: Callback-Driven State Management
The agent doesn’t directly manage state. Instead, callbacks hook into the agent’s lifecycle:
User Input
↓
on_before_agent() ← Prepare context
↓
Agent Reasoning Loop
↓
on_after_agent() ← Persist state
↓
Response
Why callbacks?
Separation of concerns: Agent reasoning separate from state management
Testability: Callbacks can be mocked
Extensibility: New concerns added without modifying agent core
Pattern 3: Tool-Based State Mutation
State changes only happen through tools, never directly in callbacks:
# ✓ GOOD: Tool modifies state
def mark_word_practiced(tool_context: Context, word: str, correct: bool) -> str:
practiced = tool_context.state.get("words_practiced", [])
tool_context.state["words_practiced"] = practiced + [word]
# ✗ BAD: Callback modifies state
# async def on_after_agent(callback_context):
# callback_context.state["words_practiced"].append(word) # Side effect!
Why?
Auditability: All state changes are tool calls (logged)
Consistency: State mutations always go through validation (error checking in tool)
Understandability: Reading tools shows what state changes are possible