Mastering Context Engineering (2): Theoretical Foundations and Core Concepts

"Context engineering is the delicate art and science of filling the context window with just the right information for the next step." — Andrej Karpathy, June 2025

Introduction

In the previous article, we examined why Vibe Coding and Spec Driven Development fail. The core issue was removal of context.

This article deeply explores the theoretical foundations of Context Engineering — the solution. Not simply "provide more context," but the science and art of what context, how, and when to provide it.

Part 1: Defining Context Engineering

1.1 Origin and Evolution of the Term

The Era of Prompt Engineering (2022-2024)

After ChatGPT's emergence, "Prompt Engineering" rose as a key skill. But there was a problem. Simon Willison noted:

"The term prompt engineering makes people think it's 'typing something into a chatbot.'"

The Emergence of Context Engineering (2025)

In mid-2025, industry leaders began proposing a new term.

Shopify CEO Tobi Lutke:

"I really like the term context engineering over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM."

Andrej Karpathy:

"Context engineering is the delicate art and science of filling the context window with just the right information for the next step."

1.2 Deep Analysis of Karpathy's Definition

Let's break down Karpathy's definition:

"delicate art and science"

Art: Requires intuition, experience, creativity
Science: Systematic methodology, measurable results
Delicate: Small differences create large outcome differences

"filling the context window"

Context Window: Maximum tokens an LLM can process at once
Filling: Not just filling, but strategically composing

"just the right information"

Too little: Insufficient basis for AI reasoning
Too much: Noise, "Lost in the Middle" problem
Right information: Quality over quantity, relevance is key

"for the next step"

Dynamic, not static context
Provide context appropriate for each step
Connects to Agentic AI's ReAct pattern

1.3 Prompt Engineering vs Context Engineering

Aspect	Prompt Engineering	Context Engineering
Focus	Input text optimization	Entire information environment design
Nature	Static (written once)	Dynamic (changes with situation)
Scope	Single prompt	Entire system
Time	Request moment	Continuous (session, project)
Components	Instructions, examples	Instructions, examples, RAG, state, tools, history
Analogy	Asking good questions	Building collaboration environment

Part 2: Understanding LLM Context Windows

2.1 What is a Context Window?

Definition: The Context Window is the maximum number of tokens an LLM can process in one input-output cycle.

About Tokens:

Encoded form of words, symbols, characters
English: approximately 4 characters = 1 token
128,000 tokens ≈ ~100,000 words ≈ ~300 pages

2.2 Context Window Sizes by Model (2025)

Model	Context Window	Notes
GPT-5	400K tokens	128K output window
GPT-4.1	1M tokens (API)	ChatGPT is limited
Claude 3.5 Sonnet	200K tokens
Gemini 2.5	1M tokens
Llama 4	10M tokens	Released April 2025

2.3 The "Lost in the Middle" Phenomenon

The Discovery:

Research shows LLMs struggle to utilize information in the middle of long contexts.

"Performance can degrade by more than 30% when relevant information shifts from the start or end positions to the middle of the context window." — GetMaxim.ai, 2025

Explanation:

LLMs remember information at the beginning and end well
Information in the middle gets "lost"
Called "Lost in the Middle" or "U-shaped attention"

NeurIPS 2024 Research:

"While many contemporary large language models (LLMs) can process lengthy input, they still struggle to fully utilize information within the long context, known as the lost-in-the-middle challenge."

2.4 Effective Context Window Strategies

1. Place Important Information at Start and End

[System instructions - most important] ← Start
[Background information]
[Reference documents] ← Middle (Lost in the Middle risk)
[Current task context]
[Specific request - most important] ← End

2. Context Compression and Summarization

Extract only relevant sections instead of full documents
Provide summarized previous conversations
Keep only key decisions

3. Hierarchical Context Structure

Level 1: Always include (project overview, key constraints)
Level 2: Task-specific (related modules, APIs)
Level 3: As needed (detailed implementation, history)

Part 3: The Five Components of Context

Synthesizing perspectives from Karpathy and industry experts, effective context consists of five key components.

3.1 Task Description

Definition: Clear description of the task AI should perform

What to Include:

Goal: What are we trying to achieve
Background: Why is this task needed
Success Criteria: What defines success
Scope: What's included and excluded

3.2 Few-shot Examples

Definition: Concrete examples of desired output

Principles of Few-shot Learning:

LLMs learn patterns from examples
Examples often more effective than explicit rules
2-5 examples typically optimal

Effective Example Selection:

Representativeness: Cover common cases
Diversity: Include different input/output types
Edge Cases: Include boundary cases
Quality: Output at desired level

3.3 RAG (Retrieval Augmented Generation)

Definition: Retrieving relevant information from external knowledge sources to provide as context

Why RAG is Needed:

Latest Information: Information after LLM training cutoff
Domain Knowledge: Company/project-specific information
Accuracy: Reduce hallucination
Context Efficiency: Select only necessary information

3.4 State & History

Definition: Record of previous interactions and decisions

What to Include:

Conversation history (summarized)
Decision log with reasoning
Feedback history
Session continuity information

3.5 Tools & Constraints

Definition: Tools AI can use and constraints to follow

Tool Specification:

Available tools and their capabilities
Unavailable tools

Constraints:

Technical constraints (language, framework)
Business constraints (privacy, dependencies)
Style constraints (conventions, patterns)
Guardrails (prohibited actions)

Part 4: Agentic AI and Context

4.1 Core Patterns of Agentic AI

1. ReAct (Reasoning + Acting)

Context enables reasoning for next action
Context guides tool selection and execution

2. Reflection

Previous output and feedback as context
AI evaluates and improves its own output

3. Planning

Goals and constraints as context
AI creates step-by-step plans

4. Multi-Agent

Shared context between agents
Role-specific specialized context

4.2 Impact of Context on Agent Performance

Research Summary:

High-quality context: 85%+ task success rate
Low-quality context: Below 40% success rate
No context: Only simple tasks possible

4.3 Principles of Context Design

Relevance: Only information directly related to current task
Specificity: Concrete examples over general descriptions
Structure: Clear sections, consistent format
Recency: Remove or update outdated information
Verifiability: Enable AI to verify context accuracy

Conclusion: From Theory to Practice

What We Learned

Definition of Context Engineering
- Evolution from Prompt Engineering
- "Right information at the right time in the right format"
- Dynamic system, not static template
Understanding Context Windows
- Size isn't everything
- "Lost in the Middle" phenomenon
- Importance of strategic placement
Five Components of Context
- Task Description, Few-shot Examples, RAG, State & History, Tools & Constraints
Relationship with Agentic AI
- Role of context in each pattern
- Context quality determines performance

Next Article Preview

Article 3: Context Engineering in Practice - Project Setup will cover:

CONTEXT.md, agents.md writing guide
Complete Cursor Rules guide
MCP (Model Context Protocol) usage
Actual templates and checklists

References

Karpathy, A. (2025, June). "Context Engineering" - X/Twitter post.
Willison, S. (2025, June 27). "Context Engineering." simonwillison.net.
Lutke, T. (2025). X/Twitter post on Context Engineering.
GetMaxim.ai. (2025). "Advanced RAG Techniques for Long-Context LLMs."
NeurIPS. (2024). "Make Your LLM Fully Utilize the Context."
Thoughtworks. (2025, November). "Technology Radar Vol. 33."

The next article covers applying theory to real projects. How have you started with Context Engineering? Share in the comments.