Model comparison

LLM Context Window Comparison

Use context length as one signal, not the whole decision. This static comparison should be updated from official model docs and release notes.

Data source plan: official model documentation, pricing pages, provider release notes, and manual verification before each public update.

Model family	Best early use	Context need	Watch out for
GPT-class frontier models	Coding, agents, complex workflows	Medium to high	Cost and changing model names
Claude-class long context models	Documents, analysis, long prompts	High	Latency on very long tasks
Gemini-class multimodal models	Long context, video, multimodal apps	High	Workflow-specific behavior testing
Open-weight models	Privacy, local workflows, cost control	Low to medium	Hosting, evals, quality variance

For coding agents

Prioritize tool calling, repository context handling, and retry behavior over raw context size.

Long context helps, but retrieval, chunking, and citations still need careful design.

Cost and reliability often matter more than frontier model capability.