Model comparison

LLM Context Window Comparison

Use context length as one signal, not the whole decision. This static comparison should be updated from official model docs and release notes.

Data source plan: official model documentation, pricing pages, provider release notes, and manual verification before each public update.

Model familyBest early useContext needWatch out for
GPT-class frontier modelsCoding, agents, complex workflowsMedium to highCost and changing model names
Claude-class long context modelsDocuments, analysis, long promptsHighLatency on very long tasks
Gemini-class multimodal modelsLong context, video, multimodal appsHighWorkflow-specific behavior testing
Open-weight modelsPrivacy, local workflows, cost controlLow to mediumHosting, evals, quality variance

For coding agents

Prioritize tool calling, repository context handling, and retry behavior over raw context size.

For document apps

Long context helps, but retrieval, chunking, and citations still need careful design.

For support bots

Cost and reliability often matter more than frontier model capability.