Hey everyone — I went through all 42 models on Pickaxe and cross-referenced the pricing and specifications against the official provider documentation (OpenAI, Anthropic, Google, xAI, Perplexity, Mistral). Wanted to share what I found in case it’s useful for the team.
TL;DR: The overwhelming majority of Pickaxe pricing is a direct pass-through from provider-published rates — no markup detected. But I did find a handful of discrepancies worth flagging, mostly around a couple of pricing entries and some context window / max output specs.
My Clean Reference Table (Published Provider Rates)
| Model | Provider | Input $/1M | Output $/1M | Context Window | Max Output |
|---|---|---|---|---|---|
| GPT-5 nano | OpenAI | $0.050 | $0.40 | 400k | 128k |
| Gemini 2.0 Flash Lite | $0.075 | $0.30 | 1.0M | 8k | |
| GPT-4.1 nano | OpenAI | $0.100 | $0.40 | 1.0M | 32k |
| Gemini 2.0 Flash | $0.100 | $0.40 | 1.0M | 8k | |
| Grok 4 Fast | xAI | $0.200 | $0.50 | 2.0M | 16k |
| Grok 4.1 Fast | xAI | $0.200 | $0.50 | 2.0M | 16k |
| Grok 4 Fast Reasoning | xAI | $0.200 | $0.50 | 2.0M | 16k |
| Grok 4.1 Fast Reasoning | xAI | $0.200 | $0.50 | 2.0M | 16k |
| GPT-5 mini | OpenAI | $0.250 | $2.00 | 400k | 128k |
| Gemini 2.5 Flash | $0.300 | $2.50 | 1.0M | 65k | |
| Mistral Medium 3.1 | Mistral | $0.400 | $2.00 | 128k | ~8k |
| GPT-4.1 mini | OpenAI | $0.400 | $1.60 | 1.0M | 32k |
| Gemini 3 Flash | $0.500 | $3.00 | 1.0M | 64k | |
| Sonar | Perplexity | $1.000 | $1.00 | 127k | ~4k |
| Claude 4.5 Haiku | Anthropic | $1.000 | $5.00 | 200k | 64k |
| OpenAI o3-mini | OpenAI | $1.100 | $4.40 | 200k | 100k |
| OpenAI o4-mini | OpenAI | $1.100 | $4.40 | 200k | 100k |
| GPT-5-Chat | OpenAI | $1.250 | $10.00 | 128k | 16k |
| GPT-5 | OpenAI | $1.250 | $10.00 | 400k | 128k |
| GPT-5.1 | OpenAI | $1.250 | $10.00 | 400k | 128k |
| Gemini 2.5 Pro | $1.250 | $10.00 | 1.0M | 65k | |
| GPT-5.2 | OpenAI | $1.750 | $14.00 | 400k | 128k |
| Magistral Medium 1.2 | Mistral | $2.000 | $5.00 | 40k | 40k |
| Sonar Reasoning Pro | Perplexity | $2.000 | $8.00 | 128k | 128k |
| Sonar Deep Research | Perplexity | $2.000 | $8.00 | 128k | 128k |
| GPT-4.1 | OpenAI | $2.000 | $8.00 | 1.0M | 32k |
| OpenAI o3 | OpenAI | $2.000 | $8.00 | 200k | 100k |
| Gemini 3 Pro | $2.000 | $12.00 | 1.0M | 64k | |
| GPT-4o | OpenAI | $2.500 | $10.00 | 128k | 16k |
| Claude 3.7 Sonnet | Anthropic | $3.000 | $15.00 | 200k | 64k |
| Claude 4 Sonnet | Anthropic | $3.000 | $15.00 | 200k (1M beta) | 64k |
| Claude 4.5 Sonnet | Anthropic | $3.000 | $15.00 | 200k (1M beta) | 64k |
| Claude 4.6 Sonnet | Anthropic | $3.000 | $15.00 | 200k (1M beta) | 64k |
| Sonar Pro | Perplexity | $3.000 | $15.00 | 200k | 8k |
| Grok 3 | xAI | $3.000 | $15.00 | 131k | 16k |
| Grok 4 | xAI | $3.000 | $15.00 | 256k | 16k |
| Claude 4.5 Opus | Anthropic | $5.000 | $25.00 | 200k | 64k |
| Claude 4.6 Opus | Anthropic | $5.000 | $25.00 | 200k (1M beta) | 128k |
| OpenAI o1 | OpenAI | $15.000 | $60.00 | 200k | 100k |
| Claude 4 Opus | Anthropic | $15.000 | $75.00 | 200k | 32k |
| Claude 4.1 Opus | Anthropic | $15.000 | $75.00 | 200k | 32k |
| GPT-5.2 Pro | OpenAI | $21.000 | $168.00 | 400k | 128k |
Item-by-Item Differences
Pricing Discrepancies
| Model | Field | Pickaxe Value | Published Rate | Difference | Notes |
|---|---|---|---|---|---|
| Sonar Reasoning Pro | Input $/1M | $1.00 | $2.00 | -$1.00 | Pickaxe is lower than Perplexity’s published rate |
| Sonar Reasoning Pro | Output $/1M | $5.00 | $8.00 | -$3.00 | Pickaxe is lower than Perplexity’s published rate |
| Mistral Medium 3.1 | Input $/1M | $2.00 | $0.40 | +$1.60 | Possible confusion with the older “Mistral Medium” model? Medium 3.1 is significantly cheaper |
| Mistral Medium 3.1 | Output $/1M | $6.00 | $2.00 | +$4.00 | Same — the older Mistral Medium was priced around $2/$6, so this may be a naming mixup |
| Gemini 2.0 Flash Lite | Input $/1M | $0.070 | $0.075 | -$0.005 | Very minor — half a cent per million tokens |
Question for the team: Could the Sonar Reasoning Pro and Mistral Medium 3.1 entries be typos or pulled from an older version of those models’ pricing? The Sonar RP numbers look like they might be from an earlier tier, and the Mistral numbers match the old “Mistral Medium” (pre-3.1) pricing almost exactly.
Context Window Discrepancies
| Model | Pickaxe | Published | Notes |
|---|---|---|---|
| Gemini 3 Flash | (missing) | 1.0M | Was blank on site — should be 1.0M |
| Magistral Medium 1.2 | 127k | 40k | Published spec is 40k context, not 127k |
| Claude 4/4.5/4.6 Sonnet | 999k | 200k (1M in beta) | See note below on aspirational vs. default |
| Claude 4.6 Opus | 999k | 200k (1M in beta) | Same — see note below |
Max Output Token Discrepancies
| Model | Pickaxe | Published | Notes |
|---|---|---|---|
| GPT-4o | 4k | 16k | Published max output is 16,384 tokens |
| Sonar | 126k | ~4k | Looks like the context window was accidentally copied into the max output field |
| Gemini 3 Flash | (missing) | 64k | Was blank — should be 64k |
| Grok 4 Fast | 30k | 16k | Published playground/API cap is 16k |
| Grok 4.1 Fast | 30k | 16k | Same |
| Grok 4 Fast Reasoning | 30k | 16k | Same |
| Grok 4.1 Fast Reasoning | 30k | 16k | Same |
| Magistral Medium 1.2 | 127k | 40k | Published spec is 40k, not 127k |
Minor Rounding Differences (Cosmetic Only)
These are close enough that I’d call them non-issues — just different ways of expressing the same number:
| Model | Pickaxe | Published | What’s happening |
|---|---|---|---|
| OpenAI GPT-5 family (nano, mini, 5, 5.1, 5.2, Pro) | 399k | 400k | Rounding |
| All Anthropic models | 199k | 200k | Rounding |
| Perplexity Sonar/Sonar RP/Deep Research | 126k–127k | 127k–128k | Rounding |
| GPT-4.1 family (nano, mini, 4.1) | 33k max out | 32k max out | Rounding |
| Gemini 2.5 Flash/Pro, Gemini 3 Pro | 66k max out | 64k–65k max out | Rounding |
A Note on Claude “999k” Context Windows
A few of the Claude models (Sonnet 4, 4.5, 4.6 and Opus 4.6) show “999k” as the context window. This is worth a quick clarification for anyone looking at these numbers:
Anthropic does offer a 1M token context window for these models, but it’s currently in beta — limited to organizations on usage tier 4 or with custom rate limits, and it requires a specific API header (anthropic-beta: interleaved-thinking-2025-05-14) to enable. The default context window for all of these models is 200k tokens.
Additionally, when a request exceeds 200k input tokens in the 1M beta window, Anthropic charges premium long-context rates (2x input, 1.5x output pricing). So even if someone does have access, the cost profile changes significantly past the 200k threshold.
Suggested language if you want to keep 999k/1M listed: Something like “1M (beta, requires tier 4; default 200k)” would give users the full picture without being misleading. That way it’s aspirational and accurate — people know what’s available and what it takes to get there.
What Matched Perfectly
Just to be clear — the vast majority of this is solid. Every OpenAI model’s pricing was exact. Every Anthropic model’s pricing was exact. Google, xAI, and most of Perplexity were exact. No markup detected anywhere. Pickaxe appears to be doing straight pass-through pricing from all providers, which is great transparency.
Sources: OpenAI API Pricing, Anthropic Pricing Documentation, Google AI for Developers, xAI Developer Docs, Perplexity API Pricing, Mistral AI Pricing. All checked February 2026.