Back to Benchmarks
🧠
Reasoning
Logic puzzles, math, planning
9 modelsWeekly updates
Task Examples
Example Tasks in This Category
Easy
Number Sequence
Find the pattern and next number in a sequence.
Hard
Causal Reasoning
Identify cause and effect relationships.
Hard
Constraint Satisfaction
Find a solution that satisfies all constraints.
Model Rankings
View Methodology →| Rank | Model | Score | Price/1M | Tasks | |
|---|---|---|---|---|---|
| 🥇 | Qwen3 235B | 98.3 | $0.60 | 6 | |
| 🥈 | GPT-4o | 97.8 | $10.00 | 6 | |
| 🥉 | Claude 3.5 Sonnet | 97.8 | $15.00 | 6 | |
| 4 | Qwen3 Max | 97.7 | $1.60 | 6 | |
| 5 | GPT-4o Mini | 95.5 | $0.60 | 6 | |
| 6 | DeepSeek R1 | 92.8 | $2.19 | 6 | |
| 7 | Gemini 2.0 Flash | 88.5 | $0.40 | 6 | |
| 8 | Llama 3.3 70B | 83.5 | $0.40 | 6 | |
| 9 | Claude 3.5 Haiku | 76.5 | $4.00 | 6 |