Back to Benchmarks
🧠

Reasoning

Logic puzzles, math, planning

9 modelsWeekly updates

Task Examples

Example Tasks in This Category

Easy

Number Sequence

Find the pattern and next number in a sequence.

Hard

Causal Reasoning

Identify cause and effect relationships.

Hard

Constraint Satisfaction

Find a solution that satisfies all constraints.

Model Rankings

View Methodology
RankModelScorePrice/1MTasks
🥇Qwen3 235B98.3$0.606
🥈GPT-4o97.8$10.006
🥉Claude 3.5 Sonnet97.8$15.006
4Qwen3 Max97.7$1.606
5GPT-4o Mini95.5$0.606
6DeepSeek R192.8$2.196
7Gemini 2.0 Flash88.5$0.406
8Llama 3.3 70B83.5$0.406
9Claude 3.5 Haiku76.5$4.006