返回基准测试
🧠

推理

逻辑、数学、规划

9个模型每周更新

任务示例

此类别的示例任务

简单

Number Sequence

Find the pattern and next number in a sequence.

困难

Causal Reasoning

Identify cause and effect relationships.

困难

Constraint Satisfaction

Find a solution that satisfies all constraints.

排名模型得分价格/1M任务
🥇Qwen3 235B98.3$0.606
🥈GPT-4o97.8$10.006
🥉Claude 3.5 Sonnet97.8$15.006
4Qwen3 Max97.7$1.606
5GPT-4o Mini95.5$0.606
6DeepSeek R192.8$2.196
7Gemini 2.0 Flash88.5$0.406
8Llama 3.3 70B83.5$0.406
9Claude 3.5 Haiku76.5$4.006