Back to Benchmarks
📊

Analysis

Data analysis, summarization

8 modelsWeekly updates

Task Examples

Example Tasks in This Category

Easy

Sentiment Classification

Classify sentiment of customer reviews.

Hard

Compare Two Documents

Compare two product descriptions and highlight differences.

Medium

Data Summary

Analyze data and provide insights.

Model Rankings

View Methodology
RankModelScorePrice/1MTasks
🥇Qwen3 235B93.0$0.601
🥈GPT-4o Mini93.0$0.601
🥉DeepSeek R193.0$2.191
4Qwen3 Max93.0$1.601
5GPT-4o90.0$10.001
6Claude 3.5 Haiku87.0$4.001
7Llama 3.3 70B87.0$0.401
8Gemini 2.0 Flash83.0$0.401