Maestro Reasoning is Arcee's flagship analysis model, a powerful 32B-parameter derivative of Qwen 2.5-32B. It's meticulously tuned with DPO and chain-of-thought RL to excel in step-by-step logic and complex problem-solving. This production release significantly widens the context window to 128k tokens and doubles the pass-rate on challenging benchmarks like MATH and GSM-8K, while also boosting code completion accuracy. Its instruction style promotes structured "thought → answer" traces, offering transparency crucial for audit-focused industries such as finance or healthcare. With a context window of 131K tokens and a max output of 4K tokens, Maestro Reasoning handles extensive queries. Pricing is competitive at $0.90/3.30 per 1M tokens (input/output), making it a PRO tier choice for demanding applications. In Arcee Conductor, Maestro is automatically selected for complex, multi-constraint queries that smaller SLMs cannot handle. This model is best suited for chat applications requiring deep analytical capabilities and clear reasoning paths. It supports streaming for dynamic interactions.
✅ Best For
🚀 Capabilities
❌ Limitations
Specifications
| Provider | arcee-ai |
| Context Window | 131,072 tokens |
| Max Output | 4,096 tokens |
| Minimum Plan | Premium |
Pricing
| Input Price | $0.9000 / 1M tokens |
| Output Price | $3.3000 / 1M tokens |
💡 With PRO subscription, cost is reduced by 20%