Arcee AI: Spotlight is a powerful 7-billion-parameter vision-language model, meticulously derived from Qwen 2.5-VL and fine-tuned by Arcee AI. It's engineered for precise image-text grounding tasks, offering a substantial 32k-token context window that facilitates rich multimodal conversations, seamlessly combining lengthy documents with one or more images. This model prioritizes fast inference on consumer GPUs while maintaining exceptional accuracy in captioning, visual-question-answering (VQA), and diagram analysis. Spotlight is ideally suited for agent workflows requiring on-the-fly interpretation of screenshots, charts, or UI mock-ups. Early benchmarks demonstrate its competitive performance, matching or even surpassing larger VLMs such as LLaVA-1.6 13B on popular VQA and POPE alignment tests. Access Spotlight for free on Multi AI, with competitive pricing at $0.18 per 1M input/output tokens.
✅ Best For
🚀 Capabilities
❌ Limitations
Specifications
| Provider | arcee-ai |
| Context Window | 131,072 tokens |
| Max Output | 65,537 tokens |
| Minimum Plan | Economy |
Pricing
| Input Price | $0.1800 / 1M tokens |
| Output Price | $0.1800 / 1M tokens |
💡 With PRO subscription, cost is reduced by 20%