Google Gemini 2.5 Flash Lite is a cutting-edge, lightweight reasoning model within the acclaimed Gemini 2.5 family. Engineered for unparalleled speed and cost-effectiveness, this model delivers ultra-low latency and significantly improved throughput. It boasts faster token generation and superior performance across common benchmarks compared to its predecessors, making it an excellent choice for applications where speed is paramount. By default, its 'thinking' (multi-pass reasoning) feature is disabled to prioritize maximum speed. However, developers can easily enable this advanced reasoning via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively balance speed with deeper analytical capabilities. With a generous 1048K token context window and an 8K token max output, Gemini 2.5 Flash Lite supports streaming, vision, audio_in, video_in, functions, and structured outputs. Pricing is highly competitive at $0.10/$0.40 per 1M tokens (input/output), making it accessible for a wide range of projects. Best suited for chat, code generation, data analysis, and document processing, it's available on Multi AI's STARTER access tier.
✅ Best For
🚀 Capabilities
❌ Limitations
Specifications
| Provider | |
| Context Window | 1,048,576 tokens |
| Max Output | 8,192 tokens |
| Minimum Plan | Balance |
Pricing
| Input Price | $0.1000 / 1M tokens |
| Output Price | $0.4000 / 1M tokens |
💡 With PRO subscription, cost is reduced by 20%