G2
Balance

Google: Gemini 2.5 Flash Lite

by google

Google Gemini 2.5 Flash Lite is a cutting-edge, lightweight reasoning model within the acclaimed Gemini 2.5 family. Engineered for unparalleled speed and cost-effectiveness, this model delivers ultra-low latency and significantly improved throughput. It boasts faster token generation and superior performance across common benchmarks compared to its predecessors, making it an excellent choice for applications where speed is paramount. By default, its 'thinking' (multi-pass reasoning) feature is disabled to prioritize maximum speed. However, developers can easily enable this advanced reasoning via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively balance speed with deeper analytical capabilities. With a generous 1048K token context window and an 8K token max output, Gemini 2.5 Flash Lite supports streaming, vision, audio_in, video_in, functions, and structured outputs. Pricing is highly competitive at $0.10/$0.40 per 1M tokens (input/output), making it accessible for a wide range of projects. Best suited for chat, code generation, data analysis, and document processing, it's available on Multi AI's STARTER access tier.

Lightweight AIFast AICost-efficientVideo AIGoogle Gemini
72%Quality
1049KContext Window
85%Speed
Category
Economy
API access
Unified context
RAG + Knowledge Base
24/7 Support
Try This ModelCompare models

Best For

Chat
Code Generation
Data Analysis
Document Processing

🚀 Capabilities

Streaming
Vision
Audio Input
Video Input
Functions
Structured Output

Limitations

No image generation

Specifications

Providergoogle
Context Window1,048,576 tokens
Max Output8,192 tokens
Minimum PlanBalance

Pricing

Input Price$0.1000 / 1M tokens
Output Price$0.4000 / 1M tokens

💡 With PRO subscription, cost is reduced by 20%

Ready to try Google: Gemini 2.5 Flash Lite?

Get 1,000 tokens free on signup

Start for free