G2
Balance

Google: Gemini 2.5 Flash Lite Preview 09-2025

by google

Gemini 2.5 Flash-Lite is a cutting-edge, lightweight reasoning model within the Gemini 2.5 family, specifically engineered for ultra-low latency and exceptional cost efficiency. This model significantly improves throughput and offers faster token generation compared to its predecessors, making it a prime choice for applications where speed is paramount. While 'thinking' (multi-pass reasoning) is disabled by default to maximize speed, developers have the flexibility to enable it via the Reasoning API parameter, allowing for a strategic trade-off between cost and intelligence. This powerful vision model boasts a substantial Context Window of 1048K tokens and a Max Output of 4K tokens, providing ample capacity for complex tasks. It supports a wide range of capabilities including vision, functions, code, and streaming, making it versatile for various use cases. Pricing is highly competitive at $0.10 per 1M input tokens and $0.40 per 1M output tokens, accessible via the STARTER tier on Multi AI. It excels in applications like chat, code generation, data analysis, and document processing.

Vision AILow LatencyCost-EfficientGoogle AIFlash Model
70%Quality
1049KContext Window
85%Speed
Category
Economy
API access
Unified context
RAG + Knowledge Base
24/7 Support
Try This ModelCompare models

Best For

Chat
Code
Analysis
Documents

🚀 Capabilities

Long context
Vision
Structured output
JSON mode
Voice understanding
Functions
Code
Streaming
Video understanding

Limitations

No image generation

Specifications

Providergoogle
Context Window1,048,576 tokens
Max Output65,536 tokens
Minimum PlanBalance

Pricing

Input Price$0.1000 / 1M tokens
Output Price$0.4000 / 1M tokens

💡 With PRO subscription, cost is reduced by 20%

Ready to try Google: Gemini 2.5 Flash Lite Preview 09-2025?

Get 1,000 tokens free on signup

Start for free