I
Balance

Inception: Mercury

by inception

Inception: Mercury represents a breakthrough in large language models, being the first to utilize a discrete diffusion approach. This innovative architecture allows Mercury to achieve unparalleled speed, running 5-10 times faster than even highly optimized models such as GPT-4.1 Nano and Claude 3.5 Haiku, all while maintaining comparable performance levels. This exceptional speed makes Mercury an ideal choice for developers aiming to create highly responsive user experiences. It excels in applications requiring rapid interaction, including voice agents, dynamic search interfaces, and real-time chatbots. With a generous 128K token context window and a 4K token max output, Mercury supports complex conversations and detailed responses. It offers capabilities like functions, code generation, and streaming, making it versatile for various development needs. Pricing is competitive at $0.25/1.00 per 1M tokens (input/output), available on the STARTER access tier.

text AIfast LLMdiffusion modelchatbot AIdeveloper tools
70%Quality
128KContext Window
70%Speed
Category
Standard
API access
Unified context
RAG + Knowledge Base
24/7 Support
Try This ModelCompare models

Best For

Chatbots
Voice Agents
Search Interfaces
Responsive UIs

🚀 Capabilities

Functions
Code Generation
Streaming

Limitations

No image generation

Specifications

Providerinception
Context Window128,000 tokens
Max Output4,096 tokens
Minimum PlanBalance

Pricing

Input Price$0.2500 / 1M tokens
Output Price$1.0000 / 1M tokens

💡 With PRO subscription, cost is reduced by 20%

Ready to try Inception: Mercury?

Get 1,000 tokens free on signup

Start for free