Meituan: LongCat Flash Chat is a powerful large-scale Mixture-of-Experts (MoE) model, boasting 560 billion total parameters. It dynamically activates 18.6B–31.3B parameters per input, ensuring efficiency. This model introduces a shortcut-connected MoE design to minimize communication overhead and achieve high throughput, all while maintaining training stability through advanced scaling strategies like hyperparameter transfer and multi-stage optimization. Optimized as a non-thinking foundation model, LongCat-Flash-Chat is specifically designed for conversational and agentic tasks. It offers extensive context window support, handling up to 128K tokens, and delivers competitive performance across various benchmarks including reasoning, coding, and instruction following. Its particular strengths lie in tool use and managing complex multi-step interactions. Key specifications include a context window of 131K tokens and a maximum output of 4K tokens. Pricing is set at $0.20 per 1M input tokens and $0.80 per 1M output tokens, making it an accessible STARTER tier model on Multi AI. It supports streaming capabilities.
✅ Best For
🚀 Capabilities
❌ Limitations
Specifications
| Provider | meituan |
| Context Window | 131,072 tokens |
| Max Output | 4,096 tokens |
| Minimum Plan | Balance |
Pricing
| Input Price | $0.2000 / 1M tokens |
| Output Price | $0.8000 / 1M tokens |
💡 With PRO subscription, cost is reduced by 20%