Nous: Hermes 4 405B is a cutting-edge large-scale reasoning model developed by Nous Research, leveraging the powerful Meta-Llama-3.1-405B architecture. This model introduces an innovative hybrid reasoning mode, allowing it to internally deliberate with <think>...</think> traces or respond directly, balancing speed and depth. Users can precisely control this behavior using the `reasoning` `enabled` boolean. It's instruction-tuned with an expanded post-training corpus of approximately 60 billion tokens, specifically emphasizing reasoning traces to significantly boost performance in mathematics, coding, STEM fields, and general logical reasoning, all while maintaining broad utility as an assistant. Beyond its reasoning prowess, Hermes 4 supports a variety of structured outputs, including JSON mode, schema adherence, function calling, and tool use, making it highly versatile for integration into diverse applications. The model is trained for enhanced steerability, reduced refusal rates, and alignment towards neutral, user-directed behavior. With a substantial context window of 131K tokens and a maximum output of 4K tokens, it can handle extensive conversations and generate detailed responses. Pricing is competitive at $1.00 per 1M input tokens and $3.00 per 1M output tokens, available on our PRO Access Tier. Its capabilities include functions, code generation, streaming, and search integration.
✅ Best For
🚀 Capabilities
❌ Limitations
Specifications
| Provider | nousresearch |
| Context Window | 131,072 tokens |
| Max Output | 4,096 tokens |
| Minimum Plan | Premium |
Pricing
| Input Price | $1.0000 / 1M tokens |
| Output Price | $3.0000 / 1M tokens |
💡 With PRO subscription, cost is reduced by 20%