Hermes 4 70B is a hybrid-mode reasoning model based on Llama-3.1-70B by Nous Research. Compared to Hermes 3, this model delivers enhanced mathematical and scientific reasoning, superior instruction following, and precise schema-adherent outputs with nuanced roleplay and creative writing capabilities.

The model supports a context length of 131k tokens.

What’s new vs Hermes 3

Custom Fields

Special features defined by the model author

Enable Thinking

: boolean

(default=false)

Controls whether the model will think before replying

Keep CoT

: boolean

(default=false)

Include Chain of Thought in subsequent requests

Parameters

Custom configuration options included with this model

Temperature

0.6

Top K Sampling

Top P Sampling

0.95

Sources

The underlying model files this model uses

testus

Public

Forked from nousresearch/hermes-4-70b

Model

Revisions

Post-training corpus: Massively increased dataset size from 1M samples and 1.2B tokens to ~5M samples / ~60B tokens blended across reasoning and non-reasoning data.

Hybrid reasoning mode with explicit <think>…</think> segments when the model decides to deliberate, and options to make your responses faster when you want.

Reasoning that is top quality, expressive, improves math, code, STEM, logic, and even creative writing and subjective responses.

Schema adherence & structured outputs: trained to produce valid JSON for given schemas and to repair malformed objects.

Much easier to steer and align: extreme improvements on steerability, especially on reduced refusal rates.

Based on

🤗lmstudio-community/Hermes-4-70B-GGUF→

GGUF

🤗lmstudio-community/Hermes-4-70B-MLX-8bit→

MLX

🤗lmstudio-community/Hermes-4-70B-MLX-6bit→

MLX

🤗lmstudio-community/Hermes-4-70B-MLX-5bit→

MLX

🤗lmstudio-community/Hermes-4-70B-MLX-4bit→

MLX