
Imagine an AI that says everything it thinks – and does so without whispering away half the truth. That’s exactly what Hermes 4 from Nous Research delivers. The new model challenges the giants with top results in math, creativity, and full user control – without the usual restrictions that make other AIs sound like tired bureaucrats.
Nous Research has released Hermes 4, a series of open source language models that match or outperform closed systems, but with a twist: no annoying content restrictions. It is a clear stance in the battle between corporate-controlled AI and open source – who should really decide what machines are allowed to say?
Previous version, Hermes 3, was trained on about 1.2 billion tokens. Hermes 4 has instead been trained on approximately 70 billion tokens. This means that the model has a much broader and deeper foundation, making the answers more accurate and reliable.
Hermes 4 can use a new “reasoning mode”. When activated, the model first performs an internal thought process, marked with -tags, before delivering the answer. This way it can provide more well-considered answers to difficult questions – without showing the user all the internal reflections.
The model is particularly good at mathematics, logic, and programming. It also follows structured formats like JSON better than before and can handle tool calls. This makes it useful both for developers and for more complex tasks where the answers need to be precise.
In tests (for example RefusalBench), Hermes 4 has shown that it answers more questions than competitors like GPT-4 and Claude Sonnet. When reasoning mode is activated, the response rate is close to 60%, making it more helpful in situations where other models often refuse to answer.
Even though the model is large – a full 60 billion parameters – it is optimized to be both fast and cost-effective. In the API via Nous Portal it costs about 0.70 USD per million tokens, which is actually cheaper than the older Hermes 3. But on a regular personal computer you will have problems even if you have a good graphics card.
To fully run Hermes 4 70B, a graphics card with about 80 GB of VRAM memory is required, for example NVIDIA A100 or H100. However, it is also possible to use quantized versions that take up less space. In that case, a card in the RTX 4090 class (24 GB) or systems with plenty of RAM are sufficient if running with some CPU assistance. Quantization makes the model more accessible to regular users.






