Google Gemma 4 released as open source

Home
AI
AI models
Google Gemma 4 released as open source

While the tech world has watched Chinese AI models take over the open source throne one position at a time, Google now responds with its strongest open card yet. Google Gemma 4 is here, it is free, and it can run on your Raspberry Pi. Yes, you read that right.

Table of content

What exactly is Google Gemma 4

Unlike Gemini, which is a subscription-based closed product, Gemma is an open model that can be downloaded and run locally completely free of charge. This means you don’t have to send a single byte of data to Google’s servers if you don’t want to.

Gemma 4 is launched in four sizes: Effective 2B, Effective 4B, 26B Mixture of Experts, and 31B Dense. The entire family is built to handle complex logic and agent workflows, not just simple chat.

The models are explicitly designed for use on personal hardware, from smartphones and IoT devices to laptop GPUs and professional developer workstations.

The license is the real news

The most remarkable thing about Google Gemma 4 is not the model itself but how it is licensed.

Google’s previous Gemma versions used a custom license that created legal uncertainty for commercial products. Apache 2.0 removes that friction completely and developers can modify, redistribute, and commercialize without having to worry about Google changing the terms later.

Hugging Face founder Clement Delangue commented on the launch saying that local AI has its moment now and is the future of the AI industry. Google DeepMind’s CEO Demis Hassabis went even further and called Gemma 4 the best open models in the world for their respective sizes.

Performance in numbers

Google currently ranks the 31B model in 3rd place in Arena AI’s text ranking among all open models globally, and the 26B model in 6th place. According to Google, Gemma 4 surpasses models with up to twenty times more parameters in this way.

But no celebration without a sour neighbor. A direct comparison with the strongest Chinese open source models, such as those from the DeepSeek family, shows that Gemma 4 is not quite keeping up there yet. The Chinese still set the highest standard in the heavyweight category.

The model shows strong results in mathematics with 89.2 percent on AIME 2026 for the 31B model, scientific knowledge with 84.3 percent on GPQA Diamond, and competitive coding with 80.0 percent on LiveCodeBench v6.

What it can do besides impressing on paper

All models handle images and videos natively. The smaller E2B and E4B models also support audio input for speech recognition. The edge models offer a context window of 128,000 tokens, while the larger models support up to 256,000 tokens. The model is trained in more than 140 languages.

Google’s LiteRT-LM runtime allows the Effective 2B model to run with under 1.5 gigabytes of memory on compatible devices. On a Raspberry Pi 5, it processes 4,000 input tokens across two distinct tasks in under three seconds. That is quite impressive for a box that costs a few hundred crowns.

The background no one talks about enough

Over the past year, the open source AI league has largely been a Chinese affair. DeepSeek, Minimax, GLM, and Qwen have dominated the top spots, leaving American alternatives struggling for relevance. Chinese open models went from just under 1.2 percent of global open source usage at the end of 2024 to about 30 percent by the end of 2025.

Meta’s Llama used to be the first choice for developers seeking a capable and locally runnable model. That reputation has eroded and Llama’s Meta-controlled license raised questions about its actual open source status.

Google Gemma 4 is therefore not just a new model. It is a response to an ongoing geopolitical tug-of-war over who will own the future of open AI infrastructure.

Where to download and test Google Gemma 4

Google Gemma 4 is now available in Google AI Studio for the 31B and 26B versions, and in Google AI Edge Gallery for E2B and E4B. The model weights are also available on Hugging Face, Kaggle, and Ollama.

Gemma 4 runs on Android and iOS with support for both CPU and GPU, as well as on Windows, Linux, and macOS. Running in the browser via WebGPU is also supported.

For those who want to jump in immediately without writing a single line of code, Google has also launched a new Python package and a command-line tool to experiment with the models directly.