Google DeepMind Releases Gemma 4 With Four Model Sizes and Apache 2.0 License

Four Models, Fully Open-Source

Google DeepMind released Gemma 4 on April 2, 2026, marking the most significant update to the company's open-weight AI model family. The release includes four distinct variants: E2B (2.3B effective parameters), E4B (4.5B effective), 26B Mixture-of-Experts (4B active parameters), and 31B dense model. All variants ship under the Apache 2.0 license, a departure from Google's typically restrictive licensing for open models.

Multimodal and On-Device Ready

Beyond text, Gemma 4 processes images and audio (on E2B and E4B variants), supporting variable aspect ratios for visual content, document parsing, chart recognition, and handwriting OCR. The models handle up to 128K context tokens for smaller variants and 256K for the larger 26B and 31B models, with support for over 140 languages.

Google positioned Gemma 4 specifically for edge deployment—phones, consumer GPUs like RTX series, and devices like Jetson Nano. The company released accompanying tools including AI Edge Gallery and LiteRT-LM to help developers build on-device agents without fine-tuning.

Benchmark Results

The flagship 31B instruction-tuned model ranks #3 on Arena AI's text leaderboard at 1452 Elo, outperforming models twenty times its size. The 26B MoE variant ranks #6 at 1441 Elo. Compared to Gemma 3, the improvements are dramatic: AIME 2026 math benchmark jumps from 20.8% to 89.2%, LiveCodeBench coding from 29.1% to 80.0%, and GPQA science from 42.4% to 84.3%.

The multilingual MMMU benchmark shows 85.2% for the 31B model versus 67.6% for Gemma 3 27B, while multimodal MMMU Pro reaches 76.9% compared to 49.7%. These gains suggest significant advances in reasoning and agentic capabilities—the models now support multi-step planning, function-calling, structured JSON output, and native system prompts.

What This Means for Developers

Gemma 4's Apache 2.0 licensing removes previous restrictions on commercial use and modification. Combined with on-device optimization, this makes the family viable for enterprises wanting local AI部署 without vendor lock-in. The benchmark improvements in math, coding, and reasoning position the 31B model as a capable alternative to larger closed models for many practical applications.