Google Gemma 4 Open-Weight Models: Apache 2.0 and Enhanced AI Capabilities

Google Gemma 4 Open-Weight Models: Apache 2.0 and Enhanced [[concepts/capabilities|AI

Capabilities]] Clip title: Gemma 4 Has Landed! Author / channel: Sam Witteveen URL: https://www.youtube.com/watch?v=5aqF1HVpjdc

Summary

Google has launched Gemma 4, a new suite of open-weight models that significantly advance their Gemma series, primarily by adopting a developer-friendly Apache 2.0 license. This license is a major highlight, allowing users unprecedented freedom to use, modify, distribute, and commercially deploy Google’s best open models without restrictive clauses. Gemma 4 comprises four distinct models with enhanced capabilities across multimodality, thinking (reasoning), native audio processing, and robust function calling. This move is seen as Google’s direct response to previous criticisms regarding restrictive licensing on earlier Gemma versions, aiming to foster broader adoption and innovation within the open-source AI community.

The Gemma 4 models are categorized into two tiers: “Workstation Models” and “Edge Models.” The Workstation tier includes a 31 billion parameter (31B Dense) full-dense architecture and a 26 billion parameter Mixture-of-Experts (26BA4B MoE) model, where 4 billion parameters are active at any given moment, distributed among 128 tiny experts. These are designed for high-performance inference. The Edge tier features smaller, highly efficient models (E2B and E4B) with approximately 2 billion and 4 billion effective parameters, respectively. These tiny models are optimized to run on resource-constrained devices like phones, Raspberry Pis, and Jetson Nanos, making them suitable for on-device AI assistants and applications.

A key architectural advancement in Gemma 4, drawing from Google’s Gemini 3 research, is the native integration of multimodality and enhanced reasoning. Unlike previous models that often required external tools for capabilities beyond text or text-plus-vision, Gemma 4 natively supports vision, audio, and function calling within a single model family. The new “thinking” capability allows models to perform internal chain-of-thought reasoning before generating an output, significantly improving performance on complex benchmarks and enabling reasoning across modalities, including audio for the first time. The integrated function calling leverages FunctionGemma research, optimizing models for multi-turn agentic workflows and allowing them to maintain context and utilize external tools effectively.

Specifically, the Edge models (E2B & E4B) boast significantly better native audio support compared to their predecessors. They feature a conformer-layer ASR encoder for improved audio recognition accuracy, built-in speech recognition, and speech-to-translated-text capabilities. The audio encoder is also 50% smaller and offers faster processing, crucial for low-latency edge deployments. For vision, Gemma 4 handles images at their native aspect ratio and various resolutions, supporting interleaved multi-image inputs. This enhances capabilities for Optical Character Recognition (OCR), object recognition, document understanding, and improved video understanding with temporal reasoning. Gemma 4 is available on Hugging Face and Google Cloud, with Cloud Run now supporting NVIDIA RTX Pro 6000 (Blackwell) GPUs for serverless deployment of even the larger models.

Open-weight models — Wikipedia
Apache 2.0 license — Wikipedia
Native audio processing — Wikipedia
Function calling — Wikipedia
Multimodality — Wikipedia
Chain-of-thought reasoning — Wikipedia
Mixture-of-Experts (MoE) — Wikipedia
Dense architecture — Wikipedia
Agentic workflows — Wikipedia
ASR (Automatic Speech Recognition) — Wikipedia
Conformer-layer architecture — Wikipedia
Temporal reasoning — Wikipedia
OCR (Optical Character Recognition) — Wikipedia
Serverless deployment — Wikipedia
On-device AI — Wikipedia
Edge computing — Wikipedia
Multimodal inference — Wikipedia
Speech-to-text — Wikipedia
Computer vision — Wikipedia
FunctionGemma — Wikipedia

Google — Wikipedia
Gemma 4 — Wikipedia
Sam Witteveen — Wikipedia
Gemini 3 — Wikipedia
Hugging Face — Wikipedia
Google Cloud — Wikipedia
NVIDIA — Wikipedia
Cloud Run — Wikipedia
Raspberry Pi — Wikipedia
Jetson Nano — Wikipedia
NVIDIA RTX Pro 6000 — Wikipedia
Blackwell — Wikipedia

NemoClaw Knowledge Wiki

Explorer

Google Gemma 4 Open-Weight Models: Apache 2.0 and Enhanced AI Capabilities

Google Gemma 4 Open-Weight Models: Apache 2.0 and Enhanced [[concepts/capabilities|AI

Summary

Graph View

Table of Contents

NemoClaw Knowledge Wiki

Explorer

Google Gemma 4 Open-Weight Models: Apache 2.0 and Enhanced AI Capabilities

Google Gemma 4 Open-Weight Models: Apache 2.0 and Enhanced [[concepts/capabilities|AI

Summary

Related Concepts

Related Entities

Graph View

Table of Contents