Meta Releases Llama 4 Scout and Maverick AI Models With MoE Architecture, Previews 288B Parameter Behemoth

Share On:

Meta unveils its next-gen open-source AI models, Llama 4 Scout and Maverick, featuring advanced multimodal capabilities and MoE efficiency.

Meta has officially launched its latest open-source AI models under the Llama 4 family — introducing Llama 4 Scout and Llama 4 Maverick, both built with Mixture-of-Experts (MoE) architecture. These models bring native multimodal capabilities, enhanced context window support, and significant improvements in compute efficiency over their predecessors.

Additionally, the company has previewed Llama 4 Behemoth, its most powerful model yet, boasting a staggering 288 billion active parameters — though still in training and not yet released.

Llama 4 Scout & Maverick: Multimodal, MoE, and Open-Source

Meta shared details of the new models via a blog post, stating that both Llama 4 Scout and Maverick are open-source and available through Hugging Face and the official Llama site. Starting today, users can also experience these models via Meta AI integrations across WhatsApp, Instagram Direct, Messenger, and Meta.AI’s web interface.

Llama 4 Scout:
- 17B active parameters, 16 experts
- Lightweight and efficient — can run on a single Nvidia H100 GPU
- Optimized for high performance with lower hardware requirements
Llama 4 Maverick:
- 17B active parameters, 128 experts
- Designed for higher throughput and superior multimodal reasoning
- Outperforms Gemini 2.0 Flash, DeepSeek v3.1, and GPT-4o in key benchmarks

What Makes Llama 4 Different: MoE & Multimodal Capabilities

The Llama 4 models leverage Mixture-of-Experts (MoE) architecture, where only a subset of model parameters are activated per query. This selective computation model enhances both training and inference efficiency. Meta also introduced several novel techniques in the development process:

Early Fusion: Integrates text and vision tokens during pre-training
MetaP Framework: Optimizes hyperparameters and initialization scales
Layered Post-Training:
- Begins with lightweight supervised fine-tuning (SFT)
- Followed by online reinforcement learning (RL)
- Concludes with lightweight direct preference optimization (DPO)
- SFT was limited to 50% of harder datasets to avoid over-constraining the model

Benchmark Performance: Llama 4 vs Competition

In internal tests, both Llama 4 Scout and Maverick demonstrated state-of-the-art performance across various benchmarks:

Maverick outperformed:
- Gemini 2.0 Flash
- DeepSeek v3.1
- GPT-4o
  On: MMMU, ChartQA, GPQA Diamond, MTOB
Scout exceeded:
- Gemma 3
- Mistral 3.1
- Gemini 2.0
  On: MMMU, ChartQA, MMLU, GPQA Diamond, MTOB

Safety, Red-Teaming, and Licensing

Meta emphasized safety and responsible AI deployment by implementing:

Pre-training filters to eliminate harmful content
Post-training safety tools including Llama Guard and Prompt Guard
Internal stress-testing and external red-teaming for added security

Both models are released under the permissive Llama 4 license, supporting academic and commercial use. However, companies with more than 700 million monthly active users are restricted from accessing the models — a change from earlier Llama licensing terms.

Coming Soon: Llama 4 Behemoth (288B Parameters)

Meta also previewed the Llama 4 Behemoth model, which is still under training. With 288 billion active parameters and 16 experts, the Behemoth is said to outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on multiple AI benchmarks.

Though not yet released, the Behemoth sets the stage for Meta’s future ambitions in super-scale AI development.