IBM Mixture of Experts



Mixture of Experts: The “Fun-cember” of Model Releases, Scaling Laws, and Agent Wars

Host: Tim Hwang Panelists:

In this episode of Mixture of Experts, the panel discusses the sudden influx of major model releases at the end of the year, debates whether AI scaling laws are still valid, and analyzes the implications of Amazon blocking ChatGPT’s shopping agent.


📰 AI News Headlines

  • Amazon re:Invent: Launched three new agents for coding, security, and operations.
  • IBM’s “Bob”: Early results show IBM’s AI coding assistant has improved developer productivity by 45%.
  • Salesforce Data: AI and agents influenced $14.2 billion in global sales during Black Friday.
  • Smart Kitchen: Introduction of “Pasha,” a private robot chef capable of complex multi-step cooking.

🚀 Topic 1: The Holiday Model Rush (Claude 4.5, Mistral, DeepSeek)

The end of the year has brought a “welter” of new model launches, including Claude Opus 4.5, Mistral 3, and DeepSeek 3.2. The panel discusses how labs are beginning to specialize rather than just competing on general capability.

  • “Fun-cember”: Gabe Goodhart notes that the holiday season is the perfect time for experimental releases, giving developers downtime to tinker.
  • Lean into Strengths:
    • DeepSeek: Focusing on efficiency and novel attention mechanisms (sparse attention) to run giant models efficiently. They are targeting reasoning and tool-calling.
    • Mistral: Delivering a high-quality “plain vanilla” dense attention model, but with vision capabilities integrated natively across the stack (not just as an add-on).
    • Claude: Doubling down on software engineering and maintaining a unique, collaborative “persona.”
  • Open Source differentiation: Abraham Daniels argues that Open Source labs differentiate simply by being open. DeepSeek proved you don’t need hundreds of thousands of GPUs to reach State of the Art (SOTA), and Mistral is returning to Apache 2.0 licensing roots.
  • Ensembling is the Future: Aaron Baughman suggests the sheer number of good models means the future lies in routerssystems that direct a prompt to the specific model best suited for that task (e.g., DeepSeek for math, Mistral for RAG, Claude for coding).

📈 Topic 2: Are AI Scaling Laws Still Real?

The group reacts to a blog post by VC Tomasz Tunguz suggesting Gemini 3 proves scaling laws hold—implying that throwing more compute at a problem still yields capability jumps.

  • Google’s Hardware Advantage: Abraham points out Google is unique because of its full-stack integration with TPUs. Their results might not apply to everyone else.
  • Quality vs. Scale: Gabe argues that “Scaling Law” is a misnomer. Gemini didn’t drastically change parameter count; they likely scaled data quality and training methods.
  • Experimentation Velocity: The real advantage of massive compute scaling isn’t just a bigger model; it’s the ability to run training experiments faster. Hardware speed allows researchers to iterate through algorithmic improvements more quickly.
  • Step-Function Growth: Aaron predicts progress will look like stepwise S-curves rather than a straight line, driven by new topologies and hybrid architectures (like mixing Transformers with State Space Models).

🛒 Topic 3: The Agent Wars (Amazon vs. ChatGPT)

Amazon has blocked ChatGPT’s new “Shopping Research” agent from accessing product details and reviews on its site. This sparks a conversation about the open web and business incentives.

  • The New Browser Wars: Gabe compares this to the early browser wars. Agents are becoming the new browser—the primary way people access the internet. This will likely lead to antitrust scrutiny as platforms try to lock agents out.
  • Protecting the Moat: Aaron notes this is a “turf war.” Amazon is protecting its e-commerce data, ad revenue, and commission structure. They want users utilizing their own tools (Rufus/Alexa+) rather than a third-party agent.
  • Impact on Utility: If agents are blocked from the biggest platforms, their utility as a “do-it-all” tool diminishes significantly.
  • Opportunity for Little Guys? Aaron speculates this could create an opening for smaller retailers to band together and open their data to agents, creating a collective competitor to the “closed retail empires.”

💡 Quote of the Week

“Scaling Law is kind of a misnomer… it’s a quality improvement law. When you have an iteration cycle that costs millions of dollars and takes months, it’s really hard to move that ship. As a developer, I want something that takes fractions of a second.” — Gabe Goodhart