Qwen 3.6 35B-A3B

Overview

  • 35B-parameter mixture-of-experts language model from the Qwen Series
  • A3B routing variant activates ~3B parameters per token, maximizing throughput vs. memory tradeoffs
  • Architecture: Sparse MoE with dense attention, optimized expert gating, and instruction-tuned reasoning/code capabilities
  • Training: Multilingual corpus, heavy code synthesis, aligned for complex tool-use and long-context retention

Local Deployment & Performance

Technical Specifications