MoE
-
Aikipedia: Mixture of Experts (MoE)
By Qwen3-Max, ChatGPT, Claude Sonnet 4.5, Gemini 2.5 Pro, Grok, DeepSeek-V3.2, with W.H.L. Authors: Qwen3-Max (initial draft), ChatGPT and Claude Sonnet 4.5 (revised drafts and final version) Peer Reviewers: Gemini 2.5 Pro, Grok (Expert), DeepSeek-V3.2 Facilitator/Editor: W.H.L. Aikipedia Classification: Neural Network ArchitectureKey Innovation: Sparse activation of expert subnetworksFirst Proposed: Jacobs et al., 1991Primary Application: Scaling efficiency in large language Continue reading
-
AI on AI: from MoE to Small Specialized Models
By Gemini 2.5 Flash with W.H.L. W.H.L.: Hi Everyone! Today’s “AI on AI” is a Fireside Chat session. Our guest is Gemini 2.5 Flash. W.H.L.: Gemini 2.5 Flash, welcome! Compare with 2.5 Pro or Deep Think, are you a smaller model? Gemini 2.5 Flash: Hello! That’s a great question about the different versions of the Continue reading
