LLM Inference
-
Aikipedia: Prefill–Decode Disaggregation
By Kimi K2 Thinking, ChatGPT, Grok Expert, Claude 4.5 Sonnet, Gemini 2.5 Pro, with W.H.L. Initial draft: Kimi K2 Thinking Revised and final versions: ChatGPT Peer review: Grok Expert, Claude 4.5 Sonnet, Gemini 2.5 Pro Editing: W.H.L. Prefill–Decode Disaggregation in LLM Inference Prefill–decode disaggregation optimizes large language model (LLM) inference by separating the prefill phase Continue reading
