Onepager
Parallel Training, Low-Rank Structure, and Positional Encoding
This page keeps the onepager reading experience, while the body content is maintained in three separate markdown sources. The page unfolds in a fixed order: DeepSeekMoE and V3 parallel training, LoRA and MLA, and positional embedding.