Lifan Sun's blog

Lifan Sun's blog, Welcome to my blog.

  • Blog
  • About
  • RSS
  • Search

Reading Notes: “Training Compute-Optimal Large Language Models”

Mar 1, 2025

Reading Notes: “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints”

Feb 28, 2025

Reading Notes: GPT Series

Feb 27, 2025

Reading Notes: “Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning”

Feb 23, 2025

Reading Notes: “GPipe: Easy Scaling with Micro-Batch Pipeline”

Feb 22, 2025

Distributed Training Basics

Feb 15, 2025

Reading Note: Megatron-LM v1

Feb 15, 2025


© Lifan Sun 2023 - 2025