Y Combinator Cast
02/06/25
@ Y Combinator
A crucial enhancement in DeepSeek's models is the fp8 accumulation fix, which helps prevent small numerical errors from compounding during calculations, leading to more efficient training across thousands of GPUs. Additionally, DeepSeek needed to optimize GPU usage due to hardware constraints and export controls on GPU sales to China, as their GPUs were often idle, achieving only about 35% utilization.