Research indicated that previous models like GPT-3 were undertrained; despite their size, they hadn't been trained on enough text to reach their full potential. For instance, Chinchilla, a model smaller than GPT-3, outperformed larger models by being trained on four times more data.