Julio's dev
๐Ÿ“„ Paper

[Review] DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining(`23.05)