🔴 💻 **Will we run out of data? Limits of LLM scaling based on human-generated data**
_“Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock of public human text data between 2026 and 2032, or slightly earlier if models are overtrained.”_
Villalobos, P. et al. (2022) Will we run out of data? Limits of LLM scaling based on human-generated data. https://arxiv.org/abs/2211.04325v2.
#AI #ArtificialIntelligence #LLM #LLMS #Data #ComputerScience #MachineLearning @ai