If an LLM is going to run in your phone, the model is going to have to be small.
> Paradoxically, smaller models require more training to reach the same level of performance. So the downward pressure on model size is putting upward pressure on training compute.
"AI scaling myths" | Arvind Narayanan + Sayish Kapoor | AI Snake Oil | 2024-06-27 https://www.aisnakeoil.com/p/ai-scaling-myths