The excitement about DeepSeek R1 is palpable, and for many of the right reasons. It is a great model, it is so refreshing to get access to such a powerful model on the local compute, and seems to imply that the road ahead towards AGI and beyond is far less resource constrained than we had thought. I have already written a whole blog post about my first impressions, but after a week of thinking about it I just wanted to emphasize a few more points.
1. Everyone in the AI community has always believed that the path ahead will be a combination of algorithmic improvements AND increased computational capacity. The latter has been the dominant driving force for the past couple of years, but now we are getting a couple of major algorithmic improvements over the course of a few months - test-time (inference) compute and now better training from scratch on the existing and synthetic data.
2. The last point above is actually a really, really big deal. There has been a consensus that we had hit the wall with AI training due to the fact that we've already used ALL of the human generated data, but now a new path seems to have opened up. With no hard upper bound in sight. We will now be using more and more of the compute on creating even more data. Scaling laws are now fully back - with a vengeance.
3. One of the most deeply rooted and intractable human cognitive biases is that making something more accessible will lead to far better and more egalitarian outcomes. The reality is actually quite the opposite. Lowering barriers to entry leads to much steeper power distribution laws in terms of the outcomes. This is actually a very, very fundamental scientific principle. In Physics we refer to it as the "scaling law", not to be confused with the AI scaling law. I am planning to write a much longer and more in-depth post about this at some point.