There's a lot more to a model than just benchmarks.
Explore Andrej Karpathy’s Autoresearch project, how it automates model experiments on a single GPU, why program.md matters, ...