2012
AlexNet — the deep learning revolution
AlexNet by Krizhevsky, Sutskever, and Hinton wins the ImageNet competition by an unprecedented margin and ushers in the deep learning revolution.
The moment deep learning took over
In September 2012, a team from the University of Toronto — Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton — submitted AlexNet to the ImageNet Large Scale Visual Recognition Challenge. The result was shocking: AlexNet achieved a top-5 error rate of 15.3%, compared to 26.2% for the second-place finisher. A margin of nearly 11 percentage points was unprecedented in the competition's history. The AI research community immediately recognized this as a turning point.
What made AlexNet different
AlexNet was a deep convolutional neural network with eight layers — five convolutional and three fully connected. What made it practical were three key innovations: it ran on GPUs (two NVIDIA GTX 580s), which made training tractable; it used ReLU activation functions instead of tanh or sigmoid, which trained much faster; and it used dropout regularization to prevent overfitting. These were not new ideas in isolation, but AlexNet combined them at a scale that produced a qualitative leap in performance.
Ripple effects
AlexNet triggered a cascade of consequences. Within months, every major tech company — Google, Facebook, Microsoft, Baidu — began massive investments in deep learning. The 2013 and 2014 ImageNet winners used deep networks with even more layers. In 2015, ResNet achieved a 3.57% error rate — below human performance of approximately 5%. The paper describing AlexNet has been cited more than 100,000 times and is one of the most influential in computer science history. It proved that Hinton's conviction — that deep networks trained on large datasets could outperform all other approaches — was correct.
Sources
- Krizhevsky, A., Sutskever, I. & Hinton, G.E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. NeurIPS 2012.
- Wikipedia — AlexNet