End to End Learning for Self-Driving Cars

Bojarski et al. (NVIDIA) (2016)

Why It Matters

CNN mapping raw pixels to steering. Proved viability of end-to-end deep learning for autonomous navigation with visual input.

A convolutional network can learn to map raw camera pixels directly to steering commands without an explicit hand-engineered perception stack.
End-to-end driving is feasible when large quantities of human driving data provide the supervisory signal.
The result is impressive but depends heavily on dataset coverage, recovery behavior, and distribution shift handling.
The paper matters because it challenged the assumption that autonomous driving must always be decomposed into many explicit modules.

This is a landmark “can learned policy replace engineered pipeline?” paper.
The deeper lesson is not that modular pipelines are dead, but that representation learning can absorb far more structure than earlier systems assumed.