Computer Vision is a dynamic and rapidly growing field with countless high-profile applications that have been developed in recent years. The potential uses are diverse, and its integration with cutting edge research has already been validated with self-driving cars, facial recognition, 3D reconstructions, photo search and augmented reality. Artificial Intelligence has become a fundamental component of everyday technology, and visual recognition is a key aspect of that. It is a valuable tool for interpreting the wealth of visual data that surrounds us and on a scale impossible with natural vision.
This course covers the tasks and systems at the core of visual recognition with a detailed exploration of deep learning architectures. While there will be a brief introduction to computer vision and frameworks, such as Caffe, Torch, Theano and TensorFlow, the focus will be learning end-to-end models, particularly for image classification. Students will learn to implement, train and debug their own neural networks as well as gain a detailed understanding of cutting-edge research in computer vision.
The final assignment will include training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet).
- Justin Johnson Instructor, Computer Science
- Fei-Fei Li Assistant Professor, Computer Science
- Serena Yeung Instructor, Computer Science
- End-to-end models
- Image classification, localization and detection
- Implementation, training and debugging
- Learning algorithms, such as backpropagation
- Long Short Term Memory (LSTM)
- Recurrent Neural Networks (RNN)
- Supervised and unsupervised learning
3.0 - 4.0
Students enrolling under the non degree option are required to take the course for 4.0 units.
Proficiency in Python; familiarity with C/C++; CS131 and CS229 or equivalents; Math21 or equivalent, linear algebra.