Neural
networks have enjoyed several waves of popularity over the past half
century. Each time they become popular, they promise to provide a
general purpose artificial intelligence--a computer that can learn to
do any task that you could program it to do. The first wave of
popularity, in the late 1950s, was crushed by theoreticians who proved
serious limitations to the techniques of the time. These limitations
were overcome by advances that allowed neural networks to discover
distributed representations,
leading to another wave of enthusiasm in the late 1980s. The second
wave died out as more elegant, mathematically principled algorithms
were developed (e.g., support-vector machines, Bayesian
models). Around 2010, neural nets had a third resurgence. What
happened over the past 20 years? Basically, computers got much faster
and data sets got much larger, and the algorithms from the 1980s--with
a few critical tweaks and improvements--appear to once again be state
of the art, consistently winning competitions in computer vision,
speech recognition, and natural language processing. Below is a comic
strip
circa 1990, when neural nets reached public awareness. You
might expect to see the same comic today, touting neural nets as the
hot new thing, except that now the field has been rechristened
deep learning
to emphasize the
architecture of neural nets that leads to discovery of task-relevant
representations.
In this course, we'll examine the history of neural networks and
state-of-the-art approaches to deep learning
.
Students will learn to design
neural network
architectures and training procedures via hands-on assignments.
Students will read current research articles to appreciate
state-of-the-art approaches as well as to question some of the hype
that comes with the resurgence of popularity. We will use
Geoff
Hinton's Coursera lectures as
background, since nobody in the field
can
explain ideas as well as Geoff, and class time will be devoted to
discussing the lectures and delving into more detail about the methods.
The
course is open to any
students who
have some background in cognitive science or artificial intelligence
and who have taken introductory probability/statistics and linear
algebra.
We will rely primarily on current research articles, since -- following
suitable introductory lectures -- the articles are pretty easy to
follow. If you want additional reading, I
recommend the following:
The research articles we'll cover in class are contained in links below
on the class-by-class syllabus.
We
will use Piazza for class discussion.
Rather than emailing me, I encourage you to post your
questions
on Piazza. The
Piazza signup page is
here.
Once you've signed up, the class page is
here.
Readings
In
the style
of graduate
seminars, I will expect you to have read required readings prior to
class and to watch required videos prior to class. (At present, we'll
do most of the videos in class, but that plan may change.) Come
prepared to class to discuss the material
(asking
clarification questions, working through the math,
relating papers to each other, critiquing the
papers, presenting
original ideas related to the paper).
Homework Assignments
We can
all delude ourselves into believing we
understand some math or algorithm by reading, but implementing and
experimenting with the algorithm is both fun and valuable for obtaining
a true understanding. Students will implement small-scale
versions of as many of the models
we discuss as possible. I will give about half a dozen
homework
assignments
that involve implementation over the semester, details to be
determined. My preference is for you to work in matlab, both because
you can leverage existing software and because
matlab has become the de facto work horse in machine learning. One or
more of the assignments may involve writing a commentary on a research
article or presenting the article to the class.
Semester
Grades
Semester
grades will be based 20% on class
attendance and participation and 80% on the homework assignments.
I will weight the assignments
in
proportion to their difficulty, in the range of 10-20% of the course
grade. Students with backgrounds in the area and specific
expertise may wish to do in-class presentations for extra credit.
When you see a "<" beside a video, it's a video you should watch
before class. When you see a ">", it's a video you
should
watch after class. Other videos we'll watch in class.