Q&A: The Network Effect

Deep learning might be a booming field these days, but few people remember its time in the intellectual wilderness better than Yann LeCun, director of Facebook Artificial Intelligence Research (FAIR) and a part-time professor at New York University. LeCun developed convolutional neural networks while a researcher at Bell Laboratories in the late 1980s. Now, the group he leads at Facebook is using them to improve computer vision, to make predictions in the face of uncertainty, and even to understand natural language.

Your work at FAIR ranges from long-term theoretical research to applications that have real product impact.

We were founded with the idea of making scientific and technological progress, but I don’t think the Facebook leadership expected quick results. In fact, many things have had a fairly continuous product impact. In the application domain, our group works on things like text understanding, translation, computer vision, image understanding, video understanding, and speech recognition. There are also more esoteric things that have had an impact, like large-scale embedding.

This is the idea of associating every object with a vector.

Yes. You describe every object on Facebook with a list of numbers, whether it’s a post, news item, photo, comment, or user. Then, you use operations between vectors to see if, say, two images are similar, or if a person is likely to be interested in a certain piece of content, or if two people are likely to be friends with one another.

What are some of the things going on at FAIR that most interest or excite you?

It’s all interesting! But I’m personally interested in a few things.

One is marrying reasoning with learning. A lot of learning has to do with perceptions, which are relatively simple things that people can do without thinking too much. But we haven’t yet found good recipes for training systems to do tasks that require a little bit of reasoning. There is some work in that direction, but it’s not where we want it.

Another area that interests me is unsupervised learning—teaching machines to learn by observing the world, say by watching videos or looking at images without being told what objects are in these images.

And the last thing would be autonomous AI systems whose behavior is not directly controlled by a person. In other words, they are designed not just to do one particular task, but to make decisions and adapt to different circumstances on their own.

How does the interplay work between research and product?

There’s a group called Applied Machine Learning, or AML, that works closely with FAIR and is a bit more on the application side of things. That group did not exist when I joined Facebook, but I pushed for its creation, because I saw this kind of relationship work very well at AT&T. Then AML became a victim of its own success. There was so much demand within the company for the platforms they were developing, which basically enabled all kinds of groups within Facebook to use machine learning in their products, that they ended up moving away from FAIR.

Recently we reorganized this a little bit. A lot of the AI capability is now being moved to the product groups, and AML is refocusing on the advanced development of things that are close to research. In certain areas like computer vision, there is a very, very tight collaboration, and things go back and forth really quickly. In other areas that are more disruptive or for which there is no obvious product, it’s more like, ‘let us work on it for a few years first’.

Let’s talk about unsupervised learning, which, as you point out elsewhere, is much closer to the way that humans actually learn.

In one sense, it’s very practical. Deep learning has been successful not just because it works well, but also because it automates part of the process of building and designing intelligent systems. In the old days, everything was manual; you had to find a way to express all of human knowledge in a set of rules, which turns out to be extremely complicated. Even in the more traditional realm of machine learning, part of the system was trained, but most of it was still done by hand, so for classical computer vision systems, you had to design a way to pre-process the image to get it into a form that your learning algorithm could digest.

With deep learning, on the other hand, you can train an entire system more or less from end to end.

Yes, but you need a lot of labeled data to do it, which limits the number of applications and the power of the system, because it can only learn whatever knowledge is present within your labeled datasets. The more long-term reason for trying to train or pre-train a learning system on unlabeled data is that, as you said, animals and humans build models of the world mostly by observation, and we’d like machines to do that as well, because accumulating massive amounts of knowledge about the world is the only they will eventually acquire a certain level of common sense.

What about adversarial training, in which a set of machines learn together by pursuing competing goals?

This is an idea that popped up a few years ago in Yoshua Bengio’s lab with Ian Goodfellow, one of his students at the time. One important application is predictions. If you build a self-driving car or any other kind of system, you’d like that system to be able to predict what’s going to happen next—to simulate the world and see what a particular sequence of actions will produce without actually doing it. That would allow it to anticipate things and act accordingly, perhaps to correct something or plan in advance.

How does adversarial training address the problem of prediction in the presence of uncertainty?

When I show you a segment of a video and I ask what happens next, you might be able to predict to some extent, but not exactly; there are probably several different outcomes that are possible. So when you train a system to predict the future, and there are several possible futures, the system takes an average of all the possibilities, and that’s not a good prediction.

Adversarial training allows us to train a system where there are multiple correct outputs by asking it to make a prediction, then telling it what should have been predicted. One of the central ideas behind this is that you train two neural networks simultaneously; there is one neural net that does the prediction and there’s a second neural net that essentially assesses whether the prediction of the first neural net looks probable or not.

You recently helped found the Partnership on AI, which aims to develop and share best practices and provide a platform for public discussion.

There are questions related to the deployment and perception of AI within the public and government, questions about the ethics of testing, reliability, and many other things that we thought went beyond a single company.

Thanks to rapid advances in the field, many of these questions are coming up very quickly. It seems like there’s a lot of excitement, but also a lot of apprehension in the public about where AI is headed.

Humans make decisions under what’s called bounded rationality. We are very limited in the time and effort we can spend on any decision. We are biased, and we have to use our bias because that makes us more efficient, though it also makes us less accurate. To reduce bias in decisions, it’s better to use machines. That said, you need to apply AI in ways that are not biased, and there are techniques being developed that will allow people to make sure that the decisions made by AI systems have as little bias as possible.

Q&A: The Network Effect

DOI

March 2018 Issue

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

Q&A: The Network Effect

DOI

March 2018 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.