In his tiny corner of a seemingly endless expanse of workspace, the artificial-intelligence research scientist Alexis Conneau tapped his keyboard for a few seconds and then, suddenly, there everything was: Hundreds of billions of words, an immense torrent of human knowledge, raining down a window of his MacBook Pro screen.
For years, automated "crawlers" had been vacuuming the Internet — its old poems and angry comments and dessert recipes and everything else — into this gargantuan database in 100 languages: Arabic, Malagasy, Urdu and dozens more. Conneau couldn't read it himself. But his creation, XLM-RoBERTa, had read it many times: This was its brain matter, the code with which the machine could, in some way, learn to emulate how people speak.
It was a quiet afternoon last year inside Facebook's secretive AI lab in Menlo Park, Calif., where the trillion-dollar company had assembled a 60-building compound of glass edifices with giant monitors and catered lunch. Around Conneau's desk, in the edifice known as MPK 21, the tech company's young and extraordinarily well-paid workforce walked beneath snaking rivers of blue Ethernet cables and past walls of corporately acquired pop art: "You Already Know What You Need," "Don't Be Afraid," "It Is Possible."
As he looked at his screen, Conneau couldn't help but feel that the inspirational company posters were right. The machine-learning system he had helped build could understand dozens of languages better than the best systems of yesteryear, even those trained in a single tongue.
From The Washington Post
View Full Article