Twitter generates a lot of noise. One hundred sixty million users send upward of 90 million messages per day, 140-character musingsstudded with misspellings, slang, and abbreviationson what they had for lunch, the current episode of "Glee," or a video of a monkey petting a porcupine that you just have to watch.
Individually, these tweets range from the inane to the arresting. But taken together, they open a surprising window onto the moods, thoughts, and activities of society at large. Researchers are finding they can measure public sentiment, follow political activity, even spot earthquakes and flu outbreaks, just by running the chatter through algorithms that search for particular words and pinpoint message origins.
"Social media give us an opportunity we didn't have until now to track what everybody is saying about everything," says Filippo Menczer, associate director of the Center for Complex Networks and Systems Research at Indiana University. "It's amazing."
The results can be surprisingly accurate. Aron Culotta, assistant professor of computer science at Southeastern Louisiana University, found that tracking a few flu-related keywords allowed him to predict future flu outbreaks. He used a simple keyword search to look at 500 million messages sent from September 2009 to May 2010. Just finding the word "flu" produced an 84% correlation with statistics collected by the U.S. Centers for Disease Control and Prevention (CDC). Adding a few other words, like "have" and "headache" increased the agreement to 95%.
The CDC's counts of what it terms influenza-like illness are based on doctors' reports of specific symptoms in their patients, so they're probably a more accurate measure of actual illness than somebody tweeting "home sick with flu." But it can take a week or two for the CDC to collect the data and disseminate the information, by which time the disease has almost certainly spread. Twitter reports, though less precise, are available in real time, and cost a lot less to collect. They could draw health officials' attention to an outbreak in its earlier stages. "We're certainly not recommending that the CDC stop tracking the flu the way they do it now," Culotta says. "It would be nice to use this as a first-pass alarm."
Twitter data may help answer sociological questions that are otherwise hard to approach, because polling enough people is too expensive and time consuming.
Google Flu Trends does something similar. One potential point in Twitter's favor is that a tweet contains more words, and therefore more clues to meaning, than the three or four words of a typical search engine query. And training algorithms to classify messagesfiltering out the tweets that talk about flu shots or Bieber Feverimproves the accuracy further.
There are other physical phenomena where Twitter can be an add-on to existing monitoring methods. Air Twitter, a project at Washington University in St. Louis, looks for comments and photos about events like fires and dust storms as a way to get early indications about air quality. And the U.S. Geological Survey (USGS) has explored using Twitter messages as a supplement to its network of seismographic monitors that alert the federal agency when an earthquake occurs. Paul Earle, a seismologist at the USGS, is responsible for quickly getting out information about seismic activity. He has searched for spikes in keywords"OMG, earthquake!" is a popular tweetfor a quick alert of an event. "It's another piece of information in the seconds and minutes when things are just unfolding," Earle says. "It comes in earlier. Some or most of its value is replaced as we get more detailed or science-derived information."
Earle says Twitter might help weed out the occasional false alarm from automated equipment, when no tweets follow an alert. The content of tweets might also supplement Web-based forms that collect people's experiences of an earthquake and are used to map the event's intensity, a more subjective measure of impact that includes factors such as building damage. A recent earthquake in Indonesia, for instance, produced a spike of tweetsin Indonesian. There's no Web form in that language for intensity, but Earle says Indonesian tweets might help fill in the blanks.
Many researchers are doing sentiment analysis of tweets. Using tools from psychology, such as the Affective Norms for English Words, which rates the emotional value of many words, Alan Mislove tracked national moods, and found that Americans tend to be a lot happier on Sunday morning than Thursday evening, and that West Coast residents seem happier than those on the East Coast.
"I think this is going to be one of the most important datasets of this era, because we are looking at what people are talking about in real time at the scale of an entire society," says Mislove, an assistant professor of computer science at Northeastern University. He says there's no easy way to validate those results, but as a proof-of-concept it shows the sorts of information that might be derived from the Twitter data stream, even without taking additional steps to filter out false positives. "There's just simply so much data that you can do pretty decently, even by taking naïve approaches," says Mislove.
"If you think about our society as being a big organism," says Noah Smith, "this is just another tool to look inside of it."
This type of inquiry, of course, has limitations. Researchers readily admit that Twitter data is noisy, and it's not always simple to know what a word meansin some parlances, "sick" is a good thing. But with hundreds of millions of messages, the errors tend to shrink. Another worry is hysteria; people worried about swine flu might tweet about it more, leading others to worry and tweet (or retweet), so there's a spike in mentions without any increase in actual cases.
There's also sample bias; certain segments of the population use Twitter more than others. But researchers seeking to glean insights from tweets can apply corrections to the sample, just as traditional pollsters do. And as a wider variety of people send more tweets, the bias is reduced. "The more data you have, the closer you get to a true representation of what the underlying population is," says Noah Smith, an assistant professor of computer science at Carnegie Mellon University.
Smith is examining how Twitter can supplement more familiar polling. One advantage is that pollsters can influence the answers they get by the way they phrase a question; people are fairly consistent, for example, in being more supportive of "gay marriage" than of "homosexual marriage," just because of the word choice. Studying tweets, which people send out of their own accord, removes that problem. "We're not actually talking to anyone. We're not asking a question. We're just taking found data," Smith says. "We can get a much larger population of people participating passively."
Indeed, he says, Twitter data may help researchers answer all sorts of sociological questions that are otherwise hard to approach, because polling enough subjects is too expensive and time-consuming using traditional methods. For instance, Smith says, a researcher might study how linguistic patterns correlate to socioeconomic status, and perhaps learn something about communication patterns among people in different demographic groups. That, in turn, could reveal something about their access to information, jobs, or government services.
Of course, the power of widespread, unfiltered information invites the possibility of abuse. Two researchers at Wellesley University, Panagiotis Metaxas and Eni Mustafaraj, found that during a special election in Massachusetts for U.S. Senate, the Democratic candidate, Martha Coakley, was the subject of a "Twitter bomb" attack. A conservative group in Iowa, the American Future Fund, sent out 929 tweets in just over two hours with a link to a Web site that attacked Coakley. The researchers estimate the messages could have been seen by more than 60,000 people before being shut down as spam.
Indiana's Menczer developed a tool to distinguish between organized partisan spamming and grass-roots activism. He calls it Truthy, from comedian Stephen Colbert's coinage describing a statement that sounds factual but isn't. Starting with a list of keywords that includes all candidates, parties, and campaign names, the system detects what he calls memes, messages about a specific topic or candidate. It then displays a graphic representation of how each meme propagates, with retweets in blue and mentions of the topic in orange. If someone sets up two accounts and repeatedly sends the same tweets back and forth between theman effort to show up in Twitter's popular "trending topics" listit appears as a thick blue bar. Networks of automated tweets pushing a particular meme show up as regular, orange starbursts. More natural propagations look like fuzzy dandelions. In some cases, the tweets carry links to Web sites with questionable claims or even strident propaganda. Others turn out to be pitching a product.
The patterns, along with information about when each Twitter account was created and whether the owner is known, allow voters to distinguish actual political dialogue from organized attacks. Menczer hopes to add sentiment analysis to analyze the content of the messages as well as their dispersal patterns.
At Xerox Palo Alto Research Center, Research Manager Ed H. Chi is also looking at message propagation. "Twitter is kind of this perfect laboratory for understanding how information spreads," Chi says. Such a study can improve theoretical models of information dispersal, and also give people and businesses better strategies for delivering their messages or managing their reputations.
Much Twitter-based research is still preliminary. Some findings can be validated through other sources, such as CDC statistics or public opinion polls, but others remain unproven. Still, the scientists are excited at the prospects of what they might find by mining such a large, raw stream of data. "As Twitter and other social media grow, you'll be able to ask much more fine-grained questions," says Smith. "If you think about our society as being a big organism, this is just another tool to look inside of it."
Chen, J., Nairn, R., Nelson, L., and Chi, E. H.
Short and tweet: experiments on recommending content from information streams, ACM Conference on Human Factors in Computing Systems, Atlanta, GA, April 1015, 2010.
Detecting influenza outbreaks by analyzing Twitter messages, KDD Workshop on Social Media Analytics, Washington, D.C., July 25, 2010.
Earle, P., Guy, M., Buckmaster, R., Ostrum, C., Horvath, S., and Vaughan, A.
OMG earthquake! Can Twitter improve earthquake response? Seismological Research Letters 81, 2, March/April 2010.
Metaxas, P.T. and Mustafaraj, E.
From obscurity to prominence in minutes: political speech and real-time search, Web Science Conference, Raleigh, NC, April 2627, 2010.
O'Connor, B., Balasubramanyan, R., Routledge, B., and Smith, N.
From Tweets to polls: linking text sentiment to public opinion time series, Proceedings of the International AAAI Conference on Weblogs and Social Media, Washington, D.C., May 2326, 2010.
©2011 ACM 0001-0782/11/0300 $10.00
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2011 ACM, Inc.