Keynotes at SIGIR 2017

こんにちは。ようこそ！Hello and welcome!

The 40^th ACM SIGIR Conference featured two keynote speeches in the main conference. The first keynote, “Forward to the Past: Notes towards a Pre-history of Web Search,” by Stephen Robertson (Professor Emeritus of City University of London, Life Fellow at Girton College Cambridge, and 15-year veteran of Microsoft Research Cambridge), walked us through the evolution of information retrieval as a discipline, beginning with the “long and venerable history entirely outside the domain of computers.” The talk highlighted how any pre-Internet Information Retrieval (IR) concepts and methods contributed to the development and success of Web search engines:

“One of the real achievements of the web search engine world was to marry the basic technologies of large-scale inverted files and free-text indexing … with the ideas of natural language queries and search output ranking. These last had up to that point had been largely confined to research systems. … search engines did not invent (nor even re-invent) search output ranking they adapted and developed it, and added new evidence.”

Stephen Robertson

However, conversion of basic, pre-Internet IR technologies for use in commercial web search engines was not straightforward. Finding workable, production-level adaptations involved stumbling along inefficient, circuitous routes, some of which bore no fruit. Professor Robertson’s talk was particularly valuable for young researchers who have not yet experienced frustrations and failures at work. In addition, the anecdotes recounted may help them from repeating some mistakes from the past. Many students I talked with found the talk fascinating since they cannot imagine a world without the Internet, and to them, wireless connectivity and high-quality search engines are givens—an inexpensive resource, like potable water or electricity.

Robertson’s keynote dovetailed nicely with Yoelle Maarek’s “Mail Search: It’s Getting Personal!” Yoelle’s talk began with her move from Google (where her team led the launch of Google Suggest, the query auto-completion feature) to Yahoo (where she has been serving as vice president of research, leading, among other efforts, Yahoo mail search. She is a member of the Technion Board of Governors and the Technion Management Council and was inducted as an ACM Fellow in 2013). Yoelle demonstrated that the worlds of Web and mail search are far different than imagined. The aim of Web search is to find one, but not necessarily all, relevant documents with respect to a user query. The user does not know whether all relevant documents will be displayed, but will be satisfied if the search engine returns at least one highly relevant document that meet his/her needs. For example, one high-quality result from a query on the top page may satisfy a user, because the user has no clue that the search engine missed many other equally relevant results.

Yoelle Maarek

In contrast, users know their own e-mail boxes, so the target is clear: Find ‘that’ e-mail which I read several days, weeks or months ago…even though I can’t remember the exact time, date, and details that can pinpoint it in my inbox. (Side note: older folks will recall that when e-mail appeared and began its proliferation into the workplace and our daily lives, it provided endless fodder for late-night comedians and cartoonists at The New Yorker: “In this world nothing can be said to be certain, except death, taxes, and e-mail.” Dealing with e-mail has moved from the humorous to the essential; unfortunately, many of us seem to be losing the battle.)

A second significant difference is the sense of ownership that users have about their e-mail inboxes. While the Web is regarded as a sort of the Wild West, from which search engines help up find useful nuggets of information, and e-mail inboxes are regarded as our very personal private spaces much like one’s own bedroom (as a child) or home (as an adult). Interestingly, this sense of ownership has impeded the adoption of smart e-mail search. Use of smart search tools to help with query formulation and refinement, combined with display of results in order of relevance ranking (starting with the most relevant at the top) can significantly reduce time required to locate a specific e-mail. In spite of this, most mail services (like Gmail and Outlook.com) still list results in chronological order because users believe they know the best way to locate a specific e-mail, similar to finding a misplaced item in their own room or home, and there is a sense of comfort in using the same search methods and retrieving the same screen format. E-mail providers do not want to lose frustrated customers, since unlike academia, business must accommodate customer preferences; recall the adage: The customer is royalty; the customer is always right.

As a compromise solution, Google’s Inbox by Gmail and Yahoo Mail introduced a hybrid solution that displays a few results ranked by relevance on top of the traditional chronological results, so as to keep users content. Development and introduction of new GUI has always been a challenge. In recent years, the adoption of mobile devices by a large portion of the connected population is forcing developers to consider whether a single display style will work for desktops as well as cellphones and tablets. Fortunately, cellphone users are accustomed to scrolling, so the compromise solution for e-mail search works well.

Before SIGIR, I was excited at the prospect of seeing Yoelle Maarek live, in person. Her talk was even more enjoyable and informative than I imagined. Yoelle is a very charismatic speaker who connects well with the audience. She began by asking people to raise their hands if they were satisfied with their e-mail inbox: storage management, search, etc. No one raised their hands…for a few seconds…until one wise guy raised his just to give the speaker a hard time. Then, the thoroughly engaged audience was asked about its adoption and regular use of e-mail tools such as folders. Unsurprisingly, the audience was an unrepresentative sample of the general public. While 30%-40% of the audience raised their hands, only ~10% of the general public are active “filers” who use folders in order to organize their messages. After declaring audience members to be “hopeless” (maybe hopeless geeks and nerds who spend too much time in front of their screens), she began her talk; by then, we were hooked. Just when the technical content was getting to be a bit thick, Yoelle re-invigorated us by introducing the concept of emotional, personal ownership of one’s e-mail inbox through the use of an analogy from The Lord of the Rings: an image of Frodo as he (like Gollum) feels drawn to his “precious,” the One Ring.

The Q&A held an interesting surprise. What to do with e-mails of the deceased is becoming a growing issue. Some institutions erase the files, while others will archive them for a short period of time, in case they may serve as legal evidence and/or be subpoenaed. Following a completely new path, the U.S. government is embarking on an experiment to archive e-mails of volunteers as part of an effort to archive historical documents of distinguished people. Searching the e-mails of a deceased person would be quite different from personal e-mail search, since the people examining them may not necessarily know what they are looking for. The search would also be quite different than a general Web search.

I had the good fortune of being an audience member of two more outstanding keynote talks on conversational search systems that aim to interact with humans in a more natural and friendly manner. The workshop on Conversational Approaches to Information Retrieval on the final day of the conference featured work in progress by researchers in industry and academia. It was organized by Jaime Arguello of the University of North Carolina at Chapel Hill, Lawrence Cavedon of RMIT University in Australia, Hideo Joho of the University of Tsukuba in Japan, Filip Radlinski of Google, and Milad Shokouhi of Microsoft.

In his morning keynote, “Search Failed? Let’s talk,” Ron Kaplan (vice president at Amazon.com, chief scientist at Amazon Search, and adjunct professor of Linguistics at Stanford University) walked us through some scenarios that demonstrated the difficulty of building e-commerce search systems. To show the need for interactive, conversational systems, he presented some simple examples that fail when input into a traditional search engine. For example, among the top hits on eBay for the input query “shoes with red laces” were: regular-looking shoes (not red), a red dress, and a package of a dozen or so shoelaces in all colors of the rainbow. For a Web search engine, the query, “James Bond movies without Roger Moore” was also unsuccessful; it retrieved the Wikipedia page for James Bond, and various pages about James Bond and/or Roger Moore. The queries “Who acquired PeopleSoft?” versus “Who did PeopleSoft acquire?” yielded similar results about PeopleSoft, but neither made the proper distinction; the latter did focus on acquisitions, just not the right ones.

Ron Kaplan

Ron moved to a more complex search in which he looked for a replacement for an item that most people do not purchase on a regular basis. Consider a garbage disposal, the kind that lives under our kitchen sinks. Ron input “garbage disposals” into the Amazon.com search site, and retrieved a massive selection of garbage disposals. The problem had evolved to selecting a garbage disposal that would be appropriate for his needs; for instance, it should be for a household. But even among ordinary households, some may grind a lot of bits of hard, fibrous vegetables, while others may use the grinding mechanism minimally. Fortunately, the Amazon.com site helps customers by providing categories of disposers on a menu on the left-hand side of the screen: power, brand, … However, unlike traditional hardware stores, e-commerce sites cannot provide quick answers to straightforward questions, such as: Does brand or power matter? What are these numbers associated with the power? How do I choose the best one for my use-case; for example, to grind peels from vegetables and fruits on a regular basis? He took a guess and bought a garbage disposer that worked really well and was comparatively quiet.

It turned out that the noise reduction was made possible by a soft, rubber lid with flaps that close once the garbage went into the disposer. However, the lid with flaps made it very difficult to push the garbage into the disposer to grind up the mess. The logical solution was to buy something to push food down the disposal. So Ron went back to Amazon.com and tried the query, “something to push food down the disposal” and retrieved many food items. Frustrated, Ron had to drive to a nearby hardware store to ask a human salesperson for the name of the item he needed. It turned out to be garbage stuffer. Indeed, the query “garbage stuffer” at Amazon.com yielded several that were appropriate for Ron’s needs. The Walmart site retrieved garbage disposals for the same query.

The conclusion from this search experience is: Conversation can be an effective, efficient, and pleasant means to resolve questions that arise when making a purchase. How might such a conversation work in practice on an e-commerce site? (1) The user inputs a query. (2) The user gets good answers to an input query, but there are too many. (3) The virtual agent detects that the user needs help. (4) The virtual agent initiates a “recovery conversation” to guide the user in refining the query and/or to provide information (e.g., power needed for intended use(s); information on brands, warranty, place of manufacture, etc.).

Ron ended with the insightful question: Providing conversational interfaces on e-commerce sites seems like a great idea, but are they the future for all sites? Probably not, and hopefully not. For example, purchasing tickets for a flight and reserving hotel rooms can be carried out more efficiently using interfaces provided on most travel sites with search engines, dialog boxes, and pull-down menus. One-shot search is the ideal, with dissatisfaction increasing with an increase in the number of conversations (Q&A) required to complete the task. Balancing and managing user expectations is important for customer satisfaction.

Ron’s highly entertaining and informative talk set the stage for the afternoon keynote by Jason Williams, “End-to-end learning for task-oriented conversational systems,” on his work as manager of the Conversational Systems Group at Microsoft Research AI. The ultimate goal for the researchers is to design and implement an intelligent conversational system that does not require programming responses to all possible conversations. To see what would happen with current technologies, Jason showed an example—implemented by his colleagues at Microsoft Research—of a neural network trained on a large data set of regular, everyday conversations.

Jason Williams

After Robertson’s talk, it comes as little surprise that this approach would fail for task-oriented dialogs. Although the replies of conversational systems were grammatically correct and semantically plausible, they were clearly inappropriate for task-oriented dialog systems. Jason presented two examples of conversations that might be acceptable in chit-chat between close friends, but not for a (virtual) customer service agent of a large corporation:

Q1: What’s the weather in Seattle?

A1: It’s not that bad

Q2: I need to reset my password

A2: I am sure you do.

Jason went on to show how a task-oriented dialog system could be created from a very small number of domain-specific training dialogs, and integrated with back-end API calls for reading and writing data. Full details are in a paper from ACL.

The issue of gathering appropriate conversational data is clearly central to building a conversational IR system. This realization led a related team at Microsoft to collect information-seeking dialogs using pairs of humans. One person acted as an information seeker, while the other acted as a customer service agent. The goal was to compile a massive data set of natural conversations between two humans to determine: good and bad behaviors exhibited by the agent; behaviors by the seeker that demonstrated satisfaction/dissatisfaction; conversational structures that promote or impede progress in the task; acceptable conversational norms, and other useful properties that would help in the design on a virtual agent. I highly recommend the paper from the workshop, which has more details about conversational modeling, experimental set-ups, and pointers to related work on conversational systems at other institutions.

This presentation and many of the other works presented at the workshop mentioned the difficulties in creating suitable datasets for training virtual agents. Currently, applications of neural networks, supervised learning, and machine learning methods are hot topics for research. It seems scientists are beginning to realize there is no one-size-fits-all algorithm that will work well for most business scenarios. Machines may need to learn to mimic the human experience of improving professional skills through training and on-the-job trial and error. Customized training data for virtual agents carrying out highly specific tasks will be an essential part of the solution.

It was a great privilege to have Jason Williams as an invited speaker from an academic community with deep insights on understanding human dialog, conversation, and sentiment. I believe the workshop inspired participants to expand their interest to learn more about the state-of-the art in disciplines outside of IR that are also vital contributors to building next-generation search systems. I also believe friendships for future collaborations were established during the meal and coffee breaks.

The keynotes in the main session and the workshops fit together very nicely, beginning with the history of IR leading to search engines, then to deploying a product for the masses at Yahoo!, and ending with eye towards the future – conversational information retrieval through virtual agents. To the organizers who successfully brought distinguished speakers all the way to Tokyo, and to the speakers who took the time and effort to travel to Japan, Cheers!

Please stay tuned. Coming up next: Blog #5 – SIGIR 2017 wrap-up & closing blog …では、また!

This is blog #4 on ACM SIGIR 2017, Tokyo. Previous blogs are:

Blog #3 SIGIR2017: Diversity and Inclusion

Blog #2 Neural Networks in IR: full-day tutorial

Blog #1 Welcome to SIGIR 2017

Mei Kobayashi is manager, Data Science/Text Analysis at NTT Communications.