'Natural' Search User Interfaces

Recent Issue - November 2011

Credit: Gary Marsh

talk
Text Size

Marti A. Hearst

November 1, 2011

What does the future hold for search interfaces for users? Today's familiar Web search interface works well for tens of millions of people and billions of queries a year, but few innovations in search interfaces gain wide-enough acceptance to replace the standard type-keywords-in-entry-form/view-results-in-a-vertical-results-list interface. This is partly because search is a means toward another end, and reading text is a mentally demanding task. The fewer distractions while reading, the more usable the interface. Additionally, search, like email, is used by nearly everyone using the Web, so its features and functions must be understandable to an enormous and diverse population.¹³

Key Insights

Future trends in search interfaces will most likely reflect trends in the use of IT generally. Today, there is a notable trend toward more "natural" user interfaces: pointing with fingers rather than mice, speaking rather than typing, viewing videos rather than reading text, and writing full sentences rather than artificial keywords. (The term "natural interface" is promoted by researchers at Microsoft, among others.) Not surprisingly, people are drawn to interfaces that allow them to think and move in a manner like what they use in their non-computing lives, but only recently has technology been able to support it.

There is also a trend toward social rather than solo use of IT, with these multi-person interactions often recorded, stored, and indexed for later viewing. Again, many people would have preferred non-isolated computer use from the start, but technology and user-interface design did not support it well until recently.

Technology is advancing toward integration of massive quantities of user behavior and large-scale human-generated knowledge bases. Search today benefits from the tracking of search behavior over hundreds of millions of queries to improve ranking, offer accurate spelling suggestions, auto-suggest query terms in real time as the user types, and suggest concepts related to a query. Integration with databases and more sophisticated processing place search at the cusp of being able to support smarter, data-driven, focused interfaces for advanced search.

These trends are, or will be, interweaving in various ways, with interesting ramifications for search interfaces and suggesting promising directions for research.

Speech Input

Speech-based user interfaces generally, and speech for search input in particular, are likely to gain a much stronger presence in the coming years. At least three technological trends support the move toward spoken queries: First, phone-based mobile devices provide a natural way to capture speech, since phones are used in large part for spoken conversations. Second, the technology for speech recognition, after years of only incremental progress, is improving by leaps and bounds, thanks to huge data repositories being generated through the use of mobile phones. (To assemble a large training set of spoken corrected data for its speech-recognition system, Google hosted, from 2007 to 2010, a free 411 information service for phones.²⁸) And third, touch-screen interfaces are increasingly popular, especially when paired with mobile devices. Neither small devices nor touch screens lend themselves well to typing, making spoken input more attractive, though clever finger-swipe-based input methods (such as ShapeWriter for entering text³⁹ and Gesture Search for menu navigation¹⁹) provide compelling alternatives to typing.

These trends suggest voice-activated queries and commands are likely to increase rapidly in the next few years as response time and accuracy continue to improve.

The next likely development following on voice-based input is a dialogue-like give and take. Though not yet a reality, recent advances are bringing closer the dream of an intelligent interactive agent. For example, the Siri system provides an interface combining local information, speech recognition, easy editing of voice recognition, and visual display of search results. Siri, which was acquired by Apple in April 2010, originated from a Defense Advanced Research Projects Agency research project called CALO (www.ai.sri.com/project/CALO), in which dozens of computer-science researchers developed machine learning, reasoning, knowledge bases, and other technology to create an intelligent personal assistant.^{4, 35}

Though the user's ability to accurately follow up one request with another is limited in Siri, good interface design helps bridge the gap in the back end, since the user sees alternatives and is able to make corrections (see figures 1, 2, and 3). Note that Siri also attempts to use searchers' contextual information, including current location. Enormous research interest^{5, 20} and commercial development focuses on using time, location, and other contextual cues for search and related applications, and will continue to increase in importance, especially for mobile platforms.

Voice input also has drawbacks, the most significant being that speaking makes noise and can disturb people around the speaker. An exciting research advance would be a microphone that uptakes the words the speaker says but somehow prevents those around the speaker from hearing the words, like a science fiction "cone of silence." Such an invention would have wide-ranging utility for mobile phones.

Social Search: Collaboration

Though observational studies have found that people often search collaboratively, tools have only recently been developed to explicitly support people searching together. Such support reflects a broader research renaissance in tools for real-time shared activity (such as shared online whiteboards and document-editing tools).

One exciting development in collaborative search, from Pickens et al.,^{11, 29} assumes the ranking algorithm should allow users to work at their own pace but be influenced in real time by their teammates' search activities. The searchers should not step on one another's proverbial toes; if one person issues a new query, others' thoughts should not be interrupted.

Pickens et al.^{11, 29} addressed this issue by developing an algorithm that combines multiple rounds of queries from multiple searchers during a single search session (see Figure 4), using two criteria for weighting results—both functions of the ranked list of documents returned for a given query. The first variable is "freshness," which is higher for documents not yet viewed, while "relevance" is higher for documents closely matching the query. These two factors are combined and continuously updated based on new queries and searcher-specified relevance judgments.

In addition, Pickens et al.^{11, 29} assigned different roles to the members of a team. For example, the "Prospector" is in charge of creating new queries to explore new parts of the information space, and the "Miner" looks at the retrieved results to determine which are relevant. Documents not yet looked at are queued up for the Miner interface according to freshness/relevance weighted scores. The Prospector is shown new query-term suggestions based on how they differ from queries already issued, as well as on the relevance judgments made by the Miner. Each role has its own interface; a third view is used to show continually updating information about the queries that have been issued, the documents that have been marked as relevant, and the system-suggested query terms based on the actions of the users.

Another approach to supporting real-time search collaboration, described by Jetter et al.,¹⁶ used a large work surface and input devices combining physical manual manipulation with virtual markings. The interface was evaluated on a complex collaborative search task, that of a group of people selecting a product, where each member of the group has different preferences that act as constraints (such as when choosing a hotel, one needs a heated pool, another wants one that received at least four stars of recommendation, and a third wants the price below a certain amount). Jetter et al.'s solution used a combination of faceted navigation³⁷ and filter-flow visualization,³⁸ showing how many constraints are met by a set of items, given certain constraints. The visualization was displayed on a shared horizontal workspace, where the controls were manipulated through physical selectors (see Figure 5). Collaboration was facilitated by allowing each user to work privately on a corner of the workspace, then let the results from each piece of the query flow into the rest of the group's query specification. A careful usability study by Jetter et al. found this approach produced results as good as those using a standard Web-based faceted navigation interface but with more bonhomie among the collaborators.

Social Search: Asking Other People

Research suggests that much online interaction on social sites is for the social experience of the interaction, rather than for problem-centric information seeking.¹² Reflecting this, a study by Morris et al.²⁴ found the questions asked of others via social networks do not necessarily involve the kinds of information found on static Web pages. Morris et al. asked survey respondents to supply questions they had posed to their social networks on Twitter and Facebook, manually classifying the 249 examples and finding only 17% were for factual information one would typically seek from a Web page (such as how to, say, put an Excel file into LaTex). The most common categories were requests for recommendations (29%), opinions (22%), rhetorical questions (14%), requests for others to join social events (9%), favors (4%), and social connections, including job openings (3%) and offers of various kinds (1%).

A study of the Aardvark expert social-question-answering system (www.vark.com) found similar results, with 65% of a random sample of 1,000 queries reflecting a subjective attitude.¹⁵ The questions asked on the social-question-answering site Quora also tend to be subjective and opinion-based; for instance, "What does Dustin Moskovitz think of the new Facebook movie?" was answered by the subject of the question himself.

Unclear is what the best user interfaces are for representing this more social kind of search. Freyne et al.¹⁰ conducted a small study in which different kinds of social cues were shown via icons alongside search-results listings. Subjective results showed a positive preference toward cues showing which articles were read frequently or annotated by others. Yahoo experimented (2005–2009) with the MyWeb system in which search results were augmented with an avatar of the person in the user's social network who had recommended the page, along with the recommendation. In March 2011, Google introduced a social-search tool called "+1" with a similar interface. Significant experimentation on incorporating social information into search results listings is likely over the next few years.

When using a social network to try to answer questions, especially in a work situation, research is ongoing about how best to distribute the related information needs among experts, either within an organization or across the Internet generally.^{18, 21} Recent work by Richardson and White³⁴ deployed and studied an instant-messaging-based question-answering service that matched the asker's questions against predefined profiles of more than 2,000 potential answerers' expertise, based on their availability. The system contacted three experts at a time, in descending order of how well their profiles matched the content of the question. If an offer to answer was not received within a fixed time limit, the request was sent to a wider circle of experts. If an answerer accepted a request, the other outstanding requests were cancelled. The tool then mediated the conversation between questioner and answerer, asking questioners to rate their satisfaction with the answer.

Richardson and White³⁴ examined log data for this system to form an interruption cost model, including how many people should be sent a question in order to minimize disruption while maximizing the likelihood of receiving an informed answer, whether a question will be answered, and how well the asker will be satisfied with the answer received.

Expert solicitation systems that are sophisticated about targeting people with the right expertise and state of mind to address a request are likely to become a fixture in knowledge-centric workplaces, as well as in volunteer causes (such as the Peer2Patent project for community input of patent prior art²⁶).

Social Search: Crowdsourcing

The word "collaboration" as it is used here refers to a set of people working together closely, usually synchronously, to achieve a goal. "Crowdsourcing" refers to large groups of people not necessarily working together knowingly but each contributing in small ways, leading to a greater whole, as seen, in, for example, Wikipedia editing.

Crowdsourcing in information seeking is seen in Web sites in which communities curate and rate information and share it with others, including question-answering sites, and in product-reviewing sites, bookmark-sharing sites like Delicious, and news-ranking and aggregation sites like Digg. The more-explicitly networked social tools (such as Twitter and Facebook) also function as real-time socially targeted information sources.

Multiple efforts have sought to use explicit user input to improve search-results ranking, though few survive; for instance, Google's SearchWiki, which allowed users to explicitly reorder search results and share this re-ranking information with others, was shut down in 2010. The Blekko Web search engine, launched October 2010, is an attempt to use sophisticated algorithms combined with community curation to improve results rankings; its founder also started the Open Directory Project, a crowdsourced yellow pages for the Web. With Blekko, users can create "vertical," or subject-specific, search by labeling Web pages with a category label preceded by a slash; they can also mark pages as spam. These two operations together impose crowdsourced quality control over retrieved Web pages. Blekko also provides a social feature allowing users to see if their friends have marked particular pages with a "/like" slashtag. It remains to be seen if explicit crowdsourcing will scale for search results ranking.

Crowdsourcing usually refers to people explicitly contributing to an effort, but Web search engines have used a form of implicit crowdsourcing for years, by modifying ranking algorithms based on huge quantities of user clickthrough data¹⁷ or predicting which vertical subject area (such as music, news, and travel) to use to augment a query.⁷ Richer user behavior data (such as mouse movements, page dwell time, and searchers' click paths many steps from the search results page—even across domains—to their destination page) has helped produce useful suggestions of pages not related to the original page through close keyword matches.³⁶

Natural Language-like Queries

Though keyword querying remains standard practice on the Web, savvy users have been typing more detailed queries for years, and Web search engines have greatly improved their ability to handle long queries. Research has shown that people prefer natural expression of queries over keywords,^{3, 30} and Web search engine query length continues to increase. According to Experian Hitwise,²² a global online competitive intelligence service, when comparing queries over a four-week period (August–September 2010) to the same four-week period in 2009, found that searches of from five to eight words were up 10%, while searches of from one to four words were down 2%. The growth of query length suggests a desire to express one's information needs more thoroughl

Communications of the ACM

Recent Issue - November 2011