Semantic Components: A Model for Enhancing Retrieval of Domain-Specific Information
Published at : 26 Jan 2021
Retrieving information from online resources is an increasingly prevalent task, supporting many work-related and personal activities. Yet, despite the success of modern search engines, finding specific information can be difficult and time-consuming. When users who are experienced in a particular domain search for information, they often know what types of information can be found in particular types of documents. We are investigating a model, called semantic components, that seeks to leverage searchersΓÇÖ knowledge by introducing a schema specific to a particular document collection. A semantic component schema consists of a two-level hierarchy, document classes and semantic components. Semantic components are types of information that commonly appear in documents of a given class. For example, documents about a disease (a document class) often contain information about treatment and diagnosis (two semantic components). Semantic component indexing identifies the location and extent of semantic component instances within a document and can supplement traditional full text and keyword indexing techniques. Semantic component searching allows a user to refine a topical search by indicating a preference for documents containing specific semantic components or by indicating terms that should appear in specific semantic components. In this talk I will discuss two user studies, an interactive searching study demonstrating the ability of semantic components to enhance search results, and an indexing study comparing manual semantic component indexing to manual keyword indexing. I will also discuss metrics for evaluating semantic component indexing and illustrate use of a session-based metric for evaluating multiple-query search sessions. In addition, I will describe ways the model can be adapted to a variety of search tasks and applications, such as Internet search (Live Search), enterprise-level search (SharePoint), and personal information (Desktop Search, Outlook).