LCC Labs Demo: "Set Xpander"

Categories: Press Releases
      Date: Nov 26, 2008
     Title: LCC Labs Demo: "Set Xpander"

Language Computer today launches a new LCC Labs demo, Set Xpander. Xpander demonstrates the kind of rich semantic knowledge which can be automatically extracted from semi-structured resources like Wikipedia for lexicon and other generation purposes. Wikipedia as a resource is a goldmine of semantic knowledge which is vital to high performance Information Extraction (IE) systems -- like entity, relation and event extraction.

In this tool, the system first disambiguates examples provided by the user, mapping them to associated Wikipedia entries. Those entries are then "xpanded" to related entries ranked by semantic similarity.

For example, given the input "Avalanche", "Civic" and "Ram", the system must first identify the concepts referred to by the user. In the case of "Avalanche", Xpander selects "Chevrolet Avalanche" from the almost 50 different entries for this term. Next, it processes structured and unstructured content in Wikipedia as well as other resources to find highly related articles, such as "Ford Ranger", "Chevrolet Silverado", and numerous other car models.

Leveraging resources in this manner is allowing LCC extraction products to achieve increasingly higher precision and recall, for an increasingly broad set of semantic types. This type of automatic learning reduces training times for LCC's customizable content extractors to mere minutes, with a process that is easy enough for anyone to learn.