Organizing and searching the world wide web of facts -- step two

8 May 2007

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

p. 101-110
https://doi.org/10.1145/1242572.1242587

Abstract

As part of a large eort to acquire large repositories of facts from unstructured text on the Web, a seed-based frame- work for textual information extraction allows for weakly supervised extraction of class attributes (e.g., side ee cts and generic equivalent for drugs) from anonymized query logs. The extraction is guided by a small set of seed at- tributes, without any need for handcrafted extraction pat- terns or further domain-specic knowledge. The attributes of classes pertaining to various domains of interest to Web search users have accuracy levels signican tly exceeding cur- rent state of the art. Inherently noisy search queries are shown to be a highly valuable, albeit unexplored, resource for Web-based information extraction, in particular for the task of class attribute extraction.

Keywords

This publication has 15 references indexed in Scilit:

Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics, 2006
Exploring distributional similarity based models for query spelling correction
Published by Association for Computational Linguistics (ACL) ,2006
Preemptive information extraction using unrestricted relation discovery
Published by Association for Computational Linguistics (ACL) ,2006
Espresso
Published by Association for Computational Linguistics (ACL) ,2006
Mining knowledge from text using information extraction
ACM SIGKDD Explorations Newsletter, 2005
KnowItNow
Published by Association for Computational Linguistics (ACL) ,2005
Automatic Discovery of Attribute Words from Web Documents
Lecture Notes in Computer Science, 2005
Wikipedia: The Free Encyclopedia
Online Information Review, 2002
Measures of distributional similarity
Published by Association for Computational Linguistics (ACL) ,1999
Automatic retrieval and clustering of similar words
Published by Association for Computational Linguistics (ACL) ,1998

Cited by 64 articles