Term and concept relevance

An important task in text processing is the extraction of the important topical words and phrases from a document, a task often call automatic keyphrase extraction. Good keyphrases are useful for many applications, in particular information retrieval, document categorization and clustering or document summarization.

We developed a tool called Keyterm able to rank extracted terms and concepts according to a relevance score estimating how well they capture the most important and discriminant subject matter of a document given a background collection of technical and scientific documents. This tool is derived from a first prototype ranked first at the Semeval 2010 shared task on keyphrase extraction from scientific articles, out of 19 participants, further improved in performance and coverage. The tool provides ranked keyphrases, but also, at a higher level, ranked concepts and entities which are more adapted to semantic applications.