Listing 1 - 1 of 1 |
Sort by
|
Choose an application
In the thesis, we developed a novel unsupervised algorithm for terminology extraction (TE). TE consists in detecting and ranking possible terms from a given document. While a term is a sequence of words that refers to a particular concept in a given domain. This thesis also brings with it two other ancillary contributions. A new relevancy measure for term ranking; which uses a mix of a termhood, a unithood, and a noise measure to provide a reliable score. And an abbreviation extractor which discovers and extracts the extended form of abbreviated terms using a simple heuristic. Many algorithms already exist for extracting terms but they have limitations. Primarily, we found that no current method was capable of reliably extracting long and complex terminology. Therefore, the algorithm we proposed was designed to handle such task.
term extraction --- terminology extraction --- financial text --- information extraction --- abbreviation extraction --- long term --- complex terminology --- multi word term --- termhood --- unithood --- unsupervised --- Ingénierie, informatique & technologie > Sciences informatiques
Listing 1 - 1 of 1 |
Sort by
|