login/register

Snip!t from collection of Alan Dix

see all channels for Alan Dix

Snip
summary

Acronyms result from a highly productive type of term va... substitutes fully expanded terms (e.g., retinoic acid re... shortened term-forms (e.g., RARA). Even though no generi... patterns have been established for dealing with acronym ...
Acromine is

Acromine
http://www.chokkan.org/research/acromine/

Categories

/Channels/text mining

[ go to category ]

For Snip

loading snip actions ...

For Page

loading url actions ...

Acronyms result from a highly productive type of term variation which substitutes fully expanded terms (e.g., retinoic acid receptor alpha) with shortened term-forms (e.g., RARA). Even though no generic rules or exact patterns have been established for dealing with acronym creation, acronyms often appears in documents without the expanded form explicitly stated. Thus, an acronym dictionary is necessary for advanced text-mining tasks to establish associations between acronyms and their expanded forms.

Acromine is a system for building a good quality acronym dictionary from running text. Assuming a word sequence co-occurring frequently with a parenthetical expression to be a potential expanded form, Acromine identifies acronym definitions in a similar manner to a statistical term recognition. Applied to the whole MEDLINE (7,811,582 abstracts) as of March 2006, Acromine extracted 920,425 acronym candidates and recognized 157,803 expanded forms in reasonable time (ca. 12 hours on a personal computer). This system achieves 99% precision and 82–95% recall on our evaluation corpus that roughly emulates the whole MEDLINE.

HTML

<p>Acronyms result from a highly productive type of term variation which substitutes fully expanded terms (e.g., <em class="wordasword">retinoic acid receptor alpha</em>) with shortened term-forms (e.g., <em class="wordasword">RARA</em>). Even though no generic rules or exact patterns have been established for dealing with acronym creation, acronyms often appears in documents without the expanded form explicitly stated. Thus, an acronym dictionary is necessary for advanced text-mining tasks to establish associations between acronyms and their expanded forms. </p> <p> <span class="emphasis"><em>Acromine</em></span> is a system for building a good quality acronym dictionary from running text. Assuming a word sequence co-occurring frequently with a parenthetical expression to be a potential expanded form, Acromine identifies acronym definitions in a similar manner to a statistical term recognition. Applied to the whole MEDLINE (7,811,582 abstracts) as of March 2006, Acromine extracted 920,425 acronym candidates and recognized 157,803 expanded forms in reasonable time (ca. 12 hours on a personal computer). This system achieves 99% precision and 82&#x2013;95% recall on our evaluation corpus that roughly emulates the whole MEDLINE. </p>