login/register

Snip!t from collection of Alan Dix

see all channels for Alan Dix

Snip
summary

CRFClassifier is a Java implementation of a Named Entity... Entity Recognition (NER) labels sequences of words in a ... names of things, such as person and company names, or ge... names. The software provides a general (arbitrary order)... linear chai

The Stanford NLP (Natural Language Processing) Group
http://nlp.stanford.edu/software/CRF-NER.shtml

Categories

/Channels/text mining

[ go to category ]

For Snip

loading snip actions ...

For Page

loading url actions ...

CRFClassifier is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. The software provides a general (arbitrary order) implementation of linear chain Conditional Random Field (CRF) sequence models, coupled with well-engineered feature extractors for Named Entity Recognition. (CRF models were pioneered by Lafferty, McCallum, and Pereira (2001); see Sutton and McCallum (2006) for a better introduction.) Included with the download are good 3 class (PERSON, ORGANIZATION, LOCATION) named entity recognizers for English (in versions with and without additional distributional similarity features) and another pair of models trained on the CoNLL 2003 English training data. The distributional similarity features improve performance but the models require considerably more memory.

HTML

CRFClassifier is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. The software provides a general (arbitrary order) implementation of linear chain Conditional Random Field (CRF) sequence models, coupled with well-engineered feature extractors for Named Entity Recognition. (CRF models were pioneered by <a href="http://www.cis.upenn.edu/%7Epereira/papers/crf.pdf">Lafferty, McCallum, and Pereira (2001)</a>; see <a href="http://www.cs.umass.edu/%7Emccallum/papers/crf-tutorial.pdf">Sutton and McCallum (2006)</a> for a better introduction.) Included with the download are good 3 class (PERSON, ORGANIZATION, LOCATION) named entity recognizers for English (in versions with and without additional distributional similarity features) and another pair of models trained on the <a href="http://www.cnts.ua.ac.be/conll2003/ner/">CoNLL 2003</a> English training data. The distributional similarity features improve performance but the models require considerably more memory.