Difference between revisions of "Tagger"
m |
|||
| Line 1: | Line 1: | ||
==Definition== | ==Definition== | ||
| − | + | A tagger is a device which assigns symbolic labels (''tags'') to linguistics units. The labels are taken from a predefined set of symbols (''tag-set''). | |
| − | + | ==Comments== | |
| + | In most cases, a tagger assigns tags representing morpho-syntactic information to single word-forms or token. But there are tagger which have been designed to identify semantic role of noun phrases or prepositional phrases (''sense tagging'') and sometimes identiying the discourse structure of a text is considered as a king of tagging. | ||
| + | |||
| + | Conceptually, tagging can be considered as a three step process: (i). identification of the relevant units (ii). assigning all possible labels to the units (e.g. by lexical look-up, applying heuristics, etc.) (iii). disambiguation. | ||
| + | |||
| + | It is common practice to distinguish between rule-based and stochastic tagger, though in some cases it is not easy to decide | ||
| + | |||
Tagger erreichen je nach Textsorte eine Korrektheit von 90-97%. | Tagger erreichen je nach Textsorte eine Korrektheit von 90-97%. | ||
| − | == | + | ==Subtypes== |
| − | + | * [[HMM tagger]] | |
| + | * [[Brill tagger]] | ||
| + | * [[Memory-based tagger]] | ||
| + | * [[Tree tagger]] | ||
==Other Languages== | ==Other Languages== | ||
* German [[Tagger (de)]] | * German [[Tagger (de)]] | ||
Revision as of 17:42, 6 July 2007
Definition
A tagger is a device which assigns symbolic labels (tags) to linguistics units. The labels are taken from a predefined set of symbols (tag-set).
Comments
In most cases, a tagger assigns tags representing morpho-syntactic information to single word-forms or token. But there are tagger which have been designed to identify semantic role of noun phrases or prepositional phrases (sense tagging) and sometimes identiying the discourse structure of a text is considered as a king of tagging.
Conceptually, tagging can be considered as a three step process: (i). identification of the relevant units (ii). assigning all possible labels to the units (e.g. by lexical look-up, applying heuristics, etc.) (iii). disambiguation.
It is common practice to distinguish between rule-based and stochastic tagger, though in some cases it is not easy to decide
Tagger erreichen je nach Textsorte eine Korrektheit von 90-97%.
Subtypes
Other Languages
- German Tagger (de)