Chapter 4: Tagging protocol

From Dialectsyntax
(Difference between revisions)
Jump to: navigation, search
(Created page with "The tagset that is used in the Edisyn search engine can be viewed [http://www.dialectsyntax.org/files/Edisyn-databases%284%29.xls here] (note that this is work in progress). This...")
 
Line 4: Line 4:
  
 
This protocol is also [http://www.meertens.knaw.nl/pdf/variatielinguistiek/dialectsyntax/Tagging-protocol.pdf available in PDF format].
 
This protocol is also [http://www.meertens.knaw.nl/pdf/variatielinguistiek/dialectsyntax/Tagging-protocol.pdf available in PDF format].
 +
 +
[http://www.dialectsyntax.org/wiki/Introduction Introduction]
 +
[http://www.dialectsyntax.org/wiki/Noun Noun]
 +
[http://www.dialectsyntax.org/wiki/Adjective Adjective]
 +
[http://www.dialectsyntax.org/wiki/Verb Verb]
 +
[http://www.dialectsyntax.org/wiki/Pronouns Pronouns]
 +
[http://www.dialectsyntax.org/wiki/Adpositions Adpositions: prepositons and postpositions]
 +
[http://www.dialectsyntax.org/wiki/Complementizers Complementizers]
 +
[http://www.dialectsyntax.org/wiki/Adverbs Adverbs]

Revision as of 14:48, 2 November 2011

The tagset that is used in the Edisyn search engine can be viewed here (note that this is work in progress). This tagset is used to label the parts of speech of (dialect) databases. The document shows how the tags of the various databases are connected to those of the Edisyn search engine. In the column 'Edisyn search engine' the tags are taken up which are used in this search engine. The other columns show the tags that apply to each individual database. Per row the correspondence between a tag of a database and that of the search engine is made visible.
The tags of the Edisyn search engine consist of two parts, a linguistic category (e.g. V verb) which may be modified with one ore more feature(s) (e.g. 1,s first person singular). In the search engine one can search via categories or features or both. In order to make many databases interoperable the categories and features are somewhat general. An argumentation of the tagset can be opened here.

The protocol below is a manual for performing Parts of Speech tagging. It was developed by Sjef Barbiers and Guido Vanden Wyngaerd, for the SAND-project (Syntactic Atlas of Dutch Dialects), but can be useful for other dialect research groups/projects.

This protocol is also available in PDF format.

Introduction Noun Adjective Verb Pronouns Adpositions: prepositons and postpositions Complementizers Adverbs

Personal tools