DFKI-LT - Exploring Deployment of Linguistic Features in Classification of Polish Texts

Jakub Piskorski, Marcin Sydow
Exploring Deployment of Linguistic Features in Classification of Polish Texts
in: Z. Vetulani (ed.):
1 2nd Language and Technology Conference, Pages 81-84, Poznan, Poland, 4/2005
 
This paper reports on some preliminary experiments of deploying linguistic features for classification of Polish texts. In particular, we explore the impact of lemmatization and various term-selection strategies relying on inclusion and exclusion of certain named-entity classes. A slight improvement against the bag-of-words approach can be observed, but there is still a lot of place for improvement.
 
Files: BibTeX