Skip to main content Skip to main navigation

Publication

Exploring Deployment of Linguistic Features in Classification of Polish Texts

Jakub Piskorski; Marcin Sydow
In: Z. Vetulani (Hrsg.). 2nd Language and Technology Conference. International Language Technologies Conference (IS-LTC), Poznan, Poland, Pages 81-84, 4/2005.

Abstract

This paper reports on some preliminary experiments of deploying linguistic features for classification of Polish texts. In particular, we explore the impact of lemmatization and various term-selection strategies relying on inclusion and exclusion of certain named-entity classes. A slight improvement against the bag-of-words approach can be observed, but there is still a lot of place for improvement.