Doug Appelt's "Introduction to Information Extraction Technology". This is a very good introduction of information extraction technology. In my course I will follow Doug's structure of presenting IE.
Günter Neumann "Informationsextraktion." This is a short overview of current IE technology written in German. It is published in Klabunde et al (eds): Computerlinguistik und Sprachtechnologie - Eine Einführung. Spektrum Akademischer Verlag, Heidelberg, 2001, which I also use as course material.
Ricardo Baeza-Yates & Berthier Ribeiro-Neto: Modern Information Retrieval, Addison Wesley Longman Publishing Co. Inc., 1999.
Maria Pazienza (Ed.) "Information Extraction; Towards Scalable, Adaptable Systems", Lecture Notes in Artificial Intelligence, 1714, 1999.
Eugene Charniak "Statistical Language Learning",
A compact introduction into major aspects of stastical methods used for NLP.
Cristopher Manning & Heinrich Schütze
"Foundations of Statistical Natural Language Processing", MIT-Press, 1999.
An extensive introduction into major aspects of stastical methods used for NLP.
Answer extraction: TREC Conference series, in particular TREC-9
Eugene Charniak "Statistical Language Learning", MIT-Press, 1993, chapter 3, 4.
Eric Brill "Tranformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging", Computational Linguistics, Volume 21, Number 14, 1995.
Adwait Ratnaparkhi "A Maximum Entropy Part-Of-Speech
Tagger.", In Proceedings of the Empirical Methods in Natural Language
Processing Conference, May 17-18, 1996. University of Pennsylvania
A. Borthwick, A
Maximum Entropy Approach to Named Entity Recognition, Ph.D. (1999)
University. Department of Computer Science, Courant Institute.
Valentin Tablan, Cristian Ursu, Hamish Cunningham et. al, Software Architecture
for Language Engineering,
Slides, they have a nice view on NE, which I also will make use in my course
Scientific papers about SMES (all paper are downloadable from my publication list)
T. Declerck and G. Neumann A Cascaded Shallow Approach to Reference
In Proceedings of EuroConference on Recent Advances in NLP, RANLP-2001, Tzigov Chark, Bulgaria, 5-7 September 2001.
G. Neumann, C. Braun and J. Piskorski: A Divide-and-Conquer Strategy
for Shallow Parsing of German Free Texts
In proceedings of ANLP-2000, Seattle, Washington, pages 239-246
G. Neumann and G. Mazzini: Domain adaptive information extraction. Technical Report, 1999.
G. Neumann, R. Backofen, J. Baur, M. Becker, C. Braun: An Information Extraction Core System for Real World German Text Processing. In Proceedings of 5th ANLP, Washington, March, 1997.
G. Neumann: Methoden zur intelligenten Informationsextraktion im Internet.
In Proceedings of 20th European Congress Fair for Technical Communication,
ONLINE '97, Hamburg,, 1997.
M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam and S. Slattery. "Learning to Extract Symbolic Knowledge from the WWW", AAAI-98. (check also: CMU World Wide Knowledge Base (Web->KB) project).
Finkelstein-Landau and Morin, Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods. In Actes, International Workshop on Ontological Engineering on the Global Information Infrastructure, pages 71-80, Dagstuhl Castle, Germany, 1999.
M. Califf and R Mooney, "Relational Learning of Pattern-Match Rules for Information Extraction", Proceedings of the AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, 1998.
S. Soderland, "Learning Text Analysis Rules for
Domain Specific Natural Language Processing" Phd Thesis,
University of Massachusetts Amherst,1997.