Data-Oriented Parsing and Generation

Hauptseminar: Computerlinguistik, 2. Studienabschnitt

Course leader: Günter Neumann
Where
: Geb. 17.2, Konferenzraum 2.11
When
: Di 14-16
Initial meeting: 19.04.2005

Abstract

Data-Oriented Parsing (DOP) models embody the assumption that humans produce and interpret natural language utterances by invoking representations of their concrete past language experience, rather than the rules of a consistent and non-redundance competence grammar. DOP models therefore maintain large corpora of sentences with syntactic structures. They analyze new input-sentences by combining partial structures from the corpus, and employ the occurrence frequencies of these structures to estimate which of the resulting analysis are the most probable one. During this seminar we will have a closer look to the computational and linguistic aspects of DOP. Recently, first Data-Oriented methods for natural language generation have been proposed, which we will discuss at the end of the seminar.
Main references
Rens Bod, Remko Scha and Khalil Sima'an (eds.): Data-Oriented Parsing. Stanford: CSLI Publications, 2003. 410 pp;
(the DOP-book)
as well as additional copies of relevant conference and journal papers

Seminar language: English

Certificates
Referat(2 LP/4 LP) oder Referat + Hausarbeit (4 LP/9 LP) (talk or talk + paper)

Concerning the Master programme:
M.Sc. Program: Specialication Course L (linguistics)

Credit points:
Diplom/M.Sc.: talk only 2 LP/4 LP; talk + paper 4 LP/9 LP

Schedule (Topics)

19.4.2005
Organisational matters/
References
Theme discussion
Main reference Speaker
Presentation






03.5.2005
Session 01 Basic DOP Model DOP-book:
ch 2 & papers
Günter Neumann
dop/ebl.ppt
10.5.2005
Session 02 Tree-Gram Parsing DOP-book:
ch 11
Joel Wagner
tree-gram.pdf
17.5.2005
Session 03 Supertagging
DOP-book:
ch 15
Dafydd Jones
supertagging.pdf
24.5.2005
Session 04 Statistical Parsing with TAG DOP-book:
ch 16
Alejandro Figueroa
ltag-dop.ppt
31.6.2005
no session
NO TOPIC



07.6.2005
no session
NO TOPIC



14.6.2005
Session 05 DOP-Model for LFG DOP-book:
ch 12
Kathrin Spreyer
lfg-dop.pdf
21.6.2005
Session 06 DOP-Model for HPSG
DOP-book:
ch 13
Xiwen Cheng
hpsg-dop.ppt
28.6.2005
Session 07 Data-oriented Generation:
Explanation-based Learning
papers
Svenia  Meyer
ebl-nlg.pdf
05.7.2005
Session 08 Data-oriented Generation:
Statistical approaches
papers
Bettina Fromkorth

12.7.2005
Final discussion Interactive discussion about pro'n'cons
the previous talks
all


Paper Report

For the report writing, we will use a standard conference style which is available for Latex and MS-word. The ZIP file contains corresponding versions for the format instructions. Important:

References

Session 01:

Session 02:

Session 03:

Session 04:

Session 05:

Session 06:

Session 07:

Session 08:

Links

Remko Scha's background material on Data-Oriented Parsing (DOP)

Probabilistic Grammars and Data-Oriented Parsing  (course by Detlef Prescher, Remko Scha, and Khalil Sima'an)

What is DOP? by Rens Bod

SuperTagging without Tears

Statistical natural language processing and corpus-based computational linguistics: An annotated list of resources