Open Information Extraction


Computational Lingustics
Master Programme
Winter Semester 2015



NOTE: the first round of oral exam will take place at 11th Feb, 11 am in my office, room number 1.11., DFKI building!


General Information

Moderator: Günter Neumann


Open Information Extraction (OIE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. The main goal is to extract new facts for a potentially unbounded set of relations from various sources such as knowledge bases or natural language text. In this seminar, we will study state of the art methods and technology for open information extraction.

Seminar Language: English

Available Certificate Modalities:

Placement in Study Programme:



Session Number
Organisational meeting
Günter Neumann
Günter Neumann
Saskia Reifers
Yauhen Klimovich
Gareth Dwyer
Hannah Seitz
Sylvette Loda
Natalia Skachkova
Jana Ott
Tyler Clement
Dai Quoc Nguyen
Stalin Varanasi
Julia Masloh
Yi-Ling Chung

Please click on the session number to jump to the corresponding references. If available, the topics of the presentations will be linked to the slides of the presentations.



Session 2: Overview

Session 3: Textrunner and WOE

Banko et al. (2007) Open Information Extraction from the Web, IJCAI, 2007.

Fei Wu, Daniel S. Weld (2010) Open Information Extraction using Wikipedia, ACL, 2010.

Session 4: Reverb and OLLIE

Banko et al. (2007) Identifying Relations for Open Information Extraction , EMNLP, 2011. Cf. also

Mausam et al. (2012) Open Language Learning for Information Extraction, EMNLP, 2012. cf. also

Session 5: Dependency-based OIE

Gamallo et al. (2012) Dependency-Based Open Information Extraction, Proceedings of ROBUS-UNSUP, 2012.

Ying Xu et al. (2013) Open Information Extraction with Tree Kernels, NAACL-HLT (2013)

Session 6: Clause-Based Open Information Extraction

Del Corro and Gemulla (2013) ClausIE: Clause-Based Open Information Extraction , International World Wide Web Conference Committee (IW3C2) (2013) Cf. also

Session 7: Effectiveness and Efficiency of Open Relation Extraction

Mesquita et al. (2007) Effectiveness and Efficiency of Open Relation Extraction , EMNLP (2013) Cf. also

Schmidek and Barosso (2014) Improving Open Relation Extraction via Sentence Re-Structuring , LREC (2014)

Session 8: Open Information Extraction via Contextual Sentence Decomposition

Bast and Haussmann (2013) Open Information Extraction via Contextual Sentence Decomposition , ICSC (2013)

Bast and Haussmann (2014) More Informative Open Information Extraction via Simple Inference , ECIR (2014)

Session 9: Leveraging Linguistic Structure For Open Domain Information Extraction

Angeli et al. (2015) Leveraging Linguistic Structure For Open Domain Information Extraction , ACL (2015)

Session 10: Multilingual OIE

Daniel Gerber and Axel-Cyrille Ngonga Ngomo (2012) Extracting Multilingual Natural-Language Patterns for RDF Predicates, KDIR (2012 )

Lewis and Steedman (2015) Unsupervised Induction of Cross-lingual Semantic Relations, EMNLP (2013)

Faruqui and Kumar (2015) Multilingual Open Relation Extraction Using Cross-lingual Projection, NAACL (2015)

Session 11: Clustering Relations

Mesquita (2012) Clustering Techniques for Open Relation Extraction, SIGMOD, PhD workshop (2012)

Mohamed et al. (2011) Discovering Relations between Noun Categories, EMNLP (2011)

Session 12: Matrix Factorization

Riedel et al. (2013) Relation Extraction with Matrix Factorization and Universal Schemas, HLT-NAACL 2013 (2013)

Petroni et al. (2015) CORE: Context-Aware Open Relation Extraction with Factorization Machines, EMNLP (2015)

Session 13: Classifying Relations and Deep Learning

Zeng et al. (2014) Relation Classification via Convolutional Deep Neural Network, Coling (2014)

Yan Xu et al. (2015) Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths, EMNLP (2015)

Session 14: Connecting Language and Knowledge Bases

Westion et al. (2013) Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction, EMNLP (2013)

Wang et al. (2014) Knowledge Graph and Text Jointly Embedding, EMNLP (2014)


Written Report

Students enrolled in the Master's programme can choose to submit a written report (see available certificate modalities). The length of the written report is restricted to eight pages, disregarding bibliographical sources. For this purpose, the linked conference-style template should be used (available for Latex and MS Word). The submission deadline is 31st March 2015. The written report should have the the style of conference proceedings. We expect you to digest the material related to your topic and perform further research. In your report, you should add value to the available information by comparing, criticizing, and highlighting plus points. We want to encourage you to think and develop your own opinion, and will disapprove of copy-pasting. If you have questions on the written report, we will be happy to help you.

You can turn in your report in electronic form as PDF file. Electronic copies should be submitted via e-mail to the following addresses: