Machine Learning Reading Group (MLRG):
Machine Reading - Rule Learning, Coreference Resolution, and Learning
from Incomplete Examples
Every thursday 11:30 PM in KEC 2057.
- Some basic ILP and SRL papers
1. FOIL algorithm
: Notes from Alan's CS532 course is here
2. Logan-H : Learning Horn expressions with LOGAN-H (PDF)
3. Overview of SRL models
: Sriraam's qualifier paper is here
(We will read specific papers from SRL if needed )
4. Probabilistic modeling paper : Analysis of multinomial models with unknown index using data augmentation (PDF)
- Learning Inference Rules (papers suggested by Prasad)
Discovery of Inference Rules for Question Answering (Lin, D. and
Natural Language Engineering, 7(4), 343-360, 2001. -- The
rules are generated using similarities between templates of paths.
The similarities are calculated based on a version of "mutual
information". High ranking similarities between paths are used to
generate inference rules. As a rule, the
recall is good, but precision is low. Moreover, inference rules are symmetric here. X eats Y <=> X likes Y.
LEDIR: An unsupervised algorithm for learning directionality
of inference rules
(Bhagat, R. Pantel, P., Hovy,
) Proceedings of the 2007 joint Conference on
EMNLP&CoNLL pp 161-170, Prague, June 2007.
-- Learned directional inference rules based on
the frequencies of occurence of each side of the inference
rule. Learns that X eats Y => X likes Y. The
directionality of learning has improved, but recognizing
valid vs invalid inferences was not. So the precision still
suffers. For example, x likes y <=> x hates y might be
learned as a rule. The problem, it seems to me, is that the x and
y are abstracted to "person" before the inference rule is learned.
I.e., the learner has not seen any evidence for (x likes
y) and (x hates y) for the same x and y! It has only seen
someone liking someone and someone else hating someone else. So in
fact, there is only evidence for believing someone likes someone
<=> someone hates someone. This seems reasonable enough, but
it is much weaker than
inference rule that is actually learned from this! Another issue:
inference was not used during learning process
to learn additional constraints.
Harabagiu, S. and Hickl, A. Methods for using Textual Entailment
in Open-Domain Question Answering. In Proceedings of ACL 2006, pp
905-912, Sydney Australia. -- Have not read this. Apparently
showed that directional textual entailment alone can improve
the question answering without other inference mechanisms
(according to Bhagat et al. )
4. Szpektor, I; Tamev,
H.; Dagan, I; and Coppola, B; 2004. Scaling web-based acquisition
of entailment relations. In Proceedings of EMNLP 2004. pp 41-48.
5. Chklovski, T. and Pantel, P.
2004. VerbOCEAN: Mining the Web for Fine-Granied Semantic Verb
Relations. In Proceedings of EMNLP 2004, Barcelona, Spain.
Rodrigo de Salvo Braz, Roxana Girju, Vasin Punyakanok, Dan
Roth, ark Sammons: An Inference Model for Semantic
Entailment in Natural Language. Lecture Notes in Computer Science,
Springer Berlin / Heidelberg Volume 3944/2006, Book: Machine
Learning Challenges. -- This paper treats inference as
optimization and does not discuss learning inference rules.
Claire Nedellec: Corpus-Based Learning of Semantic Relations by the ILP
System, Asium. Learning Language in Logic 1999:
A Paper by Ritter, Etzioni et al. on learning functional
e.g., emplyoeeOf(person,Company) is a function but colleagueOf(x,y) is not.
discovery: Gamberger, D. and Lavrac, N. 2002. Descriptive
Induction through Subgroup Discovery: A Case Study in a Medical
Domain. In /Proceedings of the Nineteenth international Conference
on Machine Learning/ (July 08 - 12, 2002). C. Sammut and A. G.
Hoffmann, Eds. Morgan Kaufmann Publishers, San Francisco, CA,
10. Markov Logic Networks paper by Richardson and Domingos. MLNs are schematized versions of undirected graphical models
relational atoms. There is a lot of current work on using these
in lifted inference and comparisons to directed relational
models like probailistic relational models. This is a
basic MLN paper. http://www.springerlink.com/content/w55p98p426l6405q/fulltext.pdf
11. Claudien paper - learning from interpretations
An interpretation is an assignment of truth values to all
predicates, e.g., author(paper23,JohnDoe). Given a theory, a
positive interpretation satisfies the theory. a negative
interpretation does not. Claudien learns a clausal theory
(conjunction of Horn clauses) from a set of positive and negative
12. Natural logic for textual inference describes the NatLog system that does textual inference.
-- This system describes a set of inference rules that can
apply to natural language sentences to derive some natural
inferences, e.g., John does not work in the US. => John does
not work in New York.
13. Natlan/Coling 2008 paper on extending NatLog
14. Bill MacCartney's Stanford thesis on natural language inference
- On Learning Temporal Structure of Events
- Logical Hidden Markov Models (JAIR paper)
Learning Partially Observable Deterministic Action Models (JAIR paper)
- First-Order Logical Filtering (AIJ paper)
- Reasoning about Deterministic Actions with Probabilistic Prior and Application to Stochastic Filtering (KR conference paper)
- Kai-Wei Chang and Rajhans Samdani and Alla Rozovskaya and Nick Rizzolo and Mark Sammons and Dan Roth, Inference Protocols for Coreference Resolution. Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task (2011) pp. 40--44
- Understanding the Value of Features for Coreference Resolution, E. Bengtson and D. Roth, EMNLP - 2008
- Constraint-Based Entity Matching , W. Shen, X. Li and A. Doan , Proceedings of the National Conference on Artificial Intelligence (AAAI) - 2005
Syntactic Parsing for Ranking-Based Coreference Resolution. Altaf Rahman and Vincent Ng. Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP-11), 2011.
- Ensemble-Based Coreference Resolution. Altaf Rahman and Vincent Ng. Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI-11), 2011.
Coreference Resolution with World Knowledge. Altaf Rahman and Vincent Ng. Main Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), 2011
Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution. Altaf Rahman and Vincent Ng.
Journal of Artificial Intelligence Research 40, pages 469-521, 2011. (This is an expanded version of the Rahman & Ng EMNLP 2009
paper. It proposes the cluster-ranking model, which solidly advances the
state of the art in coreference modeling.)
- Supervised Noun Phrase Coreference Research: The First Fifteen Years. Vincent Ng. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-10), 2010.
Supervised Models for Coreference Resolution. Altaf Rahman and Vincent Ng. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP-09), 2009.
- Semantic Class Induction and Coreference Resolution. Vincent Ng. Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07), 2007.
Shallow Semantics for Coreference Resolution. Vincent Ng. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07), 2007.
- Machine Learning for Coreference Resolution: From Local Classification to Global Ranking. Vincent Ng. Proceedings of the 43rd Annual Meeting of the Association for Computational
Linguistics (ACL-05), 2005.
Classifiers with Multiple Machine Learning Algorithms. Vincent Ng and Claire Cardie. Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP-03), 2003.
Weakly Supervised Natural
Language Learning Without Redundant Views. Vincent Ng and Claire Cardie. Proceedings of the Human Language Technology Conference of the North American
Chapter of the Association for Computational Linguistics (HLT-NAACL), 2003.
Machine Learning for Coreference Resolution: Recent Successes and Future Directions. Vincent Ng. Cornell University Technical Report CUL.CIS/TR2003-1918, 2003
- Knowledge Base Population: Successful Approaches and Challenges (ACL), 2011.
- Coreference Resolution in a Modular, Entity-Centered Model, Aria Haghighi and Dan Klein, Proceedings of NAACL 2010.
- Simple Coreference Resolution with Rich Syntactic and Semantic Features, Aria Haghighi and Dan Klein, Proceedings of EMNLP 2009.
- Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models. Sameer Singh, Amarnag Subramanya, Fernando
Pereira, Andrew McCallum. Association for Computational Linguistics: Human Language Technologies (ACL HLT), 2011
- SampleRank: Training Factor Graphs with Atomic Gradients.
Michael Wick, Khashayar Rohanimanesh, Kedar Bellare, Aron Culotta,
Andrew McCallum. Proceedings of the International Conference on Machine
Learning (ICML), 2011.
in Learning and Inference for Partition-wise Models of Coreference
Resolution. Michael Wick and Andrew McCallum. University of
Massachusets Technical Report # UM-CS-2009-028 (TR), 2009
Entity Based Model for Coreference Resolution.
Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew McCallum.
Proceedings of the SIAM International Conference on Data Mining (SDM),
Reno, Nevada, 2009
- Joint Unsupervised Coreference Resolution with Markov Logic, with Hoifung
Poon. Proceedings of the 2008 Conference on Empirical Methods in Natural
Language Processing (pp. 649-658), 2008. Honolulu, HI: ACL
- T. Finley, Supervised Clustering with Structural SVMs, PhD Thesis, Cornell University, Department of Computer Science, 2008. [Download]
- Dan Goldwasser and Dan Roth, Learning from Natural Instructions. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (2011)
- Abductive Plan Recognition by Extending Bayesian Logic Programs [Details] [PDF] . To Appear In Proceedings
of the European Conference on Machine Learning/Principles and Practice
of Knowledge Discovery in Databases (ECML-PKDD 2011), September 2011.
- Extending Bayesian Logic Programs for Plan Recognition and Machine Reading [Details] [PDF] [Slides] Technical Report, PhD proposal, Department of Computer Science, The University of Texas at Austin, May 2011.
- Learning to Interpret Natural Language Navigation Instructions from Observations [Details] [PDF] To Appear In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-2011), 2011.
- Abductive Markov Logic for Plan Recognition [Details] [PDF] To Appear In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-2011), 2011
- Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback. Sajib Dasgupta and Vincent Ng. Journal of Artificial Intelligence Research 39, 2010.
- S.R.K. Branavan, David Silver, and Regina Barzilay "Learning to Win by Reading Manuals in a Monte-Carlo Framework",
Proceedings of ACL, 2011.
- S.R.K. Branavan, David Silver, and Regina Barzilay "Non-Linear Monte-Carlo Search in Civilization II",
Proceedings of IJCAI, 2011.
- S.R.K. Branavan, Luke Zettlemoyer and Regina Barzilay "Reading Between the Lines: Learning to Map High-level Instructions to Commands",
Proceedings of ACL, 2010
- S.R.K. Branavan, Harr Chen, Luke Zettlemoyer and Regina Barzilay "Reinforcement Learning for Mapping Instructions to Actions", Proceedings of ACL, 2009. Best Paper Award
(Below papers are from Nimar Arora's NLP reading list)
Jerry R. Hobbs (1986):
Overview of the TACITUS Project
gives a brief glimpse of the knowledge representation scheme involving
predicates for each word of the sentence and thinking of a derivation as
the interpretation. Hints also at issues in temporal reasoning.
John Bear and Jerry R. Hobbs (1988):
Localizing Expression of Ambiguity discusses how to capture
attachment and other ambiguities in the logical form of a sentence instead of
creating multiple logical forms. The ambiguities are captured as a
disjunction of possible entity or action variables that special 'y'
variables could be identical to.
Jerry R. Hobbs, Mark Stickel, Paul Martin, Douglas Edwards (1990):
Interpretation as abduction describes how abductive reasoning
(an unsound logical inference process) can be used to understand natural
Patric Blackburn, Johan Bos, Michael Kohlhase (1998):
Automated Theorem Proving for Natural Language Understanding
shows how to transform sentences in Discourse Representation Theory to
first order logic.
Ricardo Santos (2000):
Donald Davidson On The Logical Form of Action Sentences
describes and justifies the Davidsonian view on logical forms. The
logical form of a sentence must capture the entailment relation between
the sentence and other sentences. Actions (and entities) should be
represented by variables and their descriptions by predicates in the
logical form - because a single action can have multiple
descriptions. Also, prepositions should have their own predicates which
modify the action.
Dan I. Moldovan, Vasile Rus (2001):
Logic Form Transformation of WordNet and its Applicability to
takes glosses found in WordNet, parses them and converts them to a
Some papers from Question Answering literature:
Lynette Hirschman, Marc Light, Eric Breck, and John D. Burger (1999):
Deep Read: A Reading Comprehension System
uses a bag of words to find the answer sentence which has the best
intersection with the question. The bag of words consists of the stemmed
words in the sentence along with semantic labels like :PERSON and
:LOCATION, and personal pronouns replaced by the last :PERSON named
entity. Some other heuristics include preferring longer matching words
and preferring sentences which appear earlier in the document.
Performs at 33% (HumSentAcc) on the Remedia corpus. (36% with perfect
name and stem resolution)
Eugene Charniak, Yasemin Altun, Rodrigo de Salvo Braz, Benjamin
Garrett, Margaret Kosrnala, Tomer Moscovich, Lixin Pang, Changbee Pyo,
Ye Sun, Wei Wy (2000):
Reading Comprehension Programs in a Statistical-Language-Processing
Same Bag-of-words approach with a few tweaks to push up the numbers a
little bit. Specifically, bag-of-verbs, tfidf based matching
instead of set intersection, and special rules for each question
type. This work shows that down-weighting stop words is better than
removing them altogether. Also, many a times, the correct answer is not
in the sentence with the best match but in the preceding or the
Performs at 41% (HumSentAcc) on the Remedia corpus.
Ellen Riloff and Michael Thelen (2000):
A Rule-based Question Answering System for Reading
is also a bag-of-words approach augmented with semantic classes (HUMAN,
LOCATION, MONTH, TIME). Specific score rules are hand constructed for
different question types.
Performs at 40% (HumSentAcc) on the Remedia corpus.
Sanda M. Harabagiu, Steven J. Maiorano, and Marius A Pasca (2003):
Open-Domain Textual Question Answering
- question stem analysis and disambiguation
- uses 24 named-entity categories
- answer type detection
- mapping of named-entity to answer type
Performs at 65.3% (HumSentAcc) on the Remedia Corpus (76.4% with perfect
named entity resolution and coreference resolution). There are no results
provided for their system's named entity resolution and coreference
resolution -- the first number has named entity resolution only.
Eugene Grois and David C. Wilkins (2005):
Learning Strategies for Story Comprehension:
A Reinforcement Learning Approach
Performs at 48% (HumSentAcc) on the Remedia Corpus.
Ben Wellner, Lisa Ferro,
Warren Greiff and Lynette Hirschman (2006):
Reading comprehension tests for computer-based
creates a logical form of the question and answer, and uses abductive
Performs at 46% (inexact) on the Remedia Corpus.