← Lessons

quiz vs the machine

Gold1430

Machine Learning

The Coreference Resolution

Linking mentions like she and the doctor that refer to the same entity.

5 min read · core · beat Gold to climb

The problem

Coreference resolution groups together all the mentions in a text that point to the same real world entity. In Maria called her sister because she was late, a system must decide whether she is Maria or the sister.

Mentions and clusters

  • A mention is a span that can refer to an entity, such as a name, a noun phrase, or a pronoun.
  • A cluster is the set of mentions that corefer.
  • The goal is to partition all mentions into clusters.

How models resolve it

  • For each mention, score candidate antecedents that appeared earlier.
  • Link a mention to its highest scoring antecedent, or to a dummy if it starts a new cluster.
  • Modern systems do this end to end, jointly detecting mentions and scoring links from contextual embeddings.

Signals that help

  • Gender and number agreement between a pronoun and its antecedent.
  • Distance, since nearer mentions are more likely partners.
  • World knowledge, for cases that grammar alone cannot settle.

Why it matters

Resolving coreference lets summarizers and question answering systems track an entity across many sentences instead of treating each pronoun as new.

Key idea

Coreference resolution partitions all mentions in a text into clusters of the same entity by linking each mention to its best earlier antecedent using agreement, distance, and context.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the output of coreference resolution?

2. What is an antecedent?

3. Which signal commonly helps resolve a pronoun?