← Lessons

quiz vs the machine

Gold1410

Machine Learning

GloVe Embeddings

Word vectors from global co occurrence statistics.

5 min read · core · beat Gold to climb

GloVe Embeddings

GloVe, short for global vectors, learns word embeddings from co occurrence counts gathered across the entire corpus. Where skip gram slides a window and predicts local neighbors, GloVe first builds a giant table of how often each pair of words appears together, then fits vectors to that table.

The central insight is about ratios. The ratio of co occurrence probabilities carries meaning. Ice co occurs with solid far more than steam does, while steam co occurs with gas more than ice does. GloVe trains vectors so their dot products match the logarithm of these co occurrence counts, capturing those ratios directly.

This blends two traditions:

  • Count based methods that summarize global statistics in one pass
  • Prediction based methods like word2vec that learn by local objectives

A weighting function keeps very frequent pairs from dominating and rare pairs from adding noise. The result is embeddings with the same useful geometry, where analogies appear as vector offsets.

In practice GloVe and skip gram perform similarly. GloVe can be appealing because it uses corpus wide statistics efficiently and trains on a compact co occurrence matrix rather than streaming text repeatedly. Like word2vec, it still assigns one static vector per word.

Key idea

GloVe fits word vectors so their dot products match log co occurrence counts, learning embeddings from global corpus statistics.

Check yourself

Answer to earn rating on the learn ladder.

1. What does GloVe fit its vectors to?

2. What carries meaning in GloVe according to its insight?

3. Why does GloVe use a weighting function?