The CBOW Model
Continuous bag of words, or CBOW, is the mirror image of skip gram. Instead of predicting context from a center word, it predicts the center word from the surrounding context words.
How it works
- Take the context words inside a window around a target position.
- Average or sum their embeddings into a single context vector.
- Use that vector to predict the missing center word.
Because the context embeddings are combined into one vector, CBOW treats the surroundings as an unordered bag, which is where the name comes from. Training updates push the context vectors to better predict the true center word.
When to choose it
CBOW trains faster and works well on frequent words because many context examples are averaged together. Skip gram, by contrast, tends to do better on rare words and small corpora since each context word gives a separate training signal. The two are often offered as alternatives in the same toolkit.
Key idea
CBOW predicts a center word from averaged context embeddings, training faster than skip gram while skip gram handles rare words better.