Beyond the dot product
Classic matrix factorization scores a user item pair with a simple dot product of their vectors. Neural collaborative filtering, or NCF, asks whether a network can learn a richer interaction function that captures nonlinear taste patterns the dot product misses.
The architecture
- Look up a user embedding and an item embedding from learned tables.
- Combine them and feed the result through a stack of fully connected layers with nonlinear activations.
- Output a single score, often a probability of interaction.
A popular variant fuses two towers: a generalized matrix factorization branch that keeps the elementwise product and a multilayer perceptron branch, then concatenates both before the final layer.
Training
- Treat observed interactions as positives and sample unobserved pairs as negatives.
- Minimize a binary cross entropy loss on the predicted interaction probability.
Trade offs
- The network can model interactions that a plain dot product cannot.
- It costs more compute and can overfit on sparse data, so embeddings still dominate the parameter count.
Key idea
NCF swaps the fixed dot product for learned neural layers over user and item embeddings, capturing nonlinear interactions at the cost of extra compute and overfitting risk.