Two notions of close
When you compare vectors you need a number for how close they are. The two common choices are cosine similarity, which measures the angle between vectors, and Euclidean distance, which measures straight line distance between their tips.
How they differ
- Cosine ignores magnitude and looks only at direction, so a long vector and a short vector pointing the same way score as identical.
- Euclidean is sensitive to magnitude, so length differences change the result.
Which to use
Many text embedding models are trained with cosine in mind, since document length should not dominate meaning. If vectors are normalized to unit length, cosine and Euclidean rankings agree, because the only remaining difference is angle.
A practical note
Always match the metric your embedding model was trained with. Using Euclidean on vectors meant for cosine can quietly degrade recall, since the geometry no longer reflects the training objective.
Key idea
Cosine compares direction while Euclidean compares position and magnitude, and after normalization the two agree, so the right metric is the one the embedding model expects.