A Linear Algebra Demo on Word Similarity
1. Word Embeddings (Vectors): Large Language Models (LLMs) learn a "word embedding," which is a mapping of every word to a high-dimensional vector. Here, we use simple 3D vectors for demonstration. Words with similar meanings are placed near each other in this vector space.
2. Cosine Similarity: To measure the "similarity" between two word vectors, we calculate the cosine of the angle ($ \theta $) between them. $$ \cos(\theta) = \frac{\vec{a} \cdot \vec{b}}{\|\vec{a}\| \|\vec{b}\|} $$ A smaller angle results in a cosine similarity closer to 1 (highly similar).