← Back to all workshops

Workshop 4: Train Your Own Embeddings

See how Word2Vec learns from text

What is this?

In previous workshops, we used pre-trained embedding models to analyze and compare word similarities. Now you'll train your own embedding model from scratch using a small toy model: Word2Vec.

The Tool

Go to remykarem.github.io/word2vec-demo

This is a web implementation of the Word2Vec algorithm, it is limited in scope as your browser and computer are not mearnt for training models but it demonstrates the core concepts.

Step by Step

  1. Paste some text song lyrics, a paragraph, anything that is short enough not to crash your browser
  2. Choose a model:
  3. Set window size: how many words around each word to consider
  4. Click Generate dataset to see the training examples
  5. Set embedding size, learning rate, epochs (or leaving them as is)
  6. Click Train model and wait for training to complete
  7. Click Run t-SNE to visualize the learned embeddings in 2D

Questions to Consider