Home / Machine Learning / Represent Word as Vector from Word2Vec with GenSim

Represent Word as Vector from Word2Vec with GenSim

This short tutorial will concentrate on how-to get vector representation of word from pre-trained word2vec (published by Tomas Mikolov).

Following these steps below:

  1. Assume your OS had Python environment. We recommend to install Anaconda for working with Python.
  2. Open Terminal and install GenSim by typing pip install --upgrade gensim.
  3. Download pre-trained word2vec at here.
  4. Now, create a blank Python source code file, for example, test.py, as below. Then, in Terminal, compile and run it by typing python test.py.

import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('\word2vec\GoogleNews-vectors-negative300.bin', binary=True)
print model['computer']

If anything is fine, your Terminal may look like this. The output is a 300-dimension vector.

The sample program is available at here.

About Phuc Duong [Admin]

Lecturer, Scientist of Faculty of Information Technology, Ton Duc Thang University (Vietnam).

Check Also

Measuring Similarity and Sentiment Analysis with Theano 0.7