pytorch/text

Vocab vectors using complete pretrained-embedding?

Open

#446 opened on Oct 12, 2018

View on GitHub
 (6 comments) (0 reactions) (0 assignees)Python (822 forks)batch import
enhancementhelp wanted

Repository metrics

Stars
 (3,396 stars)
PR merge metrics
 (No merged PRs in 30d)

Description

I am new to pytorch and nlp. I have a question when I tried to build a model.

Since my training dataset is not so big, the size of its vocab is relatively small (around 5000). However, I want to deal with any other user input which could be out of this vocabulary.

The problem is, in the model I trained, the embedding layer's weight is based on the vectors of the field, not the whole word2vec pretrained embeddings. So I cannot modified it after the training is done.

I wondered is there any better approach to do it? Thanks in advance!

Contributor guide