Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

utility improvements to seq2seq #845

Merged
merged 4 commits into from
Jun 12, 2018
Merged

utility improvements to seq2seq #845

merged 4 commits into from
Jun 12, 2018

Conversation

alexholdenmiller
Copy link
Member

  • still testing the oom code to see if it helps reduce memory spikes, it's based on fairseq's code and was suggested by myle
  • changed the vector caches to reference the parlai data path instead of parlai_home

@alexholdenmiller
Copy link
Member Author

looks like the oom code is working! this clears out pytorch's GPU memory cache whenever it gets oom during the forward and backward pass during training (not when the network weights are being updated and not during validation) and just moves on to the next batch (logging the oom to the metrics and printing a warning)

@alexholdenmiller
Copy link
Member Author

data change: #844
oom: #843 and others

@alexholdenmiller
Copy link
Member Author

trained with batchsize 350 for a bit and was able to catch some spikes during training and continue

@@ -248,23 +249,15 @@ def __init__(self, opt, shared=None):
embs = vocab.GloVe(
name='840B',
dim=300,
cache=os.path.join(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking: is this the same as
https://github.com/facebookresearch/ParlAI/blob/master/parlai/zoo/glove_vectors/build.py ?

opt  = { 'datapath': datapath }
fnames = ['glove.840B.300d.zip']
download_models(opt, fnames, 'glove_vectors', use_model_type=False,
                path = "http://nlp.stanford.edu/data")

not clear it is..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably just remove parlai/zoo/glove_vectors right? torchtext has its own code for downloading its vectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is being used by drqa, if you can make drqa work with the other then yes! fine..! be good to get drqa to work with fasttext as well, anyway..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but yes it is the same, download_models uses os.path.join(opt['datapath'], 'models', model_folder) where model_folder here is the glove_vectors string in that call

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought one might be a binary file and one a text file or something? i guess just check drqa still works please

@jaseweston
Copy link
Contributor

see comment

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants