Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

DrQA agent #437

Closed
jenniferzhu opened this issue Dec 7, 2017 · 22 comments
Closed

DrQA agent #437

jenniferzhu opened this issue Dec 7, 2017 · 22 comments
Assignees
Labels

Comments

@jenniferzhu
Copy link

Hi, I was redirected to ParlAI from the DrQA GitHub repository and noticed that there is a drqa agent in ParlAI. However, I cannot find any documentation on that. What would be a starting point to use the DrQA system with the ParlAI framework?

@jaseweston
Copy link
Contributor

jaseweston commented Dec 7, 2017 via email

@vishnumenon
Copy link

I tried using the line in the README to evaluate the pre-trained DrQA model, and I got terrible accuracy -- the final line of my output was {'total': 10570, 'accuracy': 0.01239, 'f1': 0.03498, 'hits@k': {1: 0.0124, 5: 0.0124, 10: 0.0124, 100: 0.0124}, 'train_loss': 0}. Any idea what might be causing this? I'm working on a google VM running ubuntu w python 3.6 (thru anaconda).

@vishnumenon
Copy link

Using the Interactive Mode gives me the answer "." for any questions/contexts I try

@alexholdenmiller
Copy link
Member

Hi @vishnumenon, can you confirm which steps you ran?

@vishnumenon
Copy link

vishnumenon commented Dec 11, 2017

Hi! I just ran

git clone https://github.com/facebookresearch/ParlAI.git ~/ParlAI
cd ~/ParlAI; python setup.py develop

installed spacey because it showed up as an unsatisfied requirement (along with the 'en' language pack), and then ran

wget https://s3.amazonaws.com/fair-data/parlai/_models/drqa/squad.mdl
python eval_model.py -m drqa -t squad -mf squad.mdl -dt valid

Am i missing any steps?

@alexholdenmiller
Copy link
Member

Ah, great--just wanted to make sure. I'll debug this--it should have gotten a good accuracy (I believe I ran this successfully just two weeks ago).

@vishnumenon
Copy link

Awesome, thanks!

@vishnumenon
Copy link

Oh! Also another change that I made -- initially, when I tried to run eval_model, I got this error:

Traceback (most recent call last):
  File "eval_model.py", line 46, in <module>
    main()
  File "eval_model.py", line 30, in main
    agent = create_agent(opt)
  File "/home/me/ParlAI/parlai/core/agents.py", line 319, in create_agent
    return model_class(opt)
  File "/home/me/ParlAI/parlai/agents/drqa/drqa.py", line 105, in __init__
    word_dict = DrqaAgent.dictionary_class()(opt)
  File "/home/me/ParlAI/parlai/agents/drqa/drqa.py", line 53, in __init__
    super().__init__(*args, **kwargs)
  File "/home/me/ParlAI/parlai/core/dict.py", line 187, in __init__
    import spacy
  File "/home/me/anaconda3/lib/python3.6/site-packages/spacy/__init__.py", line 4, in <module>
    from .cli.info import info as cli_info
  File "/home/me/anaconda3/lib/python3.6/site-packages/spacy/cli/__init__.py", line 5, in <module>
    from .profile import profile
  File "/home/me/anaconda3/lib/python3.6/site-packages/spacy/cli/profile.py", line 7, in <module>
    import cProfile
  File "/home/me/anaconda3/lib/python3.6/cProfile.py", line 22, in <module>
    run.__doc__ = _pyprofile.run.__doc__
AttributeError: module 'profile' has no attribute 'run'

Some googling indicated that it was because of the file named 'profile' in examples, so I changed the file's name (couldn't find any references to it that needed updating, not sure if that's what I missed) and then it seemed to run without issue.

@vishnumenon
Copy link

Update: I tried deleting everything and starting from scratch, and I now get this (seemingly unrelated) error:

[ Loading model squad.mdl ]
[ Using CUDA (GPU -1) ]
[creating task(s): squad]
loading: /home/me/ParlAI/data/SQuAD/dev-v1.1.json
Traceback (most recent call last):
  File "examples/eval_model.py", line 46, in <module>
    main()
  File "examples/eval_model.py", line 35, in main
    world.parley()
  File "/home/me/ParlAI/parlai/core/worlds.py", line 278, in parley
    acts[1] = agents[1].act()
  File "/home/me/ParlAI/parlai/agents/drqa/drqa.py", line 175, in act
    ex = self._build_ex(self.observation)
  File "/home/me/ParlAI/parlai/agents/drqa/drqa.py", line 257, in _build_ex
    inputs['document'], doc_spans = self.word_dict.span_tokenize(document)
  File "/home/me/ParlAI/parlai/core/dict.py", line 288, in span_tokenize
    if self.tokenizer == 'spacy':
AttributeError: 'SimpleDictionaryAgent' object has no attribute 'tokenizer'

@alexholdenmiller
Copy link
Member

@jaseweston I'm renaming profile.py => profile_train.py

@vishnumenon I found that bug myself, fixing it now and then you should get full training accuracy again.

@vishnumenon
Copy link

Sounds good, thanks!

@alexholdenmiller
Copy link
Member

pushed fix for profile, PR out for dictionary loading fix: #443

@alexholdenmiller
Copy link
Member

fix in! running the eval script after pulling should give you:

{'total': 10570, 'accuracy': 0.6634, 'f1': 0.7685, 'hits@k': {1: 0.663, 5: 0.663, 10: 0.663, 100: 0.663}, 'train_loss': 0}

@vishnumenon
Copy link

Yup, it's working great now, thanks!

@alexholdenmiller
Copy link
Member

@jenniferzhu I'm going to close this for now but feel free to reopen if you have any more questions!

@jenniferzhu
Copy link
Author

jenniferzhu commented Feb 14, 2018

@alexholdenmiller Hi, I ran most of those examples. They worked well, but my question is that how can I add my extra text data to answer questions, using a pre-trained model? i.e. if we don't have a good training QA dataset, how can we use the model for our content?

@jaseweston
Copy link
Contributor

@alexholdenmiller or @klshuster ?

@jenniferzhu
Copy link
Author

jenniferzhu commented Feb 22, 2018

@alexholdenmiller I just re-pulled and then ran the same commands @vishnumenon, but still have the low accuracy issue.

wget https://s3.amazonaws.com/fair-data/parlai/_models/drqa/squad.mdl
python eval_model.py -m drqa -t squad -mf squad.mdl -dt valid

Here is the error after interactive.py:

Bob is Blue.\nWhat is Bob?
/Users/xuanzhu/ParlAI/parlai/agents/drqa/layers.py:177: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
alpha_flat = F.softmax(scores.view(-1, y.size(1)))
/Users/xuanzhu/ParlAI/parlai/agents/drqa/layers.py:232: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
alpha = F.softmax(scores)
/Users/xuanzhu/ParlAI/parlai/agents/drqa/layers.py:212: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
alpha = F.softmax(xWy)
[DrqaAgent]: .

@jenniferzhu
Copy link
Author

jenniferzhu commented Feb 23, 2018

@alexholdenmiller Updates: I re-installed ParlAI on a new laptop, and still get the same problem...

@alexholdenmiller
Copy link
Member

I'll check it out

@alexholdenmiller
Copy link
Member

Sorry I missed the previous question: you can train the model with a different task (e.g. a custom one for your other data) and train the model from there.

For example...

python examples/train_model.py -t babi:task10k:3 -m drqa -mf squad.mdl -bs 32 -vtim 5

...the pretrained model gets ~46% accuracy on the first validation, but it steadily goes up from there (I got up to ~68% accuracy in only two minutes minutes before I stopped it).

@alexholdenmiller
Copy link
Member

feel free to reopen if you have any questions! #591 fixes the low accuracy, it was an issue loading the dictionary

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants