Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lda difference visualization #1374

Merged
merged 2 commits into from
May 30, 2017
Merged

Lda difference visualization #1374

merged 2 commits into from
May 30, 2017

Conversation

menshikh-iv
Copy link
Contributor

@menshikh-iv menshikh-iv commented May 29, 2017

You can see result with nbviewer link with interactive graphs

Continuing #1243

@tmylk
Copy link
Contributor

tmylk commented May 30, 2017

Please explain the annotation with '++' and '--' a little bit more. it's explained already, just need to mention the + and - symbols explicitly.

@menshikh-iv
Copy link
Contributor Author

menshikh-iv commented May 30, 2017

@tmylk Done!

@tmylk
Copy link
Contributor

tmylk commented May 30, 2017

It is better than before but please add a very simple viz where it's just two topics in order to explain the idea more clearly.

@tmylk tmylk merged commit 55997f8 into piskvorky:develop May 30, 2017
@piskvorky
Copy link
Owner

piskvorky commented May 30, 2017

@menshikh-iv There's no way to review a notebook (on github), so I'll comment here instead:

  1. add full stops = . at the end of sentences (many places)
  2. prefer list/set/dict comprehensions in place of filter/map
  3. use hanging indent
  4. use normal numpy array instead of list-of-lists (mdiff)
  5. some intro, e.g. who is "I"?
  6. missing articles etc -- do we have any students who are native speakers, who could go over this and proofread?

@parulsethi
Copy link
Contributor

parulsethi commented Jun 6, 2017

@menshikh-iv An example that I think could be helpful for user:

After fitting the LDA models, print an example using show_topics() and then describe that the distance we are calculating is basically between the topic's probability distribution/bag of words that we see above.

Maybe, illustrate the process also by taking an example cell in the matrix - print both topic distribution of that cell and then use distance function on these 2 distribution/words to print the distance which comes out to be of same value as in the matrix cell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants