Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add docs TypicalLogitsWarper #25140

Closed
wants to merge 14 commits into from
Closed

add docs TypicalLogitsWarper #25140

wants to merge 14 commits into from

Conversation

akshayamadhuri
Copy link

@akshayamadhuri akshayamadhuri commented Jul 27, 2023

What does this PR do?

Added some doc string to TypicalLogitsWarper with some examples as well.

Related to #24783

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [] Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@akshayamadhuri
Copy link
Author

@gante let me know the changes

@akshayamadhuri akshayamadhuri changed the title Update logits_process.py add docs TypicalLogitsWarper Jul 27, 2023
Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening the PR 🙌

A few nits to improve information density in the docs, and to improve the example

src/transformers/generation/logits_process.py Outdated Show resolved Hide resolved
src/transformers/generation/logits_process.py Outdated Show resolved Hide resolved
src/transformers/generation/logits_process.py Outdated Show resolved Hide resolved
src/transformers/generation/logits_process.py Outdated Show resolved Hide resolved
src/transformers/generation/logits_process.py Outdated Show resolved Hide resolved
@gante
Copy link
Member

gante commented Jul 28, 2023

You also need to run make fixup before your next commit, so that our CI becomes happy :D

Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the note in the example -- and don't forget to run make fixup on your end before the commit, otherwise our tests will block your PR! 🤗

src/transformers/generation/logits_process.py Outdated Show resolved Hide resolved
Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for iterating :)

You still need to run make fixup and commit the resulting changes, CI is complaining about formatting

Comment on lines 425 to 427

>>> # Set up the warper with desired parameters
>>> warper = TypicalLogitsWarper(tikohn_n=3, pi=0.95)
Copy link
Member

@gante gante Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> # Set up the warper with desired parameters
>>> warper = TypicalLogitsWarper(tikohn_n=3, pi=0.95)

(this is not needed, since we set typical_p in generate)

@gante gante requested a review from amyeroberts August 3, 2023 14:51
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! 🤗

Just some nits on the formatting

Comment on lines 406 to 408
The proportion of probability mass to retain while warping the logits. The value should be between 0 and 1.
Higher values (close to 1.0) retain more probability mass, leading to more typical sampling, whereas lower
values (close to 0.0) retain less probability mass, leading to more diverse sampling. The default is 0.9.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We don't use the extra indent elsewhere for continued paragraphs

Suggested change
The proportion of probability mass to retain while warping the logits. The value should be between 0 and 1.
Higher values (close to 1.0) retain more probability mass, leading to more typical sampling, whereas lower
values (close to 0.0) retain less probability mass, leading to more diverse sampling. The default is 0.9.
The proportion of probability mass to retain while warping the logits. The value should be between 0 and 1.
Higher values (close to 1.0) retain more probability mass, leading to more typical sampling, whereas lower
values (close to 0.0) retain less probability mass, leading to more diverse sampling.

Comment on lines 410 to 412
The value used to filter out logits that fall below this threshold. Any logits less than this value will be
set to -infinity before applying the softmax function. This helps in excluding unlikely tokens during sampling.
Default is -infinity.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same as above re indent. We don't need to say the default in the type info as , defaults to ... above

Suggested change
The value used to filter out logits that fall below this threshold. Any logits less than this value will be
set to -infinity before applying the softmax function. This helps in excluding unlikely tokens during sampling.
Default is -infinity.
The value used to filter out logits that fall below this threshold. Any logits less than this value will be
set to -infinity before applying the softmax function. This helps in excluding unlikely tokens during sampling.

min_tokens_to_keep (`int`, *optional*, defaults to 1):
Minimum number of tokens that cannot be filtered.
The minimum number of tokens to always keep during sampling. The default is 1.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The minimum number of tokens to always keep during sampling. The default is 1.
The minimum number of tokens to always keep during sampling.

Comment on lines 435 to 437
>>> # Generate text using the model and warper
>>> typical_p = 0.9
>>> output = model.generate(input_ids, do_sample=True, max_length=50, typical_p=typical_p)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice (but not necessary) to have two examples here with high and low typical_p values to demonstrate its effects

@github-actions
Copy link

github-actions bot commented Sep 2, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@akshayamadhuri akshayamadhuri closed this by deleting the head repository Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants