Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pointwise ABC, MonoT5, New Unit Tests #128

Merged
merged 24 commits into from
Aug 17, 2024
Merged

Conversation

xpbowler
Copy link
Collaborator

@xpbowler xpbowler commented Aug 3, 2024

Summary of Changes

  • Bug fix in extract_kwargs function
  • Bug fix where IdentityReranker not returning top_k_rerank (only top_k_retrieve)
  • Add unit tests for API and retrieve_and_rerank
  • Added pointwise and pairwise ABC
  • Bug fix in Rank_LLM ABC parameters (missing kwargs)
    ...more changes below (@IR3KT4FUNZ)

@xpbowler xpbowler marked this pull request as draft August 4, 2024 00:31
@xpbowler xpbowler marked this pull request as ready for review August 4, 2024 03:59
@xpbowler xpbowler marked this pull request as draft August 4, 2024 03:59
@IR3KT4FUNZ
Copy link

Changes:

  • add monot5 as a promptmode, and MonoT5 subclass of PointwiseRankLLM
  • write implementations for the abstract functions for MonoT5 and PointwiseRankLLM classes
  • update Reranker class to add support for the MonoT5 agent

@xpbowler xpbowler marked this pull request as ready for review August 5, 2024 21:58
@xpbowler xpbowler changed the title Pairwise/Pointwise Pointwise ABC, MonoT5, New Unit Tests Aug 5, 2024
test/api/output.txt Outdated Show resolved Hide resolved
@xpbowler xpbowler mentioned this pull request Aug 7, 2024
XKTZ added a commit to XKTZ/rank_llm that referenced this pull request Aug 7, 2024
IR3KT4FUNZ and others added 6 commits August 7, 2024 20:05
…rting docs to prompts in pointwise by using the listwise functions, fix sorting order of monot5, add batching for run_llm_batched in pointwise_rankllm
…g to pointwise_rankllm class instead of individual pointwise model classes
Copy link
Member

@ronakice ronakice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 16 to 22
try:
from vllm import LLM, SamplingParams
except:
LLM = None
SamplingParams = None

logger = logging.getLogger(__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed?

self,
model: str,
prompt_mode: str = "monot5",
context_size: int = 512,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a new issue s.t., we support longer sequences eventually, nothing is limiting T5 usage with longer context (we've done it before)

Comment on lines +73 to +75
truth_logit = logit_tensor[1176]
false_logit = logit_tensor[6136]
score = math.exp(truth_logit) / (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hardcoded and won't scale in the future to other T5 base models, note this as a T5 todo

outputs = self._tokenizer.decode(
output_ids, skip_special_tokens=True, spaces_between_special_tokens=False
)
truth_logit = logits[0][0][1176]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@ronakice ronakice merged commit 5fe9343 into castorini:main Aug 17, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants