Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added two new embedding model's encoding #247

Merged
merged 2 commits into from
Feb 9, 2024

Conversation

Praneet460
Copy link
Contributor

Problem
Library doesn't support two new embedding model's encoding mapper

  • text-embedding-3-small
  • text-embedding-3-large

tiktoken.encoding_for_model("text-embedding-3-small") raises a KeyError

Screenshot 2024-01-27 at 1 08 05 AM

Solution
Added Encoding mapper for 2 new embedding models. The source of mapping is taken from here

@hoonlight
Copy link

@hauntsaninja Hi, can you check this PR?

@stevieflyer
Copy link

Really looking forward to this merge

Copy link

@tisserasuneth tisserasuneth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@chrispy-snps
Copy link

Thanks for adding these!

Copy link

@kawada711 kawada711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@usamasaleem1
Copy link

Can we merge this to main so I can start using the new models!

@Vaibhav2001
Copy link

Can we merge this to main so I can start using the new models!

+1

@itarutomy97
Copy link

+1

@jnance314
Copy link

+1

@emsi
Copy link

emsi commented Feb 6, 2024

For the meantime you can just:
pip install -U git+https://github.com/Praneet460/tiktoken@Add-New-Embedding-Models

@will-mako-ai
Copy link

+1

1 similar comment
@ByeongUkChoi
Copy link

+1

Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@hauntsaninja hauntsaninja merged commit 55c8d83 into openai:main Feb 9, 2024
@hauntsaninja
Copy link
Collaborator

This has been released in tiktoken 0.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.