From db5bda9fc93b3171db6c4afea329394e6b6d31ca Mon Sep 17 00:00:00 2001 From: Logan Kilpatrick <23kilpatrick23@gmail.com> Date: Mon, 29 Jan 2024 17:55:58 -0700 Subject: [PATCH] Clarify language models in README (#203) --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 1a76a2c0..748578b6 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ If you work at OpenAI, make sure to check the internal documentation or feel fre ## What is BPE anyway? -Models don't see text like you and I, instead they see a sequence of numbers (known as tokens). +Language models don't see text like you and I, instead they see a sequence of numbers (known as tokens). Byte pair encoding (BPE) is a way of converting text into tokens. It has a couple desirable properties: 1) It's reversible and lossless, so you can convert tokens back into the original text @@ -128,4 +128,3 @@ setup( Then simply `pip install ./my_tiktoken_extension` and you should be able to use your custom encodings! Make sure **not** to use an editable install. -