-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: Add encoding config to json for load_prompt() #26935
base: master
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
@hwchase17 @efriis Could you please review this? |
@zakki i don't think that this we should be supporting multiple encodings. I'd rather just specify the encoding explicitly to be utf-8 and remove all the OS dependence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should hard-code the encoding as utf-8 in open I think.
Just need to figure out what to do this as it could be a breaking change to folks on windows
I don't think hard-coding UTF-8 is a bad idea, and in the long term, the problem will be solved by Python 3.15 with PEP-686. However, in the short term, the problem is the environment dependency when distributing applications. |
@zakki also I wanted to confirm the current PR only handles the decoding path? What code path takes care of the encoding path so users on windows that may use cp1252 or some other encoding end up creating appropriate json files?
Sorry could you paraphrase, are you arguing for your solution vs. the hard-coding UTF-8 everywhere? |
For my use case, I create the output file in my own python code, not within langchain framework code. Considering completeness for other use cases, it may be necessary to explicitly specify the encoding of the output file.
I think it would be reasonable to enforce UTF-8 as a breaking change in version 0.4.x. |
This PR lets you specify file encoding of "*_path" in the config.
If an encoding is set in the config, it will be used when opening the file.
Otherwise, the default encoding will be applied.
The default encoding in Python differs depending on the environment.
On Windows, it is the ANSI code page (ex: "cp932").
Issue: load_prompt Unable to set encoding for JSON files #6900
Twitter handle: https://x.com/k_matsuzaki
Add tests and docs: If you're adding a new integration, please include
docs/docs/integrations
directory.Lint and test: Run
make format
,make lint
andmake test
from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/