-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
charset: incorrect encoding for latin1
character set
#18955
Comments
I think we've 3 options for this issue:
|
latin
character setlatin1
character set
We decide to keep option 1 and close this issue for now(since it no serious problem is reported yet). Close this issue for now. A warning is added to the official website for it: https://docs.pingcap.com/tidb/stable/character-set-and-collation#character-sets-and-collations-supported-by-tidb |
Please edit this comment or add a new comment to complete the following informationBugNote: Make Sure that 'component', and 'severity' labels are added 1. Root Cause Analysis (RCA) (optional)As is stated in the document: https://docs.pingcap.com/tidb/stable/character-set-and-collation#character-sets-and-collations-supported-by-tidb 4. Workaround (optional)Use 5. Affected versionsAll existing versions. 6. Fixed versionsNA(we'll not fix it for now). |
Please note that |
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
case 1:
case 2:
2. What did you expect to see? (Required)
case 2:
3. What did you see instead (Required)
case 1:
The encoding of
¥
inlatin
should beA5
.case 2:
4. Affected version (Required)
All versions of TiDB
5. Root Cause Analysis
In TiDB, we treat
latin1
as a subset ofutf8
/utf8mb4
and encoded the characters as UTF8, just like what we did forascii
.But, it is NOT:
latin1
is a single-byte encoding character set:More details can be found here: https://en.wikipedia.org/wiki/ISO/IEC_8859-1
The text was updated successfully, but these errors were encountered: