Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Charset conversion does not work properly #19016

Open
cyliu0 opened this issue Aug 5, 2020 · 3 comments
Open

Charset conversion does not work properly #19016

cyliu0 opened this issue Aug 5, 2020 · 3 comments
Labels
sig/sql-infra SIG: SQL Infra type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@cyliu0
Copy link
Contributor

cyliu0 commented Aug 5, 2020

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE t1 (
  a ENUM('ä','ö','ü') character set latin1 default 'ü'
);
insert into t1 values ('ä'),('ö'),('ü');
select * from t1;
set charset latin1;
select * from t1;

2. What did you expect to see? (Required)

On MySQL:

mysql root@127.0.0.1:test> select * from t1
+----+
| a  |
+----+
| ä |
| ö |
| ü |
+----+
3 rows in set
Time: 0.016s
mysql root@127.0.0.1:test> set charset latin1;
Query OK, 0 rows affected
Time: 0.002s
mysql root@127.0.0.1:test> select * from t1
+---+
| a |
+---+
| ä |
| ö |
| ü |
+---+
3 rows in set
Time: 0.015s

3. What did you see instead (Required)

On TiDB:

mysql root@127.0.0.1:test> select * from t1
+----+
| a  |
+----+
| ä |
| ö |
| ü |
+----+
3 rows in set
Time: 0.014s
mysql root@127.0.0.1:test> set charset latin1
Query OK, 0 rows affected
Time: 0.001s
mysql root@127.0.0.1:test> select * from t1
+----+
| a  |
+----+
| ä |
| ö |
| ü |
+----+
3 rows in set
Time: 0.014s

4. Affected version (Required)

mysql root@127.0.0.1:test> select tidb_version()
+-------------------------------------------------------------------+
| tidb_version()                                                    |
+-------------------------------------------------------------------+
| Release Version: v4.0.0-beta.2-893-g4e829aaee                     |
| Edition: Community                                                |
| Git Commit Hash: 4e829aaee7b656aa807814708ae05af5233302af         |
| Git Branch: master                                                |
| UTC Build Time: 2020-08-05 02:23:17                               |
| GoVersion: go1.14.4                                               |
| Race Enabled: false                                               |
| TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306 |
| Check Table Before Drop: false                                    |
+-------------------------------------------------------------------+
1 row in set
Time: 0.014s

5. Root Cause Analysis

@cyliu0 cyliu0 added the type/bug The issue is confirmed as a bug. label Aug 5, 2020
@ghost
Copy link

ghost commented Aug 5, 2020

I believe this relates to #18955

TiDB does not support latin1 correctly - it treats it as a subset of utf8. This is only safe for the lower 128 characters of latin1.

@wjhuang2016 wjhuang2016 removed the type/bug The issue is confirmed as a bug. label Nov 4, 2020
@wjhuang2016
Copy link
Member

I think this problem can be solved after we implement GBK support.

@wjhuang2016 wjhuang2016 added type/feature-request Categorizes issue or PR as related to a new feature. and removed severity/major labels Nov 10, 2020
@wjhuang2016
Copy link
Member

Presently, it's the "expected behavior". Because we don't truly support "character_set_results". So I think it's should be a feature request: support "character_set_results".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants