Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emojis are converted to question-marks in repository description #2711

Closed
2 tasks done
jonasfranz opened this issue Oct 15, 2017 · 18 comments
Closed
2 tasks done

Emojis are converted to question-marks in repository description #2711

jonasfranz opened this issue Oct 15, 2017 · 18 comments
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented topic/ui Change the appearance of the Gitea UI type/enhancement An improvement of existing functionality

Comments

@jonasfranz
Copy link
Member

  • Gitea version (or commit ref): f3833b7
  • Operating system: GNU/Linux
  • Database (use [x]):
    • MySQL
  • Can you reproduce the bug at https://try.gitea.io:
    • No
  • Log gist: Not relevant

Description

It result into the following: "????App for iOS" if you want to add emojis to your repository description like "📱App for iOS".

Emojis in the description could be useful like seen at the ownCloud Github Project (https://github.com/owncloud).

This might be caused by the used MySQL database.

@lunny lunny added type/bug topic/ui Change the appearance of the Gitea UI labels Oct 16, 2017
@lunny lunny added this to the 1.x.x milestone Oct 16, 2017
@lunny
Copy link
Member

lunny commented Dec 5, 2017

So maybe mysql database should be utf8mb4?

@kolaente
Copy link
Member

I had a similar issue, but in my case the description was completly deleted when I added an emoji to the repo description (v1.4-rc-2). Seems to work fine on master with sqlite though.

@lunny
Copy link
Member

lunny commented Dec 9, 2018

This should be fixed by #5168, please feel free to reopen it.

@lunny lunny closed this as completed Dec 9, 2018
@lunny lunny removed this from the 1.x.x milestone Dec 9, 2018
@lunny lunny reopened this Dec 9, 2018
@lunny lunny added type/enhancement An improvement of existing functionality and removed type/bug labels Dec 9, 2018
@lunny
Copy link
Member

lunny commented Dec 9, 2018

If you input :smile: that right for repo description, but if you paste from your clipboard, that will fail.

@immanuelfodor
Copy link

I can confirm this on v1.6.2, any emoji copypasted from e.g. https://emojipedia.org becomes ????. Only the manually typed :emojicode: works fine. This issue is also present at eg. org descriptions, copypasted emojis become question marks. Screenshot from an issue comment:

screenshot_20181231_175649

@lunny
Copy link
Member

lunny commented Jan 3, 2019

So should we parse the copypasted emojis to :emojicode: before save it?

@immanuelfodor
Copy link

Great idea, it should work without utf8mb4 then (simple utf8 databases/tables).

@stale
Copy link

stale bot commented Mar 4, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale stale bot added the issue/stale label Mar 4, 2019
@immanuelfodor
Copy link

Any new info on somebody planning to implement the suggested conversion that could solve the original issue? :)

@stale stale bot removed the issue/stale label Mar 4, 2019
@helmut72
Copy link

Or just add MySQL utf8mb4 support.

@lunny lunny added the issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented label Mar 16, 2019
@lunny lunny closed this as completed May 24, 2019
@immanuelfodor
Copy link

Wow, thank you, @lunny ! Will there be a migration guide for us until the next release how to upgrade an existing database? Or this depends on the community if someone publishes such? I think I did such for a Nextcloud install once following these steps: https://docs.nextcloud.com/server/16/admin_manual/configuration_database/mysql_4byte_support.html Should these steps work in theory for Gitea as well? (With replacing the DB name, of course)

@lunny
Copy link
Member

lunny commented May 24, 2019

@immanuelfodor convert a utf8 database to utf8mb4 database is possbile. And I found an article about how to convert utf8 to utf8mb4, see https://mathiasbynens.be/notes/mysql-utf8mb4

@immanuelfodor
Copy link

immanuelfodor commented Jul 31, 2019

The new PRs #7144 #6992 took care of the conversion with the new gitea convert command successfully but I still get four question marks in comments when commenting with an emoji. All my tables are Barracuda, utf8mb4, row format dynamic, etc etc. Gitea was newly built, restarted, new login session.

gitea -v
# Gitea version 1.9.0 built with GNU Make 4.1, go1.12.7 : bindata
mysql -V
# mysql  Ver 15.1 Distrib 10.1.40-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
cat /etc/os-release | grep -i pretty_name
# PRETTY_NAME="Ubuntu 18.04.2 LTS"

@immanuelfodor
Copy link

I'm not sure if this is a DB issue because the CLI shows ???? as if it was saved to the DB this way.

MariaDB [gitea]> select id, name, content from issue where id=2;
+----+-----------------+---------------------+
| id | name            | content             |
+----+-----------------+---------------------+
|  2 | Testing utf8mb4 | :grinning: 

???? |
+----+-----------------+---------------------+
1 row in set (0.00 sec)

MariaDB [information_schema]> select column_name, character_set_name, collation_name from columns where table_schema = "gitea" and table_name = "issue" and character_set_name is not null;
+-------------+--------------------+--------------------+
| column_name | character_set_name | collation_name     |
+-------------+--------------------+--------------------+
| name        | utf8mb4            | utf8mb4_general_ci |
| content     | utf8mb4            | utf8mb4_general_ci |
| ref         | utf8mb4            | utf8mb4_general_ci |
+-------------+--------------------+--------------------+
3 rows in set (0.00 sec)

select table_name, table_collation, engine, row_format, create_options from tables where table_schema = "gitea" and table_name = "issue";
+------------+--------------------+--------+------------+--------------------+
| table_name | table_collation    | engine | row_format | create_options     |
+------------+--------------------+--------+------------+--------------------+
| issue      | utf8mb4_general_ci | InnoDB | Dynamic    | row_format=DYNAMIC |
+------------+--------------------+--------+------------+--------------------+
1 row in set (0.00 sec)

MariaDB [information_schema]> select * from innodb_sys_tables where name = "gitea/issue";
+----------+-------------+------+--------+-------+-------------+------------+---------------+
| TABLE_ID | NAME        | FLAG | N_COLS | SPACE | FILE_FORMAT | ROW_FORMAT | ZIP_PAGE_SIZE |
+----------+-------------+------+--------+-------+-------------+------------+---------------+
|     1005 | gitea/issue |   33 |     20 |   991 | Barracuda   | Dynamic    |             0 |
+----------+-------------+------+--------+-------+-------------+------------+---------------+
1 row in set (0.00 sec)

MariaDB [information_schema]> show variables like 'innodb_file_%';
+--------------------------+-----------+
| Variable_name            | Value     |
+--------------------------+-----------+
| innodb_file_format       | Barracuda |
| innodb_file_format_check | ON        |
| innodb_file_format_max   | Barracuda |
| innodb_file_per_table    | ON        |
+--------------------------+-----------+
4 rows in set (0.00 sec)

MariaDB [information_schema]> show variables like 'innodb_large_%';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| innodb_large_prefix | ON    |
+---------------------+-------+
1 row in set (0.00 sec)

@lunny
Copy link
Member

lunny commented Aug 1, 2019

@immanuelfodor could you paste the content here so that I can test it locally.

@immanuelfodor
Copy link

immanuelfodor commented Aug 1, 2019

Just two grinning faces, first line is with :grinning:, second is the same face copied from emojipedia (copy button): https://emojipedia.org/grinning-face/
Funny thing is that in the meantime, I received an email from Gitea, and it contains the emoji fine on the second line. Maybe the email is sent before the multibyte character is converted?
Another idea is the DB connection, in PHP, you would need to run SET NAMES utf8mb4 before anything else, I don't know if it is true for Go as well or if you do it in Gitea: https://stackoverflow.com/questions/16893035/using-utf8mb4-with-php-and-mysql

In the same MariaDB server, a Nextcloud and a TT-RSS database is stored, too, and both handle emojis fine with utf8mb4.

@lunny
Copy link
Member

lunny commented Aug 1, 2019

@immanuelfodor You should change charset in app.ini to utf8mb4. Go to https://docs.gitea.io/en-us/config-cheat-sheet/ and search CHARSET . I think your problem maybe because you haven't set that.

@immanuelfodor
Copy link

Aaand YES! I looked through my app.ini before, but I did not have the charset option there, it must be newer than my file. Added it, restarted Gitea, new comment, and it works! Thank you very much.

@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented topic/ui Change the appearance of the Gitea UI type/enhancement An improvement of existing functionality
Projects
None yet
Development

No branches or pull requests

5 participants