Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File: Re-add support to skip CR (\r) in File::get_as_text #63733

Merged
merged 1 commit into from
Aug 1, 2022

Conversation

akien-mga
Copy link
Member

@akien-mga akien-mga commented Jul 31, 2022

This was removed in #63481, and we confirmed that it's better like this,
but we add back the possibility to strip CR as an option, to optionally
restore the previous behavior.

For performance this is done directly in String::parse_utf8.

Also fixes Android FileAccess::get_line() as this one should strip CR.

Supersedes #63717.

Will backport for 3.x with the get_as_text() option defaulting to true to fix the compat breakage.

@bruvzg My changes to String::parse_utf8() might be a huge hack, please review.


Speed comparison on a big file with a debug build on Linux, as described in #63717 (comment):

Unix file: 3.107707 s
DOS file:  3.214315 s
Unix file: 0.136303 s
DOS file:  0.162586 s
Unix file: 0.180824 s
DOS file:  1.624483 s
  • This PR
Unix file: 0.129991 s
DOS file, keep CR: 0.148745 s
DOS file, skip CR: 0.190492 s

Copy link
Member

@bruvzg bruvzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_utf8 changes seems fine.

@akien-mga akien-mga force-pushed the file-get_as_text-skip-CR branch from ed25a6d to 438db2a Compare July 31, 2022 19:20
@akien-mga akien-mga requested a review from a team as a code owner July 31, 2022 19:20
@akien-mga akien-mga marked this pull request as draft July 31, 2022 20:01
@akien-mga akien-mga force-pushed the file-get_as_text-skip-CR branch from 438db2a to 8b7f4b0 Compare July 31, 2022 20:25
@akien-mga
Copy link
Member Author

I still had a doubt about my parse_utf8 changes so I added some tests, and indeed they fail :P

@bruvzg
Copy link
Member

bruvzg commented Jul 31, 2022

Test files seem to be wrong, CRLF one it only has LF.

@akien-mga akien-mga force-pushed the file-get_as_text-skip-CR branch 2 times, most recently from 06e509f to 619e99e Compare July 31, 2022 20:55
@akien-mga
Copy link
Member Author

akien-mga commented Jul 31, 2022

Test files seem to be wrong, CRLF one it only has LF.

Indeed, my .gitattributes override doesn't seem to work as intended. The Git documentation is horrible to read :|

Edit: I finally got it to work.


With your advice I could fix the FileAccess test but the String one still fails.

./tests/core/string/test_string.h:155:
TEST CASE:  [String] UTF8 with CR

./tests/core/string/test_string.h:166: ERROR: CHECK( no_cr == base.replace("\r", "") ) is NOT correct!
  values: CHECK( H == Hello darkness
My old friend
I've come to talkWith you again )

@akien-mga akien-mga force-pushed the file-get_as_text-skip-CR branch 2 times, most recently from dca94be to f13f181 Compare July 31, 2022 21:05
@akien-mga akien-mga marked this pull request as ready for review July 31, 2022 21:07
@akien-mga akien-mga requested review from a team as code owners July 31, 2022 21:07
@DarkKilauea
Copy link
Contributor

Should an enumeration that selects the desired line ending be used instead? I understand this is a quick fix, but I'm concerned about adding a parameter that will need to be changed later on, resulting in a breaking change down the line.

enum LineEnding {
    LineEnding_Unix, // lf
    LineEnding_Mac // cr
    LineEnding_Windows // crlf
}

@akien-mga
Copy link
Member Author

akien-mga commented Jul 31, 2022

Probably overkill, no modern platform uses CR only anymore. It was used on Mac OS prior to Mac OS X.
Since Mac OS X (2001), it's using LF like other Unixes. Windows uses CRLF.

@akien-mga akien-mga force-pushed the file-get_as_text-skip-CR branch from f13f181 to 961e9ac Compare July 31, 2022 22:32
This was removed in godotengine#63481, and we confirmed that it's better like this,
but we add back the possibility to strip CR as an option, to optionally
restore the previous behavior.

For performance this is done directly in `String::parse_utf8`.

Also fixes Android `FileAccess::get_line()` as this one _should_ strip CR.

Supersedes godotengine#63717.
@akien-mga akien-mga force-pushed the file-get_as_text-skip-CR branch from 961e9ac to 1418f97 Compare July 31, 2022 22:40
@akien-mga akien-mga merged commit 44f1e54 into godotengine:master Aug 1, 2022
@akien-mga akien-mga deleted the file-get_as_text-skip-CR branch August 1, 2022 05:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants