Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Javascript option to preserve string newline or char codes #717

Closed
naturedamends opened this issue Jun 19, 2024 · 8 comments
Closed

Javascript option to preserve string newline or char codes #717

naturedamends opened this issue Jun 19, 2024 · 8 comments

Comments

@naturedamends
Copy link

naturedamends commented Jun 19, 2024

I would like to preserve newline chars when in strings, such that I can write programs that operate on them. Currently it changes the output of the program. \r\n is being replaced with a single \n. As a tagged literal with a new line char. I need the \r

for example:

const something = '\r\n';

is replaced with

const a = `
`;

which is completely different. Its char code is not 13, 10. Its simply 10

@naturedamends naturedamends changed the title Javascript option to preserve newline or char codes Javascript option to preserve strung newline or char codes Jun 19, 2024
@naturedamends naturedamends changed the title Javascript option to preserve strung newline or char codes Javascript option to preserve string newline or char codes Jun 19, 2024
@tdewolff
Copy link
Owner

This was fixed in the latest release, a sequence of \r\n (or 0x5C 0x72 0x5C 0x6E) was previously converted to 0x0D 0x0A (but not just 0x0A as you mention...), but it now becomes 0x5C 0x72 0x0A. That is, \r is now kept as-is (also when alone). Please let me know if the latest release works for you.

@naturedamends
Copy link
Author

naturedamends commented Jun 21, 2024

This was fixed in the latest release, a sequence of \r\n (or 0x5C 0x72 0x5C 0x6E) was previously converted to 0x0D 0x0A (but not just 0x0A as you mention...), but it now becomes 0x5C 0x72 0x0A. That is, \r is now kept as-is (also when alone). Please let me know if the latest release works for you.

I have just disabled it for the moment. Why are you thinking that \r\n is not 0x0D 0x0A, at least in ascii hex. Feel free to close the issue.

result from plain javascript minified

00000060  70 3a 60 0d 0a 60 2c 74  65 6d 70 6c 61 74 65 73  |p:`..`,templates|

Actually my hex dump of the minified data output looks correct. The issue may elsewhere, but I am still seeing \r characters get replaced when turning on the minify . I'm using gohugo, so this might take a little more investigating.

I'm importing a library and it is definitely getting changed wrong then.

n.EOL.split("")
['\n']

I use typescript and other things so might be that. But only happens when minified is turned on. Using whatever version of that included in gohugo.

Might not be a problem here, i can't reproduce.


top level javascript minified
00000000  28 28 29 3d 3e 7b 76 61  72 20 65 3d 60 0d 0a 60  |(()=>{var e=`..`|
00000010  3b 63 6f 6e 73 6f 6c 65  2e 6c 6f 67 28 65 29 7d  |;console.log(e)}|
00000020  29 28 29                                          |)()|
00000023

top level module minified
00000000  28 28 29 3d 3e 7b 76 61  72 20 65 3d 60 0d 0a 60  |(()=>{var e=`..`|
00000010  3b 63 6f 6e 73 6f 6c 65  2e 6c 6f 67 28 65 29 7d  |;console.log(e)}|
00000020  29 28 29                                          |)()|
00000023

ts import from local ts module minified
000000e0  74 73 22 28 29 7b 76 61  72 20 65 3d 60 0d 0a 60  |ts"(){var e=`..`|
000000f0  3b 63 6f 6e 73 6f 6c 65  2e 6c 6f 67 28 65 29 7d  |;console.log(e)}|
00000100  7d 29 3b 6e 28 29 7d 29  28 29                    |});n()})()|
0000010a

However the module i'm importing is very complex. and includes lots of regexes that are not included in the test i've preformed.

@naturedamends
Copy link
Author

naturedamends commented Jun 21, 2024

So when it transforms it within a javascript string.

it needs to retain the \r characters? Your new version keeps the chars.

i see. Ill try and run it

@tdewolff
Copy link
Owner

tdewolff commented Jun 21, 2024

Not sure if I understand the problem, please try the last version and tell me if you still have a problem.

There is a difference between escape sequence (the characters '\' 'r' and '\' 'n') and literal (0x0D and 0x0A respectively) line terminators, but in some cases we may convert escape sequences into literals:

  • Inside single and double quoted strings: literal line terminators are not allowed, except when preceded by a backslash in which case it is an empty string (used to continue writing a string to the next line without inserting a newline in the string)
  • Inside backtick quoted string (template literals): literal line terminators, including 0x2028 and 0x2028, are allowed but noting that 0x0D and the sequence 0x0D0A are both interpreted as 0x0A. This means we may convert the escape sequence \n to its literal, but \r not (this was fixed in the commit).

This is why this minifier tries to convert strings to template literals, since it may convert the line terminator sequences and save a byte for each one.

@naturedamends
Copy link
Author

naturedamends commented Jun 21, 2024

Looks like it works. Using the latest version of your package. Node bindings. Are you assuming that all browsers support tagged literals. Thats fine for me, but thats a killer for many people. Might just be me having flash backs to supporting ie . Not sure

{
  "type": "module",
  "dependencies": {
    "@tdewolff/minify": "^2.20.34"
  }
}
(() => {
  // <stdin>
  function obj(h) {
    obj.EOL = "\r\n";
  }
  var a = new obj();
  console.log(obj.EOL.split("").map((a2) => a2.charCodeAt(0)));
})();
(()=>{function e(){e.EOL=`\r
`}var t=new e;console.log(e.EOL.split("").map(e=>e.charCodeAt(0)))})()
node one.min.js 
[ 13, 10 ]

I have just tried with the latest hugo (0.127-extended) i can easily get. And i get the same result. There will be no issue if you replace with 0x5C 0x72 0x5C 0x6E like you said.

I'm using hugo v0.121.1-00b46fed8e47f7bb0a85d7cfc2d9f1356379b740+extended linux/amd64 BuildDate=2023-12-08T08:47:45Z VendorInfo=gohugoio during the builds above.

Iin the hex dumps. It transforms to incorrect code. Since it replaces the literal chars with whitespace characters.

@tdewolff
Copy link
Owner

That looks correct. You can disable template literals by setting the Version to below 2015 (template literals are defined in ECMAScript 2015).

Iin the hex dumps. It transforms to incorrect code. Since it replaces the literal chars with whitespace characters.

What do you mean, it works in your example above but not with Hugo? Perhaps they haven't updated the dependency?

@naturedamends
Copy link
Author

naturedamends commented Jun 21, 2024

That looks correct. You can disable template literals by setting the Version to below 2015 (template literals are defined in ECMAScript 2015).

Iin the hex dumps. It transforms to incorrect code. Since it replaces the literal chars with whitespace characters.

What do you mean, it works in your example above but not with Hugo? Perhaps they haven't updated the dependency?

Yeah. So many versions and test cases linked to versions, it’s hard to say words. Lol. Sorry.

  • Using hugo <= 0.127 , with minify, the output of my program changes.
  • It has always worked using hugo without minify
  • My test programs work using your standalone program, from npm. version above
  • They may have updated. But it hasn’t reached downstream where I am downloading from.

If i remember right. For one reason or another. You can no longer transform all the code into es5 properly using jsbuild in hugo. Cant remember why. Seems like everyone is moving away from it. This must be one of them.

@naturedamends
Copy link
Author

naturedamends commented Jun 30, 2024

That looks correct. You can disable template literals by setting the Version to below 2015 (template literals are defined in ECMAScript 2015).

Iin the hex dumps. It transforms to incorrect code. Since it replaces the literal chars with whitespace characters.

What do you mean, it works in your example above but not with Hugo? Perhaps they haven't updated the dependency?

Just updated hugo and it seem to work. naturedamends/hugo@6f48198

(() => {
    function e() {
        e.EOL = `\r
`
    }
    console.log(e.EOL),
    console.log(e())
})()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants