Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Node is not using the system's locale #4689

Closed
jmealo opened this issue Jan 30, 2013 · 35 comments
Closed

Node is not using the system's locale #4689

jmealo opened this issue Jan 30, 2013 · 35 comments
Labels

Comments

@jmealo
Copy link

jmealo commented Jan 30, 2013

Calling .toLocaleString() on Number and Date objects behaves differently than Chrome.

This behavior is present in Ubuntu 12.04 and OSX 10.8 using Node v0.8.17.

On Chrome:
1000.toLocaleString();

"1,000"

On Node:
1000.toLocaleString();

"1000"

Additional info:

process.env.lang is set to the the expected value en_US.UTF-8

locale on the terminal returns the following (OSX 10.8):
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

locale on the terminal returns the following (Ubuntu 12.04):
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Related bugs: #966

@bnoordhuis
Copy link
Member

This is arguably a V8 bug. It ignores locale settings. In fact, all date and number formatting logic is hard-coded.

The reason it works in Chrome and Chromium is that those projects use v8-i18n on top of V8. I don't think that's a direction we want to take. It depends on libicu and that's a massive library. We would have to bundle it and that would increase our already large source tree by another 85 MB and ~500,000 LoC.

@andreioprisan
Copy link

Is there a better solution than including the v8-i18n package?

@bnoordhuis
Copy link
Member

Not really. Localization is a complex affair (the people who don't think it is are usually native English speakers) so you end up using something like libicu anyway.

@jcollum-hcg
Copy link

I think the docs should be fixed, currently they imply that toLocaleString will use locales. This is from the docs:

var number = 3500;
console.log(number.toLocaleString()); // Displays "3,500" in English locale

http://nodemanual.org/latest/js_doc/Number.html#Number.toLocaleString

Unless nodemanual.org isn't the official doc.

@bnoordhuis
Copy link
Member

Unless nodemanual.org isn't the official doc.

nodemanual.org isn't the official doc. :-)

@jcollum-hcg
Copy link

Dammit. What would you recommend using as the official doc? I've looked around a lot on this and haven't found anything.

@TooTallNate
Copy link

The latest official docs can always be found here:
http://nodejs.org/docs/latest/api/

On Tuesday, July 30, 2013, jcollum-hcg wrote:

Dammit. What would you recommend using as the official doc? I've looked
around a lot on this and haven't found anything.


Reply to this email directly or view it on GitHubhttps://github.com//issues/4689#issuecomment-21830342
.

@TooTallNate
Copy link

The Number class is part of JavaScript itself, which the NodeJS docs don't re-iterate. You probably want to read up on Mozilla's MDN docs for JavaScript, or perhaps read the ECMAScript 5 reference itself.

@jcollum-hcg
Copy link

Right but the MDN docs don't match with V8! MDN says you can pass in a BCP 47 locale string, which works in browsers but doesn't work in Node. I've looked for a V8 API reference but haven't found any that discuss Number::toLocaleString. It's frustrating -- since this is native code I'd rather not have to read C++ to figure out what a valid signature for my function call looks like.

Passing in a BCP 47 locale string clearly doesn't work in node, according to this issue ticket and my tests.

@bnoordhuis
Copy link
Member

V8 master supports linking against ICU directly now so that's an option we'll probably offer in the future. I don't think we're going to bundle ICU so you'll have to install that separately and enable it with a ./configure switch.

@jcollum-hcg
Copy link

I just worked around it with some regex stuff, since I didn't need full i18n. I just wish I hadn't spent so much time on this issue -- MDN says you can do X but V8 doesn't do X and there's no docs that explain that, just github bug reports. As a developer I'd like to see better docs for V8 if I'm going to continue to use it in a production environment.

@jmealo
Copy link
Author

jmealo commented Jul 31, 2013

I raised this issue last year and was told that browser support for i18n is
what allows V8 to detect system locale. It was indicated that it was not
technically feasible to integrate i18n support into Node. Can someone
please confirm?

On Wed, Jul 31, 2013 at 1:34 PM, Ben Noordhuis notifications@gh.neting.ccwrote:

V8 master supports linking against ICU directly now so that's an option
we'll probably offer in the future. I don't think we're going to bundle ICU
so you'll have to install that separately and enable it with a ./configure
switch.


Reply to this email directly or view it on GitHubhttps://github.com//issues/4689#issuecomment-21880490
.

@bnoordhuis
Copy link
Member

Not sure what you want me to confirm but here is a quick recap.

Previously, the only way to get native i18n in V8 was through the external v8-i18 module.

The V8 team recently pulled v8-i18n into V8 core (conceptually if not literally) but it still has a dependency on libicu and that's a library we're unlikely to bundle because of its massive size.

We'll probably add a configure option that lets you link against an external libicu but that's something you'll have to explicitly enable at build time.

tl;dr You can have native i18n but not by default.

Hope that clears things up.

@ericf
Copy link

ericf commented Jul 31, 2013

We'll probably add a configure option that lets you link against an external libicu but that's something you'll have to explicitly enable at build time.

tl;dr You can have native i18n but not by default.

@bnoordhuis could you keep this thread updated with the details (or links to them) of the approach for how people can use the v8 i18n support in Node once they're figured out?


Running the latest nightly I saw there's now an --enable_i18n (enable i18n extension) v8 option. When I ran with this on, it didn't do anything different for toLocaleString() functions. This now makes sense now that I know the libicu library is missing/not bundled.

@bnoordhuis
Copy link
Member

I'll try to remember to update this issue but in case I forget, keep an eye on the ChangeLog of upcoming releases, it will be mentioned there.

@srl295
Copy link
Member

srl295 commented Aug 6, 2013

A better way to see if i18n is there is to just type "Intl" or even "new Intl.Collator();"

v8 trunk, though, if enablei18n=on is on the arguments to make, has 'make dependencies" go off and download ICU (although a several year old version 4.6). So v8 doesn't include ICU source, not sure that node would need to. But it would be better to depend on a system ICU to get the latest and greatest.

As far as being too big, you can slice and dice ICU lots of ways, both in code and data. Don't need 16MB of legacy mappings that aren't even used by v8? Then don't include them. We've always heard "ICU is too big!" and ".. but could you please support X?" - it takes a lot of code and data to get things right. See http://userguide.icu-project.org under packaging for more details.

Anyways, v8's include of icu.gyp didn't work inside the node build, but I really don't know what I'm doing and the following only seems to result in more spectacular failure:

diff --git a/deps/v8/tools/gyp/v8.gyp b/deps/v8/tools/gyp/v8.gyp
index 0c1ad68..afb3321 100644
--- a/deps/v8/tools/gyp/v8.gyp
+++ b/deps/v8/tools/gyp/v8.gyp
@@ -838,13 +838,13 @@
             '../../src/extensions/i18n/number-format.h',
           ],
           'dependencies': [
-            '<(DEPTH)/third_party/icu/icu.gyp:icui18n',
-            '<(DEPTH)/third_party/icu/icu.gyp:icuuc',
+            '<../../../../third_party/icu/icu.gyp:icui18n',
+            '<../../../../third_party/icu/icu.gyp:icuuc',

@srl295
Copy link
Member

srl295 commented Aug 6, 2013

so, got node+i18n working. Steps:

  • above patch
  • also needed to change '-fno-rtti' to '-frtti' in common.gypi
  • needed to make sure v8_enable_i18n_support=1 was set for v8

Before building, needed:
make -C deps/v8 enablei18n=on dependencies

not too bad.

Little test:

new Date().toLocaleString(["he"],{weekday:"long"});
'יום שלישי'

@medikoo
Copy link

medikoo commented Sep 12, 2013

@srl295

I followed your steps on v0.11.7 and it didn't work. Below is exactly what I've did:

  • your patch
  • changed '-fno-rtti' to '-frtti' in common.gypi
  • make -C deps/v8 enablei18n=on dependencies
  • ./configure
  • make
  • make install
  • node --enable_i18n

And in REPL Intl is not available, do you know what's missing?

@srl295
Copy link
Member

srl295 commented Sep 12, 2013

@medikoo i'm working on getting the full patch to post

@medikoo
Copy link

medikoo commented Sep 13, 2013

@srl295 thanks! Let us know

@srl295
Copy link
Member

srl295 commented Sep 13, 2013

@medikoo will do. hope to have some pull req's here soon.

@srl295
Copy link
Member

srl295 commented Sep 17, 2013

@medikoo put some patches here https://github.com/srl295/node/compare - still a work in progress.

@medikoo
Copy link

medikoo commented Sep 18, 2013

@srl295 great thanks, I'll watch that

@isaacs
Copy link

isaacs commented Oct 16, 2013

I would be ok wiht a pull request to add this functionality. Requirements:

  1. Must not bundle libicu. As @bnoordhuis points out, it's huge, and we probably don't want to statically include it.
  2. Node must still work properly out of the box, as it does now, and the default should be the same whether you have icu on your machine or not.
  3. --with-intl must to be an opt-in configure flag.

@trevnorris
Copy link

@isaacs Agreed. As of v8 3.22 icu is enabled by default (using a freaking annoying gyp flag I can't figure out how to get around during configuration), but we definitely shouldn't be bundling that massive 300MB lib w/ core. behind a build flag seems appropriate.

@ericf
Copy link

ericf commented Oct 17, 2013

Could this be approached the same way that Chrome and Firefox are providing their ECMAScript Internationalization API implementations and related bundles? They are not adding anything close to 300MBs to their download sizes. When I talked with @NorbertLindenberg about this, I think it added something around 4MB to Firefox's download size.

@NorbertLindenberg
Copy link

The increase in download size for Firefox caused by adding ICU and the ECMAScript Internationalization API varies between platforms, from 2 MB for Windows to 6 MB for Mac – the latter includes separate 32-bit and 64-bit versions of all binaries. So, as Eric said, nowhere near 300 MB. I did use a number of ICU configuration options to remove code and data that the Internationalization API doesn't need – see
http://hg.mozilla.org/mozilla-central/file/423b9c30c73d/js/src/configure.in#l4185
http://hg.mozilla.org/mozilla-central/file/423b9c30c73d/js/src/configure.in#l4238
http://hg.mozilla.org/mozilla-central/file/423b9c30c73d/js/src/Makefile.in#l263
http://userguide.icu-project.org/packaging
http://userguide.icu-project.org/icudata#TOC-Customizing-ICU-s-Data-Library
Mozilla currently uses ICU 50.

@nciric
Copy link

nciric commented Oct 17, 2013

Hi,
I am the author of the v8-i18n library, and I'll try to clarify some of the issues:

  1. v8 did integrate v8-i18n to enable snapshotting (much faster load times). It also helped with fixing conformance test failures.
  2. v8 and Chrome use old ICU 4.6 but heavily pruned, so the size should be much smaller than the regular checkout.
  3. We (Google) are working on ICU upgrade to 52 (should be done in couple of weeks).
  4. You can use system ICU (I think it's a gyp flag - use_system_icu)
  5. GYP flags could be overriden using -Duse_system_icu=value for example.
  6. When you load ICU it will detect a default locale for the system and we (i18n) use that as default, we don't look for navigator.language value.

Regards,
Nebojša Ćirić

@srl295
Copy link
Member

srl295 commented Oct 31, 2013

@nciric good to see you last week - let me know if you need help w/ the 52 upgrade. Also the pruning should be available upstream as ICU options.

Note, To put the size in perspective, see the data customizer at http://apps.icu-project.org/datacustom/ - Don't need codepage conversion? Drop 4.4MB. Don't care about Chinese collation? That's 1.2MB, etc. It takes data, and code, to get things right.

@srl295
Copy link
Member

srl295 commented Feb 14, 2014

@ericf @nciric I haven't tried the new #6371 build option, but will give it a try and at least post some numbers here.

@srl295
Copy link
Member

srl295 commented Feb 17, 2014

so, 12M to 24M ( 64 bit intel linux)

924K    out.000/Release/libcares.a
188K    out.000/Release/libchrome_zlib.a
28K     out.000/Release/libhttp_parser.a
5.7M    out.000/Release/libopenssl.a
1.4M    out.000/Release/libuv.a
16M     out.000/Release/libv8_base.x64.a
308K    out.000/Release/libv8_nosnapshot.x64.a
936K    out.000/Release/libv8_snapshot.a
12M     out.000/Release/node
924K    out.icu/Release/libcares.a
188K    out.icu/Release/libchrome_zlib.a
28K     out.icu/Release/libhttp_parser.a
9.5M    out.icu/Release/libicudata.a
4.7M    out.icu/Release/libicui18n.a
3.0M    out.icu/Release/libicuuc.a
5.7M    out.icu/Release/libopenssl.a
1.4M    out.icu/Release/libuv.a
16M     out.icu/Release/libv8_base.x64.a
340K    out.icu/Release/libv8_nosnapshot.x64.a
1.1M    out.icu/Release/libv8_snapshot.a
24M     out.icu/Release/node

@kenkopelson
Copy link

Hi Guys! Please don't take this wrong, but the above conversation needs some perspective. Could you please provide a quick simple solution to this? We have a serious commercial SaaS product written in Node that is getting ready for initial release, and we desperately need Node to honour locales, as it should. Saying you can't because some file is too big is really letting the cart lead the horse and tail wag the dog, since the purpose of Node is to accomplish business objectives. Saying that an important function like proper locale support for Date objects can't be delivered to a major foundational software platform because some file is too large...well that just seems a bit silly. Node.js exists so that companies like ours can build business solutions with it. We are literally talking hundreds of millions of dollars in revenue here, and not being able to achieve that because some file is too large seems like priorities need to be adjusted. Do you really think we care if a file is 4 meg, 12 meg, or 300 meg? Disk is cheap and so is transfer cost. What isn't cheap is development time or forgoing serious business product release because of relatively small technical issues. I'm trying to express a real concern here, that a major function of the Node platform (namely supporting software for a global market) is being delayed even 1 day because of back and forth discussions about megabytes of disk/transfer. This was a year ago. What has happened since? Come on guys, please get this fixed ASAP so that we and others out there who banked on Node.js can get on with it. Eternally grateful!

@dmportella
Copy link

+1 well put @kenkopelson

@Albert-IV
Copy link

It seems like you can compile Node.js to include ICU support:

https://github.com/joyent/node/wiki/Intl

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests