Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

fs.readdir/open broken for non-UTF8 filenames #2387

Closed
4poc opened this issue Dec 20, 2011 · 8 comments
Closed

fs.readdir/open broken for non-UTF8 filenames #2387

4poc opened this issue Dec 20, 2011 · 8 comments

Comments

@4poc
Copy link

4poc commented Dec 20, 2011

I have problems with fs.readdir(Sync) and directories/filenames that include ISO-8859-1 characters. The following is an example for this using the special character ß (ISO-8859: 223 / 0xDF / 0337 (dec/hex/oct) in UTF-8: U+00DF (0xC3 0x9F)):

$ locale
LANG=en_US.ISO-8859-1
$ mkdir isotest; cd isotest
$ touch Gro$'\xDF'
$ ls
Groß
$ node
> var fs = require('fs');
> var file = fs.readdirSync('.')[0];
> file
'Gro�' # already broken encoding
> new Buffer(file, 'binary')
<Buffer 47 72 6f fd> 
> fs.openSync(file, 'r');
Error: ENOENT, no such file or directory 'Gro�'
    at Object.openSync (fs.js:230:18)
    at repl:1:4
    at REPLServer.eval (repl.js:80:21)
    at repl.js:190:20
    at REPLServer.eval (repl.js:87:5)
    at Interface.<anonymous> (repl.js:182:12)
    at Interface.emit (events.js:67:17)
    at Interface._onLine (readline.js:162:10)
    at Interface._line (readline.js:426:8)
    at Interface._ttyWrite (readline.js:603:14)
> # other tests try to open the file manually:
> fs.openSync('Gro\xDF', 'r')
Error: ENOENT, no such file or directory 'Gro�'
> fs.openSync('Gro\u00DF', 'r')
Error: ENOENT, no such file or directory 'Gro�'

I'm using node v0.6.4 with linux. Related issues #1971 #1842 and #1785

@bnoordhuis
Copy link
Member

Yes, it's a known problem with no good workaround currently. File names on Unices are simple byte strings with no particular encoding. V8 on the other hand requires all strings to be either UTF-8 or UCS-2.

I think that we'll add support for file names as buffers eventually (as opposed to file names as strings) but that's a moderately large undertaking.

@felipeaf
Copy link

felipeaf commented Jan 8, 2015

At lead are there some workaround for this? I think the severity of this is being underestimated. Is not about files with bad/invalid enconding filenames. Is about files with valid encoding in other very common encoding systems that aren't UTF-8, like ISO 8859-1.

I use node.js to handle files in a ftp/sftp server. I need get stats from all files that comes in a dir. I just use fs.readdir and bypass the result to fs.stat, and i get ENOENT errors to paths not valids in utf-8 that are valid in encodings that my users use, like ISO 8859-1, so i can't prevent my users to upload files like this. I can't open or stat any file valid for my system that isn't UTF-8.

Tested on node 0.10.33 using linux.

@hnsr
Copy link

hnsr commented Mar 13, 2015

Is there any progress on this? I'm a bit surprised this still appears to be an issue with no apparent work-around, surely me and felipeaf aren't the only people having to deal with file names that aren't valid utf8 (especially on *nix)? :)

EDIT: This isn't going to be a practical work-around for all cases (not mine at least ,since I have to deal with user-supplied file hierarchies), but the 'convmv' utility available at https://www.j3e.de/linux/convmv/ can be used to rename files so that they conform a given character set.

@mk-pmb
Copy link

mk-pmb commented Mar 13, 2015

I don't have a solution but here's a gist for how to reproduce the problem.

@utensil
Copy link

utensil commented Mar 15, 2015

I've encountered the same problem.

I think that we'll add support for file names as buffers eventually (as opposed to file names as strings) but that's a moderately large undertaking.

I can understand that this is a major move, but is there any plans or updates on this? As it's been 4 years now .

@jasnell
Copy link
Member

jasnell commented May 20, 2015

As previous comments indicate, this is still an issue. Unfortunately there does not appear to be any workaround that will work without introducing an API change (allowing filename to be a buffer as indicated in previous comments.

@joyent/node-coreteam ... is this something we'd want to tackle here or defer to the converged stream?

@jasnell
Copy link
Member

jasnell commented Aug 26, 2015

@srl295 @orangemocha .. would either of you have an opportunity to look at this one?

@sheepa
Copy link

sheepa commented Nov 11, 2015

Can confirm bug.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants