-
Notifications
You must be signed in to change notification settings - Fork 7.3k
fs.readdir/open broken for non-UTF8 filenames #2387
Comments
Yes, it's a known problem with no good workaround currently. File names on Unices are simple byte strings with no particular encoding. V8 on the other hand requires all strings to be either UTF-8 or UCS-2. I think that we'll add support for file names as buffers eventually (as opposed to file names as strings) but that's a moderately large undertaking. |
At lead are there some workaround for this? I think the severity of this is being underestimated. Is not about files with bad/invalid enconding filenames. Is about files with valid encoding in other very common encoding systems that aren't UTF-8, like ISO 8859-1. I use node.js to handle files in a ftp/sftp server. I need get stats from all files that comes in a dir. I just use fs.readdir and bypass the result to fs.stat, and i get ENOENT errors to paths not valids in utf-8 that are valid in encodings that my users use, like ISO 8859-1, so i can't prevent my users to upload files like this. I can't open or stat any file valid for my system that isn't UTF-8. Tested on node 0.10.33 using linux. |
Is there any progress on this? I'm a bit surprised this still appears to be an issue with no apparent work-around, surely me and felipeaf aren't the only people having to deal with file names that aren't valid utf8 (especially on *nix)? :) EDIT: This isn't going to be a practical work-around for all cases (not mine at least ,since I have to deal with user-supplied file hierarchies), but the 'convmv' utility available at https://www.j3e.de/linux/convmv/ can be used to rename files so that they conform a given character set. |
I don't have a solution but here's a gist for how to reproduce the problem. |
I've encountered the same problem.
I can understand that this is a major move, but is there any plans or updates on this? As it's been 4 years now . |
As previous comments indicate, this is still an issue. Unfortunately there does not appear to be any workaround that will work without introducing an API change (allowing filename to be a buffer as indicated in previous comments. @joyent/node-coreteam ... is this something we'd want to tackle here or defer to the converged stream? |
@srl295 @orangemocha .. would either of you have an opportunity to look at this one? |
Can confirm bug. |
I have problems with fs.readdir(Sync) and directories/filenames that include ISO-8859-1 characters. The following is an example for this using the special character ß (ISO-8859: 223 / 0xDF / 0337 (dec/hex/oct) in UTF-8: U+00DF (0xC3 0x9F)):
I'm using node v0.6.4 with linux. Related issues #1971 #1842 and #1785
The text was updated successfully, but these errors were encountered: