Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend RFC 517: Add material on std::fs #739

Merged
merged 2 commits into from
Feb 5, 2015
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 25 additions & 26 deletions text/0517-io-os-reform.md
Original file line number Diff line number Diff line change
Expand Up @@ -728,13 +728,14 @@ passing in Rust strings and literals directly, for example).
* `rename`. Take `AsPath` bound.
* `remove_file` (renamed from `unlink`). Take `AsPath` bound.

* `file_attr` (renamed from `stat`). Take `AsPath` bound. Yield a new
struct, `FileAttr`, with no public fields, but `size`, `kind` and
`permissions` accessors. The various `os::platform` modules will offer
extension methods on this structure.

* `set_permissions` (renamed from `chmod`). Take `AsPath` bound, and a
`FilePermissions` value. The `FilePermissions` type will be revamped
* `metadata` (renamed from `stat`). Take `AsPath` bound. Yield a new
struct, `Metadata`, with no public fields, but `len`, `is_dir`,
`is_file`, `perms`, `accessed` and `modified` accessors. The various
`os::platform` modules will offer extension methods on this
structure.

* `set_perms` (renamed from `chmod`). Take `AsPath` bound, and a
`Perms` value. The `Perms` type will be revamped
as a struct with private implementation; see below.

**Directories**:
Expand Down Expand Up @@ -762,8 +763,8 @@ passing in Rust strings and literals directly, for example).
The `File` type will largely stay as it is today, except that it will
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when close fails, because you are using an async filesystem? (note this is different than caring about the server crashing). and please don't say fsync/flush each time because some people might want to run some rust programs from a script and then run 'sync' manually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC is largely concerned with the API of std::fs and how it's affected by the implementation details. I believe that we'll always have a Drop implementation which calls close on the file descriptor, so in some sense this is an implementation detail. Do you have an idea in mind where this would change the API of File, however?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API http://doc.rust-lang.org/std/io/fs/struct.File.html doesn't tell me how I know close succeeded, so I view it as incomplete. I also think what functions panic is part of the public API that shouldn't change.

As I posted http://discuss.rust-lang.org/t/fs-file-should-panic-if-implicit-close-fails-and-no-panic-is-active/1349 , I think there should be an explicit .close() method that people can call, that returns an error, and otherwise drop should panic if implicit close() fails and there is not already a panic in progress. I think that makes the most code correct.

edit: and having an explicit .close() could change what errors are returned by other functions, e.g. read() could then panic or return an error ("file is closed").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may want to continue the discussion about explicit close over here.

use the `AsPath` bound everywhere.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the file descriptor management / safety strategy? FDs are a global heap, just like memory, that can alias each other, have double frees, dangling pointers, etc. What precautions are taken by the IO types to make fd handling safe? It would be shame if safe rust code couldn't corrupt memory, but could corrupt the FD table.

I think at a bare minimum, IO types should abort if(errno == EBADF) on any operation, but that is closing the barn door after the horses are out.

I'm looking at as_raw_fd and wondering what its purpose is. Is anything mutating that going to be marked as unsafe? What about reading it? Obviously that fd could be invalidated by the time you get it, if another thread got a mutex on the file and closed it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider our story here very analagous to that of plain old memory management pointers like Box<T> You can get a very unsafe *mut T, for example, safely from a Box<T> but mutation is all unsafe (similar to how as_raw_fd is safe but usage is likely unsafe). Similarly, Box<T> doesn't attempt to check if the memory is valid and I don't think that our bindings will special-case EBADF as well.

In general each object will own its file descriptor. The descriptor can be inspected and lent out via a safe API (but in this case platform specific) and relies on modification being unsafe elsewhere. You are, however, guaranteed that the raw component (in this case a file descriptor) is valid so long as the file itself is still alive.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I get a *mut T from a Box using safe code?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, this is what bothers me:

let p = Path::new("/dev/null");
let f = File::open(&p).unwrap().as_raw_fd();
println!("fd: {}", f);

That's a dangling, non-sensical "pointer" even in single threaded, 100%
safe code. Can as_raw_fd just be made unsafe?

On Mon, Jan 26, 2015 at 7:42 PM, Alex Crichton notifications@github.com
wrote:

In text/0517-io-os-reform.md
#739 (comment):

+* remove_dir (renamed from rmdir). Take AsPath bound.
+* remove_dir_all (renamed from rmdir_recursive). Take

  • AsPath bound.
    +* walk_dir. Take AsPath bound. Yield an iterator over IoResult<DirEntry>.

+Links:
+
+* hard_link (renamed from link). Take AsPath bound.
+* sym_link (renamed from symlink). Take AsPath bound.
+* read_link (renamed form readlink). Take AsPath bound.
+
+#### Files
+[Files]: #files
+
+The File type will largely stay as it is today, except that it will
+use the AsPath bound everywhere.

I consider our story here very analagous to that of plain old memory
management pointers like Box You can get a very unsafe *mut T, for
example, safely from a Box but mutation is all unsafe (similar to how
as_raw_fd is safe but usage is likely unsafe). Similarly, Box doesn't
attempt to check if the memory is valid and I don't think that our bindings
will special-case EBADF as well.

In general each object will own its file descriptor. The descriptor can be
inspected and lent out via a safe API (but in this case platform specific)
and relies on modification being unsafe elsewhere. You are, however,
guaranteed that the raw component (in this case a file descriptor) is valid
so long as the file itself is still alive.


Reply to this email directly or view it on GitHub
https://github.com/rust-lang/rfcs/pull/739/files#r23581430.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way is that unsafe? The fd is just an integer, and I can't see how you'd extract any memory unsafety from it. In the worst case, it's closed while there's another copy of it floating around, in which case calls using it start returning errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I get a *mut T from a Box using safe code?

Something like:

let foo = Box::new(3);
let foo_ptr = &*foo as *const i32;
drop(foo);
println!("{:?}", foo_ptr);

Can as_raw_fd just be made unsafe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er sorry, didn't finish that last comment, I was going to mention that your example is very similar to:

use std::ffi::CString;
let c_str = CString::from_slice(b"foo").as_ptr();
unsafe { c_funtion(c_str); }

This is also passing a dangling pointer to an external function, but I don't think it necessarily means we should make as_ptr and unsafe method.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Wed, Jan 28, 2015 at 10:39 AM, Steven Fackler notifications@github.com
wrote:

In what way is that unsafe? The fd is just an integer, and I can't see how
you'd extract any memory unsafety from it. In the worst case, it's closed
while there's another copy of it floating around, in which case calls using
it start returning errors.

No, the worst case is it aliases another later opened file and you
truncate /etc/passwd without meaning to. You can also pass it to open
/proc/self/fd/N and get the wrong (aliased) file. Heh.


The `stat` method will be renamed to `attr`, yield a `FileAttr`, and
take `&self`.
The `stat` method will be renamed to `metadata`, yield a `Metadata`
structure (as described above), and take `&self`.

The `fsync` method will be renamed to `sync_all`, and `datasync` will be
renamed to `sync_data`. (Although the latter is not available on
Expand All @@ -774,39 +775,42 @@ filesystems.)
The `path` method wil remain `#[unstable]`, as we do not yet want to
commit to its API.

The `open_opts` function (renamed from `open_mode`) will take an `OpenOptions`
struct, which will encompass today's `FileMode` and `FileAccess` and support a
builder-style API.
The `open_mode` function will be removed in favor of and will take an
`OpenOptions` struct, which will encompass today's `FileMode` and
`FileAccess` and support a builder-style API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One method I think we forgot to consider is truncate on files today. Right now the function matches Unix semantics where it basically acts as a set_len function. It will shorten the file or extend it (filling in with 0s) if necessary.

This function, however doesn't exist on Windows to my knowledge (cc @retep998, I may be missing something). Currently we're using SetEndOfFile which operates slightly different than the Unix variant. On Windows you must first seek to a position and then "truncate" the file, which is to say that Windows will simply set the length of the file on disk to the current pointer in the file.

Unfortunately, the current semantics may lead to some loss on windows. Our "emulation" of ftruncate on Windows involves seeking the file to the desired length, calling SetEndOfFile, and then seeking back to where the file used to be. The "seek back" operation, however, may fail, which would mean the operation actually completed successfully but we're forced to still return an error.

I think I would personally recommend matching Windows semantics rather than unix semantics in this case. We would drop the u64 argument to the function, call SetEndOfFile on Windows, and unix would be a combination of lseek (to learn the position) followed by ftruncate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! @retep998 has pointed out SetFileInformationByHandle to me which worked like a charm!

In that case I think we're fine :)

#### File kinds
[File kinds]: #file-kinds

The `FileType` module will be renamed to `FileKind`, and the
underlying `enum` will be hidden (to allow for platform differences
and growth). It will expose at least `is_file` and `is_dir`; the other
methods need to be audited for compatibility across
The `FileType` type will be removed. As mentioned above, `is_file` and
`is_dir` will be provided directly on `Meatadata`; the other types
need to be audited for compatibility across
platforms. Platform-specific kinds will be relegated to extension
traits in `std::os::platform`.

It's possible that an
[extensible](https://github.com/rust-lang/rfcs/pull/757) `Kind` will
be added in the future.

#### File permissions
[File permissions]: #file-permissions

The permission models on Unix and Windows vary greatly -- even between
different filesystems within the same OS. Rather than offer an API
that has no meaning on some platforms, we will initially provide a
very limited `FilePermissions` structure in `std::fs`, and then rich
very limited `Perms` structure in `std::fs`, and then rich
extension traits in `std::os::unix` and `std::os::windows`. Over time,
if clear cross-platform patterns emerge for richer permissions, we can
grow the `FilePermissions` structure.
grow the `Perms` structure.

On the Unix side, the constructors and accessors for `FilePermissions`
On the Unix side, the constructors and accessors for `Perms`
will resemble the flags we have today; details are left to the implementation.

On the Windows side, initially there will be no extensions, as Windows
has a very complex permissions model that will take some time to build
out.

For `std::fs` itself, `FilePermissions` will provide constructors and
For `std::fs` itself, `Perms` will provide constructors and
accessors for "world readable" -- and that is all. At the moment, that
is all that is known to be compatible across the platforms that Rust
supports.
Expand All @@ -819,11 +823,6 @@ This trait will essentially remain stay as it is (renamed from

#### Items to move to `os::platform`

* `change_file_times` will move to `os::unix` and remain `#[unstable]`
*for now* (cf `SetFileTime` on Windows). Eventually we will add back
a cross-platform function, when we have grown a notion of time in
`std` and have a good compatibility story across all platforms.

* `lstat` will move to `os::unix` and remain `#[unstable]` *for now*
since it is not yet implemented for Windows.

Expand All @@ -834,7 +833,7 @@ This trait will essentially remain stay as it is (renamed from
function in `std::fs`.

* In general, offer all of the `stat` fields as an extension trait on
`FileAttr` (e.g. `os::unix::FileAttrExt`).
`Metadata` (e.g. `os::unix::MetadataExt`).

### `std::net`
[std::net]: #stdnet
Expand Down