-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom Dynamically Sized Types for Rust #1524
Conversation
|
@dgrunwald Ah, well, this RFC does support that. struct CStr([libc::c_char]);
impl Dst for CStr {
type Meta = ();
}
impl Sizeable for CStr {
fn size_of_val(&self) -> usize {
libc::strlen(self.as_ptr())
}
}
impl CStr {
pub unsafe fn from_ptr(p: *const libc::c_char) -> &CStr {
// either
std::mem::transmute::<*const libc::c_char, &CStr>(p)
// or
std::ptr::make_fat_ptr(p as *const (), ())
}
} |
I'm not sure about this. My main concern is around how actually useful this is compared to the complexity. There is a lot of code that assumes fat pointers are two words and two words only. This would completely change that. It's a very large increase in complexity. Another issue is how this interacts with generics and unsizing. Currently Right now, my thought is that it's a nice feature (like most that get proposed), but not worth the increased complexity. |
@Aatch We need something like this. Look at the documentation for
This is just in the standard library. Out there in real code, we have to deal with C interop, where stuff like typedef struct _TOKEN_GROUPS {
DWORD GroupCount;
SID_AND_ATTRIBUTES Groups[ANYSIZE_ARRAY];
} TOKEN_GROUPS, *PTOKEN_GROUPS; has to be translated into struct TokenGroups {
group_count: u32,
groups: [SID_AND_ATTRIBUTES; 0],
} And traits like struct 2DSlice<'a, T> {
width: usize,
height: usize,
ptr: *const T,
lifetime: PhantomData<&'a [T]>,
}
struct 2DSliceMut<'a, T> {
width: usize,
height: usize,
ptr: *mut T,
lifetime: PhantomData<&'a mut [T]>,
} |
@Aatch Unsizing isn't touched in this RFC. We really need integer generics to start doing stuff with unsizing. However, once we do get integer generics, it'll be quite easy to add that to this and we can have unsizing for everybody :) |
@Aatch A generic |
@ubsan |
@reem 1) That's not an unsized type, 2) that wouldn't actually be a problem because you know what Meta is (as long as |
@japaric Didn't you work on user-defined DSTs before? Do you have a link to your old effort? |
How would indexing the inner Idea: Add |
@thepowersgang It's an unsafe operation. The inner |
I originally proposed using @Aatch Fat pointers being hardcoded to one pointer-sized piece of metadata is a mistake. There are important extensions to slice and traits we need to support and that hardcoding has to go (which is easier with the MIR). |
I liked |
Yes, I was working on it last year. I don't have time right now to analyze/comment on the design proposed in this RFC. So I'm just going to drop a few links here: |
@thepowersgang actually brings up a good point, if you can get at the Either use a |
Hmm, yeah, I do think [T; 0] is better now that it's been discussed. |
Although, I'm thinking about how it would work with integer generics and Unsizing coercions... I think that struct PascalStr {
len: usize,
buf: [u8],
}
// &PascalStr == ptr -> { len, [buf; len] }
// &PascalStr<N> == ptr -> { N, [buf; N] }
// PascalStr<N> == { N, [buf; N] } |
@ubsan eh, I'm not sure that's really worth it here. My main concern is that the compiler now has to be aware of a lot more context when handling field access. The main advantage of Unsizing is probably better handed via separate types. This is already the case, |
@Aatch but they're not completely different types. You need them to be exactly the same type, in fact, behind the pointer, for unsizing coercions to work. |
@ubsan no, they're the same representation, not the same type. |
struct PascalStrBuffer<B> {
len: usize,
buf: B
}
type PascalStrArray<const N: usize> = PascalStrBuffer<[u8; N]>;
type PascalStr = PascalStrBuffer<[u8]>; This might be an okay compromise of sorts. I've also considered having a single optional integer parameter, which denotes the minimal contained size and defaults to |
@Aatch fine; you can argue whether they're the same type or not. They are the same representation, definitely, and if you had to write separate Sized versions of each type, imagine the really big types: #[repr(C)]
struct CLAIM_SECURITY_ATTRIBUTE_RELATIVE_V1 {
Name: DWORD,
ValueType: WORD,
Reserved: WORD,
Flags: DWORD,
ValueCount: DWORD,
union Values {
pInt64: [DWORD],
pUint64: [DWORD],
ppString: [DWORD],
pFqbn: [DWORD],
pOctetString: [DWORD],
}
} imagine writing two of these types, and now for each of the Windows flexible array structs. It gets ridiculous. |
It would be almost impossible to ensure the same layout for two different definitions though. |
In that case how about we restrict this to slice-like DSTs and use @thepowersgang's suggestion of having an |
@Aatch that seems overly restrictive. For example, if you have a struct 2dSlice<T>([T]);
impl<T> Dst for 2dSlice<T> {
type Meta = (usize, usize);
fn element_count(&self) -> usize {
// (meta.0 * meta.1)
}
}
impl<T> Sizeable for 2dSlice<T> {...}
fn main() {
let x = 2dSlice::<(), (std::usize::MAX, std::usize::MAX)>::new(); // not possible with `element_count`
} |
@ubsan that's not possible anyway! You can't have something that large. And it's not like the same reasoning doesn't apply to the Oh, and something I just thought of: destructors/dropping. It's all well and good having something that only contains Copy` types, but what if you have something non-Copy in your custom DST? |
@Aatch yes you can :) look at the type inside! You only have a pointer to the values, and aren't the owner. You could |
@ubsan "not the owner"? Then who is? Do I now have to explicitly implement So far everything has been focused on custom strings, N-dimensional arrays of |
I can no longer keep this open in good faith. Someone else can take it from me, but ... personal issues. |
So, for some time now I've been wanting to write a followup to my previous comment sharing my thoughts about this work. Even though @ubsan has closed the RFC, I figured I should leave it anyway. Hopefully it is clear from my previous comments that I think the ideas in here are good in a technical sense, and I'd like to do some experimentation with them. However, it's probably also clear that I feel somewhat hestitant about this. My concerns here are not technical, but rather all about prioritization and motiviation. Rust is an ambitious language. Making it successul is a big job and there is a lot of work ahead for us. When I ask myself "how would Custom DST help Rust grow over the next year", I don't feel like I hear a convincing answer. The main use case that I see is helping make matrix libraries more ergonomic. Now, admittedly, this is somewhat compelling, but otoh lack of custom dst isn't, I don't think, blocking work on Matrix-like things, it's just a way to make them nicer to use. (I'd be interested to hear otherwise; are there any applications where the presence of custom DST would dramatically change the design in some way that it's not even worth doing it otherwise?) Earlier I had thoughts, which I mentioned briefly in my comment, about the idea of trying to land this work in a very provisional way, to enable hacking. I am still interested in establishing a process for this. I think there are a number of things that might benefit from having a more "experimentation friendly" process (e.g., naked fns, inline asm, and interrupt ABIs come to mind). But I haven't gone and publicly worked on defining such a process because I also have concerns: the fact is that any time there is significant churn in the codebase, it has the potential to take a lot of time for core developers who are trying to focus on other things. I think the work around (On the other hand, there are some areas where we are adopting a looser process. We did land naked fns in a "provisional state", for example, and that has largely been a non-event. There is some experimentation around the embedded area, particularly with ABIs. But all of these are quite narrow and targeted compared to custom dst, which affects the code that handles every single reference.) Before @ubsan closed this RFC, I was working up my courage to move to postpone (close) -- but I kept hesitating, because I think it's a good idea and I like it. Yet still I have the feeling that on balance it just isn't a good investment of resources. I regret that hesitating perhaps sends a more frustrating message than actually moving to close. So @ubsan, I'm sorry about that. |
@nikomatsakis Thank you for your comment. I agree with what you're saying - there are more pressing things to work on right now. I also wanted to comment on the following:
I think this is an accurate statement. There are a few things that I cannot do without custom DSTs, some of which don't even have ugly workarounds, but these are only mechanisms to tidy up the code and make things easier for users. Here are some example issues that we had to put off: AtheMathmo/rulinalg#149 (I think custom DSTs help this) It would be really nice to parallel the |
@AtheMathmo interestingly, I was planning to come and post a comment here anyhow just to mention I myself ran into a use-case for this just yesterday when hacking on my NLL prototype, specifically the bitset code. Really it's just the "2-D matrix use case" in disguise, but in trying to play with it, I did appreciate that there are things that are hard to do "just with methods". Though maybe I didn't find the best answer. In particular, I have a In any case, I definitely think that this conversation -- to what extent is this "just sugar" vs unlocking some fundamental capability -- is the critical one to deciding how to prioritize this change. |
Another thing that I was thinking about was this question: is this the sort of capability that, when added, would mean that the interface for existing libraries would want to be completely reworked to take advantage of it? |
I cannot speak for the others who are looking to make use of this feature, but for me I would be making significant changes. Right now we have the following types in rulinalg: Another particularly difficult area is with operator overloading. Right now we have an explosion of overloading implementations for all combinations of matrix types. I think that custom DSTs could help here too but I haven't explored the idea too much. (Hoping I'll try to spend some time thinking about whether there is anything else that I can add to this discussion. I've got quite used to working around the lack of custom DSTs and I don't think there are many features missing for the user. There are however quite a few ways we could make things nicer - not needing to import traits everywhere, being able to have |
@AtheMathmo it seems like it would be a fruitful exercise to try and use e.g. the design from my comment and sketch out how the types in your library would work (in contrast to now) and throw it in a gist. (I'd like to see how you envision the operator overloading working, too.) Apologies if you've already done it and I missed it. (The design from the RFC seems mostly fine too, I don't recall there being any major differences.) |
@nikomatsakis that sounds like a good idea! I'm a little busy for the rest of this week but hopefully will be able to sketch something out early next week. |
@AtheMathmo great, I'd love to see it. Since this RFC is closed, and it feels like we're still a bit in "design discussion" here, I'm going to open a thread on internals to carry on the conversation: https://internals.rust-lang.org/t/custom-dst-discussion/4842 |
For some more motivation, here's what I need in Crossbeam:
Pointers to such objects must be thin because atomically manipulating multiple words is a pain to do portably and performantly. Some support from the language for such use of DSTs would be great to have. |
Heap-allocating |
@SimonSapin Looking for exactly that as well. Have you found a suitable workaround yet? |
@bergus Today, the best you can do is manual memory layout computation, raw allocation, and https://github.com/rust-lang/rust/blob/1.26.2/src/liballoc/rc.rs#L714-L727 |
There's also this pattern for a slice with a header, and |
@SimonSapin Thanks |
Yes, and if you need mutable references, it prevents someone from (mis-)using |
Hi folks, I'm new to the Rust community, but was really hoping to implement a cross-platform library for high performance compressed vectors using this DST feature. These vectors would have, for compatibility reasons, a u32 size header and other header fields followed by something like |
I believe this fixes #813, and is a nicer, and far more powerful, solution than #709.