Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update the safety preconditions of from_raw_parts #129483

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lolbinarycat
Copy link

@lolbinarycat lolbinarycat commented Aug 23, 2024

they now reflect the fact that zero-capacity collections do not allocate

fixes #119304

@rustbot
Copy link
Collaborator

rustbot commented Aug 23, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @workingjubilee (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 23, 2024
@oskgo
Copy link
Contributor

oskgo commented Aug 24, 2024

Pointers backing Vecs of ZSTs can be dangling without capacity being 0.

they now reflect the fact that
zero-capacity collections do not allocate

fixes rust-lang#119304
@workingjubilee
Copy link
Member

Huh. My understanding was always that the zero-length Vec should be obtained specifically from dangling().

@workingjubilee
Copy link
Member

I would like it if you could clarify the motivation here.

The argument is that reconstructing a Vec may be unsound if this is not made true, because of Vec::new()'s internals.

However, it isn't actually language UB. This is about the library's contracts, and you're trying to argue the library is breaking its contract with... itself. Which it is allowed to do, as long as this doesn't lead to language UB. And it doesn't, as far as I understand. And more specifically, it is allowed to define an unfair contract, such that e.g. operating on a pointer "allocated" via Vec::new() is acceptable but you calling dangling() is not.

So, what do you actually have to say to the response of "I am the standard library. I make the rules."?

@lolbinarycat
Copy link
Author

@workingjubilee
this singular line right here:

These requirements are always upheld by any ptr that has been allocated via Vec<T, A>.

means that these "prerequisites" are not just that, but they are also a guarantee for Vec::as_raw_ptr and into_raw_parts, and the standard library does not uphold these guarantees as they are written today.

@workingjubilee
Copy link
Member

Hmm.

@oskgo
Copy link
Contributor

oskgo commented Aug 28, 2024

By simultaneously requiring that ptr is allocated by the global allocator and saying that Vec upholds the requirements, it's also saying that ptr being allocated by the global allocator is an invariant of Vec (at least as far as users are concerned).

This means that deallocating pointers obtained from Vec should be sound, but it's not. It's easy to make a custom global allocator that meets the requirements on GlobalAlloc but causes UB when trying to dealloc dangling pointers.

If the stdlib wants to prevent users from constructing dangling ptr manually, an alternative solution would be to require that ptr meets the existing requirements or is obtained from a Vec.

Edit: Note that I don't think the alternative solution I mentioned is a good one. I don't see a good motivation for making the use of manually created dangling pointers unsound. I also think that this would make the docs harder to read.

@workingjubilee
Copy link
Member

workingjubilee commented Aug 31, 2024

Ah. Noticed that since this PR was opened, this one was also: #129748

That makes it far less likely that Vec::from_raw_parts should (notionally) deviate.

@workingjubilee workingjubilee changed the title update the saftey preconditions of from_raw_parts Aug 31, 2024
@@ -913,10 +913,9 @@ impl String {
/// This is highly unsafe, due to the number of invariants that aren't
/// checked:
///
/// * The memory at `buf` needs to have been previously allocated by the
/// same allocator the standard library uses, with a required alignment of exactly 1.
/// * unless `capacity` is 0, `buf` must have been allocated using the global allocator with an alignment of 1 and a capacity of `capacity`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we really want to promise that any non-null pointer is okay if capacity is 0?

Also, "allocated using the global allocator with [...] a capacity" is odd -- the allocation ABI doesn't talk about "capacity", it talks about "size". So this should probably say "and a size of capacity".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I kinda think we don't, and should specify the NonNull::dangling() pointer if we accept any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the point about using "size" rather than "capacity"; note that this is consistent with some of the current language on Vec::from_raw_parts "capacity needs to be the capacity that the pointer was allocated with". Maybe that should be changed as well then?

Regarding the requirement on String::from_raw_parts; what about doing the same as in Vec::from_raw_parts, making the alignment requirement always present? That would allow making your own dangling pointers, but I don't see the motivation for restricting the creation of dangling pointers to a specific API here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the requirement on String::from_raw_parts; what about doing the same as in Vec::from_raw_parts, making the alignment requirement always present? That would allow making your own dangling pointers, but I don't see the motivation for restricting the creation of dangling pointers to a specific API here.

So that we can change the value that we're expecting as the dangling pointer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that we can change the value that we're expecting as the dangling pointer.

NonNull::dangling does specifically specify that it should never be used as a sentinel value, as it may also be a valid pointer.

what about doing the same as in Vec::from_raw_parts, making the alignment requirement always present?

is it possible to have a pointer that isn't byte aligned? I considered saying "the pointer must have an alignment of 1", but isn't that a given? maybe it can change the behavior of allocators? but, that shouldn't matter for a dangling pointer that is never freed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NonNull::dangling does specifically specify that it should never be used as a sentinel value, as it may also be a valid pointer.

See prior comments about the standard library being allowed to write unfair contracts and rely on its own internal implementation details in ways you cannot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be being a bit silly here, but it just still doesn't seem like a good idea to bless all pointers thusly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See prior comments about the standard library being allowed to write unfair contracts and rely on its own internal implementation details in ways you cannot.

except that the behavior of allocators is not an internal implementation detail of the standard library. even if it was, NonNull::dangling can be manipulated into taking any value by changing the size of T. therefore, even with the special privilege of being the standard library, using NonNull::dangling as a sentinel value is still a logic error.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel I can just r+ on my own recognizance a doc fix to make it clear that whatever pointer is produced via String::new() or Vec::new() and then decomposing the type via into_raw_parts or a similar API can then be passed into String::from_raw_parts or Vec::from_raw_parts. That is obviously a fix. Saying it's not allowed is silly, because you shouldn't have to check to make sure std didn't pass you the dangling pointer or anything.

...but I feel anything more runs into a question about stabilizing API details for container types, so lmk if you want this I-libs-api-nominated.

@lolbinarycat
Copy link
Author

because you shouldn't have to check to make sure std didn't pass you the dangling pointer or anything.

it would be impossible to do this check, since there's not just one dangling pointer in existence, and also because the return value of NonNull::dangling may coincidentally be a valid pointer.

@oskgo
Copy link
Contributor

oskgo commented Sep 1, 2024

Requiring libs-api signoff for this seems fair to me. For that I would like to bring up std::slice::from_raw_parts as relevant, which (from my reading) already accepts manually created dangling pointers and whose language is much clearer to me (though that is in part because the functionality is less complex).

@workingjubilee
Copy link
Member

slices aren't container types, they're pointers.

@oskgo
Copy link
Contributor

oskgo commented Sep 2, 2024

Even though slices aren't container types they are integrated with the Vec API through slice::into_vec.

You can use std::slice::from_raw_parts (well actually, using std::ptr::slice_from_raw_parts is better) together with Box::from_raw and slice::into_vec to implement your own Vec::from_raw_parts that accepts any smaller than isize::MAX, aligned and non-null pointer when capacity or size_of::<T>() are 0: playground.

@workingjubilee
Copy link
Member

    pub fn into_vec<A: Allocator>(self: Box<Self, A>) -> Vec<T, A> {
        // N.B., see the `hack` module in this file for more details.
        hack::into_vec(self)
    }

I love it when std's code has comments like this.

@workingjubilee
Copy link
Member

Hmm, I see. That does constrain things a lot. I don't think we have a realistic choice otherwise, then.

I will re-review this for conformance with changes like #129748 and those formerly tracked by the now-closed #117945

@workingjubilee
Copy link
Member

@lolbinarycat It might be appropriate to add comments in this file that capture the fact that our implementation choices are stapled down by things like slice::into_vec.

@workingjubilee
Copy link
Member

@oskgo Thank you for finding that example, it was basically what I was wondering if it already existed or not. It didn't occur to me to go find it on another type that isn't Vec.

@lolbinarycat
Copy link
Author

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 2, 2024
@RalfJung
Copy link
Member

RalfJung commented Sep 2, 2024

You can use std::slice::from_raw_parts (well actually, using std::ptr::slice_from_raw_parts is better) together with Box::from_raw and slice::into_vec to implement your own Vec::from_raw_parts that accepts any smaller than isize::MAX, aligned and non-null pointer when capacity or size_of::() are 0: playground.

We don't guarantee that the Vec you end up with has the same pointer as what you started with, do we?

Though at this point we're getting deep into rules lawyering territory and there is probably little justification for not also guaranteeing this on Vec::from_raw_parts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
5 participants