Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise lang_item demo to something unrelated to Box impl #22499

Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
155 changes: 124 additions & 31 deletions src/doc/trpl/unsafe.md
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -649,69 +649,162 @@ it exists. The marker is the attribute `#[lang="..."]` and there are
various different values of `...`, i.e. various different 'lang various different values of `...`, i.e. various different 'lang
items'. items'.


For example, `Box` pointers require two lang items, one for allocation For example, there are lang items related to the implementation of
and one for deallocation. A freestanding program that uses the `Box` string slices (`&str`); one of these is `str_eq`, which implements the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already talk about this all the time, so you can just say 'string slices' or &str, rather than both. I guess it doesn't hurt though.

sugar for dynamic allocations via `malloc` and `free`: equivalence relation on two string slices. This is a lang item because
string equivalence is used for more than just the `==` operator; in
particular, it is also used when pattern matching string literals.

A freestanding program that provides its own definition of the
`str_eq` lang item, with a slightly different semantics than
usual in Rust:


``` ```
#![feature(lang_items, box_syntax, start, no_std)] #![feature(lang_items, intrinsics, start, no_std)]
#![no_std] #![no_std]


extern crate libc; // Our str_eq lang item; it normalizes ASCII letters to lowercase.
#[lang="str_eq"]
fn eq_slice(s1: &str, s2: &str) -> bool {
unsafe {
let (p1, s1_len) = str::repr(s1);
let (p2, s2_len) = str::repr(s2);


extern { if s1_len != s2_len { return false; }
fn abort() -> !;
} let mut i = 0;
while i < s1_len {
let b1 = str::at_offset(p1, i);
let b2 = str::at_offset(p2, i);


#[lang = "owned_box"] let b1 = lower_if_ascii(b1);
pub struct Box<T>(*mut T); let b2 = lower_if_ascii(b2);


#[lang="exchange_malloc"] if b1 != b2 { return false; }
unsafe fn allocate(size: usize, _align: usize) -> *mut u8 {
let p = libc::malloc(size as libc::size_t) as *mut u8;


// malloc failed i += 1;
if p as usize == 0 { }
abort();
} }


p return true;
}
#[lang="exchange_free"] fn lower_if_ascii(b: u8) -> u8 {
unsafe fn deallocate(ptr: *mut u8, _size: usize, _align: usize) { if 'A' as u8 <= b && b <= 'Z' as u8 {
libc::free(ptr as *mut libc::c_void) b - ('A' as u8) + ('a' as u8)
} else {
b
}
}
} }


#[start] #[start]
fn main(argc: isize, argv: *const *const u8) -> isize { fn main(_argc: isize, _argv: *const *const u8) -> isize {
let x = box 1; let a = "HELLO\0";
let b = "World\0";
unsafe {
let (a_ptr, b_ptr) = (str::as_bytes(a), str::as_bytes(b));
match (a,b) {
("hello\0", "world\0") => {
printf::print2p("Whoa; matched \"hello world\" on \"%s, %s\"\n\0",
a_ptr, b_ptr);
}

("HELLO\0", "World\0") => {
printf::print2p("obviously match on %s, %s\n\0", a_ptr, b_ptr);
}

_ => printf::print0("No matches at all???\n\0"),
}
}
return 0;
}


0 // To be able to print to standard output from this demonstration
// program, we link with `printf` from the C standard library. Note
// that this requires we null-terminate our strings with "\0".
mod printf {
use super::str;

#[link(name="c")]
extern { fn printf(f: *const u8, ...); }

pub unsafe fn print0(s: &str) {
// guard against failure to include '\0'
if str::last_byte(s) != '\0' as u8 {
printf(str::as_bytes("(invalid input str)\n\0"));
} else {
let bytes = str::as_bytes(s);
printf(bytes);
}
}

pub unsafe fn print2p<T,U>(s: &str, arg1: *const T, arg2: *const U) {
// guard against failure to include '\0'
if str::last_byte(s) != '\0' as u8 {
printf(str::as_bytes("(invalid input str)\n\0"));
} else {
let bytes = str::as_bytes(s);
printf(bytes, arg1, arg2);
}
}
}

/// A collection of functions to operate on string slices.
mod str {
/// Extracts the underlying representation of a string slice.
pub unsafe fn repr(s: &str) -> (*const u8, usize) {
extern "rust-intrinsic" { fn transmute<T,U>(e: T) -> U; }
transmute(s)
}

/// Extracts the pointer to bytes representing the string slice.
pub fn as_bytes(s: &str) -> *const u8 {
unsafe { repr(s).0 }
}

/// Returns the last byte in the string slice.
pub fn last_byte(s: &str) -> u8 {
unsafe {
let (bytes, len): (*const u8, usize) = repr(s);
at_offset(bytes, len-1)
}
}

/// Returns the byte at offset `i` in the byte string.
pub unsafe fn at_offset(p: *const u8, i: usize) -> u8 {
*((p as usize + i) as *const u8)
}
} }


// Again, these functions and traits are used by the compiler, and are
// normally provided by libstd. (The `Sized` and `Copy` lang_items
// require definitions due to the type-parametric code above.)

#[lang = "stack_exhausted"] extern fn stack_exhausted() {} #[lang = "stack_exhausted"] extern fn stack_exhausted() {}
#[lang = "eh_personality"] extern fn eh_personality() {} #[lang = "eh_personality"] extern fn eh_personality() {}
#[lang = "panic_fmt"] fn panic_fmt() -> ! { loop {} } #[lang = "panic_fmt"] fn panic_fmt() -> ! { loop {} }

#[lang="sized"] pub trait Sized: PhantomFn<Self,Self> {}
#[lang="copy"] pub trait Copy: PhantomFn<Self,Self> {}
#[lang="phantom_fn"] pub trait PhantomFn<A:?Sized,R:?Sized=()> { }
``` ```


Note the use of `abort`: the `exchange_malloc` lang item is assumed to
return a valid pointer, and so needs to do the check internally.


Other features provided by lang items include: Other features provided by lang items include:


- overloadable operators via traits: the traits corresponding to the - overloadable operators via traits: the traits corresponding to the
`==`, `<`, dereferencing (`*`) and `+` (etc.) operators are all `==`, `<`, dereferencing (`*`) and `+` (etc.) operators are all
marked with lang items; those specific four are `eq`, `ord`, marked with lang items; those specific four are `eq`, `ord`,
`deref`, and `add` respectively. `deref`, and `add` respectively.
- stack unwinding and general failure; the `eh_personality`, `fail` - stack unwinding and general failure; the `eh_personality`, `panic`
and `fail_bounds_checks` lang items. `panic_fmt`, and `panic_bounds_check` lang items.
- the traits in `std::marker` used to indicate types of - the traits in `std::marker` used to indicate types of
various kinds; lang items `send`, `sync` and `copy`. various kinds; lang items `send`, `sync` and `copy`.
- the marker types and variance indicators found in - the marker types and variance indicators found in
`std::marker`; lang items `covariant_type`, `std::marker`; lang items `covariant_type`,
`contravariant_lifetime`, etc. `contravariant_lifetime`, etc.


Lang items are loaded lazily by the compiler; e.g. if one never uses Lang items are loaded lazily by the compiler; e.g. if one never uses
`Box` then there is no need to define functions for `exchange_malloc` array indexing (`a[i]`) then there is no need to define a function for
and `exchange_free`. `rustc` will emit an error when an item is needed `panic_bounds_check`. `rustc` will emit an error when an item is
but not found in the current crate or any that it depends on. needed but not found in the current crate or any that it depends on.