Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build out initial key values API #324

Merged
merged 8 commits into from
Apr 29, 2019
Merged

Conversation

KodrAus
Copy link
Contributor

@KodrAus KodrAus commented Mar 13, 2019

For #149

Introduces an initial structured logging API that's feature gated by the opt-in kv_unstable. APIs behind this feature gate aren't considered stable and may get broken between patch releases that happen to include them.

This implementation requires rustc 1.21.0 for making it nicer to treat a closure as a fn pointer, and for making it nicer to capture a constant local with a lifetime that's wider than its local scope. Since the feature gate is opt-in we aren't going to break anyone currently on compilers before 1.21.0. If we want to make this API available by default then we'll have to bump to 1.21.0 (which seems ok to me).

There is a lot of new code in here, so I appreciate that reviews might take time and it could be a lot of back-and-forth before we're happy. The core pieces are:

  • trait Source: A source for key-value pairs. Like a repeatable iterator for key-value pairs.
  • trait Visitor: The mutable part of the repeatable iterator for key-value pairs. It's driven by a Source.
  • struct Key: The key in a key-value pair. A container around a string that gives us room to abstract over ownership and add an optional index.
  • struct Value: The value in a key-value pair. A container around an erased fragment of data where its structure is retained by an internal serialization contract. Right now you can only format a value using its debug implementation, but the machinery is there for structured backends (I ended up needing it to implement impl<T: ToValue> ToValue for Option<T>).
  • struct KeyValueError: An error working with a key-value pair that lets us propagate errors serializing a key-value pair naturally. It's expected that consumers will be producing these.

The implementation is based on the structured logging RFC with a key difference:

The Source trait is object-safe. That means we can't add convenient non-object-safe methods like for_each to it even though I've found those super useful in practice and I think we should add them eventually. Making Source object-safe saves some clumsy machinery though and we can work around providing non-object-safe methods:

  • Non object-safe convenient methods that implementors will want to override like get and to_owned can be made object-safe. For instance, get could be changed from fn get<Q: ToKey>(&self, key: Q) -> Option<Value> to fn get(&self, key: Key) -> Option<Value>.
  • Non-chaining convenient methods like for_each and as_map can be added as inherent impls on dyn Source and/or to an extension trait.
  • Chaining convenient methods like chain and filter can be added by an extension trait.

Once we've merged I'd like to work on support in env_logger and integration with slog and tokio-trace and use those as drivers for the next round.

r? @sfackler

@KodrAus
Copy link
Contributor Author

KodrAus commented Mar 13, 2019

Note: CI won't pass yet, I have to tweak it to only run on 1.21.0+ for the new structured API.

Also missing in this first round:

  • Docs. I haven't bothered writing nice docs when things are likely to change a lot.
  • A bunch of Source trait impls for things like BTreeMaps.
  • A bunch of ToValue trait impls for things like SocketAddrs.

@KodrAus KodrAus changed the title Build out innitial key values API Build out initial key values API Mar 13, 2019
@KodrAus KodrAus force-pushed the feat/kv_unstable branch 3 times, most recently from 58b19a3 to 30207e8 Compare March 13, 2019 22:13
use super::backend::Backend;

/// A function for converting some type `T` into a [`Value`](struct.Value.html).
pub type FromAnyFn<T> = fn(FromAny, &T) -> Result<(), KeyValueError>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub types in this module are currently pub(super) in visibility.

@KodrAus
Copy link
Contributor Author

KodrAus commented Mar 25, 2019

r? @dtolnay

@KodrAus
Copy link
Contributor Author

KodrAus commented Mar 25, 2019

Sorry, I don't mean to put you on the spot to feel compelled to invest your time into reviewing this large new API. If you don't have the bandwidth/interest that's no problem :) It'd be great to at least get a sanity check from somebody on this new code since we the Libs team will be supporting it at some point.

If we'd like to try minimise the amount of bandwidth needed from the wider Libs team to flesh this feature out I'm also ok to continue landing stuff behind this unstable feature-gate and defer a thorough review to a point where everything is implemented and there's experimental integration from structured frameworks in the ecosystem.


/// An error encountered while working with structured data.
#[derive(Clone, Debug)]
pub struct KeyValueError {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative here could be to make this error like fmt::Error, so it doesn't carry any context on its own and is easy to create.

@dtolnay
Copy link
Member

dtolnay commented Apr 2, 2019

I would be happy to review but I just got home from 2 weeks of travel and have a substantial backlog. Will aim to get you some feedback within the next 14 days.

@KodrAus KodrAus force-pushed the feat/kv_unstable branch from c463161 to f21ce41 Compare April 9, 2019 05:51
@KodrAus KodrAus force-pushed the feat/kv_unstable branch from f21ce41 to 32d080a Compare April 9, 2019 05:55
script:
- cargo test --verbose --features kv_unstable
- cargo test --verbose --features "kv_unstable std"
- rust: stable
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up rolling in this tweak to our CI so we do the embedded build as a separate task like the kv_unstable one, but it does mean we only check it on stable instead of also on beta and nightly. I can revert this change if we don't like it.

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gotten a chance to look at key_values/value/*.rs at all yet, but here are a few comments on the rest. My first impression is that this seems promising and very nicely put together! I would be on board with landing something like this.

src/key_values/mod.rs Outdated Show resolved Hide resolved
src/key_values/key.rs Outdated Show resolved Hide resolved
src/key_values/key.rs Outdated Show resolved Hide resolved

/// A key in a structured key-value pair.
#[derive(Clone)]
pub struct Key<'k> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You wrote:

A container around a string that gives us room to abstract over ownership and add an optional index.

And in the RFC I see:

The Key type could be extended to hold an optional index into a source. This could be used to retrieve a specific key-value pair more efficiently than scanning. There's some subtlety to consider though when the sources of keys are combined in various ways that might invalidate a previous index.

I know this isn't designed yet but is there anything you would be able to describe about what use cases you would like for the index to enable and roughly what sort of code would they want to write using indices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this all came out of a possible Source::get(Key) -> Option<Value> method, where for sources like a BTreeMap<String, Value> the implementation could do an indexed lookup, but for Vec<(String, Value)> it would need to scan. It could be a useful thing for a logging framework that wraps log and has some well-known structured pairs like a timestamp or correlation id that consumers want to poke at.

How you would get a hold of an index, and how useful that index would be if you, say, concat two Sources together is where I stopped the design train and left it as a possible extension. For the well-known pairs use-case you could also just write a Source like:

struct MyFrameworkSource<'s> {
    timestamp: Timestamp,
    correlation: Uuid,
    the_rest: Option<&'s dyn Source>,
}

All that's to say it's something I didn't want to absolutely rule out as an enhancement but it's still airy-fairy.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated
record: Record {
args: format_args!(""),
metadata: Metadata::builder().build(),
module_path: None,
file: None,
line: None,
key_values: KeyValues(&Option::None::<(key_values::Key, key_values::Value)>),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the extremely common case of no key-values, I wonder whether there would be a measurable performance difference from using a (private) Source impl whose visit fn does nothing, unlike Option's which needs to check for Some every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a good excuse to bootstrap some micro-benchmarks for the log crate, so I'll come back to this one later!

src/key_values/error.rs Outdated Show resolved Hide resolved
src/key_values/error.rs Outdated Show resolved Hide resolved
src/key_values/error.rs Outdated Show resolved Hide resolved
src/key_values/value/backend.rs Outdated Show resolved Hide resolved
/// A helper for converting any type into a [`Value`](struct.Value.html).
pub struct FromAny<'a>(&'a mut Backend);

impl<'a> FromAny<'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type looks like "&mut dyn Backend, but with extra helpers". It sounds like we'll want Backend in the public API eventually, so consider putting the helpers directly on dyn Backend instead:

impl dyn Backend + '_ {
    pub(crate) fn value<T>(&mut self, v: T) -> Result<(), KeyValueError>
    where
        T: ToValue,
    {...}

    pub fn debug<T>(&mut self, v: T) -> Result<(), KeyValueError>
    where
        T: Debug,
    {...}
}

We can seal the Backend trait for now if we'd like to be free to add mandatory methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind the FromAny type (and the rest of the any API here) is to try really hard not to make Value look like a serialization contract. The reason is mostly philosophical. The exercise of building out the sval framework convinced me that building a serialization API is tricky, and design decisions can seem arbitrary but have significant impacts on usability and interoperability (sval has some really wacky code to integrate with serde), so log should avoid owning that story and defer to existing libraries.

As an example, the way I imagine we'd add serde support to Value would be:

impl<'v> Value<'v> {
    /// Capture a type that implements `serde::Serialize` as a `Value.
    pub fn from_serde(v: &'v impl Serialize) -> Self {
        Self::from_any(v, |from, v| from.serde(v)) 
    }
}

impl<'v> Serialize for Value<'v> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: Serializer,
    {
        //  an implementation of the internal `Backend` trait
        let mut backend = SerdeBackend::new(serializer);
        self.visit(&mut backend)?;
        
        backend.into_result()
    }
}

impl<'b> FromAny<'b> {
    pub fn serde(self, v: impl Serialize) -> Result<(), KeyValuesError> {
        self.0.serde(v)
    }
}

trait Backend {
    /// Visit a `serde::Serialize`. All backends have to support visiting a `Serialize`.
    fn serde(&mut self, v: &erased_serde::Serialize) -> Result<(), KeyValuesError>;
}

Internally, we can't get away with not having some contract for frameworks to conform to, but we can avoid ever exposing it directly. Instead, we can implement traits like std::fmt::Debug, serde::Serialize, andsval::Value on log::Value as the way you can inspect the underlying structure instead of implementing a trait.

But as someone with a lot of experience working on serialization APIs I'd welcome any input you've got here!

unsafe {
Any {
data: mem::transmute::<&'a T, &'a Void>(data),
from: mem::transmute::<FromAnyFn<T>, FromAnyFn<Void>>(from),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems sketchy as the Rust calling convention for those two function types might be arbitrarily different. I think this is in the realm of "probably works but who knows". It would be good to have a compiler person take a look.

  • fn(FromAny, &T) -> Result<(), KeyValueError>
  • fn(FromAny, &Void) -> Result<(), KeyValueError>

Making them something like unsafe extern "C" fn(FromAny, *const T) -> Result<(), KeyValueError> seems like it would be sufficient to avoid undefined behavior but I don't know how much worse that makes the rest of the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh, this was the weird API I was hoping to get some more eyes on at the All Hands earlier this year.

It's the same approach that std::fmt::Argument uses, but that's unstable. It's also similar to the std::task::RawWakerVTable which is up for stabilization now, but includes some of the extra type safety @eddyb suggested in the stabilization PR. The impression I've gotten is that this approach is ok, but is well off the beaten track and not something you should reach for if you don't have to.

As for why we need it, there's a case where you're bridging between log and another framework where you effectively need:

&'kv T ->  Value<'kv> where T: ToValue || Debug

If your bridging crate doesn't own the type T and it doesn't satisfy either of those bounds on its own then you're stuck. You can't wrap &'kv T in a newtype because you need the data within Value to live for 'kv. That's where Value::from_any comes in. It lets you create a Value<'kv> from any arbitrary &'kv T, regardless of the traits T implements.

Additionally, when we come to integrating other serialization frameworks like serde into log, we can do it by erasing into the same Any mechanism, so there's no matching required.

This approach comes with drawbacks though:

  • It is kinda sketchy and will probably surprise any future contributors that come across it.
  • Once we make from_any public, I can't think of any alternative internal implementation of Value that could support that &'kv T -> Value<'kv> method.
  • Supporting owned values in the future using this erased approach is much trickier.

This PR doesn't make the from_any method public though, so right now there's no reason to use this erased approach over an internal trait object. I just copied it from my reference implementation. For implementation simplicity I'd be on board with replacing this with the more obvious trait-object-based implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, in ee05032 I've refactored this to remove this FromAny infrastructure and use a pair of internal traits Visit and Visitor to preserve the structure of primitive values.

- fix trailing file newlines
- tweak KeyValuesError type not to rely on deprecated Error APIs
- change ToKey impl from &str to str
- hide the sub-modules under key_values
- add a Visitor impl for Box<impl Visitor>
@KodrAus
Copy link
Contributor Author

KodrAus commented Apr 18, 2019

Thanks @dtolnay for a thorough review! I'm working through your feedback and have left some of my thoughts on the value API. I appreciate the fresh perspective.

@KodrAus
Copy link
Contributor Author

KodrAus commented Apr 25, 2019

@dtolnay Alrighty, this should be ready for another look now.

@Thomasdezeeuw
Copy link
Collaborator

@KodrAus Quick note: I'm missing ToValue implementations for usize and isize. More feedback to follow, looking good so far!

@Thomasdezeeuw
Copy link
Collaborator

I've added support for this to my logging implementation to test this pr, it's available here: Thomasdezeeuw/std-logger#14.

As mentioned in a previous comment. I'm missing ToValue for usize and isize.

It would be nice to detect if the record has any key value pairs, this would avoid needless
formatting. Could be as simple as changing key_values to return Option<&dyn Source>.

Could KeyValueError be renamed to kv::Error (where the kv is the module), this would match types like fmt::Error and io::Error. This does mean that without the module name in front of it the name loses all meaning.

I'm missing a Value::from_display method that uses the fmt::Display implementation in printing/formatting. I've also noticed that the fmt::Display implementation for Value actually uses the fmt::Debug implementation via FmtVisitor.

I'm also missing a lot of docs. For example for the kv module: what the types and traits and how they work (together). But this can be future work.

@KodrAus
Copy link
Contributor Author

KodrAus commented Apr 26, 2019

Thanks for the feedback @Thomasdezeeuw!

missing ToValue for usize and isize.

We should definitely add these 👍

It would be nice to detect if the record has any key value pairs, this would avoid needless
formatting. Could be as simple as changing key_values to return Option<&dyn Source>.

We could look at a convenience method on Source that can tell if there are any pairs present so you could avoid formatting. I think it would be good to avoid exposing a Option<&dyn Source> though so we avoid the confusion of None vs Some(&[]).

Could KeyValueError be renamed to kv::Error

That's the name I'd prefer too, but used KeyValueError instead so that it's consistent with the explicitly named errors that already exist in the log crate. I'd happily change it to kv::Error, and maybe consider re-exporting it as log::KeyValueError at the crate root (or not).

I'm missing a Value::from_display method

That's a good idea too!

fmt::Display implementation for Value actually uses the fmt::Debug implementation via FmtVisitor.

This actually shouldn't be an issue, because we can forward a fmt::Display impl through a fmt::Debug one, so whatever fmt trait you use for a particular value is the one that'll be used in both the Value as Debug and Value as Display impls.

@Thomasdezeeuw
Copy link
Collaborator

I think it would be good to avoid exposing a Option<&dyn Source> though so we avoid the confusion of None vs Some(&[]).

I'm not sure I understand, when do we want to use Some(&[])? And what confusion would it cause?

If the key_values field in Record is changed to Option<&dyn Source> we can just return that right? Then RecordBuilder, and any log messages with key/values, can just default to using None.

@KodrAus
Copy link
Contributor Author

KodrAus commented Apr 27, 2019

Ah sorry that wasn’t very clear. The confusion is assuming that if key_values returns some then there are more than zero pairs to work with. That’s not necessarily going to be the case, so having to jump through the extra hoops of matching on an Option doesn’t seem worth it.

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this looks terrific.

@KodrAus
Copy link
Contributor Author

KodrAus commented Apr 29, 2019

🎉

@Thomasdezeeuw The intention now is that we can continue developing off this base without having to break it, but you never know. So in case that does need to happen I'll try remember to give you a CC in any PRs I see that would result in breakage!

Any more input you have in the form of issues and PRs will be most welcome! I've created a tracking issue we can use to follow progress.

@KodrAus KodrAus merged commit 84fffab into rust-lang:master Apr 29, 2019
@Thomasdezeeuw
Copy link
Collaborator

The confusion is assuming that if key_values returns some then there are more than zero pairs to work with. That’s not necessarily going to be the case, so having to jump through the extra hoops of matching on an Option doesn’t seem worth it.

But we could ensure that when key_values is some there are one or more key values. Then we won't have that problem.

EFanZh pushed a commit to EFanZh/log that referenced this pull request Jul 23, 2023
…of `HashMap`.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants