Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opaque values #1913

Merged
merged 5 commits into from
May 24, 2024
Merged

Opaque values #1913

merged 5 commits into from
May 24, 2024

Conversation

jneem
Copy link
Member

@jneem jneem commented May 10, 2024

Here's a first stab at opaque values, as in #1909. The payload is a u64 for now -- I think we could change it without breaking our backwards-compatibility promises, because you can only generate these when embedding nickel, and we don't promise much about the stability of nickel-as-a-library.

@jneem jneem requested review from yannham and vkleen May 10, 2024 18:00
@github-actions github-actions bot temporarily deployed to pull request May 10, 2024 18:03 Inactive
Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have a type for that. A type is a bit useless in that opaque value can't be deconstructed or computed with. So maybe a contract in the stdlib is sufficient (and wouldn't reserve a new keyword).

On another front, I personally find Opaque to be a bit too generic, although it's rather descriptive. I wonder if we should insist on the fact that those values are intended to be injected by the environment, so maybe something like ForeignValue, ForeignId, OpaqueForeign, or something like that? But it's just a suggestion, though.

@@ -261,6 +261,13 @@ pub enum Term {
///
/// This is a temporary solution, and will be removed in the future.
Closure(CacheIndex),

#[serde(skip)]
/// An opaque value that cannot be constructed within Nickel code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth to insist on the fact that they cannot be constructed, but also should never be observable from within Nickel code (compared or distinguished by any means).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made sure they can't be compared for equality; is there some other kind of observability that needs to be handled?

I was imagining that these values could be copied around (and possibly manipulated by functions provided by the nickel embedder). So you'd be able to write a contract like { username | String, token | ForeignId }, and then by applying this contract you could observe the presence or absence of the token.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I said observable, but the right technical term is probably separable, which means "distinguishable".

That is, for every context C (that you can think of just as a function here), for any pair of foreign keys k1 and k2, then if C k1 k2 ~> v (the context evaluates to the value v), then for all other foreign keys k and l, C k l ~> v' with v ~ v' (I won't define the precise meaning of ~, but let's say it's an observational equivalence). Maybe we need to extend that to arbitrarily long finite lists of key.

Put differently, no result of an expression can depend on the actual values of the keys. They must be all interchangeable without affecting the semantics. We need to include error messages as well (so my above specification is incomplete), because that's another way for a malicious user to get the value of the key indirectly - but you properly don't print the content of the key in the pretty printer, which is all good.

Copy link

@aiverson aiverson May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparing by equality between foreignIds is actually specifically fine and useful. Even being able to use them as keys in maps is fine and useful as long as that doesn't reveal the content of the opaque reference directly or indirectly. (though there is some subtlety of exactly which equality is being checked, since there could be two copies of the same reference, two different references to the same local data, or two different references to promises which will eventually become the same)
It's being able to forge them or inspect the backing data that's the problem, and as long as there's no way to make a new opaque reference (from inside the language; making them from foreign functions into the trusted platform is fine) that is satisfied, and no way to look up what's behind the foreign reference. Being able to check the arbitrary number in the foreign reference that indexes the backing table isn't even a security issue as long as the table itself is impossible to access, but it is good practice to forbid that since any kind of behavioral reliance on the specific values of those numbers is fragile and should be prevented, and if the numbers are viewable in the first place someone accidentally allowing forgery is more likely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using them as map keys is much more invasive. Currently record fields in Nickel are all strings, that you can list with stdlib functions. So opaque value would need special casing here.

Regarding equality, this is easy to add - I just wonder if there are other use-cases where you would want to not even have equality. In any case, I would propose to start the most restrictive possible and move forward with this PR, and then see cases by case what we would add and why, if that sounds good to you all.

///
/// This can be used by programs that embed Nickel, as they can inject these opaque
/// values into the AST.
Opaque(u64),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we spell u64 somewhere else, but I wonder if we should use a type alias instead to make it easy to switch to something else (ie define Opaque(OpaquePayload) with type OpaquePayload = u64 or something).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might even be fairly straightforward to make the interpreter generic over these if desired; I don't actually need that for my usecases so it might not be worth it, but it might allow generalizing to other systems nicely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a bit painful, since it would involve carrying around an extra type parameter everywhere that Term is used (which is basically everywhere). An easier version would be to have a Box<dyn Any> to allow downcasting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Joe, on paper it's just adding a generic, but API-wise, this is painful, because you often have to propagate this type parameter everywhere (every site that uses a term, which is pretty much everywhere). For less fundamental datastructure we could a parameter and make an alias like pub type Foo = ParametrizedFoo<usize>, but for terms, I'm afraid this will leak in typechecking, transform, etc. which might not be worth it. The Box<dyn Any> doesn't look very nice, but yeah, if it's really needed, this might be a possibility.

core/src/typ.rs Outdated
@@ -269,6 +269,8 @@ pub enum TypeF<Ty, RRows, ERows> {
///
/// See [`crate::term::Term::Sealed`].
Symbol,
/// An opaque value, the type of `Term::Opaque`.
Opaque,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should even worry about having a type for that, given that the user can't do much with opaque values, and can't use this type right now (it's not in the grammar, and it would be a breaking change to add it). However it doesn't cost much, and maybe it could be useful for Rust binaries consuming the library? I'm not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the nickel embedder might provide a function like union: ForeignId -> ForeignId -> ForeignId, so it would be nice if we could type-check it

Copy link
Member

@yannham yannham May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you mean customizing the set of operations possible on opaque values with their own operations? There is still the problem that if the user can't write this type down, it is somehow a second class citizen (and once again, making it possible to spell it out is a breaking change).

All of that being said, having the type internally is like 10 additional line of code, so let's not argue about it too much. It doesn't really hurt to have this type internally, and see later if it can be turned into something useful. I'm fine with keeping it as it is for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A second idea to make it not a breaking change is to wait for the let-type RFC, and make it possible to define primitive types - think %Opaque% (just so it doesn't clash with user-land). Then, once we can export types from a record, we could define stdlib namespaced types, like type Opaque = %Opaque% in std.foreign or something.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a good way to do this would be to have a newtype construct to allow creating a distinct type family that is backed by an existing type, so that we can have this core Opaque/ForeignId type, then give names and possibly generics to various kinds of foreign data. That allows making more specific function signatures that work on foreign data too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in fact the let type or type = would precisely act like a newtype. But the point stands that you need to be able to refer to a primitive Opaque type at some point, and just adding it would break backward compatibility (maybe someone called their contract Opaque already). So the idea would be to have the primitive type use an obscure internal syntax - as other primops work in nickel already - and the stdlib would just export type Opaque = %BuiltinOpaqueType%.

core/src/typ.rs Outdated
@@ -818,6 +821,7 @@ impl Subcontract for Type {
TypeF::Number => internals::num(),
TypeF::Bool => internals::bool(),
TypeF::String => internals::string(),
TypeF::Opaque => internals::opaque(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this contract can never be instantiated in practice (unless programmatically) . For example, we don't even bother elaborating a contract for Symbol below (which is also arguably not a very useful type). Maybe it would be more useful to have a std.contract.Opaque contract?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, if you keep the type, I think keeping the contract as well makes sense rather than replace it with a panic, even if it's not useful right now.

@github-actions github-actions bot temporarily deployed to pull request May 16, 2024 16:48 Inactive
Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I think you need to update the signature of the typeof function in the stdlib (the return enum type), otherwise its contract will blame on 'ForeignId.

@jneem
Copy link
Member Author

jneem commented May 21, 2024

Hm, it looks like I never updated UnaryOp::Typeof to support 'ForeignId; that's added now

@github-actions github-actions bot temporarily deployed to pull request May 21, 2024 16:37 Inactive
Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's start like this and add operation on foreign ids as we see fit.

@yannham yannham added this pull request to the merge queue May 24, 2024
Merged via the queue into master with commit c8e8401 May 24, 2024
5 checks passed
@yannham yannham deleted the opaque branch May 24, 2024 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants