Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Expressions #9439

Merged
merged 39 commits into from
Nov 8, 2017
Merged

Expressions #9439

merged 39 commits into from
Nov 8, 2017

Conversation

anandthakker
Copy link
Contributor

@anandthakker anandthakker commented Jul 6, 2017

GL Native portion of mapbox/mapbox-gl-js#4777

Tasks:

  • Move any and all to a special form (to allow early-stopping behavior)
  • Drop {expression: ...} wrapper in favor of isExpression() heuristic.
  • Add unit tests for isExpression()
  • Make sure we're enforcing style spec function constraints (e.g. only 'step' curves for piecewise-constant functions, etc.)
  • Add constant folding
  • Fix and simplify stop-function conversions
  • Add unit test for runtime styling w/ expression values
  • Mark {Composite,Source,Camera,Identity}Function data members (stops, etc.) const and deal with the fallout. (Ideally, we wouldn’t store these at all, accepting them only as constructor args used to build an expression; but SDKs rely on reading these values back out, so we need to keep them for now.) Actually, I think we should leave this as-is for now and do it once the SDK expressions implementations are underway.
  • Bloaty size delta check
  • Benchmark check

@anandthakker anandthakker added Core The cross-platform C++ core, aka mbgl feature ⚠️ DO NOT MERGE Work in progress, proof of concept, or on hold labels Jul 6, 2017
@anandthakker anandthakker force-pushed the expressions branch 2 times, most recently from 5525b38 to a6d0cf8 Compare July 7, 2017 21:17
@anandthakker
Copy link
Contributor Author

anandthakker commented Jul 7, 2017

@jfirebaugh @kkaefer this is still very preliminary, but I'd love to get your early 👀 just on the basic class structure here. Definitely no need for line-by-line review -- my question at this point is just: does this overall strategy look reasonable, such that it makes sense to start adding implementations of each specific expression? (You can see +, -, /, * there, which I did just to exercise the setup.)

class Typename;
class Lambda;

using ValueType = variant<
Copy link
Contributor

@jfirebaugh jfirebaugh Jul 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This encoding of types looks odd to me.

  • Why do we need both ValueType and Type?
  • Is it necessary to divide types into "primitives" and other types? That seems like an unnecessary division.

Here's my sketch for how I'd do this. First, the types of values:

class Null {};
class Number {};
class String {};
class Color {};
class Object {};
class Array {
    Type itemType;
    optional<std::size_t> length;
};
class Value {};

using Type = variant<
    Null,
    Number,
    String,
    Color,
    Object,
    Array,
    Value>;

Then, you have special stuff for typing lambdas:

class Typename {
    std::string name;
};

class NArgs {
    std::vector<variant<Type, Typename>> types;
};

class Variant {
    std::vector<variant<Type, Typename>> choices;
};

using ResultType = variant<
    Type,
    Typename>;

using ArgumentType = variant<
    Type,
    Typename,
    NArgs,
    Variant>;

class LambdaExpression {
    ResultType resultType;
    std::vector<ArgumentType> argumentTypes;
}

Expression::type should be the type of the expression's result, not its lambda type. (When we talk about the "type of an expression" in any language, we're almost always referring to the type of its result value, not the type of the method if it's a method call.)

These suggestions apply to the JS implementation as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on organizing things into types-of-values / stuff-for-lambdas.

Is it necessary to divide types into "primitives" and other types? That seems like an unnecessary division

The reason for this was that from the point of view of our code (as opposed to the the pov from 'within' the expression language), all these types are values, not classes.
I.e., there need only ever be a single object representing NumberType, StringType, etc. (So Primitive is just there to be the type of said objects.)

Copy link
Contributor Author

@anandthakker anandthakker Jul 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfirebaugh what about something like this instead for that first section ("types of values"):

class Type {
  static Type Null;
  static Type Number;
  static Type String;
  static Type Color;
  static Type Object;
  static Type Value;
}
class Array : Type {
  Type itemType;
  optional<size_t> length;
}

(It's at least a bit simpler than what's there now by avoiding an unnecessary Primitive class...)

Copy link
Contributor Author

@anandthakker anandthakker Jul 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, another thought: i think we should allow Typenames to be considered a valid expression type, so that an Expression can represent an un-typechecked AST. (E.g. before we do type inference, the type of [==, 1, 2] would be T.)

(Alternatively, we could have Expression and TypedExpression, with only the former allowing type to be Typename.)

I'm also on the fence about whether to consider Variant a value type (and also whether to use it to represent Value, as we're currently doing in JS). As of now, Value is the only variant type that an expression can take, but it's conceivable that that could change. I suppose can can cross that bridge later.

Expression(std::string key_, type::Type type_) : key(key_), type(type_) {}
virtual ~Expression() {}

virtual optional<OutputValue> evaluate(float, const GeometryTileFeature&, EvaluationError& e) const = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest packaging the arguments to evaluate in a struct, so if we need to change types or add new parameters it doesn't require touching the definition of every subexpression.

}

template <class V>
static ParseResult parse(const V& value, const ParsingContext& ctx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these parses follow the same form, with only the expression type varying. That suggests it should be defined once as a template on the expression type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 🙇

return isZC;
}

virtual std::unique_ptr<Expression> applyInferredType(const type::Type& type, Args args) const = 0;
Copy link
Contributor Author

@anandthakker anandthakker Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a broken design: e.applyInferredType wants to return a transformed copy of e, but because it transfers ownership of child expressions, it's destructive of e in the process. Either expressions should be made copyable or this should be replaced with just a setType() that the type inference process can use to refine expression types in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not super familiar with the typing algorithm, but my gut is that applyInferredType should be combined with typecheck, and return a new AST -- i.e. it transforms itself and recursively transforms child expressions, and all transforms are immutable and return new objects.


using namespace mbgl::style::expression::type;

TypecheckResult typecheck(const Type& expected, const std::unique_ptr<Expression>& e) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be a (virtual) method of Expression?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 And I still think Typename should be eliminated as a type; a virtual typecheck method could be part of accomplishing that. The other part, I think, is replacing as many uses of Typename as possible with a set of overload signatures. The overload signature should include the return type, and expressions like ==, <, at should have a set of overloads, one for each supported type. Not only will this reduce the need for Typename, it will force us to be more specific about which types these operators support and the definition for each supported type. For example, in the JS implementation I can see that == doesn't work correctly for Color, and < likely should not be defined for Object.

Then for the expression types that are still using Typename, define a custom typecheck method instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sketch of possible definition syntax in JS:

    RegisterCompoundExpression('==', [
        [BooleanType, NullType, NullType],
        () => `true`,
        [BooleanType, NumberType, NumberType],
        (lhs, rhs) => `${lhs} === ${rhs}`,
        [BooleanType, StringType, StringType],
        (lhs, rhs) => `${lhs} === ${rhs}`,
        [BooleanType, BooleanType, BooleanType],
        (lhs, rhs) => `${lhs} === ${rhs}`,
        [BooleanType, ColorType, ColorType],
        (lhs, rhs) => `${lhs}.equal(${rhs})`
    ]),

In C++, you could just provide the evaluation methods, and the signatures could be deduced from those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C++, you could just provide the evaluation methods, and the signatures could be deduced from those.

Hm, I don't see how this would work... I think we need the list of signatures to be modeled as a value (e.g. std::vector<std::vector<type::Type>>) in order to do type checking. That wouldn't be possible just from the existence of certain evaluation methods, right?

Copy link
Contributor

@jfirebaugh jfirebaugh Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can translate from compile time C++ types to runtime expression types via templates. You're already doing this for values with valueTypeToExpressionType, and it's also possible to write functionTypeToSignature:

template <class Result, class... Inputs>
type::Signature functionTypeToSignature(Result (*fn) (Inputs...) = nullptr) {
    return Signature(valueTypeToExpressionType<Result>(),
                     valueTypeToExpressionType<Inputs>()...);
}

// Can be used with deduced template parameters:
//   functionTypeToSignature([] (bool, float) -> float { return 0.0; });
// or explicitly specified:
//   functionTypeToSignature<float, bool, float>();

(Again, just a sketch. Haven't tried to compile this.)

@anandthakker anandthakker force-pushed the expressions branch 2 times, most recently from 636e250 to a5b2f6a Compare July 19, 2017 21:14
// (const EvaluationParameters&, T1, T2, ...) => Result<U>,
// where T1, T2, etc. are the types of successfully-evaluated subexpressions.
template <class R, class... Params>
struct Signature<R (const EvaluationParameters&, Params...)> : SignatureBase {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signature is probably a misnomer, since this object holds not just the types, but also a pointer to an expression's evaluate implementation.

@anandthakker anandthakker force-pushed the expressions branch 2 times, most recently from e470b06 to ca091ab Compare July 21, 2017 20:57
@anandthakker
Copy link
Contributor Author

@jfirebaugh Got a decent draft implementation of using expressions to drive {Identity,Source,Camera}Function<T>. Still plenty left to do (added to top of ticket), but could you give this another round of 👀 ? mbgl/style/function/convert.hpp and mbgl/style/expression/value.xpp are new & possibly good places to start.

Copy link
Contributor

@jfirebaugh jfirebaugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I only reviewed match.hpp/cpp so far.)

namespace expression {

using InputType = variant<int64_t, std::string>;
using MatchKey = variant<InputType, std::vector<InputType>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we reduce singletons to one-element vectors as early as possible.

using InputType = variant<int64_t, std::string>;
using MatchKey = variant<InputType, std::vector<InputType>>;

template <typename T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we need to make a decision. We definitely want to support:

["match", input: T, (label: T | Array<T>, expression: R)..., otherwise: R] (labels must be literals)

for T ∈ {number, string, boolean}. Do we want to support:

["match", input: Value, (label: number | string | boolean | Array<number | string | boolean>, expression: R)..., otherwise: R] (labels must be literals)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is no on the latter, because I would think that any realistic use case for that could be handled by simply coercing the input to a string and then matching on string keys.

If users really want to distinguish between 0 and "0" or true and "true", they could do things like the following (though I think this would be quite rare):

[ "let", "input", input,
  [ "match", ["concat", ["typeof", ["var", "input"]], "-", ["to_string", ["var", "input"]]],
    "string-0", e_a: R,
    "number-0", e_b: R,
    ...
  ]
]  
    
  

}

EvaluationResult evaluate(const EvaluationParameters& params) const override {
const auto& inputValue = evaluateInput(params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, name the type instead of using auto (unless the type name already appears elsewhere on the line).

if (cases.find(*inputValue) == cases.end()) {
return otherwise->evaluate(params);
}
return cases.at(*inputValue)->evaluate(params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use an iterator variable rather than finding *inputValue twice (once with find and once with at).

assert(isArray(value));
auto length = arrayLength(value);
if (length < 3) {
return CompileError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS throws exceptions rather than using a Result type. Should be consistent one way or the other.

if (*inputValue == rounded) {
return rounded;
} else {
return EvaluationError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want an evaluation error in this case? I'd expect it to use the otherwise value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right; since we don't distinguish among numeric types, this shouldn't be treated as a type error

private:
static bool isIntegerValue(const mbgl::Value& v) {
return v.match(
[] (uint64_t) { return true; },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to handle overflowing of int64_t for this case. Also, what about overflowing Number.MAX_SAFE_INTEGER in JS? Should we enforce a range that's compatible with JS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Yeah, I suppose we might as well enforce a range compatible with JS.

return parsedInput;
}

auto otherwise = parseExpression(arrayMember(value, length - 1), ParsingContext(ctx, {length - 1}, {"match"}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS parses the non otherwise branches first. Should be consistent one way or the other.

// Expect odd-length array: ["match", input, 2 * (n pairs)..., otherwise]
if (length % 2 != 1) {
return CompileError {
"Missing final output value for \"match\" expression.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS error is "Expected an even number of arguments."

checkedCases.emplace_back(pair.first, std::move(*checkedOutput));
}

error = matchType(*outputType, (*checkedOtherwise)->getType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outputType can still be empty here.

@anandthakker anandthakker force-pushed the expressions branch 3 times, most recently from 62049c6 to 2746c28 Compare August 13, 2017 02:44
@anandthakker
Copy link
Contributor Author

@jfirebaugh stumbled on a dependency issue as I was working on wiring up expression parsing into the mbgl::style::conversion system. Currently, I have a template <typename V> static parse(const V&, ParsingContext) method on each Expression subclass.

The symptom I'm seeing is that, when I try to include expression parsing in the data-driven property value converter (598c7a1), references to conversion methods like isUndefined(rapidjson object) from the expression parse() methods are coming up undefined.

I think the problem is that, due to the following dependency chain...

  • parser.hpp -> light.hpp -> property_value.hpp -> camera_function.hpp -> mbgl/style/function/convert.hpp -> various Expression subtypes.

... the expressions are included too early -- namely, before rapidjson_conversion.hpp, which is private and only comes in via parser.cpp.

Do you see a way to fix this without moving each expression subclass's parse() to a separate file?

@jfirebaugh
Copy link
Contributor

I think you want to split up your conversions into two groups:

  • Templated conversion from V representing some JSON-like type -- requires rapidjson_conversion.hpp or similar to have been included
  • Conversion from Stops -- this shouldn't require isUndefined, etc. to be declared

@anandthakker anandthakker force-pushed the expressions branch 3 times, most recently from 80508d0 to ce48199 Compare August 17, 2017 19:07
@anandthakker
Copy link
Contributor Author

Reran the function benchmarks, and they're pretty much indistinguishable from #9439 (comment)

@anandthakker
Copy link
Contributor Author

anandthakker commented Nov 4, 2017

@jfirebaugh okay, I think I've addressed everything from your latest review. Benchmarks are looking okay, and if we're okay with the +300K for now, then I think the only thing outstanding is to add Expression::operator==() (so that *Function::operator==() can do a value == rather than pointer ==). That's WIP in 21172a9.

virtual ~Expression() = default;

virtual EvaluationResult evaluate(const EvaluationContext& params) const = 0;
virtual void eachChild(const std::function<void(const Expression*)>&) const = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the function argument is a pointer rather than a reference?

return {};
}
if (isFeatureConstant(expression->get())) {
return { CameraFunction<T>(std::move(*expression)) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the findZoomCurve check needed here too?


// consume the first output value, which doesn't have a corresponding input value,
// before proceeding into the "stops" loop below.
auto first_output = ctx.parse(arrayMember(value, 2), 2, outputType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firstOutput

return ParseResult();
}

if (*label < previous) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<=

for (std::size_t i = 3; i + 1 < length; i += 2) {
const optional<mbgl::Value> labelValue = toValue(arrayMember(value, i));
optional<double> label;
optional<std::string> labelError;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused.


if (!itemType) { itemType = {type::Value}; }

return type::Array(*itemType, arr.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use itemType.value_or(type::Value) here.

const std::string tname = type::toString(t);
if (!itemType) {
itemType = {t};
} else if (type::toString(*itemType) == tname) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we can use a direct == here now rather than stringifying.

[&] (const std::unordered_map<std::string, Value>& obj) {
writer.StartObject();
for(const auto& entry : obj) {
writer.String(entry.first);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writer.Stringwriter.Key

}

const auto i = *fromExpressionValue<double>(*evaluatedIndex);
const auto inputArray = *fromExpressionValue<std::vector<Value>>(*evaluatedInput);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to convert just the indexed value, rather than the whole array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fromExpressionValue(v), where T is member of expression::Value, is a no-op conversion, equivalent to v->get(). I'll rewrite using the latter, since it's clearer.

- Disallow duplicate stops
- Ignore the no-token-replacement-for-expressions test
@anandthakker anandthakker changed the title [core] WIP - Expressions Expressions Nov 8, 2017
@anandthakker anandthakker merged commit f648cfe into master Nov 8, 2017
@anandthakker anandthakker deleted the expressions branch November 8, 2017 17:34
@tobrun tobrun removed the ⚠️ DO NOT MERGE Work in progress, proof of concept, or on hold label Nov 27, 2017
@tobrun tobrun mentioned this pull request Dec 7, 2017
24 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Core The cross-platform C++ core, aka mbgl feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants