Some additional performance optimization #119

szeiger · 2021-04-12T20:48:43Z

The latest result from 0.4.0 was:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  751.757 ± 6.697  ms/op

This PR brings it down to:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  310.339 ± 1.590  ms/op

The main optimizations are:

New scope handling: ValScope is now symbol-based instead of name-based. New bindings are appended at the end with no shadowing; self, $ and super are treated like regular variables in the scope.
A new StaticOptimizer which uses a single AST transformation after parsing to implement several of the following optimizations (per source file; stored in the parse cache for reuse)
Static binding of standard library calls
Arity-specific function calls for standard library functions (ApplyBuiltin) and user-defined functions (Apply)
Static resolution of named arguments and defaults in function calls
Remove several unnecessary object allocations and indirections (Applyer, lambdas in members & standard library functions)
No more ValScope allocation for standard library calls
Partial application and peephole optimizations of standard library calls via Builtin.specialize (e.g. pre-compiled patterns in format and strReplaceAll; specialized implementation of length(filter(...)))
Static application of functions when all parameters are literals
Inlining of literals
General optimizations of various standard library functions
Make Val a subclass of Lazy to avoid unnecessary wrappers when a Lazy is required for an already computed Val
Allow arrays of literals to be treated as literals (similar to static objects introduced in the last round of optimizations)
Faster dispatch in the Evaluator methods by using a tableswitch-based dispatch for operators and a lookupswitch ordered by frequency (in our benchmarks) for node types
Avoid unnecessary multiple loading of imported files
New Renderer implementations based on the latest ujson
New JSON parsing directly to Sjsonnet Val without an intermediate usjon AST

New features:

ExprTransform and ScopedExprTransform for implementing tree transforms (used by the optimizer and some benchmarks)
Benchmarks for parser, optimizer and materializer
A profiler for gathering jfr-like profiling data at the level of AST evaluation

lihaoyi-databricks · 2021-04-17T13:12:05Z

sjsonnet/src/sjsonnet/Evaluator.scala

+    val l = visitExpr(lhs)
+    val r = visitExpr(rhs)
+    def fail() = Error.fail(s"Unknown binary operation: ${l.prettyName} ${Expr.BinaryOp.name(op)} ${r.prettyName}", pos)
+    op match {


Should this be annotated : @switch to ensure the tableswitch-based compilation isn't accidentally broken?

Ditto for the visitUnaryOp pattern match above

AFAIR @switch is broken. But even if it works, it also accepts a lookupswitch so it's not much of help.

lihaoyi-databricks · 2021-04-17T13:24:27Z

sjsonnet/src/sjsonnet/Val.scala

  def cast[T: ClassTag: PrettyNamed] =
    if (implicitly[ClassTag[T]].runtimeClass.isInstance(this)) this.asInstanceOf[T]
    else throw new Error.Delegate(
      "Expected " + implicitly[PrettyNamed[T]].s + ", found " + prettyName
    )
-  def pos: Position
+
+  private[this] def failAs(err: String): Nothing =


Should we make this take an implicit T: PrettyNamed for the error message, rather than passing in the err string manually? That would help ensure that we're consistent with our names for the various Val.* classes

lihaoyi-databricks · 2021-04-17T13:37:25Z

sjsonnet/test/src/sjsonnet/ParserTests.scala


      parse("1 + 2 * 3") ==>
-        BinaryOp(pos(2), Num(pos(0), 1), BinaryOp.`+`, BinaryOp(pos(6), Num(pos(4), 2), BinaryOp.`*`, Num(pos(8), 3)))
+        BinaryOp(pos(2), Num(pos(0), 1), BinaryOp.OP_+, BinaryOp(pos(6), Num(pos(4), 2), BinaryOp.OP_*, Num(pos(8), 3)))
    }
    test("duplicateFields") {
      parseErr("{ a: 1, a: 2 }") ==> """Expected no duplicate field: a:1:14, found "}""""


Can we add a few tests to ParserTests to validate the construction of static Val.Arrs and Val.Objs? A lot of the new changes are semantically indistinguishable from the status quo, and thus wouldn't get validated through the normal course of compiling jsonnet. We're also starting to have edge cases that are worth validating in tests, e.g. nested static arrays, nested static objects, alternating nested static arrays and static objects, static arrays containing a mix of static primitives and other static arrays, etc.

lihaoyi-databricks · 2021-04-17T13:42:38Z

sjsonnet/src/sjsonnet/Evaluator.scala

@@ -33,20 +33,24 @@ class Evaluator(parseCache: collection.mutable.HashMap[(Path, String), fastparse
  def visitExpr(expr: Expr)
               (implicit scope: ValScope): Val = try {
    expr match {
+      case Id(pos, value) => visitId(pos, value)


I assume you re-ordered these to try and speed up lookup for the most common cases. How did you find the order? e.g. was it just a guesstimate, did you instrument it to see which ones are most common, or something else?

I gathered statistics from our universe benchmark. It may not be representative for all use cases but the current version doesn't seem to be ordered for performance anyway.

sjsonnet/src/sjsonnet/Std.scala

bench/src/main/scala/sjsonnet/ParserBenchmark.scala

sjsonnet/src/sjsonnet/Val.scala

lihaoyi-databricks · 2021-04-17T14:07:08Z

sjsonnet/src/sjsonnet/Val.scala

+          Error.fail("Too many args, function has " + params.names.length + " parameter(s)", outerPos)
+        }
+        arrI
+      } else if(params.indices.length < argsSize) {


Would moving this case to the top of the if-else chain allow us to avoid the try-catch above? That would also allow us to avoid duplicating the Error.fail call. AFAICT the most common code path is the last else, so re-ordering the first two cases shouldn't slow things down too much

lihaoyi-databricks · 2021-04-17T14:10:03Z

Looks good, left some comments

szeiger · 2021-04-17T22:47:56Z

Some of those might already be obsolete with the further changes I made. Current benchmark time is 483ms. I'm still exploring a few options for improving it.

szeiger · 2021-06-15T18:28:12Z

Updated with the latest changes. I'm running out of ideas and haven't made progress in a while. We should get this version merged. I tested it against universe. It will require 2 changes there because we relied on incorrect behavior of the old Sjsonnet release.

szeiger · 2021-06-15T18:31:43Z

0.4.0 stands at 751 ms in our benchmark. Here's the progress over the course of this PR:

Remove Applyer & optimize function application:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  738.439 ± 6.316  ms/op

Scope-free function application:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  690.672 ± 4.851  ms/op

Optimize Std:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  673.334 ± 5.585  ms/op

Optimize builtinWithDefaults:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  663.788 ± 2.486  ms/op

Optimize Builtin args handling:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  642.562 ± 3.537  ms/op

Shared Val.Strict + Array literals:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  626.566 ± 3.125  ms/op

Static object optimization + Obj.Member without lambdas:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  616.049 ± 4.625  ms/op

Reorder operations by frequency:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  595.176 ± 3.789  ms/op

tableswitch-based operator lookup:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  582.002 ± 5.800  ms/op

Static bindings of Std calls:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  501.764 ± 3.263  ms/op

More static optimization:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  494.850 ± 2.327  ms/op

Fix import caching:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  483.840 ± 3.675  ms/op

Conventional scope handling:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  480.387 ± 2.640  ms/op

Optimize scope handling:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  469.976 ± 2.968  ms/op

Val extends Lazy:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  464.764 ± 1.853  ms/op

Optimize Materializer + Renderer:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  431.154 ± 1.928  ms/op

ApplyN:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  408.612 ± 3.715  ms/op

Add special-case Exprs

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  401.357 ± 1.430  ms/op

Stdlib specialization:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  391.974 ± 2.038  ms/op

% specialization:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  378.203 ± 2.186  ms/op

Static apply + length optimization:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  363.323 ± 2.544  ms/op

MaterializeRenderer:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  337.980 ± 1.187  ms/op

Filter strictness optimization:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  332.285 ± 1.596  ms/op

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  331.977 ± 1.212  ms/op

Builtin with defaults & std.setInter:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  322.575 ± 0.885  ms/op

length(filter) specialization:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  314.919 ± 1.103  ms/op

Reduce phase mismatches in scoped transform:

[info] Benchmark           Mode  Cnt    Score   Error  Units
[info] MainBenchmark.main  avgt  160  310.339 ± 1.590  ms/op

szeiger · 2021-06-16T12:54:41Z

build.sbt

@@ -6,7 +6,7 @@ cancelable in Global := true

 lazy val main = (project in file("sjsonnet"))
  .settings(
-    scalacOptions in Compile ++= Seq("-opt:l:inline", "-opt-inline-from:sjsonnet.**"),
+    scalacOptions in Compile ++= Seq("-opt:l:inline", "-opt-inline-from:sjsonnet.*,sjsonnet.**"),


The -opt-inline-from syntax is confusing. sjsonnet.** only inlines from subpackages (which we don't have). We need to add sjsonnet.* to inline from the sjsonnet package, too. Ultimately it doesn't make a big difference. HotSpot has become so good that it doesn't matter in most cases but the optimizations are not entirely deterministic. Sometimes you end up with a benchmark run that is 10% slower than it should be. Letting scalac do the trivial inlining (and thus creating less work for HotSpot) makes this more reliable.

lihaoyi-databricks · 2021-06-22T12:42:29Z

Looks good, few things before we merge it:

Update the PR description to have your latest benchmarks (in your later comment), and try to write up a summary of all the disparate changes you did. e.g. you introduced an optimizer, what you optimize, what changes to the core datatypes are, what common themes like specialization/unboxing/fastpaths occur throughout the PR
Flesh out each of the commits with a paragraph description of what you are doing and why? That will help make the blame useful for when we merge/rebase it without squashing. At least 2-3 sentence, but many commits deserve more since clearly a lot of thought/work went into them
Update the readme.md Architecture section: we have a new phase (static optimizations) and the data types are different (now Val and Expr are related, and Lazy and Strict now exist)
Add some docs about how to run the benchmarks and profiler; both on proprietary code (for people at databricks) and how to point it at other codebases (for anyone not at databricks)

Other than that, I have reasonable confidence in the test suite and our ZZZ golden files to catch any bugs before they slip through. Main concern now is just to make sure the knowledge of why you did each of these commits you earned from the hours you spent on this stuff is preserved in the codebase/git, for future maintainers to pick up and continue. Especially non-obvious things like avoiding megamorphic code in hot paths, removing boxing, fast-paths for common usage patterns, would be much harder for someone to later reverse-engineer v.s. simply reading why you did something

This hides the internal array and provides all the necessary methods to access it efficiently. There are experiments in https://github.com/szeiger/sjsonnet/tree/wip/perf-arropt that didn't improve performance.

Applyer required an extra object and indirection for every call of a higher-order function. We can do without the convenience and use Val.Func directly, passing the EvalScope and FileScope directly.

This introduces the Val.Builtin types for built-in functions of various arities, plus arity-specific apply methods for all functions. The arity-specific methods and types allow us to avoid array allocation for parameters. The Builtin types further avoid allocating a new ValScope when calling a built-in function. This commits starts the refactoring of standard library functions to the new Builtin types, which is an ongoing effort. Normal Val.Func functions are still supported, but they do not benefit from the same optimizations.

Refactor more standard library functions to use `Val.Builtin` and introduce convenience methods in `Val` for type coercions. These take the place of the `ReadWriter` implicits (which are still used in some places). The old way of implementing built-in functions as `Val.Func` via the `builtin` methods handled type coercions automatically (with `ReadWriter`), but the new style of manually implementing `Val.Builtin` is much easier with the convenience methods.

Some performance optimizations for `builtinWithDefaults` to avoid complex collection operations when calling such a function.

This starts the refactoring of built-in functions into objects (instead of anonymous classes) and removes the unnecessary FileScope.

Avoid creating many individual subclasses of `Lazy` for representing strict values (which have already been evaluated or are safe to evaluate immediately because we know they will be evaluated anyway). The new shared `Strict` class assigns the cached value in the constructor (in addition to returning it in `compute`). This looks unnecessary from a functional point of view, but it is important for performance as it avoids the megamorphic `compute` call the first time it is forced. `force` itself is a short monomorphic method that can easily be inlined by HotSpot.

Arrays can now be literals. Any array expression encountered by the parser which contains only literals, is itself turned into a literal, i.e. an instance of `Val.Arr` rather than `Expr.Arr`.

We already introduced the concept of static objects (created from object literals, containing only members with static names and literal values) in the previous round of optimizations. This can be used for faster key and value lookup.

`Val.Obj.Member` is now an abstract class with an abstract `invoke` method instead of taking a function argument for the member implementation. This avoids the extra object allocation per definition and the extra indirection per call site. When a member returns a statically known value (which is always the case in a static object) we use the special `ConstMember` class which serves the same purpose as `Val.Strict` for `Val.Lazy`.

The main benchmark is still over 600ms at this point. The new parser and materializer benchmarks tell us how much of this time is spent outside the evaluator (a bit over 30ms for the parser and 60ms for the materializer).

This is based on statistics gathered from our benchmark. The class-based lookups in `visitExpr` get compiled to a `lookupswitch` which is generally pretty fast, but with linear time based on the position in the method.

Unary and binary operators already used the same expression types, with the `Op` objects only being used as markers. They can be easily replaced by `Int` literals, thus allowing `visitUnaryOp` and `visitBinaryOp` to be compiled to a `tableswitch` at the outer layer, which makes all operator lookups equally fast.

This commit introduces the static optimizer. The base class `ExprTransform` implements an AST transformer which can recursively transform and rebase an `Expr`. `StaticOptimizer` adds a scoped transform (which keeps track of the names that are in scope for each `Expr`) and implements the first optimization: Calls of the form `std.x(...)`, where `std` is the standard library (i.e. the name `std` has not been shadowed by a local definition) and `x` is a valid method name in the standard library, are replaced by one of the new `Expr.ApplyBuiltin` expression types (for various arities). This allows us to skip looking up `std` and `x` again during evaluation.

- Add a new benchmark for the optimizer (~2-3ms in our normal benchmark run) - Introduce the `Resolver` abstraction for resolving imports - Refactor the scoped transformations from `StaticOptimizer` into a new superclass `ScopedExprTransform`

Another statistics-based reordering since we added a number of new expression types since last time.

This replaces `ValidSuper` with `SelectSuper`, `InSuper` and `LookupSuper`. A `super` call can only appear in these 3 contexts and the code paths for evaluating them are very different from the non-`super` versions, so it makes sense to split them up.

We can't beat HotSpot when it comes to making the right decision about which small methods should be inlined into `visitExpr`.

We are adding another new step in the optimization of function application expressions. Any `Builtin` now gets the opportunity to rewrite the call site. This is particularly useful for partially applying literal arguments during optimization. For example, the `from` argument of `std.replaceAll` has to be parsed as a regular expression every time the function is called. When it is statically known we now generate a call to a specialized version that performs the parsing only once during optimization.

This is similar to what jfr does for the JVM, but based on the Sjsonnet AST.

The `%` operator can benefit greatly from partial application when the lhs is a string literal, so we add an optimizer rule for this. This is not useful for any other operators. We do not need a generic mechanism like we have for `Builtin` functions.

- As another step in function application optimization we now try to statically apply a `Builtin` function when all arguments are literals. - Some micro-optimizations for `std.length`.

- Don't parse JSON to an intermediate ujson AST first in `std.parseJson`. The new `ValVisitor` can parse directly to an Sjsonnet `Val`. - Similarly the new `MaterializeJsonRenderer` used by `std.manifestJson` renders the output without an intermediate ujson AST.

We can avoid allocating a new `ValScope` for each predicate call in `std.filter`. Normally every definition has to copy the existing scope instead of updating a shared array because any value could be read at an arbitrary later time due to lazy evaluation. But this is not the case when a function returns a primitive value. In particular, in `std.filter` we are repeatedly calling the same predicate function and we only check if it returned `Val.True`. The value is not stored anywhere for later use. This makes it safe to reuse a single `ValScope` with a single bindings array for all calls.

Built-in functions with defaults were still treated as `Val.Func`. With the old scope handling we would have needed a more complex implementation to also handle default values but this is no longer a problem. We simply have to pass them on to the superclass. Now that we can turn `std.setInter` into a `Builtin` we can optimize it further by partially applying a static argument.

Calls of the form `std.length(std.filter(...), ...)` can be optimized to skip creation of the filtered array. We can simply count as we go along. It is not clear at this time if the Jsonnet specification allows us to go further and skip evaluation entirely in cases like `std.length(std.filter(...), ...) == 0` so we still evaluate everything.

Some micro-optimizations

When looking up definitions in the static scope of a `ScopedExprTransform` it is not always possible to see the value at the current phase (i.e. after `StaticOptimizer` vs after `Parser` at the moment; we do not have more phases yet). This is a general problem in any language that allows recursive references in definitions. Previously we used the simplest possible implementation in `ScopedExprTransform`: All definitions that are made together (e.g. in a single `local` expression) are stored in the scope at the same time (using their value after the previous phase), and then they are evaluated to provide an improved scope in which the body of the `local` can see them with the updated values after the current phase. This prevents full optimization in cases like this: ``` local a = 1, b = a, c = b; c ``` When the rhs of `c` is optimized, the scope still contains `b = a` even though `a` has already been inlined (`b = 1`). With this PR we do the next better thing: Update the scope incrementally to allow back-references to see the current phase. Note that `a`, `b` and `c` are all allowed to refer to each other and they may be defined in an arbitrary order. In these cases we can still miss some optimizations, but supporting forward references would require lazy evaluation of scopes and recursion detection. In practice most references that benefit from these optimizations are expected to go backwards and we want to keep it simple.

This avoids creating `Lazy` objects in some cases when we alreayd know that evaluation will be strict.

Some refactoring to simplify and generalize these optimizations.

We have to do them before the optimizer because they are based on syntax.

PrettyYamlRenderer relies on subVisitor calls for individual elements (and no subVisitor call for an empty array)

The current behavior (after the scope handling overhaul) is correct (matching the specification and Google Jsonnet) but the error message was misleading. Sjsonnet 0.4 did not detect the illegal call at all.

`extVar` depends on external variables which are part of a specific evaluation. We have to ensure that they do not end up in the shared parse cache.

The result is puzzling but it matches the specification and Google Jsonnet. Returning a nullary `function() true` causes it to be evaluated during materialization to `true` but comparing it explicitly to `true` must yield `false` because there is no implicit evaluation in this case. This was previously broken in Sjsonnet.

They are only needed for tests and benchmarks. All production code uses the new implementations.

Oops, replaceAll accidentally did the work twice.

szeiger · 2021-06-25T18:27:58Z

Updated with additional docs.

lihaoyi-databricks reviewed Apr 17, 2021

View reviewed changes

sjsonnet/src/sjsonnet/Std.scala Outdated Show resolved Hide resolved

lihaoyi-databricks reviewed Apr 17, 2021

View reviewed changes

bench/src/main/scala/sjsonnet/ParserBenchmark.scala Outdated Show resolved Hide resolved

lihaoyi-databricks reviewed Apr 17, 2021

View reviewed changes

sjsonnet/src/sjsonnet/Val.scala Outdated Show resolved Hide resolved

lihaoyi-databricks reviewed Apr 17, 2021

View reviewed changes

szeiger commented Jun 16, 2021

View reviewed changes

lihaoyi-databricks approved these changes Jun 22, 2021

View reviewed changes

szeiger added 15 commits June 22, 2021 19:20

Refactor Val.Arr to better separate the code paths that require laziness

3fc1482

This hides the internal array and provides all the necessary methods to access it efficiently. There are experiments in https://github.com/szeiger/sjsonnet/tree/wip/perf-arropt that didn't improve performance.

Remove Applyer & optimize function application

a7e576a

Applyer required an extra object and indirection for every call of a higher-order function. We can do without the convenience and use Val.Func directly, passing the EvalScope and FileScope directly.

Optimize builtinWithDefaults

f50f07a

Some performance optimizations for `builtinWithDefaults` to avoid complex collection operations when calling such a function.

Optimize Builtin args handling

9944f8a

This starts the refactoring of built-in functions into objects (instead of anonymous classes) and removes the unnecessary FileScope.

Array Literals

1b248e4

Arrays can now be literals. Any array expression encountered by the parser which contains only literals, is itself turned into a literal, i.e. an instance of `Val.Arr` rather than `Expr.Arr`.

Static object optimization

449c32b

We already introduced the concept of static objects (created from object literals, containing only members with static names and literal values) in the previous round of optimizations. This can be used for faster key and value lookup.

Add parser and materializer benchmarks

eafb0f4

The main benchmark is still over 600ms at this point. The new parser and materializer benchmarks tell us how much of this time is spent outside the evaluator (a bit over 30ms for the parser and 60ms for the materializer).

Reorder operations by frequency

3ca6c62

This is based on statistics gathered from our benchmark. The class-based lookups in `visitExpr` get compiled to a `lookupswitch` which is generally pretty fast, but with linear time based on the position in the method.

More static optimization

a23b01a

- Add a new benchmark for the optimizer (~2-3ms in our normal benchmark run) - Introduce the `Resolver` abstraction for resolving imports - Refactor the scoped transformations from `StaticOptimizer` into a new superclass `ScopedExprTransform`

szeiger added 27 commits June 22, 2021 21:44

Reorder visitExpr

650fc28

Another statistics-based reordering since we added a number of new expression types since last time.

Simplify visitExpr and let HotSpot do the inlining

ec32cc9

We can't beat HotSpot when it comes to making the right decision about which small methods should be inlined into `visitExpr`.

Add profiler

8a6642d

This is similar to what jfr does for the JVM, but based on the Sjsonnet AST.

Specialization of %

3b501a3

The `%` operator can benefit greatly from partial application when the lhs is a string literal, so we add an optimizer rule for this. This is not useful for any other operators. We do not need a generic mechanism like we have for `Builtin` functions.

Static apply + length optimization

b21ac5b

- As another step in function application optimization we now try to statically apply a `Builtin` function when all arguments are literals. - Some micro-optimizations for `std.length`.

Some Materializer optimization

407cb9f

Collect more profiling data

b92b418

Improve equal

a2e8448

Some micro-optimizations

Restore Scala.JS / Native / 2.12 compatibility

56d556f

Skip lazy evaluation of literals

cb11680

This avoids creating `Lazy` objects in some cases when we alreayd know that evaluation will be strict.

Unified Apply optimization

b99371d

Some refactoring to simplify and generalize these optimizations.

Simplify optimizer & dereference fields in all static objects

6705c51

Fix strict mode checks

e0d8a05

We have to do them before the optimizer because they are based on syntax.

Request one subVisitor per array elements

ac46d32

PrettyYamlRenderer relies on subVisitor calls for individual elements (and no subVisitor call for an empty array)

Fix error message for illegal super calls

2a37238

The current behavior (after the scope handling overhaul) is correct (matching the specification and Google Jsonnet) but the error message was misleading. Sjsonnet 0.4 did not detect the illegal call at all.

Exclude std.extVar from static evaluation

de343bc

`extVar` depends on external variables which are part of a specific evaluation. We have to ensure that they do not end up in the shared parse cache.

Move old renderers into test scope

9e2f19d

They are only needed for tests and benchmarks. All production code uses the new implementations.

Remove unnecessary code

4355af6

Oops, replaceAll accidentally did the work twice.

Add some new documentation

972b41a

szeiger force-pushed the perf-scopes branch from 297fa02 to 972b41a Compare June 25, 2021 18:26

szeiger merged commit d013527 into databricks:master Jul 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some additional performance optimization #119

Some additional performance optimization #119

szeiger commented Apr 12, 2021 •

edited

Loading

lihaoyi-databricks Apr 17, 2021 •

edited

Loading

szeiger Apr 17, 2021

lihaoyi-databricks Apr 17, 2021

lihaoyi-databricks Apr 17, 2021 •

edited

Loading

lihaoyi-databricks Apr 17, 2021

szeiger Apr 17, 2021

lihaoyi-databricks Apr 17, 2021 •

edited

Loading

lihaoyi-databricks commented Apr 17, 2021

szeiger commented Apr 17, 2021

szeiger commented Jun 15, 2021

szeiger commented Jun 15, 2021

szeiger Jun 16, 2021

lihaoyi-databricks commented Jun 22, 2021 •

edited

Loading

szeiger commented Jun 25, 2021

Some additional performance optimization #119

Some additional performance optimization #119

Conversation

szeiger commented Apr 12, 2021 • edited Loading

lihaoyi-databricks Apr 17, 2021 • edited Loading

Choose a reason for hiding this comment

szeiger Apr 17, 2021

Choose a reason for hiding this comment

lihaoyi-databricks Apr 17, 2021

Choose a reason for hiding this comment

lihaoyi-databricks Apr 17, 2021 • edited Loading

Choose a reason for hiding this comment

lihaoyi-databricks Apr 17, 2021

Choose a reason for hiding this comment

szeiger Apr 17, 2021

Choose a reason for hiding this comment

lihaoyi-databricks Apr 17, 2021 • edited Loading

Choose a reason for hiding this comment

lihaoyi-databricks commented Apr 17, 2021

szeiger commented Apr 17, 2021

szeiger commented Jun 15, 2021

szeiger commented Jun 15, 2021

szeiger Jun 16, 2021

Choose a reason for hiding this comment

lihaoyi-databricks commented Jun 22, 2021 • edited Loading

szeiger commented Jun 25, 2021

szeiger commented Apr 12, 2021 •

edited

Loading

lihaoyi-databricks Apr 17, 2021 •

edited

Loading

lihaoyi-databricks Apr 17, 2021 •

edited

Loading

lihaoyi-databricks Apr 17, 2021 •

edited

Loading

lihaoyi-databricks commented Jun 22, 2021 •

edited

Loading