Generator Enhancements #1523

steven-johnson · 2016-09-27T23:10:32Z

This PR introduces some enhancements to the Generator infrastructure; the goals are

Improve readability and flexibility of Generators
Provide machine-generated Stubs that make it easier for one Generator to use another
Make integration with the Autoscheduler easier and more reliable

Note that none of these changes break existing Generators (all existing Generators should work as-is); it is not the intent of this PR to force existing Generators to migrate to this style anytime in the near future.

Replacing `Param<X>` with `Input<X>` (and `ImageParam` with `Input<Func>`)

Param<> will continue to exist, but Generators can now use a new
class, Input<>, instead. For scalar types, these can be considered essentially
identical to Param<>, but have a different name for reasons of code clarity,
as we'll see later.

Similarly, ImageParam will continue to exist, but Generators can
instead use a Input<Func>. This is (essentially) like an ImageParam, with
the main difference being that it may (or may not) not be backed by an actual
buffer, and thus has no defined extents.

Input<Func> input{"input", Float(32), 2};

It is an error for a Generator to declare both Input<> and Param<> or
ImageParam (i.e.: if you use Input<> you may not use the previous syntax).

Note that Input<> is intended only for use with Generator, and is not
intended for use in other Halide code; in particular, it is not intended to
replace Param<>, except for inside Generators.

Example:

class SumColumns : Generator<SumColumns> {
  ImageParam input{Float(32), 2, "input"};

  Func build() {
    RDom r(0, input.width());
    Func f;
    Var y;
    f(y) = 0.f;
    f(y) += input(r.x, y);
    return f;
  }
};

becomes

class SumColumns : Generator<SumColumns> {
  Input<Func> input{"input", Float(32), 2};
  Input<int32_t> width{"width"};

  Func build() {
    RDom r(0, width);
    Func f;
    Var y;
    f(y) = 0.f;
    f(y) += input(r.x, y);
    return f;
  }
};

You can optionally make the type and/or dimensions of Input<Func> unspecified, in which case the value is simply inferred from the actual Funcs passed to them. Of course, if you specify an explicit Type or Dimension, we still require the input Func to match, or a compilation error results.

Input<Func> input{ "input", 3 };  // require 3-dimensional Func,
                                  // but leave Type unspecified

When a Generator using Input<Func> is compiled directly (e.g., using GenGen), the Input<Func> must be concretely specified; if Type and/or Dimensions are unspecified, you can specify them using implicit GeneratorParams with names derived from the Input or Output. (In the example above, input has an implicit GeneratorParam named "input.type" and an implicit GeneratorParam named "input.dim".)

Explicitly Declaring Outputs

Currently, all of a Generator's inputs can be determined by introspecting its
members, but information about its outputs must be determined by calling its
build() method and examining the return value (which may be a Func or a
Pipeline).

With this change, a Generator can, instead, explicitly declare its output(s) as
member variables, and provide a generate() method instead of a build() method.
(These are equivalent aside from the fact that generate() does not return a
value.)

Example:

class SumColumns : Generator<SumColumns> {
  Input<Func> input{"input", Float(32), 2};
  Input<int32_t> width{"width"};

  Func build() {
    RDom r(0, width);
    Func f;
    Var y;
    f(y) = 0.f;
    f(y) += input(r.x, y);
    return f;
  }
};

becomes

class SumColumns : Generator<SumColumns> {
  Input<Func> input{"input", Float(32), 2};
  Input<int32_t> width{"width"};

  Output<Func> sum_cols{"sum_cols", Float(32), 1};

  void generate() {
    RDom r(0, width);
    Var y;
    sum_cols(y) = 0.f;
    sum_cols(y) += input(r, y);
  }
};

As with Input<Func>, you can optionally make the type and/or dimensions of an
Output<Func> unspecified; any unspecified types must be resolved via an implicit GeneratorParam in order to use top-level compilation.

Note that Output<> is intended only for use with Generator, and is not
intended for use in other Halide code.

The Generator infrastructure will verify (after calling generate()) that all
outputs are defined, and have definitions that match the declaration.

You can specify an output that returns a Tuple by specifying a list of Types:

class Tupler : Generator<Tupler> {
  Input<Func> input{"input", Int(32), 2};
  Output<Func> output{"output", {Float(32), UInt(8)}, 2};

  void generate() {
    Var x, y;
    output(x, y) = Tuple(cast<float>(input(x, y)), cast<uint8_t>(input(x, y)));
  }
};

A Generator can define multiple outputs (which is quietly implemented as a
Pipeline under the hood):

class SumRowsAndColumns : Generator<SumRowsAndColumns> {
  Input<Func> input{"input", Float(32), 2};
  Input<int32_t> width{"width"};
  Input<int32_t> height{"height"};

  Output<Func> sum_rows{"sum_rows", Float(32), 1};
  Output<Func> sum_cols{"sum_cols", Float(32), 1};

  void generate() {
    RDom rc(0, height);
    Var x;
    sum_rows(x) = 0.f;
    sum_rows(x) += input(x, rc);

    RDom rr(0, width);
    Var y;
    sum_cols(y) = 0.f;
    sum_cols(y) += input(rr, y);
  }
};

We also allow you to specify Output for any scalar type (except for Handle
types); this is merely syntactic sugar on top of a zero-dimensional Func, but
can be quite handy, especially when used with multiple outputs:

class Sum : Generator<Sum> {
  Input<Func> input{"input", Float(32), 2};
  Input<int32_t> width{"width"};
  Input<int32_t> height{"height"};

  Output<Func> sum_rows{"sum_rows", Float(32), 1};
  Output<Func> sum_cols{"sum_cols", Float(32), 1};
  Output<float> sum{"sum"};

  void generate() {
    RDom rc(0, height);
    Var x;
    sum_rows(x) = 0.f;
    sum_rows(x) += input(x, rc);

    RDom rr(0, width);
    Var y;
    sum_cols(y) = 0.f;
    sum_cols(y) += input(rr, y);

    RDom r(0, width, 0, height);
    sum() = 0.f;
    sum() += input(r.x, r.y);
  }
};

Note that it is an error to define both a build() and generate() method.

Array Inputs and Outputs

You can also use the new syntax to declare an array of Input or Output, by
using an array type as the type parameter:

// Takes exactly 3 images and outputs exactly 3 sums.
class SumRowsAndColumns : Generator<SumRowsAndColumns> {
  Input<Func[3]> inputs{kNumInputs, "inputs", Float(32), 2};
  Input<int32_t[2]> extents{"extents"};

  Output<Func[3]> sums{"sums", Float(32), 1};

  void generate() {
    assert(inputs.size() == sums.size());
    // assume all inputs are same extent
    Expr width = extent[0];
    Expr height = extent[1];
    for (size_t i = 0; i < inputs.size(); ++i) {
      RDom r(0, width, 0, height);
      sums[i]() = 0.f;
      sums[i]() += inputs[i](r.x, r.y);
     }
  }
};

You can also lee array size unspecified, in which case it will be inferred from the input vector, or (optionally) explicitly specified via a resize() method:

class Pyramid : public Generator<Pyramid> {
public:
    GeneratorParam<int32_t> levels{"levels", 10};

    Input<Func> input{ "input", Float(32), 2 };

    Output<Func[]> pyramid{ "pyramid", Float(32), 2 };

    void generate() {
        pyramid[0](x, y) = input(x, y);

        output.resize(levels);
        for (int i = 1; i < pyramid.size(); i++) {
            pyramid[i](x, y) = (pyramid[i-1](2*x, 2*y) +
                               pyramid[i-1](2*x+1, 2*y) +
                               pyramid[i-1](2*x, 2*y+1) +
                               pyramid[i-1](2*x+1, 2*y+1))/4;
        }
    }
};

An Array Input/Output with unspecified size must be resolved to a concrete size for toplevel compilation; there are now implicit GeneratorParam<size_t> that allow to to set this, based on the name ("pyramid.size" in the example above).

Note that both Input and Output arrays support a limited subset of the methods from std::vector<>:

operator[]
size()
begin()
end()
resize()

Separating Scheduling from Building

A Generator can now split the existing build() method into two methods:

void generate() { ... }
void schedule() { ... }

Such a Generator must move all scheduling code for intermediate Func into
the schedule() method. Note that this means that schedulable Func, Var,
etc will need to be stored as member variables of the Generator. (Since
Output<> are required to be declared as member variables, these are simple
enough, but intermediate Func that need scheduling may require motion.)

Example:

class Example : Generator<Example> {
  Output<Func> output{"output", Float(32), 2};

  void generate() {
    Var x, y;

    Func intermediate;
    intermediate(x, y) = SomeExpr(x, y);

    output(x, y) = intermediate(x, y);

    intermediate.compute_at(output, y);
  }
};

becomes

class Example : Generator<Example> {
  Output<Func> output{"output", Float(32), 2};

  void generate() {
    intermediate(x, y) = SomeExpr(x, y);
    output(x, y) = intermediate(x, y);
  }

  void schedule() {
    intermediate.compute_at(output, y);
  }

  Func intermediate;
  Var x, y;
};

Note that the output Func doesn't have a scheduling directive for
compute_at() or store_at() in either case: it is either implicitly
compute_root() (when being compiled directly into a filter), or explicitly
scheduled by its caller (when being used as a subcomponent, as we'll see later).

Even if the intermediate Halide code doesn't have any scheduling necessary (e.g.
it's all inline), you should still provide an empty schedule() method to make
this fact obvious and clear.

Example:

class ExampleInline : Generator<ExampleInline> {
  Output<Func> output{"output", Float(32), 2};

  void generate() {
    Var x, y;
    output(x, y) = SomeExpr(x, y);
  }
};

becomes

class ExampleInline : Generator<ExampleInline> {
  Output<Func> output{"output", Float(32), 2};

  void generate() {
    output(x, y) = SomeExpr(x, y);
  }

  void schedule() {
    // empty
  }

  Var x, y;
};

Converting `GeneratorParam` into `ScheduleParam` where necessary

GeneratorParam is now augmented by the new ScheduleParam type. All
generator params that are intended to be used by the schedule() method should
be declared as ScheduleParam rather than GeneratorParam. This has two
purposes:

It allows a declarative way to enumerate and communicate scheduling
information between arbitrary Generators (as we'll see later).
It makes clear which GeneratorParams are used for scheduling, which will aid
future Autoscheduler work.

Note that there are common GeneratorParam conventions that already act as
ScheduleParam (most notably, vectorize and parallelize); this merely
formalizes the previous convention.

GeneratorParam and ScheduleParam will continue to live inside a single
namespace (i.e., it is an error to declare a GeneratorParam and
ScheduleParam with the same name).

While a GeneratorParam can be used from anywhere inside a Generator (either
the generate() or schedule() method), a ScheduleParam should be accessed
only within the schedule() method. (We'd like to make this a compile-time
error in the future.)

Note that while GeneratorParam will continue to be serializable to and from
strings (just as GeneratorParams are), some ScheduleParam values are not
serializable, as they may reference runtime-only Halide structures (most
notably, LoopLevel, which cannot be reliably specified by name in the general
case). Attempting to set such a ScheduleParam from GenGen will cause a
compile-time error.

Example:

class Example : Generator<Example> {
  GeneratorParam<int32_t> iters{"iters", 10};
  GeneratorParam<bool> vectorize{"vectorize", true};

  Func generate() {
    Var x, y;
    vector<Func> intermediates;
    for (int i = 0; i < iters; ++i) {
      Func g;
      g(x, y) = (i == 0) ? SomeExpr(x, y) : SomeExpr2(g(x, y));
      intermediates.push_back(g);
    }
    Func f;
    f(x, y) = intermediates.back()(x, y);

    // Schedule
    for (auto fi : intermediates) {
      fi.compute_at(f, y);
      if (vectorize) fi.vectorize(x, natural_vector_size<float>());
    }
    return f;
  }
};

becomes

class Example : Generator<Example> {
  GeneratorParam<int32_t> iters{"iters", 10};
  ScheduleParam<bool> vectorize{"vectorize", true};

  Output<Func> output{"output", Float(32), 2};

  void generate() {
    for (int i = 0; i < iters; ++i) {
      Func g;
      g(x, y) = (i == 0) ? SomeExpr(x, y) : SomeExpr2(g(x, y));
      intermediates.push_back(g);
    }
    output(x, y) = intermediates.back()(x, y);
  }

  void schedule() {
    for (auto fi : intermediates) {
      fi.compute_at(output, y);
      if (vectorize) fi.vectorize(x, natural_vector_size<float>());
    }
  }

  Var x, y;
  vector<Func> intermediates;
};

Note that ScheduleParam can have other interesting values too, most notably
LoopLevel:

class Example : Generator<Example> {
  // Specify a LoopLevel at which we want intermediate Func(s)
  // to be computed and/or stored.
  ScheduleParam<LoopLevel> intermediate_compute_level{"level", "undefined"};
  ScheduleParam<LoopLevel> intermediate_store_level{"level", "root"};
  Output<Func> output{"output", Float(32), 2};

  void generate() {
    intermediate(x, y) = SomeExpr(x, y);
    output(x, y) = intermediate(x, y);
  }

  void schedule() {
    intermediate
      // If intermediate_compute_level is undefined,
      // default to computing at output's rows
      .compute_at(intermediate_compute_level.defined() ?
                  intermediate_compute_level :
                  LoopLevel(output, y))
      .store_at(intermediate_store_level);
  }

  Func intermediate;
  Var x, y;
};

Note that ScheduleParam<LoopLevel> can default to "root", "inline", or
"undefined"; all other values (e.g. Func-and-Var) must be specified in actual
code. (It is explicitly not possible to specify LoopLevel(Func, Var) by name,
e.g. "func.var"; although Halide uses such a convention internally, it is not
currently possible to guarantee unique Func names across an arbitrary set of Generators.)

Note that it is an error to use an undefined LoopLevel for scheduling.

Generator Stubs

Let's start with an example of usage, then work backwards to explain what's
going on. Say we have an RGB-to-YCbCr component we want to re-use:

class RgbToYCbCr : public Generator<RgbToYCbCr> {
  Input<Func> input{"input", Float(32), 3};
  Output<Func> output{"output", Float(32), 3};
  void generate() { ... conversion code here ... }
  void schedule() { ... scheduling code here ... }
};
RegisterGenerator<RgbToYCbCr> register_me{"rgb_to_ycbcr"};

GenGen now can produce a "Func-like" stub class around a generator, which (by convention)
is emitted in a file with the extension ".stub.h". It looks something like:

/path/to/rgb_to_rcbcr.stub.h:

  // MACHINE-GENERATED
  struct RgbToYCbCr {
    struct ScheduleParams { ... };
    struct GeneratorParams { ... };

    // ctor, with required inputs, and (optional) GeneratorParams.
    RgbToYCbCr(Context* context,
               // All the Input<>s declared in the Generator are listed here,
               // as either Func or Expr
               Func input,
               const GeneratorParams& = {}) { ... }

    // Output(s)
    Func output;

    // Overloads for first output
    operator Func() const { return output; }
    Expr operator()(Expr x, Expr y, Expr z) const  { return output(x, y, z); }
    Expr operator()(std::vector<Expr> args) const  { return output(args); }
    Expr operator()(std::vector<Var> args) const  { return output(args); }

    void schedule(const ScheduleParams &params = {});
  };

Note that this is a "header-only" class; all methods are inlined (or
template-multilinked, etc) so there is no associated .cpp to incorporate. Also
note that this is a "by-value", internally-handled-based class, like most other
types in Halide (e.g. Func, Expr, etc).

We'd consume this downstream like so:

#include "/path/to/rgb_to_rcbcr.stub.h"

class AwesomeFilter : public Generator<AwesomeFilter> {
 public:
  Input<Func> input{"input", Float(32), 3};
  Output<Func> output{"output", Float(32), 3};

  void generate() {
    // Snap image into buckets while still in RGB.
    quantized(x, y, c) = Quantize(input(x, y, c));

    // Convert to YCbCr.
    rgb_to_ycbcr = RgbToYCbCr(this, quantized);

    // Do something awesome with it. Note that rgb_to_ycbcr autoconverts to a Func.
    output(x, y, c) = SomethingAwesome(rgb_to_ycbcr(x, y, c));
  }
  void schedule() {
    // explicitly schedule the intermediate Funcs we used
    // (including any reusable Generators).
    quantized.
      .vectorize(x, natural_vector_size<float>())
      .compute_at(rgb_to_ycbcr, y);
    rgb_to_ycbcr
      .vectorize(x, natural_vector_size<float>())
      .compute_at(output, y);

    // *Also* call the schedule method for all reusable Generators we used,
    // so that they can schedule their own intermediate results as needed.
    // (Note that we may have to pass them appropriate values for ScheduleParam,
    // which vary from Generator to Generator; since RgbToYCbCr has none,
    // we don't need to pass any.)
    rgb_to_ycbcr.schedule();
 }

 private:
  Var x, y, c;
  Func quantized;
  RgbToYCbCr rgb_to_ycbcr;

  Expr Quantize(Expr e) { ... }
  Expr SomethingAwesome(Expr e) { ... }
};

It's worth pointing out that all inputs to the subcomponent must be explicitly
provided when the subcomponent is created (as arguments to its ctor); the caller
is responsible for providing these. (There is no concept of automatic input
forwarding from the caller to a subcomponent.)

What if RgbToYCbCr has array inputs or outputs? For instance:

class RgbToYCbCrMulti : public Generator<RgbToYCbCrMulti> {
  Input<Func[3]> inputs{"inputs", Float(32), 3};
  Input<float> coefficients{"coefficients", 1.f};
  Output<Func[3]> outputs{"outputs", Float(32), 3};
  ...
};

In that case, the generated RgbToYCbCrMulti class requires vector-of-Func (or
vector-of-Expr) for inputs, and provides vector-of-Func as output members:

struct RgbToYCbCrMulti {
    RgbToYCbCr(Context* context,
               const std::vector<Func>& inputs,
               const std::vector<Expr>& coefficients,
               const GeneratorParams& = GeneratorParams()) { ... }

    ...

    std::vector<Func> outputs;
};

What if RgbToYCbCr has multiple outputs? For instance:

class RgbToYCbCrMulti : public Generator<RgbToYCbCrMulti> {
  Input<Func> input{"input", Float(32), 3};
  Output<Func> output{"output", Float(32), 3};
  Output<Func> mask{"mask", UInt(8), 2};
  Output<float> score{"score"};
  ...
};

In that case, the generated RgbToYCbCrMulti class has all outputs as struct
members, with names that match the declared names in the Generator:

struct RgbToYCbCrMulti {
    ...
    Func output;
    Func mask;
    Func score;
};

Note that scalar outputs are still represented as (zero-dimensional) functions,
for consistency. (Also note that "output" isn't a magic name; it just happens to
be the name of the first output of this Generator.)

Note also that the first output is always represented both in an "is-a"
relationship and a "has-a" relationship: RgbToYCbCrMulti overloads the necessary
operators so that accessing it as a Func is the same as accessing its "output"
field, i.e.:

struct RgbToYCbCrMulti {
    ...
    Func output;

    operator Func() const { return output; }
    Expr operator()(Expr x, Expr y, Expr z) const  { return output(x, y, z); }
    Expr operator()(std::vector<Expr> args) const  { return output(args); }
    Expr operator()(std::vector<Var> args) const  { return output(args); }
    ...
};

This is (admittedly) redundant, but is deliberate: it allows convenience for the
most common case (a single output), but also orthogonality in the multi-output
case.

The consumer might use this like so:

#include "/path/to/rgb_to_rcbcr_multi.stub.h"

class AwesomeFilter : public Generator<AwesomeFilter> {
  ...
  void generate() {
    rgb_to_ycbcr_multi = RgbToYCbCrMulti(this, input);
    output(x, y, c) = SomethingAwesome(rgb_to_ycbcr_multi.output(x, y, c),
                                       rgb_to_ycbcr_multi.mask(x, y),
                                       rgb_to_ycbcr_multi.score());
  }
  void schedule() {
    rgb_to_ycbcr_multi.output
      .vectorize(x, natural_vector_size<float>())
      .compute_at(output, y);
    rgb_to_ycbcr_multi.mask
      .vectorize(x, natural_vector_size<float>())
      .compute_at(output, y);
    rgb_to_ycbcr_multi.score
      .compute_root();
    // Don't forget to call the schedule() function.
    rgb_to_ycbcr_multi.schedule();
  }
};

What if there were GeneratorParam we wanted to set in RgbToYCbCr, to
configure code generation? In that case, we'd pass a value for the optional
generator_params field when calling its constructor

class RgbToYCbCr : public Generator<RgbToYCbCr> {
  GeneratorParam<bool> fast_but_less_accurate{"fast_but_less_accurate", false};
  ...
};

This would produce a different (generated) definition of
GeneratorParams, with a field for each GeneratorParam, initialized
to the proper default:

struct GeneratorParams {
  Halide::Type input_type{UInt(8)};
  bool fast_but_less_accurate{false};
};

We could then fill this in manually:

class AwesomeFilter : public Generator<AwesomeFilter> {
  void generate() {
    ...
    GeneratorParams generator_params;
    generator_params.input_type = Float(32);
    generator_params.fast_but_less_accurate = true;
    rgb_to_ycbcr = RgbToYCbCr(this, input, generator_params);
    ...
  }
}

Alternately, if we know the types at C++ compilation time, we can use a templated
construction method that is terser:

class AwesomeFilter : public Generator<AwesomeFilter> {
  void generate() {
    ...
    rgb_to_ycbcr = RgbToYCbCr::make<float, true>(this, input);
    ...
  }
}

What if there are ScheduleParam in RgbToYCbCr?

class RgbToYCbCr : public Generator<RgbToYCbCr> {
  ScheduleParam<LoopLevel> level{"level"};
  ScheduleParam<bool> vectorize{"vectorize"};

  void generate() {
    intermediate(x, y) = SomeExpr(x, y);
    output(x, y) = intermediate(x, y);
  }

  void schedule() {
    intermediate.compute_at(level);
    if (vectorize) intermediate.vectorize(x, natural_vector_width<float>());
  }

  Var x, y;
  Func intermediate;
};

In that case, the generated stub code would have a different declaration for ScheduleParams:

struct ScheduleParams {
  LoopLevel level{"undefined"};
  bool vectorize{false};
};

And we might call it like so:

class AwesomeFilter : public Generator<AwesomeFilter> {
  ...
  void schedule() {
    rgb_to_ycbcr
      .vectorize(x, natural_vector_size<float>())
      .compute_at(output, y);

    rgb_to_ycbcr.schedule({
      // We want any intermediate products also at compute_at(output, y)
      LoopLevel(output, y),
      // vectorization: yes please
      true
    });
  }
  ...
}

steven-johnson · 2016-09-29T01:05:08Z

No comments?

abadams · 2016-09-29T02:32:24Z

Haven't looked closely at the code yet, but I have a few half-formed thoughts on some parts of the design. Sorry if this comes out a little incoherent - I am sleep-deprived and have a cold.

Do we want to distinguish between generator params and params on the call side? They could all just be serialized into one argument list. The distinction is helpful inside the component, because it marks things that have a value at compile-time, but outside the component I'm not sure if the distinction is important enough to warrant syntactic separation like that. The call signature will just have a float/int in that spot rather than an Expr. Passing them as a struct seems weird when the other params are passed as an argument list, and making the generator params template parameters seems a bit wonky too, because it needlessly requires them to be c++ compile-time constant, which breaks when you want to pass you GeneratorParams down into subcomponents. Another way to separate them out would be: some_component(generator param args)(param args). I.e. constructor args vs operator() args, but that gets weird when there are no generator params.
The context() thing is ugly and opaque, but I can't think of a good way around it. Is it just the target? If so we could just pass the target. Another option is passing this (i.e. the calling component).
It looks like the CRTP base class just requires the syntax derived.schedule() to be valid, rather then requiring that the derived class has a void schedule() method. schedule could therefore actually be a callable object created by the generate method, which means it could be a wrapped-up lambda that captures local Funcs, which means you wouldn't have to declare your Funcs at class scope... I think @dsharletg's trick of expressing scheduling inside a lambda that captures Funcs by value is a viable idiom in this framework. Not sure if that's good, bad, or just a curiosity.
People using a component are going to want to be able to jump to the definition of the thing they're calling and read the docs for it. This means the stubs need to be checked-in and either commented (e.g. with a comment supplied by a const char *docs member of the generator), or they need to redirect people to the original generator source in some reasonable way.

steven-johnson · 2016-09-29T17:23:57Z

Do we want to distinguish between generator params and params on the call side?

One legitimate reason to keep the distinction is that GeneratorParams will often have useful default values that you want to keep (so you only want to specify some of them), while Inputs must always be specified. This is likely to be even more pervasive going forward (as we use this approach to write more-flexible, more-reusable Generator libraries); we don't want to end up with (say) a Blur component that requires recapitulation of a half-dozen extra arguments.

Passing them as a struct seems weird when the other params are passed as an argument list

Yeah, it's suboptimal; if C++ had a named-parameter-with-default-values mode that would be the thing to use here, but, alas, it doesn't.

making the generator params template parameters seems a bit wonky too

The intent is that you'd only use this form when the values are compile-time constants. (I kinda like having this as an optional syntax as IMHO it reinforces the idea that certain params are "compile-time" vs runtime.)

Another way to separate them out would be: some_component(generator param args)(param args).

IMHO the double-call-operator idiom is rare for a good reason; it looks weird and jars the reader. I'd prefer to avoid it.

The context() thing is ugly and opaque

The actual code already accepts "this"; this documentation is out of date (from an earlier rev of the code) and I neglected to update. Willfix.

schedule could therefore actually be a callable object created by the generate method

This is true, and I can't think of a reason offhand to prevent such an approach from working (other than "that's not how I'd code it", which is a terrible reason). Assuming we can't think of a reason this should be prevented, we should document that it's supported and add tests to verify it work. That said, I think that the explicit schedule() method approach makes for more-readable code and should be the approach that we document as recommended.

People using a component are going to want to be able to jump to the definition
of the thing they're calling and read the docs for it.

Agreed...

This means the stubs need to be checked-in

...Disagree strongly. Checking in machine-generated code is pretty much always a bad idea.

and either commented (e.g. with a comment supplied by a const char *docs member of the generator)

Agreed 100%, adding documentation is critical and is missing from this PR. (I'd prefer to add it in a subsequent PR, however.)

or they need to redirect people to the original generator source in some reasonable way.

Disagree: IMHO one of the desirable properties of Stubs is to add a separation of interface vs implementation to a Generator that's intended as a reusable component; if the Stub doesn't have enough information in its API + Documentation to be useful, that's a failure of this design.

dsharletg · 2016-09-30T15:58:59Z

I also haven't looked at the code yet, just read through the description.

Can you declare a Func Input that produces a Tuple the same way you can declare Output tuples?

The template method of passing generator params to a make method will not work for floats (can't be template parameters) or anything other than integral types I believe. That will be annoying, might be better to just enable syntax like:

rgb_to_ycbcr = RgbToYCbCr::make(this, input, {type_of<float>(), true});

Maybe that already works?

In the examples, the outer generator (AwesomeFilter) schedules the last step of the inner generator (RgbToYCbCr) by vectorizing it. In my experience, the inner generator often wants to do something interesting in the schedule that the outer component shouldn't have to know about. For example, if the inner generator is a demosaic, then it is quite likely that it will want it's output to be scheduled with at least the following:

demosaic
    .tile(x, y, xo, yo, x, y, vector_size * 2, 2)
    .vectorize(x)
    .unroll(y);

This is not something that is easy to communicate in the design. If this is left up to the calling generator, then I don't see any way to get this other than to specify in the documentation of the generator that it wants a particular property (e.g. unrolling in x and y by 2 to simplify some selects).

In my code, I use the lambda strategy mentioned above, and I let the inner code (not a generator, just a function) do the "loop" scheduling (some splits, vectorizing, unrolling, etc.) and have the outer code do the "locality" scheduling. However, the split isn't perfect. Some splits are relevant to the inner code, and some splits are relevant to the outer code.

steven-johnson · 2016-09-30T16:45:42Z

Can you declare a Func Input that produces a Tuple the same way you can declare Output tuples?

Not at present, but that could be made to happen with modest work.

The template method of passing generator params to a make method will not work for floats

Correct; we support this via a hacky std::ratio workaround, but as it turns out, float GeneratorParams are almost non-existent (I count exactly two instances inside of Google, both of which could be implemented other ways.)

rgb_to_ycbcr = RgbToYCbCr::make(this, input, {type_of(), true});
Maybe that already works?

Yes, via the standard ctor.

However, the split isn't perfect. Some splits are relevant to the inner code, and some splits are relevant to the outer code

Fair enough -- I don't think we're going to be able to achieve perfect insulation here. I think these cases are going to need to be handled case-by-case via documentation.

abadams · 2016-09-30T17:37:39Z

I think not having the prototype available at coding-time when calling a generator is problematic. We need to figure out something reasonable for when someone clicks on one of these calls and wants to jump to its declaration to check the docs for it, or if someone expects their IDE to give them hints while typing out a call to it. Checking in the stubs violates the common rule against checking in build artifacts, but I think the alternative is worse.

This issue is already a common complaint when calling AOT-compiled Halide pipelines, but it's less of a big deal because typically the person calling the pipeline is also the person who wrote it, so they know what the arguments are based on the Params and Generator Params.

Regarding GeneratorParams in the call syntax - I think we should take a look at some generators we have and see if the GeneratorParams are more or less likely to be left at their default values relative to the params. Something like a Type always needs to be set. If they're no more or less likely, there's no reason to separate them from params in the call syntax. If they're much less likely to be anything other than the default, then it makes more sense to put them in a struct.

Another option is to put them in the signature in order after all the params, with default values. People writing generators would order the generator params by how likely it is that they'll be non-default, which lets callers specify as many of them as they like:

RgbToYCbCr(this, input, Float(32)) // true left as the default value

It's also a little weird that Type generator params need to be set at all when they describe the type of input Funcs. What if Input<Func> could accept Funcs of unspecified type, and it's up to the component to inspect the type and do the right thing. Maybe that's already possible and I'm misunderstanding.

zvookin · 2016-09-30T17:50:57Z

We are planning to automatically generate outputs in addition to the stub,
including documentation (e.g. Doxygen). Possibly also tests with a small
amount of extra specification input. Stubs for other languages are an
obvious thing as well.

Access to the stub prototypes is basically the same as with stub generators
for RPC, etc. They can be built ahead of time and used that way for
example. I don't think common tool chains make it easy to prepopulate the
code database before compilation, but may be abble come up with some way to
do so. (Basically all the IDE has to do is run the compilation dependencies
for the generated header file eagerly, but my guess is few do that. This
solution would similarly work for Halide AOT I believe.)

Doing automatic inference of types from input Funcs is near the top of the
list of things to investigate. I am hopeful it will be fairly easy.

-Z-

On Fri, Sep 30, 2016 at 10:37 AM, Andrew Adams notifications@github.com
wrote:

I think not having the prototype available at coding-time when calling a
generator is problematic. We need to figure out something reasonable for
when someone clicks on one of these calls and wants to jump to its
declaration to check the docs for it, or if someone expects their IDE to
give them hints while typing out a call to it. Checking in the stubs
violates the common rule against checking in build artifacts, but I think
the alternative is worse.

This issue is already a common complaint when calling AOT-compiled Halide
pipelines, but it's less of a big deal because typically the person calling
the pipeline is also the person who wrote it, so they know what the
arguments are based on the Params and Generator Params.

Regarding GeneratorParams in the call syntax - I think we should take a
look at some generators we have and see if the GeneratorParams are more or
less likely to be left at their default values relative to the params.
Something like a Type always needs to be set. If they're no more or less
likely, there's no reason to separate them from params in the call syntax.
If they're much less likely to be anything other than the default, then it
makes more sense to put them in a struct.

Another option is to put them in the signature in order after all the
params, with default values. People writing generators would order the
generator params by how likely it is that they'll be non-default, which
lets callers specify as many of them as they like:

RgbToYCbCr(this, input, Float(32)) // true left as the default value

It's also a little weird that Type generator params need to be set at all
when they describe the type of input Funcs. What if Input could
accept Funcs of unspecified type, and it's up to the component to inspect
the type and do the right thing. Maybe that's already possible and I'm
misunderstanding.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1523 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABbqFPRaZnaJ8Vx6NRbbPlJ36XhTO0Yaks5qvUjmgaJpZM4KIQNX
.

# Conflicts: # src/Generator.cpp

From offline discussion: it was suggested that Stubs should accept Inputs (aka Params) in struct form, similar to how Generator/ScheduleParams are handled; this allows for nicer symmetry, while still allowing inline “calls-style” declarations via C++11 aggregate initialization syntax. Take a look and see what you think.

steven-johnson · 2016-10-04T19:20:36Z

From offline discussion: it was suggested that Stubs should accept Inputs (aka Params) in struct form, similar to how Generator/ScheduleParams are handled; this allows for nicer code symmetry, while still allowing inline “call-style” declarations via C++11 aggregate initialization syntax. PTAL.

steven-johnson · 2016-10-06T21:01:20Z

Gentle ping for comments on 77a0e97

steven-johnson · 2016-10-18T17:18:54Z

Added revisions to add type-and-dim inference for Func, and size inference for Array (and updated long PR comment description to reflect this). PTAL, I think this is close enough to consider landing.

abadams · 2016-10-18T18:53:29Z

src/Generator.cpp

+std::pair<int64_t, int64_t> rational_approximation(double d) {
+    if (std::isnan(d)) return {0, 0};
+    if (!std::isfinite(d)) return {(d < 0) ? -1 : 1, 0};
+    // TODO: fix this abomination to something more intelligent


This is a fun topic:
https://www.youtube.com/watch?v=CaasbfdJdJg

I'll open a small PR with my solution. It's overkill.

Ouch, yeah, I overlooked that one. Frankly, at this point I'm tempted to say we should just pull the templated "constructors" entirely; I'm not sure they're worthwhile now that we've added type/dim/size inference. Anyone else have opinions?

Fancier method for finding a good rational

abadams · 2016-10-18T19:03:52Z

test/generator/stubtest_jittest.cpp

+Halide::Var x, y, c;
+
+template<typename Type>
+Image<Type> MakeImage(int extra) {


nit: make_image here and in the other tests

abadams · 2016-10-18T19:04:44Z

test/generator/stubtest_jittest.cpp

+
+    // We statically know the types we want, so the templated construction method
+    // is most convenient.
+    auto gen = StubTest::make<>(


I'm confused by the comment - no template arguments are provided

Yes, the comment is stale. Willfix.

abadams · 2016-10-18T19:07:38Z

test/generator/stubtest_jittest.cpp

+    // This generator defaults intermediate_level to "undefined", 
+    // so we *must* specify something for it (else we'll crater at
+    // Halide compile time). We'll use this:
+    sp.intermediate_level = LoopLevel(gen.f, Var("y"));


Passing in the variable name by string seems like an anti-pattern. Is there another way to do this?

Right now: no, not really. The alternative which we've discussed is to put the public-facing Vars into the Stub as part of the contract, e.g.

sp.intermediate_level = LoopLevel(gen.f, gen.y);

but since we currently guarantee/require that Vars always match by name, I haven't done so.

Updating this: as it turns out, the addition of type/dim/size inference to Stubs makes it hard to infer the output Vars... To infer the output Vars for the Stub, we have to call generate() so that all the Output values are valid. But if any Input<> or Output<> values have unspecified values, we can't do that... and we can't just guess at a hopefully-valid value since there may be constraints in the code we can't predict.

I'm pushing a change with a revision to use Func::args() instead:

sp.intermediate_level = LoopLevel(gen.f, gen.f.args().at(1));

(Note that I'm using args().at() rather than args()[] since the former guarantees exception/halt if entry isn't found, rather than returning a default as with operator[])

After an offline discussion with zalman@, we'd like to hold off on trying to gather the used-by-output-vars into the Stub, mainly because (1) it's hard, and (2) we'd like to actually try out the idiom above in zalman@'s code to see if it feels good enough.

AFAIK this is the last nontrivial stumbling block; LMK your thoughts.

abadams · 2016-10-18T19:10:14Z

src/Generator.h

+// Use a little variadic macro hacking to allow two or three arguments.
+// This is suboptimal, but allows us more flexibility to mutate registration in
+// the future with less impact on existing code.
+#define _HALIDE_REGISTER_GENERATOR2(GEN_CLASS_NAME, GEN_REGISTRY_NAME) \


If we're going to use macro magic, couldn't we also encapsulate the "auto register_me = " part?

The embarrassing answer: I can't figure out a way to guarantee a unique name for the variable when multiple registrations occur in the same source file.

Scratch that comment: an approach just occurred to me and I like it, so I've pushed it out there.

string-pasting the class name looks good. __COUNTER__ is also handy for unique names

Had to google __COUNTER__ -- looks like an MS extension? Is it widely supported?

abadams · 2016-10-18T19:12:41Z

test/generator/stubtest_generator.cpp

+
+}  // namespace
+
+namespace StubNS1 {


Why must it go in a namespace?

Requiring good hygiene up front: I submit that if you are going to use a Stub class, putting it in the global namespace is ~always the wrong thing to do. Thus we require it to be elsewhere.

abadams · 2016-10-18T19:16:22Z

I agree that we should remove them, though I'm sad to not be able to use
the continued fraction thingy :)

On Tue, Oct 18, 2016 at 12:15 PM, Steven Johnson notifications@github.com
wrote:

@steven-johnson commented on this pull request.

In src/Generator.cpp #1523:
+Argument to_argument(const Internal::Parameter &param) {
Expr def, min, max;

if (!param.is_buffer()) {
   def = param.get_scalar_expr();
   min = param.get_min_value();
   max = param.get_max_value();
}

return Argument(param.name(),
   param.is_buffer() ? Argument::InputBuffer : Argument::InputScalar,
   param.type(), param.dimensions(), def, min, max);
+}
+
+std::pair<int64_t, int64_t> rational_approximation(double d) {
if (std::isnan(d)) return {0, 0};

if (!std::isfinite(d)) return {(d < 0) ? -1 : 1, 0};

// TODO: fix this abomination to something more intelligent
Ouch, yeah, I overlooked that one. Frankly, at this point I'm tempted to
say we should just pull the templated "constructors" entirely; I'm not sure
they're worthwhile now that we've added type/dim/size inference. Anyone
else have opinions?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1523, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAfdRqPgH5YFe-Whmm2j1rhV_y2PmsDUks5q1RrRgaJpZM4KIQNX
.

steven-johnson · 2016-10-18T21:12:30Z

I pulled & applied your fraction thingy for now: offline conversations with zalman@ indicated he had plans for use of the templated constructors, at least for now, so I'm inclined to keep with the option to remove later if they don't prove useful.

jrk · 2016-10-18T23:38:49Z

src/Generator.cpp

+    static const std::map<std::string, LoopLevel> halide_looplevel_enum_map{
+        {"root", LoopLevel::root()},
+        {"undefined", get_halide_undefined_looplevel()},
+        {"inline", LoopLevel()},


Why are ScheduleParam<LoopLevel> default values specified by string instead of by actual LoopLevel objects? This is one part that felt a little weird to me.

The textual values specified here can only be used in build rules (e.g. Makefiles); since there is no unique textual representation for a Func, there can't be a unique textual representation for an arbitrary LoopLevel. That said: these three possibilties are "special" and arguably worth providing a way to specify at build time.

jrk · 2016-10-19T00:14:18Z

General comment, now that I've finally read through most of this: I like the direction this is going. Andrew and Dillon have much more practical experience using this stuff lately, so their specific concerns are clearly relevant, but at a high level, I like it!

abadams · 2016-10-19T00:35:37Z

I believe clang and gcc also support it.

On Tue, Oct 18, 2016 at 5:23 PM, Steven Johnson notifications@github.com
wrote:

@steven-johnson commented on this pull request.

In src/Generator.h #1523:

} // namespace Halide

+// Use a little variadic macro hacking to allow two or three arguments.
+// This is suboptimal, but allows us more flexibility to mutate registration in
+// the future with less impact on existing code.
+#define _HALIDE_REGISTER_GENERATOR2(GEN_CLASS_NAME, GEN_REGISTRY_NAME) \

Had to google COUNTER -- looks like an MS extension? Is it widely
supported?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1523, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAfdRuAn3YcxNbrf8SfQWWujUkOmdPRsks5q1WMSgaJpZM4KIQNX
.

steven-johnson · 2016-10-19T21:06:59Z

re: COUNTER, that level of preprocessor-fu makes me a little nervous; I'm inclined to avoid it unless/until we have clear evidence that the current approach is not good enough.

— add tests — special-case non-finite numbers and negative numbers to get predictable results

This allows us to remove lots of Halide:: noise from generated code

zvookin · 2016-10-21T19:42:05Z

Makefile

@@ -907,16 +912,18 @@ $(FILTERS_DIR)/pyramid.a: $(BIN_DIR)/pyramid.generator
 	@-mkdir -p $(TMP_DIR)
 	cd $(TMP_DIR); $(CURDIR)/$< -f pyramid -o $(CURDIR)/$(FILTERS_DIR) target=$(HL_TARGET) levels=10

+MDTEST_GEN_ARGS=input.type=uint8 input.dim=3 output.type=float32,float32 output.dim=3 input_not_nod.type=uint8 input_not_nod.dim=3 input_nod.dim=3 input_not.type=uint8 array_input.size=2 array_i8.size=2 array_i16.size=2 array_i32.size=2 array_h.size=2 array_outputs.size=2


"MDTEST" does not feel like a productive abbreviation here.

zvookin · 2016-10-21T19:43:49Z

src/Generator.cpp

+    }
+
+    /** Emit spaces according to the current indentation level */
+    std::string ind();


I don't think the savings of three characters here is worth the loss of clarity.

zvookin · 2016-10-21T19:53:11Z

src/Generator.h

+    }
+};
+
+class GIOBase {


Can we have a short comment explaining the purpose of this class? (Or perhaps the whole class hierarchy around this.)

zvookin · 2016-10-21T19:55:56Z

src/Generator.h

@@ -533,14 +1370,26 @@ class NamesInterface {
    static inline Type UInt(int bits, int lanes = 1) { return Halide::UInt(bits, lanes); }
 };

+class JITGeneratorContext : public GeneratorContext {


I think the Generator context classes get used as part of the API, so they should have Doxygen comments.

zvookin · 2016-10-21T19:59:12Z

Modulo a few requests for comments and discussion re: updating the description to reflect recent changes, LGTM.

# Conflicts: # test/CMakeLists.txt

— remove Generator::context() — allow Stubs to accept a context by either pointer or ref

steven-johnson · 2016-10-24T23:53:49Z

Revised & expanded version of the PR comment above added to the wiki at https://github.com/halide/Halide/wiki/Generator-Enhancements

steven-johnson · 2016-10-26T00:02:56Z

Updated (rewrote) doxygen comment for Generator.

I have an LGTM from zalman@, so if there are no objections, I'm going to land this once the travis checks pass again.

steven-johnson added 5 commits September 23, 2016 14:52

Merge branch 'master' into generator_revisions

c7ab93f

Merge branch 'master' into generator_revisions

20663db

Merge branch 'master' into generator_revisions

3c85603

Merge branch 'master' into generator_revisions

fb26100

Generator Enhancements

1810833

steven-johnson added 3 commits October 3, 2016 13:04

Merge branch 'master' into generator_revisions

4aaa3d5

# Conflicts: # src/Generator.cpp

Merge branch 'master' into generator_revisions

4c9dd45

Merge branch 'master' into generator_revisions

7814575

steven-johnson added 3 commits October 6, 2016 14:05

Merge branch 'master' into generator_revisions

485ce28

Merge branch 'master' into generator_revisions

ef0899d

Add inference for type-and-dim for Func, and size for Array

acd10e3

abadams reviewed Oct 18, 2016

View reviewed changes

Update Generator.cpp

b590b6c

Fancier method for finding a good rational

abadams reviewed Oct 18, 2016

View reviewed changes

steven-johnson added 2 commits October 18, 2016 13:54

rename make_image

7f28b71

rework HALIDE_REGISTER_GENERATOR macro to avoid "auto register_me = "

76dc2dc

jrk reviewed Oct 18, 2016

View reviewed changes

steven-johnson added 5 commits October 19, 2016 14:25

Merge branch 'master' into generator_revisions

35836c6

rational_approximation tweaks

19b1fb8

— add tests — special-case non-finite numbers and negative numbers to get predictable results

Add NamesInterface to GeneratorStub

d585b04

This allows us to remove lots of Halide:: noise from generated code

Omit certain ctors in stubs when there are no GP/SP present

4c009d1

Use args().at() for output Vars

8abacb5

zvookin reviewed Oct 21, 2016

View reviewed changes

steven-johnson added 6 commits October 24, 2016 13:54

Merge branch 'master' into generator_revisions

8a6a8f7

# Conflicts: # test/CMakeLists.txt

Use *this instead of context() in StubUser

d18d210

Minor nomenclature cleanup

5c6fcbc

Tweak GeneratorContext API

db33afe

— remove Generator::context() — allow Stubs to accept a context by either pointer or ref

Add doc comments, EXPORTS, etc

cecfed4

Add link to https://github.com/halide/Halide/wiki/Generator-Enhancements

a8861d8

steven-johnson added 3 commits October 25, 2016 15:10

Fixes for MSVC issues

70ded22

Merge branch 'master' into generator_revisions

f842a4b

Rewrite Generator doxygen comment

3195766

zvookin merged commit c8de157 into master Oct 26, 2016

jrk deleted the generator_revisions branch April 20, 2017 01:40

Generator Enhancements #1523

Generator Enhancements #1523

Conversation

steven-johnson commented Sep 27, 2016 • edited Loading

Replacing Param<X> with Input<X> (and ImageParam with Input<Func>)

Explicitly Declaring Outputs

Array Inputs and Outputs

Separating Scheduling from Building

Converting GeneratorParam into ScheduleParam where necessary

Generator Stubs

steven-johnson commented Sep 29, 2016

abadams commented Sep 29, 2016

steven-johnson commented Sep 29, 2016

dsharletg commented Sep 30, 2016 • edited Loading

steven-johnson commented Sep 30, 2016

abadams commented Sep 30, 2016

zvookin commented Sep 30, 2016

steven-johnson commented Oct 4, 2016

steven-johnson commented Oct 6, 2016

steven-johnson commented Oct 18, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abadams Oct 19, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abadams commented Oct 18, 2016

@steven-johnson commented on this pull request.

steven-johnson commented Oct 18, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrk commented Oct 19, 2016

abadams commented Oct 19, 2016

@steven-johnson commented on this pull request.

steven-johnson commented Oct 19, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zvookin Oct 21, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zvookin commented Oct 21, 2016

steven-johnson commented Oct 24, 2016

steven-johnson commented Oct 26, 2016

steven-johnson commented Sep 27, 2016 •

edited

Loading

Replacing `Param<X>` with `Input<X>` (and `ImageParam` with `Input<Func>`)

Converting `GeneratorParam` into `ScheduleParam` where necessary

dsharletg commented Sep 30, 2016 •

edited

Loading

abadams Oct 19, 2016 •

edited

Loading

zvookin Oct 21, 2016 •

edited

Loading