Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Customizable lazy broadcasting with options for pure-Julia fusion and eager evaluation #25377

Closed
wants to merge 62 commits into from

Commits on Jan 7, 2018

  1. Reduce build-time calls to broadcasting machinery

    The main one left is in concatenation, in a line
    
        inds[i] = offsets[i] .+ cat_indices(x, i)
    timholy committed Jan 7, 2018
    Configuration menu
    Copy the full SHA
    c0db74c View commit details
    Browse the repository at this point in the history
  2. Allow test/core.jl to be run from REPL

    If you've already said `using Test`, defining a function named `Test` causes problems.
    timholy committed Jan 7, 2018
    Configuration menu
    Copy the full SHA
    0c6617a View commit details
    Browse the repository at this point in the history
  3. Turn range&number arithmetic operations into broadcast methods

    This is consistent with the deprecation of methods like `[1,2,3] + 1`.
    timholy committed Jan 7, 2018
    Configuration menu
    Copy the full SHA
    98fe8ab View commit details
    Browse the repository at this point in the history
  4. Make lazy dot fusion

    vtjnash authored and timholy committed Jan 7, 2018
    Configuration menu
    Copy the full SHA
    aeba265 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    98f6bdc View commit details
    Browse the repository at this point in the history
  6. Integrate lazy broadcast representation into new broadcast machinery

    Among other things, this supports returning AbstractRanges for appropriate inputs.
    
    Fixes #21094, fixes #22053
    timholy committed Jan 7, 2018
    Configuration menu
    Copy the full SHA
    0698edc View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    4c02b07 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    e4d1962 View commit details
    Browse the repository at this point in the history

Commits on Jan 9, 2018

  1. Configuration menu
    Copy the full SHA
    944e069 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    69eca0b View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2018

  1. Update doctests for TupleLLEnd

    and use a slightly more thorough search through nested arguments
    mbauman authored Jan 10, 2018
    Configuration menu
    Copy the full SHA
    3cf994b View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2018

  1. Fix and test nested scalar broadcasts

    within .-fused expressions that contain custom arrays with custom broadcast styles.
    mbauman committed Jan 11, 2018
    Configuration menu
    Copy the full SHA
    de9e321 View commit details
    Browse the repository at this point in the history

Commits on Jan 13, 2018

  1. Merge remote-tracking branch 'origin/master' into teh-jn/lazydotfuse

    Only needed to manually resolve a trivial conflict in NEWS.md; everything else git did automagically.
    mbauman committed Jan 13, 2018
    Configuration menu
    Copy the full SHA
    61bb21f View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2018

  1. Merge remote-tracking branch 'origin/master' into teh-jn/lazydotfuse

    Conflicts:
    	NEWS.md
    	base/broadcast.jl
    	base/compiler/optimize.jl
    	stdlib/SparseArrays/src/higherorderfns.jl
    mbauman committed Jan 19, 2018
    Configuration menu
    Copy the full SHA
    8dcd8c1 View commit details
    Browse the repository at this point in the history
  2. fixup merge

    mbauman committed Jan 19, 2018
    Configuration menu
    Copy the full SHA
    a14ed08 View commit details
    Browse the repository at this point in the history

Commits on Jan 20, 2018

  1. Allow construction of instantiated Broadcasted{Nothing} objects

    This comes up when `flatten`-ing a broadcasted object within a "fallback"
    `copyto!` method: `flatten` wants to construct a new Broadcast object and
    copy the instantiated information, but we've already destroyed the `Style`
    information when we deferred dispatch to the destination! So this simply permits
    instantiating `Broadcasted{Nothing}` objects in the sole signature that gets
    called by `flatten`.
    mbauman committed Jan 20, 2018
    Configuration menu
    Copy the full SHA
    1774bdf View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2018

  1. Replace BitArray piecemeal broadcast...

    with new infrastructure.  Captures many more cases in a very straightforward manner
    mbauman committed Jan 25, 2018
    Configuration menu
    Copy the full SHA
    7381ea4 View commit details
    Browse the repository at this point in the history
  2. Fix literal_pow broadcast issue. (#25665)

    ajkeller34 authored and mbauman committed Jan 25, 2018
    Configuration menu
    Copy the full SHA
    25598ea View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    99507a2 View commit details
    Browse the repository at this point in the history
  4. Structured broadcasts: Support Bidiagonal broadcasts and perform runt…

    …ime test for zero preserving
    mbauman committed Jan 25, 2018
    Configuration menu
    Copy the full SHA
    2e371e4 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    3e42812 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0115326 View commit details
    Browse the repository at this point in the history
  7. fixup comment

    mbauman committed Jan 25, 2018
    Configuration menu
    Copy the full SHA
    1de78ac View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2018

  1. Configuration menu
    Copy the full SHA
    8e41f2f View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2018

  1. Configuration menu
    Copy the full SHA
    adaf337 View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2018

  1. Fix Sparse inference; improve allocations

    Things are vastly improved; the majority of allocations still appear to be coming from the repeated construction of the same function.
    mbauman committed Jan 29, 2018
    Configuration menu
    Copy the full SHA
    cf0f8ce View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4f8233f View commit details
    Browse the repository at this point in the history
  3. fixup merge

    mbauman committed Jan 29, 2018
    Configuration menu
    Copy the full SHA
    5749afc View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2018

  1. Configuration menu
    Copy the full SHA
    f90f5fe View commit details
    Browse the repository at this point in the history

Commits on Apr 4, 2018

  1. Configuration menu
    Copy the full SHA
    28d5421 View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2018

  1. Configuration menu
    Copy the full SHA
    cfa9caa View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8147932 View commit details
    Browse the repository at this point in the history

Commits on Apr 12, 2018

  1. Removing broadcasting from the new optimizer

    to make it bootstrap friendly
    mbauman committed Apr 12, 2018
    Configuration menu
    Copy the full SHA
    90ad8eb View commit details
    Browse the repository at this point in the history
  2. Remove Structured broadcast deferral to DefaultArrayStyle

    We effectively do that in any case with broadcast similar, and it remains type-stable
    mbauman committed Apr 12, 2018
    Configuration menu
    Copy the full SHA
    03287c1 View commit details
    Browse the repository at this point in the history
  3. work around SparseArrays inference failure in broadcast!

    and remove some unneeded code
    mbauman committed Apr 12, 2018
    Configuration menu
    Copy the full SHA
    f71db14 View commit details
    Browse the repository at this point in the history
  4. Merge remote-tracking branch 'origin/master' into teh-jn/lazydotfuse

    * origin/master:
      A few more #26670 fixes (#26773)
      Revert "deprecate using the value of `.=`. fixes #25954" (#26754)
      change dim arguments for `diff` and `unique` to keyword args (#26776)
      reorder pmap arguments to allow do-block syntax (#26783)
      correct deprecated parametric method syntax (#26789)
      [NewOptimizer] handle new IR nodes correctly in binary format
      [NewOptimizer] support line number emission from new IR format
      fix #26453, require obviously-concrete lower bound for a var to be diagonal (#26567)
      fix #26743, spurious `return` path in try-finally in tail position (#26753)
      Also lift SelectInst addrspaces
    mbauman committed Apr 12, 2018
    Configuration menu
    Copy the full SHA
    e3eede4 View commit details
    Browse the repository at this point in the history

Commits on Apr 13, 2018

  1. Decouple Broadcasting API from inference

    Some of the broadcasting API users still lean on inference -- that can be fixed up later -- but this now no longer hand-feeds them the inferred result of the broadcast.
    
    This has the slight downside that a type-unstable broadcast will not fall back to the simpler `copyto!` method as it must incrementally widen instead. I find this a worthwhile tradeoff.
    
    Also simplify instantiation now that we no longer need to worry about the eltype.
    mbauman committed Apr 13, 2018
    Configuration menu
    Copy the full SHA
    0edbd99 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    71b830f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b248953 View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2018

  1. Remove broadcast_skip_axes_initialization

    in favor of just overloading `instantiate(::Broadcasted{CustomStyle})`.
    mbauman committed Apr 18, 2018
    Configuration menu
    Copy the full SHA
    964039a View commit details
    Browse the repository at this point in the history

Commits on Apr 19, 2018

  1. Configuration menu
    Copy the full SHA
    37220d5 View commit details
    Browse the repository at this point in the history
  2. Expose simpler axes/getindex methods for Broadcasted objects

    as a nicer internal API. Also accomodate the loss of a broadcast style due to falling back to a `Broadcasted{Nothing}`.
    mbauman committed Apr 19, 2018
    Configuration menu
    Copy the full SHA
    79ce497 View commit details
    Browse the repository at this point in the history
  3. Documentation update

    [ci skip]
    mbauman committed Apr 19, 2018
    Configuration menu
    Copy the full SHA
    a6cc656 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    98b5e84 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    2e9c0f2 View commit details
    Browse the repository at this point in the history
  6. Merge remote-tracking branch 'origin/master' into teh-jn/lazydotfuse

    * origin/master: (22 commits)
      separate `isbitstype(::Type)` from `isbits` (#26850)
      bugfix for regex matches ending with non-ASCII (#26831)
      [NewOptimizer] track inbounds state as a per-statement flag
      change default LOAD_PATH and DEPOT_PATH (#26804, fix #25709)
      Change url scheme to https (#26835)
      [NewOptimizer] inlining: Refactor todo object
      inference: enable CodeInfo method_for_inference_limit_heuristics support (#26822)
      [NewOptimizer] Fix _apply elision (#26821)
      add test case from issue #26607, cfunction with no args (#26838)
      add `do` in front-end deparser. fixes #17781 (#26840)
      Preserve CallInst metadata in LateLowerGCFrame pass.
      Improve differences from R documentation (#26810)
      reserve syntax that could be used for computed field types (#18466) (#26816)
      Add support for Atomic{Bool} (Fix #26542). (#26597)
      Remove argument restriction on dims2string and inds2string (#26799) (#26817)
      remove some unnecessary `eltype` methods (#26791)
      optimize: ensure merge_value_ssa doesn't drop PiNodes
      inference: improve tmerge for Conditional and Const
      ensure more iterators stay type-stable
      code loading docs (#26787)
      ...
    mbauman committed Apr 19, 2018
    Configuration menu
    Copy the full SHA
    3870fdf View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c1f2eba View commit details
    Browse the repository at this point in the history

Commits on Apr 20, 2018

  1. WIP: maybe don't use indexers?

    this solves the allocations in perf_op_bcast
    mbauman committed Apr 20, 2018
    Configuration menu
    Copy the full SHA
    9afdbbe View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    81bd635 View commit details
    Browse the repository at this point in the history

Commits on Apr 22, 2018

  1. Configuration menu
    Copy the full SHA
    82d0a3b View commit details
    Browse the repository at this point in the history
  2. Completely move indexing helpers into wrappers

    The key insight here is that these indexing helpers are an _implementation detail_ of an optimization for a particular argument type within a given broadcast implementation. They are not universal across all Broadcasted wrappers -- which is precisely why some styles had wanted to opt out of them. Now the _broadcast_getindex function is solely responsible for allowing indexing into arguments with broadcasted dimensions properly constrained as appropriate. The `Extruded` type pre-computes the dimensions to constrain, allowing an optimization for types who do not statically know this answer -- by default just all `AbstractArray`s.
    
    This still has a performance regression over master in the reduced example `f(r, x) = r .= x.*x.*x.*x` because it does not currently vectorize on this branch. Not sure why.
    mbauman committed Apr 22, 2018
    Configuration menu
    Copy the full SHA
    fb8234a View commit details
    Browse the repository at this point in the history
  3. Don't recursively initialize the Broadcasted objects

    We only need to store the outer set of axes; we do not need the axes of any of the nested Broadcasted objects once that is known -- all other accesses defer to individual argument axes.
    mbauman committed Apr 22, 2018
    Configuration menu
    Copy the full SHA
    aba2da7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5f99c2e View commit details
    Browse the repository at this point in the history
  5. Hack around losing Type{T} information in the final tuple...

    that constructs the arguments to call the function. Julia actually knows the value statically, but it doesn't follow the type information through that transient tuple.
    mbauman committed Apr 22, 2018
    Configuration menu
    Copy the full SHA
    52a3202 View commit details
    Browse the repository at this point in the history

Commits on Apr 23, 2018

  1. Configuration menu
    Copy the full SHA
    a8a2608 View commit details
    Browse the repository at this point in the history
  2. Mitagate some of the performance issues with non-type-stable...

    broadcasting by preprocessing the arguments to potentially wrap them with indexing helpers.
    mbauman committed Apr 23, 2018
    Configuration menu
    Copy the full SHA
    db690e0 View commit details
    Browse the repository at this point in the history
  3. broadcast.jl cleanup:

    * Slightly clearer recursion through arg lists in not_nested
    * Move show(::IO, ::Broadcasted) to a more sensible location and have it print its type fully qualified with the `Style` parameter.
    mbauman committed Apr 23, 2018
    Configuration menu
    Copy the full SHA
    a2b9015 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c8bb374 View commit details
    Browse the repository at this point in the history
  5. Merge remote-tracking branch 'origin/master' into teh-jn/lazydotfuse

    * origin/master: (23 commits)
      fix deprecations of \cdot and \times (#26884)
      Support reshaping custom 0-dimensional arrays (#26870)
      fix some cases of dot syntax lowering (#26878)
      Pkg3: deterministically close the LibGit2 repo in tests (#26883)
      code loading docs: add missing graph edge (#26874)
      add news for #26858 and #26859 [ci skip] (#26869)
      Deprecate using && and || within at-dot expressions (#26792)
      widen `Int8` and `Int16` to `Int` instead of `Int32` (#26859)
      fix #26038, make `isequal` consistent with `hash` for `Ptr` (#26858)
      Deprecate variadic size(A, dim1, dim2, dims...) method (#26862)
      add using Random to example in manual (#26864)
      warn once instead of depwarn since we want to test it
      Revert "reserve syntax that could be used for computed field types (#18466) (#26816)" (#26857)
      Fix compilation on LLVM 6.0
      change promotion behaviour of `cumsum` and `cumsum!` to match `sum`
      [LLVM 6] add patch to diamond if-conversion
      add a precompile command that can be used to precompile all dependencies (#254)
      use registry if no version entry exist in project for developed pacakges
      make Pkg3 work as a drop in for the old CI scripts
      update registries when adding (#253)
      ...
    mbauman committed Apr 23, 2018
    Configuration menu
    Copy the full SHA
    110a0a5 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    6fdb86e View commit details
    Browse the repository at this point in the history
  7. Fix broadcast_similar docstring

    [ci skip]
    mbauman committed Apr 23, 2018
    Configuration menu
    Copy the full SHA
    df51b31 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    a1d4e7e View commit details
    Browse the repository at this point in the history