Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for aggregating transforms #57

Merged
merged 3 commits into from
Sep 5, 2018

Conversation

thbar
Copy link
Owner

@thbar thbar commented Jun 17, 2018

See #53.

As seen in this StackOverflow question & other similar situations, it can be helpful to ensure one can aggregate rows together.

This PR introduces an optional #close method on transforms, which can either:

  • close resources (e.g. a tracing transform could this way close its output file)
  • yield one or more rows (e.g. an aggregating transform can provide the last aggregate row to the rest of the pipeline)

Because of the yielding support, you must use the new StreamingRunner for this to work:

extend Kiba::DSLExtensions::Config
config :kiba, runner: Kiba::StreamingRunner

The bundled AggregateTransform provides an example of use:

class AggregateTransform
  def initialize(aggregate_size:)
    @aggregate_size = aggregate_size
  end
  
  def process(row)
    @buffer ||= []
    @buffer << row
    if @buffer.size == @aggregate_size
      yield @buffer
      @buffer = []
    end
    nil
  end
  
  def close
    yield @buffer unless @buffer.empty?
  end
end

@thbar thbar changed the title (Experimental) Add support for aggregating transforms (see #53) Add support for aggregating transforms (see #53) Sep 5, 2018
@thbar thbar changed the title Add support for aggregating transforms (see #53) Add support for aggregating transforms Sep 5, 2018
@thbar thbar merged commit a2b5573 into master Sep 5, 2018
@thbar thbar deleted the experimental-buffering-transform-support branch September 5, 2018 15:12
@ttilberg
Copy link

Hey @thbar, I've been seeing some recent activity with you mentioning Kiba 3 plans. Do you still have intention to officially release Kiba 2.5 with this feature? Or will it be waiting for the full 3.0 release? I've been looking forward to using this, but don't want to target unofficial releases. Cheers!

@thbar
Copy link
Owner Author

thbar commented May 28, 2019

Hi @ttilberg! Thanks for asking!

I'll release aggregating transforms as part of Kiba 2.5.

I'll make an official release shortly!

@thbar
Copy link
Owner Author

thbar commented May 29, 2019

@ttilberg it's ready - Kiba v2.5.0 is out. Happy coding!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants