Skip to content

Bios-Marcel/gostream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gostream

GoDoc builds.sr.ht status Go Report Card

This repository contains a generic implementation for streams, similar to the stream API that Java has. For a list of available functions, check the documentation.

Dude, what are those stream things you are talking about

I am glad you asked! Streams is basically a small concept that allows you to process data in a declerative way. The produced code is mostly easy to understand and can theoretically automatically be scheduled onto multiple threads, or in the case of Go, coroutines.

Overall streams consist of two types of methods, those are terminating methods and non-terminating methods. A terminating method ends the call-chain, as it returns some kind of end result. A non-terminating methods simply returns the stream itself. A stream cannot be used multiple times, every time you want to use a stream, you have to create a new one.

Limitations

Since Go doesn't support generics, this API isn't as powerful as it could be if written in a different language like Java or Kotlin. In Java the types of the Stream could change after every call, this isn't possible in the case of of gostream, since a stream is generated for a very specific type. If you'd want a truly generic stream API, you'd have to use interface{} everywhere and cast it into the correct types, however, this would reduce compiletime safety.

I might eventually create a interface{} version of this library, which is gonna be more powerful than the genny-powered version.

Usage

This library is implemented using genny, that means that each needed implementation has to be generated before starting to compile your actual code.

  1. You'll have to install genny in order to generate the necessary source
    go get -u github.com/cheekybits/genny
    go install github.com/cheekybits/genny
  2. Pull the repository
    go get github.com/Bios-Marcel/gostream
  3. Generate the versions you need
    genny -in="$GOPATH/src/github.com/Bios-Marcel/gostream/stream.go" /
        -out="folder/outputfile.go" -pkg="newpackagename" /
        "GenericStreamEntity=desiredtypes"
  4. Compile your code

Examples

For a standalone example, check out the repository gostreamexample.

Get all even numbers, multiply them by two and sum up the leftover values.

For this example I am just going to use an eager stream.

data := []int{1,2,3,4,5,6,7,8,9,10}
summedEvens := gostream.
    StreamIntEager(data).
    Filter(func(value int) bool {return value%2 == 0}).
    Map(func(value int) int {return value * 2})
    Reduce(func(one, two int) int {return one + two})

fmt.Println(summedEvens)

However, the usage of laziness is way more interesting, for example look at this code:

testData := []int{1, 2, 3, 4, 5}
firstValueValid := gostream.
    StreamIntLazy(testData).
    Filter(func(value int) bool { return value != 2 }).
    Map(func(value int) int { return value * 4 }).
    FindFirst()

The functions passed to Filter(...) and Map(...) will be executed exactly once, since FindFirst() will stop executing after it finds any value.

In an eager stream, the function passed to Filter(...) would execute five times and the function passed to Map(...) would execute four times.

In case you don't care wether the implementation should be eager or lazy, simply use the method gostream.StreamGenericStreamEntity([]int) IntStream.

Making use of parallelism

In order to get the maximum out of Go and streams, having streams make use of parallelism would be very useful.

A stream should only be parallel if the user ask the framework to do so, otherwise a lot of logic is required in order to find out if parallelism would make sense or not. Introducing such logic would create an overhead and make the code more prone to errors. Therefore, a parallel stream will always create multiple channels per step, no matter if the terminal action is a Collect() or a FindFirst().

A downside would be that in case of a FindFirst() the stream might execute more code than necessary, since the go routines aren't killable from outisde, unless the user prepares them to be killable, but that would introduce too much boilerplate code when using the parallel api. However, since the code executed shouldn't have side effects, that might be fine.

About

implementations for eager and lazy streams in golang.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages