Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add various streaming transducers #282

Closed
wants to merge 8 commits into from
Closed

Conversation

satabin
Copy link
Member

@satabin satabin commented Dec 31, 2021

  • Copyless Streaming Tree Transducers which can be used to implement query languages over tree like data
  • Streaming Finite State Transducers which can be used for matching inputs and emitting some output
  • Streaming String Transducers which can be used for matching inputs and emitting some output as soon as possible

@satabin satabin added enhancement New feature or request json xml labels Dec 31, 2021
@satabin satabin added this to the 1.4.0 milestone Dec 31, 2021
@satabin satabin marked this pull request as draft December 31, 2021 14:28
@satabin satabin force-pushed the transducers/bootstrap branch 7 times, most recently from 5ecab7c to dbbcb90 Compare January 2, 2022 18:58
Copy link
Collaborator

@ybasket ybasket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure what it does without docs, but left some small Scala-specific comments and will stay tuned for updates 😄

Comment on lines +148 to +153
for {
acc <- loop(left, acc)
acc <- loop(right, acc)
} yield acc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for {
acc <- loop(left, acc)
acc <- loop(right, acc)
} yield acc
loop(left, acc).flatMap(loop(right, _))

I find the for-comprehension with re-use of acc unnecessarily hard to read and it has the tail map problem on Scala 3 IIRC.

m.lift(from)
}

implicit object MapTable extends Table[Map] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map <: MapOps <: PartialFunction, so we might run into ambiguous implicit here. Maybe move the partial function instance into a low priority trait to be safe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure what it does without docs, but left some small Scala-specific comments and will stay tuned for updates smile

Documentation will come. This is a first basic building block for higher-level features. More transducers to come soon, I hope, especially SSTs, which will allow for parsing. This STT can be the base for streaming query languages on tree structures.

@satabin satabin force-pushed the transducers/bootstrap branch 2 times, most recently from f9ae1b9 to 2c01c74 Compare January 8, 2022 19:01
@satabin satabin changed the title Add copyless streaming tree transducers Add various streaming transducers Jan 8, 2022
@satabin satabin mentioned this pull request Jan 8, 2022
1 task
@satabin satabin force-pushed the transducers/bootstrap branch 2 times, most recently from 10bd29a to 76d0de3 Compare January 9, 2022 20:07
@satabin
Copy link
Member Author

satabin commented Apr 12, 2022

I decided to merge this PR without support for SST because:

  • STT can be used to implement features that are independent from SST (e.g. query languages)
  • SST require a lot more work to be functional
    I will also add documentation (and probably some reworks and improvements) when implementing the higher level features on top of these transducers.

@satabin satabin marked this pull request as ready for review April 12, 2022 08:16
@satabin satabin modified the milestones: 1.4.0, future Jun 5, 2022
The copyless natures is ensured by using a controlled environment update
language, without losing expressiveness.

This execution model emits as soon as possible, and requires transducers
to be total, otherwise the stream fails.
This also ensures that everything is emitted up to the first error.

The STT model can be used to implement query languages on tree
structures and some transformations.
By abstracting the transition table type with a dedicated typeclass, one
can create symbolic transducers, which can come in handy to handle
infinite input alphabets.
The streaming FST are simulated and try to emit as early as possible
when there is no ambiguity.
This index can be used to implement efficient lookup table in later use.
@satabin satabin closed this Sep 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants