id	title
spark-backend-relational	Spark Relational

Spark backend as a Special Form of Relational backend

Context and Problem Statement

Given that Apache Spark's data model is an extended relational data model and most of the complexity in the backend will likely be related to the differences between the FP and the relational model (like flattening data structures, truning lookups into joins, ...), should we consider implementing the Spark backend as a special case of a Relational backend? This way we would be able to reuse the solutions we come up with in this process across a wide variety of backends including various relational databases or technologies like Apache Kafka SQL (any technology that is fundamentally relational).

Decision Drivers

Future potential/limitations
Implementation effort

Considered Options

Map to Relational IR first, then to Spark IR
Map directly to Spark IR
Map directly to Spark IR, but implement flatten and joins as reusable functions

Decision Outcome

Chosen option: "{option 1}", because {justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.

Positive Consequences

{e.g., improvement of quality attribute satisfaction, follow-up decisions required, …}
…

Negative Consequences

{e.g., compromising quality attribute, follow-up decisions required, …}
…

Pros and Cons of the Options

Map to Relational IR first, then to Spark IR

graph LR
    A[Morphir IR] --> A2B(Relational Backend) --> B[Relational IR]
    B --> B2C(Spark Backend) --> C[Spark IR]

Loading

Good, because it makes the relational mapping reusable
Bad, because the Relational IR limits the scope of Spark operations we can map to

Map directly to Spark IR

graph LR
    A[Morphir IR] --> B2C
    B2C(Spark Backend) --> C[Spark IR]

Loading

Good, because we can fully utilize Spark's capabilities to implement all possible Morphir features
Bad, because we cannot reuse the solutions we come up with for flattening and joins

Map directly to Spark IR, but implement flatten and joins as reusable functions

graph LR
    A[Morphir IR] --> B2C
    B2C(Spark Backend) --> C[Spark IR]
    B2C -.-> F[flatten]
    B2C -.-> J[join]

Loading

Good, because we can fully utilize Spark's capabilities to implement all possible Morphir features
Good, because we can reuse the solutions we come up with for flattening and joins

Links

{Link type} {Link to ADR}
…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-backend-as-a-special-case-of-a-relational-backend.md

spark-backend-as-a-special-case-of-a-relational-backend.md

Spark backend as a Special Form of Relational backend

Context and Problem Statement

Decision Drivers

Considered Options

Decision Outcome

Positive Consequences

Negative Consequences

Pros and Cons of the Options

Map to Relational IR first, then to Spark IR

Map directly to Spark IR

Map directly to Spark IR, but implement flatten and joins as reusable functions

Links

Files

spark-backend-as-a-special-case-of-a-relational-backend.md

Latest commit

History

spark-backend-as-a-special-case-of-a-relational-backend.md

File metadata and controls

Spark backend as a Special Form of Relational backend

Context and Problem Statement

Decision Drivers

Considered Options

Decision Outcome

Positive Consequences

Negative Consequences

Pros and Cons of the Options

Map to Relational IR first, then to Spark IR

Map directly to Spark IR

Map directly to Spark IR, but implement flatten and joins as reusable functions

Links