Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: overall goals and influences for language design #79

Closed
monfera opened this issue Nov 8, 2021 · 1 comment
Closed

Question: overall goals and influences for language design #79

monfera opened this issue Nov 8, 2021 · 1 comment

Comments

@monfera
Copy link

monfera commented Nov 8, 2021

The XQuery and LINQ analogies miss a key point, which is, an implementation of functional relational programming, most famously advocated for in Out of the tar pit.

One concern is that the language so far seems to reflect its sole researcher's individual contribution, and locking in what the language is about has the chance of settling in a specific groove too early, making it difficult to generalize or accomodate desirable things later. For example, the syntax is already proposed without a robust discussion of semantics, and what goals the syntax formation are driven by (that I could find; I might have missed it).

I wonder, in the diverse space of relational algebra and (ill defined) extended relational algebra, what's included and what's not, or if it's somehow kept malleable or even deferred this time.

  • Ordered tuples (tables) or unordered (sets/bags)? I propose direct support for partial orders, because no order and full order are just special cases of that, and partial orders are very valuable, think interesting orders
  • Is it a set or bag algebra? Similarly, maybe it can be the continuum: the ability to think of duplications by certain attributes, but not by others admits both a strict set algebra and a full bag algebra, and things in between (a bit like keys / key candidates / functional dependency)
  • Are attributes ordered (table-like) or unordered? Hopefully the latter
  • Is the physical representation of a relation defined? Hopefully not, because the same relation can be represented in a multitude of ways
  • Specifically, what things are anticipated that can act as a relations, yet maybe counterintuitive to most folks?

Though more implementational and optimization related, I'm curious about the stance on the ongoing discussions in query optimizations, and how the language at least permits these, such as vectorization vs pipelining (and mixes); points of pipeline breakers and elective materialization; abilities for quick but approximative queries, as well as prioritizing latency of initial response (partial query results) over overall throughput.

Still operational: super interested in the bit where the optimization needs to be a cost based optimizer rather than (just) a rules based one. As it requires some coupling of data, or at least metadata and statistics to the static code/query. And how the language is expected to excel at arbitrary layers of the memory hierarchy, from optimality of register and stack allocation to the layers of CPU caches L1, L2, L3, main memory, fast, slow magnetic and serial storage, down to things like AWS Glacier.

The type system is also interesting in detail, for example, nominal vs structural typing (should I be able to add an angle in radians to one in degrees).

Types of indices or things like clustered indices would be premature to lock in, yet it'd be interesting to know what the language designer would say about indices in general, and what the influences are (concepts, example paper links etc.)

Any thoughts about integrating relational algebra with elements of linear algebra? Or inclusion of solvers for the user, so they can pose problems that are not just directly executed as RA operators, but it's partly driven in reverse and hill climbing and more sophisticated algos yield answers?

Any support for reactivity, streaming and incremental, efficient recalculation? As in, can the user have maybe tens of levels of "views", to then be updated (via materialization or pipelining), or is it expected to be like most DBs for which it'd be an edge use case and they do a horrible job with these efficiency wise?

How about multi-query optimization where, due to complex querying or multiple users querying simultaneously, it's the overall optimum and constraints like fairness and SLA adherence that matter?

Any list of papers or terms that are meant to be considered for inclusion? Eg. current literature is full of optimizing compilation to machine code for individual queries (has pros/cons).

Any consideration for GPU based computation and visualization?

There are so many facets of what makes a good language for FRP, considering use cases, and a healthy discussion of all these seems valuable.

@julianhyde
Copy link
Collaborator

I'm going to convert this into a discussion, and I will respond further there.

@hydromatic hydromatic locked and limited conversation to collaborators Nov 15, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants