TODO

Parser and lexer:

* Support for parentheses
* Type casts
* Trinary conditional operator ("?:")
* Allow location specifiers to be attached to constants
* Unicode identifiers and string literals
* In string literals, support the "\000" syntax (Octal ASCII
  character) from C89; or perhaps some syntax for specifying Unicode
  code points?
* Error reporting; location of syntax error
* Support for assignments?
* Warn about redundant joins: "foo(A), foo(A);" is redundant

Planner:

* Add support for stratification
* Exploit implied equalities more effectively, suppress duplicate
  predicate evaluations
* Avoid Cartesian products, if possible
* We can skip projection in an operator if no operator in the
  remainder of the op chain requires values that are NOT in the input
  to that operator; e.g. if we do a join and there isn't anything in
  the rest of the op chain that needs a value in the scan_rel, we can
  skip projection on the output of the join
  * This might be complicated by the uniqification of variable names

Executor/Router/Operators:

* More efficient joins: the existing approach to joins essentially
  re-materializes intermediate join results. Stems and stairs from
  earlier Eddies work might point toward a better way of doing this.
* Use a heap to implement timer deadlines
* Aggs: don't emit a deletion/insertion pair if the agg value is
  unchanged (e.g. sum<> on input 0, min<> on non-min input, etc.)
  * Similarly, if an agg moves from n => m, we currently delete n and insert
    m. Might be more efficient to emit a single "update n => m" tuple
* Implement DRED

Network:

* Add an ad-hoc compression method, to avoid resending table names in
  every single packet
  * Have the client and server negotiate that "table 10" means "table
    foo, with schema bar" once, and use that information for the
    remainder of the session
* Add a UDP transport
* Consider adding an SCTP transport
* Consider adding an SSL-over-TCP transport, and/or secure
  communication in general
* Consider adding a multicast transport?
* Consider using TCP_NODELAY in TCP transport

Data types and expressions:

* Consider using a variable-size length word for C4String: more
  storage-efficient for short strings, which is the common case (or
  special-case this just for network format?)
* Replace string location specifier type with an IPv4 endpoint (scalar
  value containing IPv4 address + port)
* Consider removing refcount from Tuple OR use the resulting padding
  on LP64 machines for something useful (e.g. cache tuple_hash())
* Consider using a packed tuple representation; reorder Tuple fields
  to reduce padding requirements
* Allow sum, avg to work on a broader range of data types
  * The sum of int4s might be an int8
* Check for integer overflow in addition, multiply, etc.

Tables and storage:

* Add internal "table IDs", and use them instead of table names
* Support for event tables
* Support for BDB persistent tables
  * Can we store in-memory Tuple/Datum format directly to BDB?
* Consider adding a "regexp" table type: given a string input, parses
  into tuple format by applying a regular expression
  * Rather than regexp, look at PADS and related work
  * Also do output to external format
* Optimize the hash table implementation
* Consider using Judy trees/arrays instead of hash tables
* Import red-black tree code
* Support for secondary indexes, index scans on PK
  * Push predicates down to SQLite table scan

Build system:

* Make use of profile-guided optimization with GCC
* Support GCC 4.5's interprocedural optimization mode
* Only export the official C4 client API from the shared library
  * All symbols prefixed with c4_, etc.
  * Implement via linker scripts?

APR:

* Report queue performance issue
* Add support for "apr-config --configure"
* Modify queue type to allow variably-sized queue elements, to avoid
  the need to malloc() small queue messages

Broader issues:

* Unit testing framework
* Error handling: exceptions via longjmp?
* Add a "$LOCALHOST" variable that expands to the network address of
  the evaluating C4 instance
  * Complicated by the fact that a machine can have multiple network
    addresses (one per interface + localhost + IPv4 vs. IPv6, etc.)
  * Perhaps adopt something like Reactor's "ref" concept instead
* Invent something similar to makeNode() from Postgres: infer node
  size from node type tag
* Consider caching per-tuple hash code
* Change the node system to work with strict aliasing per C99
* Add a concept of "programs" or "modules"
* Implement a simple interactive shell
  * As a first step, read input program from stdin unless terminal
* Locking / concurrency control

Minor:

* If a fact is defined at node X but has a location specifier for node
  Y, should we send the tuple to node Y, or simply ignore it?
* Make core dumps more obvious
  * Print backtrace on core dump