Add Remap specification #3740

binarylogic · 2020-09-07T18:17:14Z

Now that we're expanding the Remap language, we should materialize all of the rules and guidelines into a spec. This spec can live in the Remap crate/folder in markdown format.

Requirements

Format

Follow RFC 2119 terminology (SHOULD, MUST, etc). (ex: Timber's library spec)

Language

Principles (performance, safety, self-documenting).
How do we enforce these principles? Create a feature/principle matrix. (ex: not introducing methods, avoiding loops, state management, preserving self-documentation, type safety, etc)
Language execution contexts. Assignment & mutability rules. (ex: the del and merge functions)

Limits

Things we have explicitly decided not to do:

Types

List the types.
Types should align with JSON as much as possible, with the exception of Timestamps. (ex: we will never introduce sets).

Syntax

Just cover the basics.

Strictness

Require that all errors be handled at compile time?
Type checking at compile time?
- How does this work with schemas (knowing and not knowing)

Functions

Errors

How are errors returned from functions?
How are errors handled?

The text was updated successfully, but these errors were encountered:

JeanMertz · 2020-12-18T16:38:36Z

There's a lot to write down, but I'll just focus on a few key things that we want to get right before we ship v1:

Functions

general

Work towards composability over individual capabilities. For example, use built-in iteration solutions (once that lands) over allowing multiple arguments of the same type to apply a function more than once. Or, if we have a function that provides a certain task, don't incorporate that into another function as an optional argument, but instead make it possible to compose the two functions together. Using performance as a reason not to follow this rule should be backed by real-life examples and benchmarks.
Functions should rarely mutate their input, instead creating a new value and returning that. Exceptions to this are functions that directly impact the object/event, such as del and merge
When implementing functions, focus on performance, but not at the expense of usability.
Try to design your function to be infallible, if you can make it fallible by removing one specific obscure part of it, then that's usually the best choice to make, we can always introduce that specific behaviour in a new, fallible — less often used — function.
We have test_type_def and test_function macro's, use them to validate your expectations!
We also have a bench_function macro to add Criterion benchmarks, add them if you think it'll help us understand the performance profile of complex functions.
Be sure to update function documentation when you update a function.
When first implementing a function, keep its scope narrow, we can always expand its scope later, but we can't take away features without breaking people's code.

naming

When adding a function, make sure it doesn't conflict with existing ones (meaning, it shouldn't be too similar to existing function names or capabilities)
In terms of naming, I'd say keep the most-often used functions short and simple, and use more descriptive names for more obscure functions unlikely to be used as often.
If a function is specific to a provider/source/sink, name it accordingly, e.g. aws_... etc
us is_* only for functions that return a boolean (but of course there are plenty of functions that don't use that pattern and still return a boolean, such as contains, which is fine)
use parse_* when parsing a string into a specific type (e.g. parse_timestamp, parse_json, and parse_url)
use format_* to go from any type to a string, where the string can be formatted in different ways
use to_* to convert between specific values (e.g. to_string, etc)

parameters

For function parameters, keep it limited, try not to add functions with too many parameters
As a convention put the value you're acting on as the first parameter, and name it value
Try not to restrict input types too much. e.g. don't just limit the input to a string literal if you don't have to, but instead accept any expression and resolve it to a string at runtime (we already have runtime checks that return an error if the expression doesn't return a string, and we're working towards more compile-time checking, and so at some point we'll inform users at boot-time to make sure their expression will return a string at runtime, f.e. by wrapping it in to_string)
Avoid using parameters such as default, there are (and will be) other language constructs to allow falling back to default values, so that we don't have to implement it for every function individually.

binarylogic · 2021-01-01T16:39:00Z

@JeanMertz I updated the requirements to cover problems and questions that I've seen come up.

binarylogic · 2021-05-06T14:28:41Z

Reopening since additional design questions are still popping up. Ex: should the new unnest function in #7038 take string or path arguments?

binarylogic added type: task Generic non-code related tasks domain: internal docs Anything related to Vector's internal documentation domain: vrl Anything related to the Vector Remap Language labels Sep 7, 2020

This was referenced Sep 7, 2020

Rename remap syntax coercion functions #3741

Closed

New format_timestamp remap function #3742

Closed

binarylogic self-assigned this Sep 12, 2020

binarylogic modified the milestones: 2020-08-31 - Digitization Laser, 2020-09-14 - The Grid Sep 12, 2020

jamtur01 modified the milestones: 2020-09-14 - The Grid, 2020-09-28 - Derezzed Sep 28, 2020

jamtur01 modified the milestones: 2020-09-28 - Derezzed, 2020-10-12: Son of Flynn Oct 14, 2020

JeanMertz mentioned this issue Oct 23, 2020

feat(remap transform): add "remap-lang" crate #4695

Merged

jamtur01 modified the milestones: 2020-10-12: Son of Flynn, 2020-10-26: Recognizer Oct 25, 2020

jamtur01 removed this from the 2020-10-26: Recognizer milestone Nov 5, 2020

jamtur01 assigned StephenWakely Nov 6, 2020

jamtur01 added this to the 2020-11-09: Augur of Dunlain milestone Nov 6, 2020

jamtur01 unassigned StephenWakely Nov 9, 2020

jamtur01 modified the milestones: 2020-11-09: Augur of Dunlain, 2020-11-23: Pseudo-chitin armor Nov 24, 2020

jamtur01 modified the milestones: 2020-11-23: Pseudo-chitin armor, 2020-12-07: Nanite Repair System Dec 7, 2020

binarylogic assigned JeanMertz and unassigned binarylogic Dec 17, 2020

This was referenced Dec 18, 2020

chore(ci): Restrict usage of benchmark runs #5610

Merged

Document Remap syntax, grammar, algebra, and other primitives #5373

Closed

jamtur01 removed this from the 2020-12-07: Nanite Repair System milestone Dec 18, 2020

jamtur01 added this to the 2020-12-21 Kryptek Yeti milestone Dec 18, 2020

binarylogic mentioned this issue Dec 21, 2020

add (syntax-only) modules for functions vectordotdev/vrl#145

Open

jamtur01 removed this from the 2020-12-21 Kryptek Yeti milestone Dec 21, 2020

jamtur01 unassigned JeanMertz Dec 21, 2020

StephenWakely mentioned this issue Dec 29, 2020

feat(remap): add parse_regex and parse_regex_all remap functions #5594

Merged

jamtur01 assigned JeanMertz Dec 31, 2020

jamtur01 added this to the 2021-01-04 Xenomass Well milestone Dec 31, 2020

binarylogic changed the title ~~Add remap syntax guidelines~~ Add Remap specification Jan 1, 2021

binarylogic mentioned this issue Jan 4, 2021

Document reverse_dns remap function and add local caching #4517

Open

JeanMertz mentioned this issue Jan 6, 2021

chore(remap): Make del function only take one arg #5633

Merged

jamtur01 modified the milestones: 2021-01-04 Xenomass Well, 2021-01-18 Tabula E-Rasa Jan 15, 2021

binarylogic unassigned JeanMertz Feb 22, 2021

binarylogic closed this as completed Apr 2, 2021

binarylogic reopened this May 6, 2021

JeanMertz mentioned this issue May 7, 2021

chore: RFC for emitting multiple log events from remap #7038

Merged

JeanMertz self-assigned this Jul 13, 2021

JeanMertz mentioned this issue Jul 21, 2021

chore(vrl): CSV enrichment RFC #8400

Merged

JeanMertz mentioned this issue Aug 16, 2021

chore(vrl): add development design document #8735

Merged

JeanMertz closed this as completed in #8735 Oct 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Remap specification #3740

Add Remap specification #3740

binarylogic commented Sep 7, 2020 •

edited

Loading

JeanMertz commented Dec 18, 2020

binarylogic commented Jan 1, 2021

binarylogic commented May 6, 2021

Add Remap specification #3740

Add Remap specification #3740

Comments

binarylogic commented Sep 7, 2020 • edited Loading

Requirements

Format

Language

Limits

Types

Syntax

Strictness

Functions

Errors

JeanMertz commented Dec 18, 2020

Functions

binarylogic commented Jan 1, 2021

binarylogic commented May 6, 2021

binarylogic commented Sep 7, 2020 •

edited

Loading