xyr

xyr is a very lightweight, simple, and powerful data ETL platform that helps you to query available data sources using SQL.

Example (Local Filesystem)

here we define a new table called users which will load all json files in that directory (recursive) with any of the following json formats: (object/object[] per-file, newline delimited json objects/object[], or event no delimiter json objects/object[] like what kinesis firehose json output format).

Let's image we have a directory of json files called /tmp/data/users and here is an example of a json file there:

{"id":10,"email":"u10@example.com"}{"id":20,"email":"u20@example.com"}{"id": 3,"email":"u3@example.com"}{"id": 4,"email":"u4@example.com"}

Then we can define its schema as following

# where xyr should store its internal database
data_dir = "./tmp/db/"

# this file is `./config.xyr.hcl`
table "users" {
    // the driver we want
    driver = "jsondir"

    // the data source directory
    source = "/tmp/data/users"

    // xyr will try to create a table into its internal storage, so it needs
    // to know at least what are the required columns names of your data.
    // i.e: {"id": 1, "email": "user@example.com", "age": 20}
    // but we only need "id" and "email", so we defined both in the below columns array
    // and not that the ordering is the same as our example.
    columns = ["id", "email"]

    // what do you want to load
    // in case of jsondir, we can specify a regex pattern to filter the files 
    // using the filename
    // but if we're using an SQL driver we can provide an sql statement that reads the data
    // from the source SQL based database.
    // i.e: "SELECT * FROM SOME_TABLE"
    filter = ".*"
}

Now its the time to load it

$ xyr table:import users

Now let's query it

$ xyr exec "SELECT * FROM users"

All tables you define could be joined in the same query easily, let's imagine that we have the following defination

# debug mode "affects the log level"
debug = true

# how many workers to use to write into our sqlite db
# 0 means current cpu cores count
workers_count = 0

# where xyr should store its internal database
data_dir = "./tmp/db/"


table "users" {
    driver = "s3jsondir"
    source = "s3://ACCESS_KEY:SECRET_KEY@/BUCKET_NAME?region=&ssl=false&path=true&perpage=1000"

    # which prefix we want to select
    filter = "xyr/users/"

    columns = ["id", "email"]
}

table "user_vists" {
    driver = "postgres"
    source = "postgresql://username:password@server:port/dbname?option1=value1"
    columns = ["user_id", "vists"]
    filter = "SELECT user_id, count(vists) FROM USERS GROUP BY user_id"
}

Now let's join them

$ xyr exec "SELECT * FROM users LEFT JOIN user_vists ON users_vists.user_id = users.id"

Installation

use this docker package

Supported Drivers

Driver	Source Connection String
`jsondir`	`/PATH/TO/JSON/DATA/DIR`
`s3jsondir`	`s3://[access_key_url_encoded]:[secret_key_url_encoded]@[endpoint_url]/bucket_name?region=&ssl=false&path=true&perpage=1000&downloaders_count=8&downloader_concurrency=8`
`mysql`	`usrname:password@tcp(server:port)/dbname?option1=value1&...`
`postgres`	`postgresql://username:password@server:port/dbname?option1=value1`
`sqlite3`	`/path/to/db.sqlite?option1=value1`
`sqlserver`	`sqlserver://username:password@host/instance?param1=value&param2=value`
	`sqlserver://username:password@host:port?param1=value&param2=value`
	`sqlserver://sa@localhost/SQLExpress?database=master&connection+timeout=30`
`hana`	`hdb://user:password@host:port`
`clickhouse`	`tcp://host1:9000?username=user&password=qwerty&database=clicks&read_timeout=10&write_timeout=20&alt_hosts=host2:9000,host3:9000`
`oracle`	`oracle://user:pass@server1/service?server=server2&server=server3`

Use Cases

Simple Presto Alternative.
Simple AWS Athena Alternative.
Convert your JSON documents into a SQL DB.

How does it work?

internaly xyr utilizes SQLite as an embeded sql datastore (it may be changed in future and we can add multiple data stores), when you define a table in XYRCONFIG file then run $ xyr table:import you will be able to import all defined tables as well querying them via $ xyr exec "SELECT * FROM TABLE_NAME_HERE" which outputs json result by default.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
internals		internals
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.xyr.hcl.example		config.xyr.hcl.example
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

xyr

Example (Local Filesystem)

Installation

Supported Drivers

Use Cases

How does it work?

Plan

About

Releases

Packages

Languages

License

alash3al/xyr

Folders and files

Latest commit

History

Repository files navigation

xyr

Example (Local Filesystem)

Installation

Supported Drivers

Use Cases

How does it work?

Plan

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages