This guide aims to assist you in initiating your comprehension of Echelon's elements and its requirements for metadata.
Although the initial learning curve might seem steep, as you become more accustomed to it, incorporating and upkeeping Echelon becomes notably straightforward.
Setup the necessary Echelon components. Refer to their respective README files for instructions.
- Command Line Interface (CLI)
- Database
- Web Application
Once you have established all the framework components, the next step involves generating your metadata.
- Begin by duplicating the files from the
/database/migrations/data
directory and placing them into a designated working directory.
-
Access the
job.json
file and initiate the creation of a singular job. This serves as your data flow, and for simplicity, consider crafting a basic task such as loading a file into a database table. -
An advisable naming convention to adhere to is as follows:
[action]_[target entity]
For instance:
load_salesforce_customer_staging_table
-
Regarding all other obligatory fields, consult Echelon's documentation for proper guidance.
Example
"job": [ { "name": "load_salesforce_customer_staging_table", "description": "Job to load Salesforce Customer data into a staging table", "type": "file_load", "priority": 1, "dependency_logic": "and" } ]
-
Access the
entity.json
file and establish a pair of entities. These entities correspond to your source and target elements (namely, your source file and target table). -
A recommended naming approach is as follows:
[source system]_[entity name]_[entity type]
For instance:
salesforce_customer_file
-
For all additional mandatory fields, consult Echelon's documentation to receive proper guidance.
Example
"entity": [ { "name": "salesforce_customer_file", "description": "CSV file containing Salesforce Customer data", "business_description": null, "type": "delimited_file", "source": "Salesforce" }, { "name": "salesforce_customer_staging_table", "description": "Staging table containing Salesforce Customer data", "business_description": null, "type": "staging_table", "source": "Salesforce" } ]
-
Access the
job_entity_rel.json
file and establish a solitary relationship connecting the entities formed in step 3 with the job conceived in step 1.Note: In the case of a many-to-one relationship, you should create multiple entries, each featuring a distinct
source_entity_name
, but all associated with the sametarget_entity_name
.Example
"job_entity_rel": [ { "job_name": "load_salesforce_customer_staging_table", "sequence_number": 1, "source_entity_name": "salesforce_customer_file", "target_entity_name": "salesforce_customer_staging_table", "required_flag": true } ]
-
Access the
entity_constant.json
file and generate several constants intended for both the source file and the target table. -
When it comes to the source file, it's advisable to consider
data_file_path
data_file_pattern
column_delimiter
column_quote_character
header_count
Example
"entity_constant": [ { "entity_name": "salesforce_customer_file", "name": "data_file_path", "value": "./data/salesforce" }, { "entity_name": "salesforce_customer_file", "name": "data_file_pattern", "value": "./customer_\\d{8}.csv" }, { "entity_name": "salesforce_customer_file", "name": "column_delimiter", "value": "," }, { "entity_name": "salesforce_customer_file", "name": "header_count", "value": "1" } ]
-
When it comes to the target table, it's advisable to consider
database_name
database_port
database_server
schema_name
table_name
Example
"entity_constant": [ { "entity_name": "salesforce_customer_staging_table", "name": "database_name", "value": "prod001" }, { "entity_name": "salesforce_customer_staging_table", "name": "database_port", "value": "5432" }, { "entity_name": "salesforce_customer_staging_table", "name": "database_server", "value": "127.0.1.0" }, { "entity_name": "salesforce_customer_staging_table", "name": "header_count", "value": "staging" }, { "entity_name": "salesforce_customer_staging_table", "name": "table_name", "value": "salesforce_customer" } ]
-
A comprehensive compilation of accessible constants within Echelon can be located in the documentation at constraints/entity_constant_name.
-
If desired, you can introduce additional types of constants by appending them to the
constraint.json
file under theconstraint_entity_constant_name
category and then proceeding to re-import the data.
-
Access the
field.json
file and create all the necessary fields contained within the entities created in the above steps.Example
"field": [ { "entity_name": "salesforce_customer_file", "physical_name": "first_name", "data_type": "varchar", "length": "100", "precision": null, "scale": null, "sequence_number": 1, "group_number": null, "business_description": "Customer's first name", "business_name": "First Name", "business_alias": null, "acronym_name": null, "classification": "restricted", "required_flag": true, "computed_flag": false, "sequence_flag": false, "hash_key_flag": false, "hash_diff_flag": true, "record_source_flag": false, "business_date_flag": false } ... ]
With your metadata successfully generated, the next step involves its loading. This process is accomplished through the utilisation of the Echelon Command Line Interface (CLI).
To perform the metadata loading, execute the following command in your terminal:
echelon data:import --input [metadata directory] --insert
In case your metadata is already present within Echelon and you've made modifications to it, you can effectuate an update by deploying the subsequent command:
echelon data:import --input [metadata directory] --insert --update
When there is a requirement to access your technical metadata or generate operational metadata, you have three options to consider:
- API: This is the preferred choice for the majority of interactions.
- CLI: This is the suggested method for interactions involving scripting or automation.
The provided example workflow will illustrate the majority of the potential and essential interactions that your workflow will need to establish with Echelon throughout its execution.
Generate a run when commencing the workflow. The unique run_id
returned will be utilised for the purpose of documenting operational metadata associated with the run.
Request
POST /job/run
{
"name": "load_salesforce_customer_staging_table"
}
Response
[
{
"run_id": 1,
"job_id": 3,
"start": "2023-08-10T08:13:47.748Z",
"end": null,
"status": "running"
}
]
Fetch the source entities linked to the job along with their corresponding constants.
GET /job/source?name=load_salesforce_customer_staging_table
Response
[
{
"entity_id": 4,
"name": "salesforce_customer_file",
"description": "CSV file containing Salesforce Customer data",
"business_description": null,
"type": "delimited_file",
"source": "Salesforce",
"required_flag": true,
"sequence_number": 1,
"insert_date": "2023-08-10T08:04:00.105Z",
"update_date": null
}
]
GET /entity/constant?name=salesforce_customer_file&jq=from_entries
Response
{
"column_delimiter": ",",
"data_file_path": "./data/salesforce",
"data_file_pattern": "./customer_\\d{8}.csv",
"header_count": "1"
}
Fetch the target entities linked to the job along with their corresponding constants.
GET /job/target?name=load_salesforce_customer_staging_table
Response
[
{
"entity_id": 5,
"name": "salesforce_customer_staging_table",
"description": "Staging table containing Salesforce Customer data",
"business_description": null,
"type": "staging_table",
"source": "Salesforce",
"insert_date": "2023-08-10T08:04:00.144Z",
"update_date": null,
"migration_insert_id": 3,
"migration_update_id": null
}
]
GET /entity/constant?name=salesforce_customer_staging_table&jq=from_entries
Response
{
"database_name": "prod001",
"database_port": "5432",
"database_server": "127.0.1.0",
"header_count": "staging",
"table_name": "salesforce_customer"
}
Retain all logs relevant to the workflow. This encompasses informational logs, warnings, and errors. Maximizing the logging is beneficial.
POST /job/run/log
{
"id": "1",
"job": "tutorial",
"function": "load()",
"priority": "INFO",
"message": "Loading of salesforce_customer table has begun"
}
Response
[
{
"log_id": 1,
"run_id": 1,
"job": "tutorial",
"function": "load()",
"priority": "INFO",
"message": "Loading of salesforce_customer table has begun",
"code": null,
"insert_date": "2023-08-10T08:28:40.971Z"
}
]
Capture the count of records within the source file, as well as the count of records that have been inserted into the target table.
Note: To gain a deeper understanding of the flow metrics that should be documented, refer to the Technical Reconciliation guide.
POST /job/run/flow
{
"id": "1",
"job": "tutorial",
"function": "load()",
"label": "source_count",
"count": "100"
}
Response
[
{
"flow_id": 1,
"run_id": 1,
"job": "tutorial",
"function": "load()",
"label": "source_count",
"count": 100,
"insert_date": "2023-08-10T08:30:42.617Z"
}
]
POST /job/run/flow
{
"id": "1",
"job": "tutorial",
"function": "load()",
"label": "insert_count",
"count": "100"
}
Response
[
{
"flow_id": 2,
"run_id": 1,
"job": "tutorial",
"function": "load()",
"label": "insert_count",
"count": 100,
"insert_date": "2023-08-10T08:31:08.007Z"
}
]
While the job is running, you have the option to run proactive monitoring. This involves conducting statistical checks on the ongoing run.
Should any of these proactive monitoring checks falter, a new entry will be added to the alert
table.
Note: To gain a deeper understanding of the monitoring process, refer to the Proactive Monitoring guide.
POST /job/run/monitoring
{
"id": "1"
}
Response
null
After the workflow has concluded, carry out the technical reconciliation process to guarantee the comprehensive accountability of all records.
Note: To gain a deeper understanding of the reconciliation process, refer to the Technical Reconciliation guide.
Request
POST /job/run/reconciliation
{
"id": "1"
}
Response
[
{
"flow_id": 3,
"run_id": 1,
"job": "echelon cli",
"function": "create:job:run:reconciliation",
"label": "technical_reconciliation_variance",
"count": 0,
"insert_date": "2023-08-10T08:33:02.059Z"
}
]
Upon the conclusion of the workflow, modify the run's status to either completed
or failed
.
PUT /job/run
{
"id": "1",
"status": "completed"
}
Response
[
{
"run_id": 1,
"job_id": 3,
"start": "2023-08-10T08:13:47.748Z",
"end": "2023-08-10T08:34:41.220Z",
"status": "completed"
}
]
Following the completion of the workflow, you can choose to run proactive monitoring. This check performs statistical evaluations on the finished run. Distinct checks are conducted for both ongoing and completed jobs.
In the event that any of these proactive monitoring assessments encounter issues, a new record will be inserted into the alert
table.
Note: To gain a deeper understanding of the monitoring process, refer to the Proactive Monitoring guide.
POST /job/run/monitoring
{
"id": "1"
}
Response
null