Virtual Schema Common Document Files allows you to query data stored in a document file in the same way as if the data was stored in a regular Exasol database table.
This module is part of a larger project called Virtual Schemas covering document based dialects as well as JDBC based, see complete list of dialects.
Document-based virtual schemas are characterized by
- a storage that is basically a container hosting the document files and also defining the access control and type of account needed to access the files and
- a document type defining the format of the document containing the data.
You cannot directly use this adapter. Please, use one of the dialects for specific storage variants below.
If this list does not contain your file source you can implement your own file source.
Each storage variant can contain documents using any of the following supported document types:
- JSON
- JSON-Lines (one json document per line)
- Parquet
- CSV
You can also add support for other document types.
VSDF builds and publishes a test-jar
with common integration tests for document-oriented virtual schemas that can be used by any derived virtual schema. The derived virtual schema only needs to extend class com.exasol.adapter.document.files.AbstractDocumentFilesAdapterIT
to inherit all common integration tests.
AbstractDocumentFilesAdapterIT
also contains performance regression tests tagged with regression
.
The following changes to the performance regression tests might influence comparability of test results:
- Version 7.3.1
- CSV tests now use all six data types (string, boolean, integer, double, date and timestamp) instead of only string. The column count is unchanged.
- Test names in the test report changed. They now use suffix
()
instead of(TestInfo)
.