-
Notifications
You must be signed in to change notification settings - Fork 2
CSV Preparation
The input data needs to be prepared in valid CSV (Comma separated values) files .
⚠️ Note As by RFC4180 and the name of Comma Separated Values, the use of a comma is suggested as delimiter if you have the choice. If your export does only allow other delimiters you have the possibility to also use other delimiters if needed in Cube Creator.
-
UTF8 Encoding
All CSVs must be in UTF8 encoding, otherwise special characters will wrongly displayed. On Windows we recommend Notepad to detect and change encoding. (The easiest way to tell if it's UTF8 is to open the file in notepad.exe and click "Save as". Whatever encoding is proposed is the actual encoding of your CSV.)
-
No empty lines
Be especially careful to remove empty lines at the bottom of the file. -
Syntactic Validity
-
Quotes within a string must be doubled (
"Hans ""Johnny"" Müller"
) - Float, Decimal: Decimal character is
.
, thousands separators are not supported. - If the string contains the separator character or quotes then it must be enclosed in quotes (
"Hans, Heiri"
).
The following tool can check the syntactic validity of your CSV: https://zazuko.com/csv-validate/ (This tool works inside your browser and thus might need, depending your machine, some time for bigger CSV files.)
For a more in depth description of syntactic validity please refer to RFC4180.
-
Quotes within a string must be doubled (
It is important to make sure the values of a column are in the correct format . Even if the input format is not correct, the data can be imported to the Cube Creator. Only after the mapping, the correctness of the data type will be validated. In-correct data types will not allow the correct visualisation of your data later on.
-
Dates and Time
-
Date
: Data must be in xsd:date format (example:2001-01-31
) -
Time
: Data must be in xsd:time format (example:17:30:00
) -
DateTime
: Data must be in xsd:datetime format (example:2001-01-31T17:30:00
)
-
-
Boolean value
-
Boolean
: Data must be in xsd:boolean format. (example:true
,false
)
-
-
Geo Coordinates
In general are geo spatial data (shapes, linkes, points) solved by using concepts like municipalities or measurement stations which are provided through shared dimensions. It is only possible to provide coordinates directly with the input data, which needs to be decimal WGS84 coordinates. (E.g. 46.7937, 7.1589). If you have LV95, or LV03 coordinates need to be converted upfront. See also NAVREF.
-
Units
Make sure that the the rows with the input data should not feature any units (e.g. °C or kg) as they are added afterwards explicitly as metadata per dimension.
-
Multilingual Texts
If a dimension is nominal (names) then all 4 languages need to be provided in the CSV if necessary to show them correctly. One language needs to be in one column. (Metadata on the other hand can be added in all languages in Cube Creator itself.)
If there are missing values they must be an empty field in the CSV.
A cube should not have any missing values in general. It is possible to have some random missing values (measurement errors). But it is not the goal to have systematically missing values e.g. for a combination of key dimensions. In case you have systematic missing values, e.g. no data for a dimension before a certain date, you might consider creating multiple cubes, with the same input files.
The input values which are used as identifiers must be complete and unique for each column / observation.
Excel uses your system defaults for CSV export and there is no way you could specify the separator character during export. (How to change the delimiter (on Windows) is described here: https://www.koskila.net/how-to-change-the-delimiter-when-exporting-csv-from-excel/)
While Cube Creator can basically handle different field delimiters the "Replace CSV" feature can't. So you might make sure you've got the right one from the beginning.
Depending on the input data it might be necessary to change the format of your input data to be easier read by Cube Creator. See the following explanation (in German) how to move to a flat table: Excel-Anleitung_Umwandlung_einer_Kreuztabelle_in_eine_flache_Tabelle.docx