Skip to content

Scripts to transcode UOE questionnaires between different DSD versions

License

Notifications You must be signed in to change notification settings

LucaGramaglia/UOE-Transcoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Purpose

This GitHub repository contains a script to transcode UOE questionnaire files between different DSD versions. The script takes a CSV file structured according to a source DSD version (2017 or 2019) UOE DSDs and transcodes it to a target DSD version. It provides both a CSV and an SDMX-ML Compact output.

Requirements

In order to run the programme, you will need:

  • An installation of R on your PC (version 3.3.1 or above).
  • Downloading and installing the external XML package for R

Description of the contents of the programme

The programme downloaded contains the following elements:

  • Mapping Algorithm.R script: this is the file containing the R script which transcodes the the input files structured according to a source DSD version into files strcutures according to a target DSD version.
  • params.csv file: this file contains set of parameters needed by the script - the code of the questionnaire to be transcoded, the name of the DSD to be used, the source DSD version and the target DSD version.
  • Input folder: this folder contains the input CSV files structured according to the source DSD. The GitHub repository contains a sample input file.
  • Output-CSV folder: this folder is the place where the R script will generate the transcoded files in CSV format. The GitHub repository contains a sample output file.
  • Output-XML folder: this folder is the place where the R script will generate the transcoded files in SDMX-ML Compact format. The GitHub repository contains a sample output file.
  • Maps and Template-XML folders: these folders contain all the information needed by the script in order to apply the correct transcoding and generate correct SDMX-ML files.

Configuring the programme to run correctly on your computer

Once you have downloaded and unzipped the contents of this GitHub repository on your own computer, you must tell the R script where you have unzipped the programme so that it knows where it can find the different input / output folders it expects.

To do this, open the Mapping Algorithm.R script with any text editor. At the beginning of the script, you will find a variable called path ← "....." (see screenshot below). Replace the default value of path with the path to the working directory you have unzipped the programme in and save the file.

alt text

The params.csv file also contains some parameters that need to be set prior to running the script:

  • DSD: The ID of the DSD to be used must be provided. There are two possible values: UOE_NON_FINANCE and UOE_FINANCE.
  • datasetID: The ID of the questionnaire must be provided. Possible values are ENRL, ENTR, PERS, DEM, CLASS, GRAD, and FIN.
  • SourceYear: The year indicating the source DSD version to be used (i.e. the version of the DSD according to which the input files are structured). Possible values are 2017 and 2019.
  • TargetYear: The year indicating the source DSD version to be used (i.e. the version of the DSD according to which the output files are structured). Possible values are 2017 and 2019.
  • CSVoutput (optional): Boolean value indicating whether the script should output the transcoded file in CSV format. The default value is TRUE.
  • XMLoutput (optional): Boolean value indicating whether the script should output the transcoded file in SDMX-ML format. The default value is TRUE.

How to use the programme

To use the programme, the user simply needs to drop one or more files structured according to source DSD indicated in the params.csv file in the Input folder. You can then run the Mapping Algorithm.R script. In order to do this, open your R application and type the source command: source(yourpath/Mapping Algorithm.R). See screenshot below.

alt text

The script will generate the transcoded files in CSV format in the Output-CSV folder and in SDMX-ML Compact format in the Output-XML folder.

Potential improvements

One possible improvement is the use of a yaml file rather than a CSV file for the parameters. This would however add a dependency on the external yaml package for R.

Error messages and validation of the input parameters could also be improved.

About

Scripts to transcode UOE questionnaires between different DSD versions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages