-
Notifications
You must be signed in to change notification settings - Fork 10
Slocum Glider Data File Primer
The core classes of the SPT toolbox, Dbd and DbdGroup, create instances of native Slocum glider data files from one of 2 types of files:
- dba: dba data files are ascii data files decoded from the original binary data file and contain a header with file metadata.
- .m & .dat: Pairs of segment data files used to facilitate the loading of the segment data using Matlab. The .m file is a script file used to load the data stored in the .dat sibling. These file pairs are missing much of the metadata found in the dbd header.
This document presents a thorough discussion of the Slocum glider native files formats and presents examples of the typical steps for renaming, decoding and merging them in preparation of downstream processing. While either of the above formats may be used to create instances of the Dbd and DbdGroup classes, I recommend the single file dba format as these files contain more extensive metadata records.
The Slocum glider is built with 2 processors. The first processor, referred to as the flight processor, is used for flight navigation and logging of all engineering sensor data.. The second processor, referred to as the science processor, is used to control the science package instruments and log all scientific data sets.
The glider stores all sensor data in a set of 6 files encoded using the dinkum binary data format. This format consists of an ascii header followed by binary encoded data. The official, detailed documentation on this format can be found here. The six files can be further divided into 3 file pairs, listed below. Each pair consists of a file used to store the glider's flight data and a file used to store the science datasets. If you're interested in the detailed background on this, it can be found here.
During a deployment, the sbd or tbd files are transferred to shore, where they are renamed, decoded and merged into a single file used for real-time data processing. These file types typically contain a subset of the glider sensor data to allow for transfer of the files while minimizing the time the glider spends on the surface. Once the glider is recovered, the remaining (dbd, ebd and, optionally, the mbd and nbd) files are transferred from the glider and provide a full record of the data set.
The following is brief description of each of the 6 file types created by a slocum glider during each segment of a mission. Glider missions are composed of segments. The glider uses an 8.3 file naming scheme to store data files. Individual data segments are named according to the mission number (first 4 digits) and segment number (last 4 digits) of the mission, and are zero-based. For example, the following file:
04940000.sbd
is the first segment (0000) of the 495th (zero-based) mission. The mission number ranges from 0000 – 9999 and is assigned on the day that the mission begins. This number is not incremented until a new mission is started. The segment number also ranges from 0000 – 9999 and is incremented each time a new segment begins. These 8.3-named files are typically renamed once they are transferred to shore using a the rename_dbd_files utility provide by TWRC. The renamed file using a significantly more useful format, described below.
- dbd: dbd files contain the full-resolution sensor data gathered by the flight controller. The size of these files prohibits them from being transferred during a deployment, particuallarly if done over the Iridium satellite link.
- ebd: ebd files contain the full-resolution sensor data gathered by the science controller. As with the dbd files, the size of these files prohibits real-time transfer to shore.
- sbd: sbd files contain a subset of the dbd file contents, as configured by the sbdlist.dat. An example of this file can be found here. These files are configured to enable transfer of the files during the deployment regardless of the communication link.
- tbd: tbd files contains a subset of the ebd file contents, as configured by the tbdlist.dat. An exampel of this file can be found here.
- mbd: mbd files contain a subset of the dbd file contents, as configured by the mbdlist.dat. An example of this file can be found here. These files are typically used to provide an adequate sampling of the flight data in the event that diagnosis of glider behavior is warranted during a deployment. The files, while larger than the sbd file, can be transferred from a deployed glider
- nbd: nbd files contain a subset of the ebd file contents, as configured by the nbdlist.dat. An example of this file can be found here.
IMPORTANT: I've written a shell script that wraps the executables detailed below and greatly simplifies the file renamining, conversion and merging detailed below. You can find it here. The following discussion is a thorough discussion of the tools that are provided by TWRC if you choose to go this route on your own.
The following utilities are provided by TWRC for renaming, decoding, filtering and merging binary data files to their ascii equivalents:
- rename_dbd_files: renames to 8.3 filename to it's more informative format
- dbd2asc: decodes the 8.3 binary filename (requires the correct ASCII sensor list header) to it's ascii equivalent. Result is printed to STDOUT.
- dba_sensor_filter: Takes the output of dbd2asc and filters out any sensors not contained in the specified sensor list. The sensor list is a list of whitespace delimited sensors to include.
- dba_merge: Merges 2 segment file pairs (ie: dbd & ebd or sbd & tbd) by timestamp (m_present_time and sci_m_present_time).
- dba2_orig_matlab: Takes the output of either dbd2asc and dbd_sensor_filter and writes a pair of files for importing into the Matlab programming environment.
Recent versions of these utilities are available from the secure TWRC site. I also host recent version for linux and windows systems.
These utilities are developed as unix-style utilities that can accept the output (STDOUT) from one utility as the input (STDIN) to the subsequent utility. Let's take a look at how the utilities are used to convert binary data files.
The following is an end to end example of renaming, decoding, merging and writing binary Slocum glider data file segment pairs to the merged ASCII equivalent. I will also discuss the use of optional filters for modifying the contents of the resulting files and their final formats.
Assuming we have 2 files, 00500000.EBD and 00500000.DBD, the 8.3 binary files are renamed to their full filename equivalents as follows:
kerfoot-lin: tmp > ~/bin/twrc/rename_dbd_files 00500000.[DE]BD
ru28-2013-197-5-0.dbd
ru28-2013-197-5-0.ebd
The file is renamed according to the following conventions:
- ru28: name of the glider
- 2013: year the mission was started
- 197: julian day (zero-based) on which the mission was started
- 5: sequential mission number (zero-based) started on day 197
- 0: sequential segment number (zero-based) of the current mission number
The Dockserver application uses this utility internally to rename all files that have been completely transferred, but can also be used from the command line.
The next step in this process is to decode each binary file to ASCII and, optionally, filter out unwanted sensors. For dbd & ebd pairs, sensor filtering is often used as the majority of sensors are of no use in scientific analysis. This must be done a per file basis and is done differently depending on whether the full file ascii header, used in the decoding, is contained in the file. The presence or absence of this header is controlled by the u_dbd_sensor_list_xmit_control (dbd files) and u_sci_dbd_sensor_list_xmit_control sensors.
If the full ascii header is contained in the files, use the following to decode the ascii file and include all sensors in the file:
kerfoot-lin: tmp > ~/bin/twrc/dbd2asc ru28-2013-197-5-0.dbd > ru28-2013-197-5-0.dba
An error will be returned if the full ASCII header was not contained in the file:
kerfoot-lin: tmp > ~/bin/twrc/dbd2asc ru28-2013-197-5-0.dbd
Error, ignoring: file:ru28-2013-197-5-0.dbd Can't open cache file ./cache/868d75a7.cac
Nothing to process!
In this case, you can use the -c option with dbd2asc and specify the location of the header file:
kerfoot-lin: tmp > ~/bin/twrc/dbd2asc -c /home/coolgroup/gliderData/glider_cache ru28-2013-197-5-0.dbd > ru28-2013-197-5-0.dba
Assuming the full header is in the original binary file, use of this option creates and write the header, for future use, in the specified directory.
The resulting ASCII file, ru28-2013-197-5-0.dba, contains all of the flight controller sensors and is 6.7 Mb. We can remove the majority of the useless (at least for scientific analysis purposes) sensors by piping the output of dbd2asc to dba_sensor_filter, along with the list of sensors to include, before redirecting to the filename. Here's how:
kerfoot-lin: tmp > ~/bin/twrc/dbd2asc -c /home/coolgroup/gliderData/glider_cache ru28-2013-197-5-0.dbd | ~/bin/twrc/dba_sensor_filter -f /home/coolgroup/auvs/auvMeta/standard_sensors.txt > ru28-2013-197-5-0.dba
where /home/coolgroup/auvs/auvMeta/standard_sensors.txt contains the following list of sensors (comments begin with #):
# Always include this!
m_present_time
# Tells us if the glider thinks it's at the surface
m_appear_to_be_at_surface
# Steering parameters
m_fin
c_fin
m_pitch
c_pitch
m_roll
c_roll
m_heading
c_heading
# Glider health
m_coulomb_amphr
m_battery
m_vacuum
# Iridium parameters
m_iridium_call_num
m_iridium_signal_strength
m_iridium_redials
# Depth parameters
m_depth
m_pressure
m_altitude
m_water_depth
# Depth-averaged currents
m_final_water_vx
m_final_water_vy
m_water_vx
m_water_vy
# GPS sensors
c_wpt_lat
c_wpt_lon
m_gps_lat
m_gps_lon
m_lat
m_lon
m_gps_status
m_gps_full_status
# Science sensors
sci_ctd41cp_timestamp
sci_m_present_time
m_science_clothesline_lag
# CTD sensors
sci_water_pressure
sci_water_cond
sci_water_temp
# GliderDOS version
x_software_ver
This results in a file containing all of the sensors listed above (non-existent sensors are ignored) and has a size of 210 Kb.
After we've converted the corresponding ebd file in the same manner:
kerfoot-lin: tmp > ~/bin/twrc/dbd2asc -c /home/coolgroup/gliderData/glider_cache ru28-2013-197-5-0.ebd | ~/bin/twrc/dba_sensor_filter -f /home/coolgroup/auvs/auvMeta/standard_sensors.txt > ru28-2013-197-5-0.eba
We now need to merge the dba & eba pair into a single ASCII file using dba_merge:
kerfoot-lin: tmp > ~/bin/twrc/dba_merge ru28-2013-197-5-0.dba ru28-2013-197-5-0.eba > ru28-2013-197-5-0.deba
This produces a single merged file, ru28-2013-197-5-0.deba, with a total file size of 914 Kb. This file is one of the 2 filetypes accepted by the Dbd or DbdGroup class:
>> dbd = Dbd('ru28-2013-197-5-0.deba')
Producing the Matlab file pair discussed above is as easy as replacing the redirection with a pipe to dba2_orig_matlab:
kerfoot-lin: tmp > ~/bin/twrc/dba_merge ru28-2013-197-5-0.dba ru28-2013-197-5-0.eba | ~/bin/twrc/dba2_orig_matlab
ru28_2013_197_5_0_sf_dbd.m
Here, 2 files are produced, ru28_2013_197_5_0_sf_dbd.m and ru28_2013_197_5_0_sf_dbd.dat. This is the other file type accepted by the Dbd or DbdGroup class, but remember to specify the .m file, not the .dat sibling:
>> dbd = Dbd('ru28_2013_197_5_0_sf_dbd.m')
I recommend using the ru28-2013-197-5-0.deba file when creating instances of the Dbd or DbdGroup classes, not the Matlab "equivalents" as the ru28-2013-197-5-0.deba contains the default masterdata sensor units, while the Matlab files do not contain these units, so a default of nodim is used.