- TERRA-REF data from Season 4 and 6 contained in this repository are licensed under CC-0. Please cite LeBauer et al 2020 when using these data.
- KSU data is unpublished and should not be reused without permission from Geoff Morris.
- Clemson data are from Brenton et al (2016)
- This repository are licensed under MIT (see file LICENSE).
LeBauer, David et al. (2020), Data From: TERRA-REF, An open reference data set from high resolution genomics, phenomics, and imaging sensors, v6, Dryad, Dataset, https://doi.org/10.5061/dryad.4b8gtht99
Brenton, Zachary W., et al. "A genomic resource for the development, improvement, and exploitation of sorghum for bioenergy." Genetics 204.1 (2016): 21-33. https://doi.org/10.1534/genetics.115.183947
David LeBauer, University of Arizona, dlebauer@arizona.edu
This repository contains source code for accessing and curating TERRA-REF data used to support the GenoPhenoEnvo project and machine learning research. This repository supports one of our goals to provide open data and reproducible code in order to follow FAIR data principles and contribute to open science.
This repository focuses on Sorghum bicolor trait data collected from four experiments and the associated weather data for those locations, listed below.
- Maricopa Agricultural Center, University of Arizona, Season 4
- Coordinates: 33.069, -111.972
- Elevation: 362 meters
- Planting: 2017-04-20, Day 110
- Last Day of Harvest: 2017-09-16, Day 259
- Maricopa Agricultural Center, University of Arizona, Season 6
- Coordinates: 33.068941, -111.972244
- Elevation: 362 meters
- Planting: 2018-04-25, Day 115
- Harvest: 2018-08-01, Day 213
- Kansas State University, Ashland Bottoms
- Coordinates: 39.126, -96.677
- Elevation: 325 meters
- Planting: 2016-06-17, Day 169
- Harvest: 2016-10-21, Day 295
- Clemson University Pee Dee Research and Education Center, South Carolina
- Coordinates: 34.289, -79.737
- Elevation: 42 meters
- Planting: 2014-05-06, Day 126
- Latest date in Clemson trait data: 2014-10-15, Day 288
Trait data prepared for this analysis can be downloaded in .csv
format from CyVerse.
MAC Season 4
mac_season_4_aboveground_dry_biomass.csv
mac_season_4_canopy_height_manual.csv
mac_season_4_days_gdd_to_flowering.csv
mac_season_4_days_gdd_to_flag_leaf_emergence.csv
MAC Season 6
KSU
Clemson
These tables have the following structure ...
The following traits and units were selected from the raw data for analysis, with the sites that collected those phenotypes. The calculation used for growing degree days (gdd) can be found here.
days_to_flowering
:days
,gdd
- MAC Season 4, KSU, Clemson
days_to_flag_leaf_emergence
:days
,gdd
- MAC Season 4
canopy_height
:cm
- MAC Season 4, MAC Season 6, KSU, Clemson
aboveground_dry_biomass
:kg/ha
- MAC Season 4, MAC Season 6, Clemson
mac_season_4_data_cleaning.ipynb
mac_season_6_data_cleaning.ipynb
ksu_data_cleaning.ipynb
clemson_data_cleaning.ipynb
Contains the weather data during season dates for sorghum experiments at these locations
- Date: YYYY-MM-DD format
- Day of year
- Minimum temperature: Celsius
- Maximum temperature: Celsius
- Mean temperature: Celsius
- Accumulated growing degree days (gdd): heat units
- 10 degrees Celsius is base temperature for sorghum
- Daily gdd value =
((max temp + min temp) / 2) - 10 (base temperature)
- Accumulated growing degree days = cumulative sum of daily gdd values
- Minimum relative humidity: percentage
- Maximum relative humidity: percentage
- Mean relative humidity: percentage
- Vapor pressure deficit: Kilopascals
es = (6.11 * np.exp((2500000/461) * (1/273 - 1/(273 + temp_avg)))) vpd = (((100 - rh_avg)/1000) * es)
- Precipitation: millimeters
- Cumulative precipitation: millimeters
- First water deficit treatment: boolean value
True
values only found in MAC Season 4
- Second water deficit treatment: boolean value
True
values only found in MAC Season 4
Information about MAC season 4 water deficit treatments can be found here
- Maricopa (Arizona) Agricultural Center
- Kansas Mesonet
- NASA Daymet - can search by coordinates
- Climate Engine - can search by coordinates
- Weather data that could not be found in all four seasons were dropped during processing, but can be accessed in the raw data
- The Python3 code used to process weather data can be found in
src/weather_data_cleaning.py
. This script will produce the following output data:mac_season_4_weather.csv
mac_season_6_weather.csv
ksu_weather.csv
clemson_weather.csv
TODO: what does this refer to?
├── LICENSE
|
├── README.md <- The top-level README for developers using this project.
|
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── notebooks <- Jupyter notebooks.
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
|
├── scripts <- Source code for use in this project.