WholeTale Summer Internship 2018

Taxonomy alignment as a key to enhance reproducibility in biodiversity research: a case study of Magnolia

Author: Yi-Yun Cheng (Jessica), University of Illinois at Urbana-Champaign

Mentors: Dr. Bertram Ludaescher, UIUC ; Dr. Nico Franz, ASU

Goal of this project

Oftentimes in biodiversity research, we expect the scientific names of species to be unique identifiers, but actually they may not be. Why is that?

(1) The scientific names can vary over time

(2) The names stay the same, but the semantics of the names change

Other complicated issues:

(1) Different people may have different perceptions to the taxonomy of a same topic

(2) Species distribution datasets oftentimes only include information on a species ‘name’ without crediting the authorship of that taxonomy

This is why we are in a pressing need to align diffferent taxonomies that is addressing the same topic, not to only make the names more interoperable, but also to make way for further datasets usage.

Overview of the tasks for this project

Step 1: Decide which species (or genus) to examine

Step 2: Domain experts provide a mapping table for the taxonomies used over time for that particular species

Step 3: Researcher transpose domain experts’ table into Euler/X or LeanEuler input file

Step 4: Gather species distribution dataset from biodiverisity portals

Step 5: Concept mapping of the taxonomies and create new datasets based on different taxonomies

Step 6: Data cleaning - geocode missing lat-long information

Step 7: Visualizing species co-occurrence distribution & synthesized taxonomy alignment distribution

Step 8: Niche modeling and further analyisis

Refer to the following for details

Step 5: Concept mapping process. Refer to this notebook.
Step 6: Filling in missing geo-location information. Clone this repository and run the geocode.py along with testgeocode.py.
Step 7: Species co-occurrence distribution visualization. Refer to this notebook.

try plotdata.py to run the code directly

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
AndropogonEulerRuns		AndropogonEulerRuns
ConceptMapping		ConceptMapping
Datasets		Datasets
Magnolia		Magnolia
MinyomerusEulerRuns		MinyomerusEulerRuns
OrchidEulerRuns		OrchidEulerRuns
Andropogon_Distribution.ipynb		Andropogon_Distribution.ipynb
Interactive attempts.ipynb		Interactive attempts.ipynb
Magnolia2014_1883.png		Magnolia2014_1883.png
Magnolia_all.ipynb		Magnolia_all.ipynb
README.md		README.md
comparison2014_1883.png		comparison2014_1883.png
plotdata.py		plotdata.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WholeTale Summer Internship 2018

Taxonomy alignment as a key to enhance reproducibility in biodiversity research: a case study of Magnolia

Goal of this project

Overview of the tasks for this project

Refer to the following for details

About

Releases

Packages

Contributors 2

Languages

idaks/wt-biodiversity-summer-2018

Folders and files

Latest commit

History

Repository files navigation

WholeTale Summer Internship 2018

Taxonomy alignment as a key to enhance reproducibility in biodiversity research: a case study of Magnolia

Goal of this project

Overview of the tasks for this project

Refer to the following for details

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages