Skip to content
Emilio Righi edited this page Oct 15, 2024 · 5 revisions

BioGenome Portal

A web-based platform for managing biodiversity genomics metadata.

DOI

About the Project

The BioGenome Portal provides a user-friendly platform to showcase, coordinate, and manage biodiversity genomics metadata.

API & Online Instances

Built With

This project is powered by:

Getting Started

To run this application locally, you need to have Docker Compose installed.

Containers Overview

This app is composed of six Docker containers managed via a docker-compose file.

Front End

  • Vue3 SPA compiled with Vite and served via NGINX.
  • Features a user-authenticated Content Management System (CMS) for CRUD operations.
  • Multilang
  • JSON files are used to configure the layout.

Back End

  • uWSGI/Flask API that handles requests from the front-end, queries the database, and returns JSON or TSV responses.

Database

  • MongoDB container to store and manage all data.

Cronjob (Optional)

  • A simple container with crontab file to trigger scheduled jobs via the Back End API

Celery and Redis

  • A distibuted task queue system and a message broker

External APIs

This project utilizes external APIs for taxonomic and genomic data. If changes in these APIs cause issues, please open an issue.

APIs in use:

CMS Features

The admin area allows the management of various database records:

  • Reads, Biosamples, and Assemblies: Import via accession numbers (INSDC), but cannot be modified (can be deleted).
  • Organisms (Taxons): Import via NCBI taxonomic identifier or automatically when related metadata is added. You can manually add metadata such as images, vernacular names, and publications, and custom attribues.
  • Samples: Import sample metadata from .xlsx files, declaring columns for taxon identifier, scientific name, and sample ID. This feature is useful for managing samples before submission to INSDC. Column names with "ORCID" will not be imported.
  • Annotations: Import genome annotations related to assemblies with download links and metadata.
  • GoaT Reports: Upload GoaT-compliant reports (format here).

Genome Browser

This app includes a genome browser (JBrowse2) for visualizing genomic annotations for chromosome-level assemblies.

Annotation Files

Annotations should be provided in .gff format:

  • genes.gff.gz
  • genes.gff.gz.tbi

Steps to Generate Files

gt gff3 -sortlines -tidy -retainids genes.gff3 > genes.sorted.gff3
bgzip genes.sorted.gff3
tabix genes.sorted.gff3.gz

You can upload these files directly or link them if hosted on servers supporting range requests (e.g., AWS).

If chromosome/sequence names differ between annotations and assemblies, you can provide a tab-separated text file for aliasing. More info here.

For more information, visit the JBrowse2 Documentation.

Contributing

Contributions are what make the open-source community an amazing place to learn, inspire, and create. All contributions are greatly appreciated!

To contribute:

  1. Fork the project
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a pull request

Don't forget to give the project a ⭐ if you like it!

License

Distributed under the MIT License. See LICENSE.txt for details.

Contact

Emilio Righi - emilio.righi@crg.eu