Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation plugin support #426

Open
xiamaz opened this issue Apr 1, 2024 · 7 comments
Open

Annotation plugin support #426

xiamaz opened this issue Apr 1, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@xiamaz
Copy link
Contributor

xiamaz commented Apr 1, 2024

Is your feature request related to a problem? Please describe.
Both VEP and open-cravat support plugins, which can extend annotation capabilities without requiring these to be directly integrated into the core software.

Describe the solution you'd like
mehari should offer a plugin interface with at least the features given by VEP. In the best case these should be VEP compatible.

Describe alternatives you've considered
Most software supports annotating custom tsv, but this might be too limited for most use-cases.

Additional context
First we will need to investigate the approach taken by both VEP and open-cravat for plugin support. Potentially something like wasmer might help, as a wasm intermediate step is utilized by multiple rust projects to allow for easy plugin integration without putting strong constraints on either programmming language or environment,.

@xiamaz xiamaz added the enhancement New feature or request label Apr 1, 2024
@xiamaz xiamaz moved this to In review in Release Planning Apr 1, 2024
@xiamaz xiamaz moved this from In review to In progress in Release Planning Apr 1, 2024
@xiamaz
Copy link
Contributor Author

xiamaz commented Apr 1, 2024

VEP Plugins

Supported language: perl

Approach

Plugins are run for each line of input, before anything is printed to the output file. In addition the variant allele and overlapping genomic features are provided in an object.

Plugins need to implement new, get_header_info and run. On calling run, the return value is the additional info to be added to the entry.

Implementation concerns

Directly supporting perl-based plugins, would require either integrating a perl ffi-interface into rust (complicated) or looking into perl-wasm compilation, which might work. (https://perlwasm.github.io/)

@xiamaz
Copy link
Contributor Author

xiamaz commented Apr 1, 2024

open-cravat plugins

Supported language: python

Approach

open-cravat supports modular annotators for a large number of annotation scores. Otherwise plugin functionality is pretty similar to vep.

Implementation concerns

pyo3 can be used. compiling to wasm is also not well supported.

@xiamaz
Copy link
Contributor Author

xiamaz commented Apr 1, 2024

Design

Look into https://perldoc.perl.org/perlembed and https://github.com/PyO3/pyo3. We might be able to get some very simple vep and open-cravat extensions running.

Afterwards we should compare performance of these against e.g. wasm based extensions and potentially offer that as the main plugin approach.

@xiamaz xiamaz self-assigned this Apr 1, 2024
@holtgrewe
Copy link
Contributor

I looked a bit and now wonder how many plugins we can get to run. E.g. the VEP plugins often rely on the "tva" argument which is a complex data structure. See NMD for a simple VEP plugin.

It might be easier to provide some infrastructure for tabox lookup and then implement some plugins and crowd source from then on (after publication).

Overall, our native interface could pass the current vcf record as JSON serialization plus, say transcript Infos as JSON (serde is really cool), and vcf header as JSON and return a changed record as JSON.

@holtgrewe
Copy link
Contributor

What about the following.

We create a native plugin system based on extism. This allows writing plugins in wasm. We pass data through interfaces as JSON for simplicity. We can model interfaces inspired by VEP and cravat.

We implement some core plugins such as annotation based on annonars/dbsnp in Rust. We provide a reference implementation of the VEP plugin NMD in Rust and Python compiled to WASM. We then explore how we can make a wrapper in the wasm layer that allows to run the VEP plugin NMD and some basic cravat plugin in Python.

We will be able to create the native interface and the NMD demo in Python and Rust. The exploration can be time boxed to day one day and we can postpone. I don't know whether we will be able to expose all of VEPs data structure needed for the plugins.

The strategy above allows for implementing something that should work easy enough with 98% confidence and the wrapper layer can be postponed/terminated.

@xiamaz
Copy link
Contributor Author

xiamaz commented Apr 2, 2024

Sound good. This keeps the overhead to a minimum and allows us to create a clean plugin interface.

@tedil
Copy link
Contributor

tedil commented Apr 4, 2024

I have implemented a dummy plugin + calling the plugin from mehari in the plugin-system branch, just to get a feeling for extism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In progress
Development

No branches or pull requests

3 participants