Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added FAQs section #346

Merged
merged 5 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 19 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,35 +42,42 @@ Full CLI options (check out with ``geneplexus --help``)
```txt
Run the GenePlexus pipline on a input gene list.

optional arguments:
options:
-h, --help show this help message and exit
-i , --input_file Input gene list (.txt) file (one gene per line). (default: None)
-i , --input_file Input gene list (.txt) file. (default: None)
-d , --gene_list_delimiter
Delimiter used in the gene list. Use 'newline' if the genes are separated
by new line, and use 'tab' if the genes are seperate by tabs. Other
generic separator are also supported, e.g. ', '. (default: newline)
-n , --network Network to use. {format_choices(config.ALL_NETWORKS)} (default: STRING)
-f , --feature Types of feature to use. The choices are: {Adjacency, Embedding,
Influence} (default: Embedding)
-g , --gsc Geneset collection used to generate negatives and the modelsimilarities.
The choices are: {GO, DisGeNet} (default: GO)
-s , --small_edgelist_num_nodes
Number of nodes in the small edgelist. (default: 50)
-dd , --data_dir Directory in which the data are stored, if set to None, then use the
default data directory ~/.data/geneplexus (default: None)
-n , --network Network to use. The choices are: {BioGRID, STRING, IMP} (default: STRING)
-f , --feature Types of feature to use. The choices are: {SixSpeciesN2V} (default:
SixSpeciesN2V)
-s1 , --sp_trn Species of training data The choices are: {Human, Mouse, Fly, Worm,
Zebrafish, Yeast} (default: Human)
-s2 , --sp_res Species of results data The choices are: {Human, Mouse, Fly, Worm,
Zebrafish, Yeast} (default: Mouse)
-g1 , --gsc_trn Geneset collection used to generate negatives. The choices are: {GO,
Monarch, Mondo, Combined} (default: GO)
-g2 , --gsc_res Geneset collection used for model similarities. The choices are: {GO,
Monarch, Mondo, Combined} (default: GO)
-s , --small_edgelist_num_nodes
Number of nodes in the small edgelist. (default: 50)
-od , --output_dir Output directory with respect to the repo root directory. (default:
result/)
-l , --log_level Logging level. The choices are: {CRITICAL, ERROR, WARNING, INFO, DEBUG}
(default: INFO)
-ad, --auto_download_off
Turns off autodownloader which is on by default. (default: False)
-q, --quiet Suppress log messages (same as setting log_level to CRITICAL). (default:
False)
-z, --zip-output If set, then compress the output directory into a Zip file. (default:
False)
--clear-data Clear data directory and exit. (default: False)
--overwrite Overwrite existing result directory if set. (default: False)
--skip-mdl-sim Skip model similarity computation. This computation is not yet available
when using custom networks due to the lack of pretrained models for
comparison. (default: False)
--skip-mdl-sim Skip model similarity computation (default: False)
--skip-sm-edgelist Skip making small edgelist. (default: False)
```

# Dev
Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ PyGenePlexus
notes/api
notes/r
notes/data
notes/faqs

.. toctree::
:maxdepth: 1
Expand Down
27 changes: 27 additions & 0 deletions docs/source/notes/data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,33 @@ Features :term:`SixSpeciesN2V`
GSCs [GO]_, [Monarch]_, [Mondo]_, Combined
======== =======================================================

**Detailed species info:**

.. list-table::
:widths: 10 10 10

* -
- Specifc Name
- Taxon Id
* - Human
- Homo sapiens
- 9606
* - Mouse
- Mus musculus
- 10090
* - Fly
- Drosophila melanogaster
- 7227
* - Zebrafish
- Danio rerio
- 7955
* - Worm
- Caenorhabditis elegans
- 6239
* - Yeast
- Saccharomyces cerevisiae
- 4932

Due to the availability of the data, the following combinations are supported:

.. list-table:: Available Network Options
Expand Down
22 changes: 22 additions & 0 deletions docs/source/notes/faqs.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
PyGenePlexus FAQs
=====================

Frequently Asked Questions
--------------------------

**How are positive and negative genes determined?**

In the supervised machine learning model, any gene from the user-supplied
gene list that is able to be converted to an Entrez ID and is also in the network is
considered part of the positive class.

Genes in the negative class based on the chosen Geneset Context. The default Geneset
Context is Combined, which used all avilable geneset collections.

GenePlexus then automatically selects the genes in the negative class by:

#. Considering the total pool of possible negative genes to be any gene that has an annotation to at least one of the terms in the selected geneset collection.
#. Retaining all terms in the selected geneset collection that have between 10 and 200 genes annotated to them.
#. Removing genes that are in the positive class.
#. Performing a hypergeometric test between the genes in the positive class and the lists of genes annotated to every term in the selected geneset collection. If the value of this hypergeometric test is less than 0.05, all genes from the given term are also removed from the pool of possible negative genes.
#. Declaring all the remaining genes in the pool of possible negative genes as the negative class.
8 changes: 4 additions & 4 deletions geneplexus/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

.. currentmodule:: geneplexus.GenePlexus

PyGenePlexus enables researchers to predict novel genes similar to their
genes of interest based on their patterns of connectivity in genome-scale
molecular interaction networks, and addtionaly translate these findings
across species.
PyGenePlexus enables researchers to predict genes similar to an uploaded
geneset of interest based on patterns of connectivity in genome-scale
molecular interaction networks, with the ability to translate these
findings across species.

.. figure:: ../figures/mainfigure.png
:scale: 20 %
Expand Down
Loading