Skip to content

Commit

Permalink
Merge branch 'master' into dev
Browse files Browse the repository at this point in the history
  • Loading branch information
tensulin committed Aug 28, 2024
2 parents 64d51b4 + 5643f4f commit ed48e02
Show file tree
Hide file tree
Showing 12 changed files with 228 additions and 47 deletions.
4 changes: 3 additions & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
/src/meta_tran_sim/data/* filter=lfs diff=lfs merge=lfs -text
src/marbel/data/deduplicated_pangenome_EDGAR_Microbiome_JLAB2.fas.bgz filter=lfs diff=lfs merge=lfs -text
src/marbel/data/deduplicated_pangenome_EDGAR_Microbiome_JLAB2.fas.bgz.bio_index filter=lfs diff=lfs merge=lfs -text
src/marbel/data/orthologues_processed_combined_all.parquet filter=lfs diff=lfs merge=lfs -text
33 changes: 19 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# meta_tran_sim_dev (Meta Transcriptomic Simulation)
# marbel (MetAtranscriptomic Reference Builder Evaluation Library)

This project generates an in silico metatranscriptomic dataset based on specified parameters.

Expand Down Expand Up @@ -35,7 +35,7 @@ BiocManager::install("polyester")
```

Install the package:
Install the package:

```
pip install -e .
Expand All @@ -46,36 +46,41 @@ pip install -e .
To get help on how to use the script, run:

```sh
meta-tran-sim --help
marbel --help
```

### Command Line Arguments

```
Usage: meta-tran-sim [OPTIONS] [N_SPECIES] [N_ORTHOGROUPS] [N_SAMPLES]...
Arguments:
n_species [N_SPECIES] Number of species to be drawn for the metatranscriptomic in silico dataset [default: 20]
n_orthogroups [N_ORTHOGROUPS] Number of orthologous groups to be drawn for the metatranscriptomic in silico dataset [default: 1000]
n_samples [N_SAMPLES]... Number of samples to be created for the metatranscriptomic in silico dataset. The first number is the number of samples for group 1 and
the second number is the number of samples for group 2 [default: 10, 10]
Usage: marbel [OPTIONS]
Options:
--version Show the version and exit.
--help Show this message and exit.
--n-species INTEGER Number of species to be drawn for the metatranscriptomic in silico dataset [default: 20]
--n-orthogroups INTEGER Number of orthologous groups to be drawn for the metatranscriptomic in silico dataset [default: 1000]
--n-samples <INTEGER INTEGER>... Number of samples to be created for the metatranscriptomic in silico datasetthe first number is the number of samples for group 1 and the second is the number of samples for group 2 [default: 10, 10]
--outdir TEXT Output directory for the metatranscriptomic in silico dataset [default: simulated_reads]
--max-phylo-distance TEXT Maximum mean phylogenetic distance for orthologous groups. Specify stricter limit to avoid groups with a more diverse phylogenetic distance. [default: None]
--min-identity FLOAT Minimum mean sequence identity score for orthologous groups. Specify for more stringent identity requirements. [default: None]
--deg-ratio <FLOAT FLOAT>... Ratio of up- and down-regulated genes. The first value is the ratio of up-regulated genes, the second represents the ratio of down-regulated genes [default: 0.1, 0.1]
--seed INTEGER Seed for sampling. Set for reproducibility [default: None]
--read-length INTEGER Read length for the generated reads [default: 100]
--output-format [fastq.gz|fastq|fasta] Output format for the reads [default: fastq.gz]
--version Show the version and exit.
--help Show this message and exit.
```

## Examples

### Running with Default Parameters

```sh
meta-tran-sim
marbel
```

### Specifying Number of Species, Orthogroups, and Samples

```sh
meta-tran-sim --n-species 30 --n-orthogroups 1500 --n-samples 15 20
marbel --n-species 30 --n-orthogroups 1500 --n-samples 15 20
```

This command will generate a dataset with:
Expand Down
14 changes: 7 additions & 7 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ requires = ["hatchling", "hatch-requirements-txt"]
build-backend = "hatchling.build"

[project]
name = "meta_tran_sim"
name = "marbel"
version = "0.0.1"
authors = [
{ name="Timo Wentong Lin", email="ge2317@uni-giessen.de" },
Expand All @@ -26,18 +26,18 @@ dynamic = ["dependencies"]
"Bug Tracker" = "https://github.com/jlab/meta_tran_sim_dev/issues"

[project.scripts]
meta-tran-sim = "meta_tran_sim.meta_tran_sim:main"
marbel= "marbel.meta_tran_sim:app"

[tool.hatch.metadata.hooks.requirements_txt]
files = ["requirements.txt"]

[tool.hatch.build.targets.wheel]
packages = ["src/meta_tran_sim"]
packages = ["src/marbel"]

[tool.hatch.build]
include = [
"src/meta_tran_sim/data/deduplicated_pangenome_EDGAR_Microbiome_JLAB2.fas.bgz",
"src/meta_tran_sim/data/deduplicated_pangenome_EDGAR_Microbiome_JLAB2.fas.bgz.bio_index",
"src/meta_tran_sim/data/orthologues_processed_combined_all.parquet",
"src/meta_tran_sim/data/EDGAR_all_species.newick",
"src/marbel/data/deduplicated_pangenome_EDGAR_Microbiome_JLAB2.fas.bgz",
"src/marbel/data/deduplicated_pangenome_EDGAR_Microbiome_JLAB2.fas.bgz.bio_index",
"src/marbel/data/orthologues_processed_combined_all.parquet",
"src/marbel/data/EDGAR_all_species.newick",
]
File renamed without changes.
1 change: 1 addition & 0 deletions src/marbel/data/EDGAR_all_species.newick

Large diffs are not rendered by default.

Loading

0 comments on commit ed48e02

Please sign in to comment.