-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bwa-mem2 data manager #6376
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
categories: | ||
- Data Managers | ||
description: Bwa-mem2 is the next version of the bwa-mem algorithm in bwa. | ||
homepage_url: https://github.com/bwa-mem2/bwa-mem2 | ||
long_description: | | ||
Bwa-mem2 is the next version of the bwa-mem algorithm in bwa. It produces | ||
alignment identical to bwa and is ~1.3-3.1x faster depending on the use-case, | ||
dataset and the running machine. Bwa-mem2 uses a different index format that | ||
is efficient on disk space and runtime memory but requires larger amounts of | ||
memory (roughly 27x the reference) when building. | ||
name: data_manager_bwa_mem2_index_builder | ||
owner: iuc | ||
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/data_managers/data_manager_bwa_mem2_index_builder | ||
type: unrestricted |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
<tool id="bwa_mem2_index_builder_data_manager" name="Build BWA-MEM2 indexes" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" tool_type="manage_data" profile="22.05"> | ||
<description></description> | ||
<macros> | ||
<token name="@TOOL_VERSION@">2.2.1</token> | ||
<token name="@VERSION_SUFFIX@">0</token> | ||
</macros> | ||
<requirements> | ||
<requirement type="package" version="@TOOL_VERSION@">bwa-mem2</requirement> | ||
</requirements> | ||
<command detect_errors="exit_code"><![CDATA[ | ||
#set $fasta_file_name = str($all_fasta_source.fields.path).split('/')[-1] | ||
mkdir -p '${out_file.extra_files_path}' && | ||
ln -s '${all_fasta_source.fields.path}' '${out_file.extra_files_path}/${fasta_file_name}' && | ||
bwa-mem2 index '${out_file.extra_files_path}/${fasta_file_name}' && | ||
cp '$dmjson' '$out_file' | ||
]]> | ||
</command> | ||
<configfiles> | ||
<configfile name="dmjson"><![CDATA[#slurp | ||
#set $fasta_file_name = str($all_fasta_source.fields.path).split('/')[-1] | ||
#set $value = $sequence_id or $all_fasta_source.fields.dbkey | ||
#set $name = $sequence_name or $all_fasta_source.fields.name | ||
{ | ||
"data_tables":{ | ||
"bwa_mem2_indexes":[ | ||
{ | ||
"value": "${value}", | ||
"dbkey": "${all_fasta_source.fields.dbkey}", | ||
"name": "${name}", | ||
"path": "${fasta_file_name}" | ||
} | ||
] | ||
} | ||
} | ||
]]></configfile> | ||
</configfiles> | ||
<inputs> | ||
<param name="all_fasta_source" type="select" label="Source FASTA Sequence"> | ||
<options from_data_table="all_fasta"/> | ||
</param> | ||
<param name="sequence_name" type="text" value="" label="Name of sequence" /> | ||
<param name="sequence_id" type="text" value="" label="ID for sequence" /> | ||
</inputs> | ||
<outputs> | ||
<data name="out_file" format="data_manager_json" /> | ||
</outputs> | ||
<tests> | ||
<test> | ||
<param name="all_fasta_source" value="phiX174"/> | ||
<output name="out_file" file="bwa_mem2_data_manager.1.json"/> | ||
</test> | ||
<test> | ||
<param name="all_fasta_source" value="phiX174"/> | ||
<param name="sequence_name" value="Galeocerdo cuvier"/> | ||
<param name="sequence_id" value="fooBar1"/> | ||
<output name="out_file" file="bwa_mem2_data_manager.2.json"/> | ||
</test> | ||
</tests> | ||
<help> | ||
<![CDATA[ | ||
.. class:: infomark | ||
|
||
**Notice:** If you leave name, description, or id blank, it will be generated automatically. | ||
|
||
Bwa-mem2 is the next version of the bwa-mem algorithm in bwa. It produces | ||
alignment identical to bwa and is ~1.3-3.1x faster depending on the use-case, | ||
dataset and the running machine. Bwa-mem2 uses a different index format that | ||
is efficient on disk space and runtime memory but requires larger amounts of | ||
memory (roughly 27x the reference) when building. | ||
]]> | ||
</help> | ||
<citations> | ||
<citation type="doi">10.1038/nmeth.3317</citation> | ||
</citations> | ||
</tool> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
<?xml version="1.0"?> | ||
<data_managers> | ||
<data_manager tool_file="data_manager/bwa_mem2_index_builder.xml" id="bwa_mem2_index_builder"> | ||
<data_table name="bwa_mem2_indexes"> | ||
<output> | ||
<column name="value" /> | ||
<column name="dbkey" /> | ||
<column name="name" /> | ||
<column name="path" output_ref="out_file" > | ||
<move type="directory" relativize_symlinks="True"> | ||
<!-- <source>${path}</source>--> <!-- out_file.extra_files_path is used as base by default --> <!-- if no source, eg for type=directory, then refers to base --> | ||
<target base="${GALAXY_DATA_MANAGER_DATA_PATH}">${dbkey}/bwa_mem2_index/${value}</target> | ||
</move> | ||
<value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/${dbkey}/bwa_mem2_index/${value}/${path}</value_translation> | ||
<value_translation type="function">abspath</value_translation> | ||
</column> | ||
</output> | ||
</data_table> | ||
</data_manager> | ||
</data_managers> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#This file lists the locations and dbkeys of all the fasta files | ||
#under the "genome" directory (a directory that contains a directory | ||
#for each build). The script extract_fasta.py will generate the file | ||
#all_fasta.loc. This file has the format (white space characters are | ||
#TAB characters): | ||
# | ||
#<unique_build_id> <dbkey> <display_name> <file_path> | ||
# | ||
#So, all_fasta.loc could look something like this: | ||
# | ||
#apiMel3 apiMel3 Honeybee (Apis mellifera): apiMel3 /path/to/genome/apiMel3/apiMel3.fa | ||
#hg19canon hg19 Human (Homo sapiens): hg19 Canonical /path/to/genome/hg19/hg19canon.fa | ||
#hg19full hg19 Human (Homo sapiens): hg19 Full /path/to/genome/hg19/hg19full.fa | ||
# | ||
#Your all_fasta.loc file should contain an entry for each individual | ||
#fasta file. So there will be multiple fasta files for each build, | ||
#such as with hg19 above. | ||
# | ||
phiX174 phiX174 phiX174 ${__HERE__}/phiX174.fasta |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"data_tables":{ | ||
"bwa_mem2_indexes":[ | ||
{ | ||
"value": "phiX174", | ||
"dbkey": "phiX174", | ||
"name": "phiX174", | ||
"path": "phiX174.fasta" | ||
} | ||
] | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"data_tables":{ | ||
"bwa_mem2_indexes":[ | ||
{ | ||
"value": "fooBar1", | ||
"dbkey": "phiX174", | ||
"name": "Galeocerdo cuvier", | ||
"path": "phiX174.fasta" | ||
} | ||
] | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
>phiX174 | ||
GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGATATTTCTGATGAGTCGAAAAATTATCTT | ||
GATAAAGCAGGAATTACTACTGCTTGTTTACGAATTAAATCGAAGTGGACTGCTGGCGGAAAATGAGAAA | ||
ATTCGACCTATCCTTGCGCAGCTCGAGAAGCTCTTACTTTGCGACCTTTCGCCATCAACTAACGATTCTG | ||
TCAAAAACTGACGCGTTGGATGAGGAGAAGTGGCTTAATATGCTTGGCACGTTCGTCAAGGACTGGTTTA | ||
GATATGAGTCACATTTTGTTCATGGTAGAGATTCTCTTGTTGACATTTTAAAAGAGCGTGGATTACTATC | ||
TGAGTCCGATGCTGTTCAACCACTAATAGGTAAGAAATCATGAGTCAAGTTACTGAACAATCCGTACGTT | ||
TCCAGACCGCTTTGGCCTCTATTAAGCTCATTCAGGCTTCTGCCGTTTTGGATTTAACCGAAGATGATTT | ||
CGATTTTCTGACGAGTAACAAAGTTTGGATTGCTACTGACCGCTCTCGTGCTCGTCGCTGCGTTGAGGCT | ||
TGCGTTTATGGTACGCTGGACTTTGTGGGATACCCTCGCTTTCCTGCTCCTGTTGAGTTTATTGCTGCCG | ||
TCATTGCTTATTATGTTCATCCCGTCAACATTCAAACGGCCTGTCTCATCATGGAAGGCGCTGAATTTAC | ||
GGAAAACATTATTAATGGCGTCGAGCGTCCGGTTAAAGCCGCTGAATTGTTCGCGTTTACCTTGCGTGTA | ||
CGCGCAGGAAACACTGACGTTCTTACTGACGCAGAAGAAAACGTGCGTCAAAAATTACGTGCAGAAGGAG | ||
TGATGTAATGTCTAAAGGTAAAAAACGTTCTGGCGCTCGCCCTGGTCGTCCGCAGCCGTTGCGAGGTACT | ||
AAAGGCAAGCGTAAAGGCGCTCGTCTTTGGTATGTAGGTGGTCAACAATTTTAATTGCAGGGGCTTCGGC | ||
CCCTTACTTGAGGATAAATTATGTCTAATATTCAAACTGGCGCCGAGCGTATGCCGCATGACCTTTCCCA | ||
TCTTGGCTTCCTTGCTGGTCAGATTGGTCGTCTTATTACCATTTCAACTACTCCGGTTATCGCTGGCGAC | ||
TCCTTCGAGATGGACGCCGTTGGCGCTCTCCGTCTTTCTCCATTGCGTCGTGGCCTTGCTATTGACTCTA | ||
CTGTAGACATTTTTACTTTTTATGTCCCTCATCGTCACGTTTATGGTGAACAGTGGATTAAGTTCATGAA | ||
GGATGGTGTTAATGCCACTCCTCTCCCGACTGTTAACACTACTGGTTATATTGACCATGCCGCTTTTCTT | ||
GGCACGATTAACCCTGATACCAATAAAATCCCTAAGCATTTGTTTCAGGGTTATTTGAATATCTATAACA | ||
ACTATTTTAAAGCGCCGTGGATGCCTGACCGTACCGAGGCTAACCCTAATGAGCTTAATCAAGATGATGC | ||
TCGTTATGGTTTCCGTTGCTGCCATCTCAAAAACATTTGGACTGCTCCGCTTCCTCCTGAGACTGAGCTT | ||
TCTCGCCAAATGACGACTTCTACCACATCTATTGACATTATGGGTCTGCAAGCTGCTTATGCTAATTTGC | ||
ATACTGACCAAGAACGTGATTACTTCATGCAGCGTTACCGTGATGTTATTTCTTCATTTGGAGGTAAAAC | ||
CTCTTATGACGCTGACAACCGTCCTTTACTTGTCATGCGCTCTAATCTCTGGGCATCTGGCTATGATGTT | ||
GATGGAACTGACCAAACGTCGTTAGGCCAGTTTTCTGGTCGTGTTCAACAGACCTATAAACATTCTGTGC | ||
CGCGTTTCTTTGTTCCTGAGCATGGCACTATGTTTACTCTTGCGCTTGTTCGTTTTCCGCCTACTGCGAC | ||
TAAAGAGATTCAGTACCTTAACGCTAAAGGTGCTTTGACTTATACCGATATTGCTGGCGACCCTGTTTTG | ||
TATGGCAACTTGCCGCCGCGTGAAATTTCTATGAAGGATGTTTTCCGTTCTGGTGATTCGTCTAAGAAGT | ||
TTAAGATTGCTGAGGGTCAGTGGTATCGTTATGCGCCTTCGTATGTTTCTCCTGCTTATCACCTTCTTGA | ||
AGGCTTCCCATTCATTCAGGAACCGCCTTCTGGTGATTTGCAAGAACGCGTACTTATTCGCCACCATGAT | ||
TATGACCAGTGTTTCCAGTCCGTTCAGTTGTTGCAGTGGAATAGTCAGGTTAAATTTAATGTGACCGTTT | ||
ATCGCAATCTGCCGACCACTCGCGATTCAATCATGACTTCGTGATAAAAGATTGAGTGTGAGGTTATAAC | ||
GCCGAAGCGGTAAAAATTTTAATTTTTGCCGCTGAGGGGTTGACCAAGCGAAGCGCGGTAGGTTTTCTGC | ||
TTAGGAGTTTAATCATGTTTCAGACTTTTATTTCTCGCCATAATTCAAACTTTTTTTCTGATAAGCTGGT | ||
TCTCACTTCTGTTACTCCAGCTTCTTCGGCACCTGTTTTACAGACACCTAAAGCTACATCGTCAACGTTA | ||
TATTTTGATAGTTTGACGGTTAATGCTGGTAATGGTGGTTTTCTTCATTGCATTCAGATGGATACATCTG | ||
TCAACGCCGCTAATCAGGTTGTTTCTGTTGGTGCTGATATTGCTTTTGATGCCGACCCTAAATTTTTTGC | ||
CTGTTTGGTTCGCTTTGAGTCTTCTTCGGTTCCGACTACCCTCCCGACTGCCTATGATGTTTATCCTTTG | ||
AATGGTCGCCATGATGGTGGTTATTATACCGTCAAGGACTGTGTGACTATTGACGTCCTTCCCCGTACGC | ||
CGGGCAATAATGTTTATGTTGGTTTCATGGTTTGGTCTAACTTTACCGCTACTAAATGCCGCGGATTGGT | ||
TTCGCTGAATCAGGTTATTAAAGAGATTATTTGTCTCCAGCCACTTAAGTGAGGTGATTTATGTTTGGTG | ||
CTATTGCTGGCGGTATTGCTTCTGCTCTTGCTGGTGGCGCCATGTCTAAATTGTTTGGAGGCGGTCAAAA | ||
AGCCGCCTCCGGTGGCATTCAAGGTGATGTGCTTGCTACCGATAACAATACTGTAGGCATGGGTGATGCT | ||
GGTATTAAATCTGCCATTCAAGGCTCTAATGTTCCTAACCCTGATGAGGCCGCCCCTAGTTTTGTTTCTG | ||
GTGCTATGGCTAAAGCTGGTAAAGGACTTCTTGAAGGTACGTTGCAGGCTGGCACTTCTGCCGTTTCTGA | ||
TAAGTTGCTTGATTTGGTTGGACTTGGTGGCAAGTCTGCCGCTGATAAAGGAAAGGATACTCGTGATTAT | ||
CTTGCTGCTGCATTTCCTGAGCTTAATGCTTGGGAGCGTGCTGGTGCTGATGCTTCCTCTGCTGGTATGG | ||
TTGACGCCGGATTTGAGAATCAAAAAGAGCTTACTAAAATGCAACTGGACAATCAGAAAGAGATTGCCGA | ||
GATGCAAAATGAGACTCAAAAAGAGATTGCTGGCATTCAGTCGGCGACTTCACGCCAGAATACGAAAGAC | ||
CAGGTATATGCACAAAATGAGATGCTTGCTTATCAACAGAAGGAGTCTACTGCTCGCGTTGCGTCTATTA | ||
TGGAAAACACCAATCTTTCCAAGCAACAGCAGGTTTCCGAGATTATGCGCCAAATGCTTACTCAAGCTCA | ||
AACGGCTGGTCAGTATTTTACCAATGACCAAATCAAAGAAATGACTCGCAAGGTTAGTGCTGAGGTTGAC | ||
TTAGTTCATCAGCAAACGCAGAATCAGCGGTATGGCTCTTCTCATATTGGCGCTACTGCAAAGGATATTT | ||
CTAATGTCGTCACTGATGCTGCTTCTGGTGTGGTTGATATTTTTCATGGTATTGATAAAGCTGTTGCCGA | ||
TACTTGGAACAATTTCTGGAAAGACGGTAAAGCTGATGGTATTGGCTCTAATTTGTCTAGGAAATAACCG | ||
TCAGGATTGACACCCTCCCAATTGTATGTTTTCATGCCTCCAAATCTTGGAGGCTTTTTTATGGTTCGTT | ||
CTTATTACCCTTCTGAATGTCACGCTGATTATTTTGACTTTGAGCGTATCGAGGCTCTTAAACCTGCTAT | ||
TGAGGCTTGTGGCATTTCTACTCTTTCTCAATCCCCAATGCTTGGCTTCCATAAGCAGATGGATAACCGC | ||
ATCAAGCTCTTGGAAGAGATTCTGTCTTTTCGTATGCAGGGCGTTGAGTTCGATAATGGTGATATGTATG | ||
TTGACGGCCATAAGGCTGCTTCTGACGTTCGTGATGAGTTTGTATCTGTTACTGAGAAGTTAATGGATGA | ||
ATTGGCACAATGCTACAATGTGCTCCCCCAACTTGATATTAATAACACTATAGACCACCGCCCCGAAGGG | ||
GACGAAAAATGGTTTTTAGAGAACGAGAAGACGGTTACGCAGTTTTGCCGCAAGCTGGCTGCTGAACGCC | ||
CTCTTAAGGATATTCGCGATGAGTATAATTACCCCAAAAAGAAAGGTATTAAGGATGAGTGTTCAAGATT | ||
GCTGGAGGCCTCCACTATGAAATCGCGTAGAGGCTTTACTATTCAGCGTTTGATGAATGCAATGCGACAG | ||
GCTCATGCTGATGGTTGGTTTATCGTTTTTGACACTCTCACGTTGGCTGACGACCGATTAGAGGCGTTTT | ||
ATGATAATCCCAATGCTTTGCGTGACTATTTTCGTGATATTGGTCGTATGGTTCTTGCTGCCGAGGGTCG | ||
CAAGGCTAATGATTCACACGCCGACTGCTATCAGTATTTTTGTGTGCCTGAGTATGGTACAGCTAATGGC | ||
CGTCTTCATTTCCATGCGGTGCATTTTATGCGGACACTTCCTACAGGTAGCGTTGACCCTAATTTTGGTC | ||
GTCGGGTACGCAATCGCCGCCAGTTAAATAGCTTGCAAAATACGTGGCCTTATGGTTACAGTATGCCCAT | ||
CGCAGTTCGCTACACGCAGGACGCTTTTTCACGTTCTGGTTGGTTGTGGCCTGTTGATGCTAAAGGTGAG | ||
CCGCTTAAAGCTACCAGTTATATGGCTGTTGGTTTCTATGTGGCTAAATACGTTAACAAAAAGTCAGATA | ||
TGGACCTTGCTGCTAAAGGTCTAGGAGCTAAAGAATGGAACAACTCACTAAAAACCAAGCTGTCGCTACT | ||
TCCCAAGAAGCTGTTCAGAATCAGAATGAGCCGCAACTTCGGGATGAAAATGCTCACAATGACAAATCTG | ||
TCCACGGAGTGCTTAATCCAACTTACCAAGCTGGGTTACGACGCGACGCCGTTCAACCAGATATTGAAGC | ||
AGAACGCAAAAAGAGAGATGAGATTGAGGCTGGGAAAAGTTACTGTAGCCGACGTTTTGGCGGCGCAACC | ||
TGTGACGACAAATCTGCTCAAATTTATGCGCGCTTCGATAAAAATGATTGGCGTATCCAACCTGCA | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
#This file lists the locations and dbkeys of all the fasta files | ||
#under the "genome" directory (a directory that contains a directory | ||
#for each build). The script extract_fasta.py will generate the file | ||
#all_fasta.loc. This file has the format (white space characters are | ||
#TAB characters): | ||
# | ||
#<unique_build_id> <dbkey> <display_name> <file_path> | ||
# | ||
#So, all_fasta.loc could look something like this: | ||
# | ||
#apiMel3 apiMel3 Honeybee (Apis mellifera): apiMel3 /path/to/genome/apiMel3/apiMel3.fa | ||
#hg19canon hg19 Human (Homo sapiens): hg19 Canonical /path/to/genome/hg19/hg19canon.fa | ||
#hg19full hg19 Human (Homo sapiens): hg19 Full /path/to/genome/hg19/hg19full.fa | ||
# | ||
#Your all_fasta.loc file should contain an entry for each individual | ||
#fasta file. So there will be multiple fasta files for each build, | ||
#such as with hg19 above. | ||
# |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
#This is a sample file distributed with Galaxy that enables tools | ||
#to use a directory of BWA indexed sequences data files. You will need | ||
#to create these data files and then create a bwa_index.loc file | ||
#similar to this one (store it in this directory) that points to | ||
#the directories in which those files are stored. The bwa_index.loc | ||
#file has this format (longer white space characters are TAB characters): | ||
# | ||
#<unique_build_id> <dbkey> <display_name> <file_path> | ||
# | ||
#So, for example, if you had phiX indexed stored in | ||
#/depot/data2/galaxy/phiX/base/, | ||
#then the bwa_index.loc entry would look like this: | ||
# | ||
#phiX174 phiX phiX Pretty /depot/data2/galaxy/phiX/base/phiX.fa | ||
# | ||
#and your /depot/data2/galaxy/phiX/base/ directory | ||
#would contain phiX.fa.* files: | ||
# | ||
#-rw-r--r-- 1 james universe 830134 2005-09-13 10:12 phiX.fa.amb | ||
#-rw-r--r-- 1 james universe 527388 2005-09-13 10:12 phiX.fa.ann | ||
#-rw-r--r-- 1 james universe 269808 2005-09-13 10:12 phiX.fa.bwt | ||
#...etc... | ||
# | ||
#Your bwa_index.loc file should include an entry per line for each | ||
#index set you have stored. The "file" in the path does not actually | ||
#exist, but it is the prefix for the actual index files. For example: | ||
# | ||
#phiX174 phiX phiX174 /depot/data2/galaxy/phiX/base/phiX.fa | ||
#hg18canon hg18 hg18 Canonical /depot/data2/galaxy/hg18/base/hg18canon.fa | ||
#hg18full hg18 hg18 Full /depot/data2/galaxy/hg18/base/hg18full.fa | ||
#/orig/path/hg19.fa hg19 hg19 /depot/data2/galaxy/hg19/base/hg19.fa | ||
#...etc... | ||
# | ||
#Note that for backwards compatibility with workflows, the unique ID of | ||
#an entry must be the path that was in the original loc file, because that | ||
#is the value stored in the workflow for that parameter. That is why the | ||
#hg19 entry above looks odd. New genomes can be better-looking. | ||
# |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
<!-- Use the file tool_data_table_conf.xml.oldlocstyle if you don't want to update your loc files as changed in revision 4550:535d276c92bc--> | ||
<tables> | ||
<!-- Locations of indexes in the BWA-MEM2 mapper format--> | ||
<table name="bwa_mem2_indexes" comment_char="#"> | ||
<columns>value, dbkey, name, path</columns> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we want to include the bwa-mem2 version here (that has been used for the generation). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The index format has been stable since 2020 and the Galaxy version of the tool was added after that index format change, so it should be compatible with all versions. For tools that have changed it in the past (e.g. STAR) we added a new versioned data table once that happens, but I guess if we want to be safe I could add a version column, even if it's unused for now? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should add a version column to all DM. Its just more future-proof and does not cost anything. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense, but wouldn’t adding a column make it incompatible with the existing bwa-mem2 tool’s data table? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is true. I rephrase it, ideally, we should have a version column. Lets get it in for now without the version column and hope it will never change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In that case I think this is ready to go. |
||
<file path="tool-data/bwa_mem2_index.loc" /> | ||
</table> | ||
</tables> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
<tables> | ||
<!-- Locations of all fasta files under genome directory --> | ||
<table name="all_fasta" comment_char="#"> | ||
<columns>value, dbkey, name, path</columns> | ||
<file path="${__HERE__}/test-data/all_fasta.loc" /> | ||
</table> | ||
<!-- Locations of indexes in the BWA-MEM2 mapper format--> | ||
<table name="bwa_mem2_indexes" comment_char="#"> | ||
<columns>value, dbkey, name, path</columns> | ||
<file path="${__HERE__}/test-data/bwa_mem2_index.loc" /> | ||
</table> | ||
</tables> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be
${out_file.extra_files_path}/${fasta_file_name}
?In the DMs that I created recently I just crated a directory (e.g.
DIR
) in the working dir and used this. I guess usingout_file.extra_files_path
is not really necessary.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing so seems to cause the index to be referenced from the object store after the DM completes: the path in the loc file points to
/path/to/galaxy/database/objects/c/0/f/dataset_c0ffee_files/fooBar1.fa
. Whereas if it is just the relative name inside theextra_files_path
, then the index is referenced fromtool_data_dir
as expected for a DM. Possibly just an artifact ofos.path.join([tool_data_dir, intermediate_dirs, '/path/to/galaxy/database/objects/c/0/f/dataset_c0ffee_files/fooBar1.fa'])
clobbering the earlier path elements, but either way that's not the desired outcome.