-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert the submodule star-sys/STAR to a git subtree #29
Conversation
…ctory path) for the file names in --readFilesIn . Implemented read group ID output as the last column of the Chimeric.out.junction file.
Don't exit when fastq contains a zero-length read
…alized memory access. The chimeric output may change for a very small number of reads.
…nique (i.e. NH=1).
…ing bins. Icnreasing this number reduces the amount of RAM required for sorting.
…it, use --peOverlapNbasesMin and --peOverlapMMp oprions.
…ends of the mates.
Add -Werror to CXXFLAGS. Remove register keyword. Comment out unused function arguments.
…' into sj/subtree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have to learn the details at some but, but definitely seems like this is worth a shot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh -- we should move the license file from STAR to the top of the subtree. Don't want to elide the license accidentally.
I agree w/ Pat re: License. I may also be easier to just copy into place. Otherwise, dope! |
Good catch, Pat. I've added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌 SHIP IT 👌
$ time git clone https://github.com/10XGenomics/orbit
Cloning into 'orbit'...
remote: Enumerating objects: 3997, done.
remote: Counting objects: 100% (3997/3997), done.
remote: Compressing objects: 100% (969/969), done.
remote: Total 4378 (delta 2942), reused 3988 (delta 2940), pack-reused 381
Receiving objects: 100% (4378/4378), 7.36 MiB | 3.73 MiB/s, done.
Resolving deltas: 100% (3112/3112), done.
git clone https://github.com/10XGenomics/orbit orbit-subtree 0.39s user 0.16s system 19% cpu 2.787 total 13504 MB
$ du -hs orbit
10M orbit-subtree 😁 |
LGTM, though it would be good to include a script for updating the subtree. |
# Extract the source/ subdirectory of STAR into a Git subtree.
orbit-subtree:
git clone https://github.com/10XGenomics/orbit orbit-subtree
git -C orbit-subtree remote add star https://github.com/10XGenomics/STAR
git -C orbit-subtree fetch star
# Remove the STAR submodule.
git -C orbit-subtree switch -c orbit/subtree remotes/origin/master
git -C orbit-subtree rm star-sys/STAR .gitmodules
git -C orbit-subtree commit -m 'Remove the submodule star-sys/STAR'
# Create the subtree of star/master.
git -C orbit-subtree switch -c star/master remotes/star/master
git -C orbit-subtree subtree split -P source -b star/subtree
# Add the subtree of star/master to orbit.
git -C orbit-subtree switch orbit/subtree
git -C orbit-subtree subtree add -P star-sys/STAR/source star/subtree
# Create the subtree of star/orbit.
git -C orbit-subtree switch -c star/orbit remotes/star/orbit
git -C orbit-subtree subtree split -P source -b star/subtree
# Add the subtree of star/orbit to orbit.
git -C orbit-subtree switch orbit/subtree
git -C orbit-subtree subtree merge -P star-sys/STAR/source star/subtree
# Check for large files.
git -C orbit-subtree lfs migrate info --include-ref=orbit/subtree --above=1MB
git -C orbit-subtree rev-list --objects orbit/subtree \
| git -C orbit-subtree cat-file --batch-check='%(objectname) %(objecttype) %(objectsize) %(rest)' \
| awk '$$2 == "blob" && $$3 >= 1000000 { print substr($$1, 1, 7), $$3, $$4; fail=1 } END { exit fail }' Here's the script used to create the subtree. |
# Check for large files.
$ git -C orbit-subtree lfs migrate info --include-ref=orbit/subtree --above=1MB
migrate: Sorting commits: ..., done.
migrate: Examining commits: 100% (671/671), done.
STAR 7.4 MB 4/4 files(s) 100%
*.a 6.2 MB 2/2 files(s) 100%
STARstatic 3.6 MB 1/1 files(s) 100%
$ git -C orbit-subtree rev-list --objects orbit/subtree \
| git -C orbit-subtree cat-file --batch-check='%(objectname) %(objecttype) %(objectsize) %(rest)' \
| awk '$2 == "blob" && $3 >= 1000000 { print substr($1, 1, 7), $3, $4; fail=1 } END { exit fail }'
40b79fe 3106550 liborbit.a
9b28e14 3097906 orbit.a
8a26056 1793054 STAR
5f27ff2 1792798 STAR
43e95fd 3646356 STARstatic
e532a98 1792790 STAR
9c8e339 2019903 STAR Note that there are 17 MB of large binary files stored in |
Yes, agreed. I'll do that in a subsequent PR. Those commands scan be extracted from #29 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because sometimes the bin is a trash bin, I love it!
Remove the Git submodule
star-sys/STAR
.Add the Git subtree
star-sys/STAR/source
.Thanks to Lance for this suggestion!
Files under
star-sys/STAR
that are not understar-sys/STAR/source
have been removed, in particular the large binaries instar-sys/STAR/bin
. There's thankfully no need to rewrite the history of Orbit. Sincestar-sys/STAR
was a Git submodule, once it has been removed, its Git objects will not be fetched by default. The Git commit history of STAR, and the Orbit-specific changes to STAR, has been preserved.This change addresses two issues:
star-sys/STAR/bin
git submodules are not cached
git submodules are not cached rust-lang/cargo#7987The challenges are that
git subtree
is an uncommon tool and will be unfamiliar to most developers, and merging upstream STAR into Orbit is somewhat more involved. https://gist.github.com/tswaters/542ba147a07904b1f3f5#fetching-upstramSee https://www.atlassian.com/git/tutorials/git-subtree
and https://stackoverflow.com/questions/23937436/add-subdirectory-of-remote-repo-with-git-subtree
and https://jrsmith3.github.io/merging-a-subdirectory-from-another-repo-via-git-subtree.html
and https://gist.github.com/tswaters/542ba147a07904b1f3f5