Skip to content

CPU HiCCUPS

Muhammad Saad Shamim edited this page Aug 28, 2018 · 9 revisions

HiCCUPS was developed to run on GPUs, as it is a computationally intensive algorithm which analyzes all intrachromosomal spaces to identify areas of local enrichment for peak identification. However, many groups may not have access to a GPU, and as such, we have introduced a modified version of HiCCUPS which will run on CPUs.

The primary difference involves restricting the search space of the HiCCUPS algorithm to only search for peaks within 8MB of the diagonal. In practice, we do see that the vast majority of loops (especially CTCF mediated chromatin loops) are within a few megabases of the diagonal.

"The vast majority of peaks (98%) reflected loops between loci that are <2 Mb apart" Rao and Huntley et al. 2014

However, due to this restriction of the intrachromosomal search space, the FDR thresholds calculated are slightly different from running the full GPU-based HiCCUPS. In addition, since the area of the region analyzed varies with the -m flag, slight differences in loops may result when using different sub-matrix sizes (area of window/region examined at a given time).

Usage

The CPU version of HiCCUPS uses exactly the same inputs as regular HiCCUPS with the addition of the --cpu flag

hiccups --cpu [-m matrixSize] [-c chromosome(s)] [-r resolution(s)] 
		<HiC file> <outputDirectory> 

You may also choose to run GPU-based HiCCUPS using the same restriction of searching near the diagonal. This results in identical results with CPU HiCCUPS, and can be run using the --restrict flag.

hiccups --restrict [-m matrixSize] [-c chromosome(s)] [-r resolution(s)]
                <HiC file> <outputDirectory>

Example of Differences

Using the cohesin-degron maps from Rao et al. 2017, regular GPU-based HiCCUPS finds 2326 loops in the untreated megamap and 58 loops in the treated megamap. CPU-based HiCCUPS (and restricted GPU-based HiCCUPS) finds 2285 loops in the untreated megamap and 52 loops in the treated megamap.

Of the loops identified in the untreated megamap, 2261 match exactly and are identical; 13 are within a 10kB tolerance region; 11 are unique to the CPU version, and 52 are unique to the GPU version.

Of the loops identified in the treated megamap, 42 match exactly and are identical; 4 are within a 10kB tolerance region; 6 are unique to the CPU version, and 12 are unique to the GPU version.