############### 5.10 genomeCoverageBed ############### **genomeCoverageBed** computes a histogram of feature coverage (e.g., aligned sequences) for a given genome. Optionally, by using the **-d** option, it will report the depth of coverage at *each base* on each chromosome in the genome file (**-g**). ========================================================================== 5.10.1 Usage and option summary ========================================================================== Usage: :: genomeCoverageBed [OPTIONS] -i -g NOTE: genomeCoverageBed requires that the input BED file be sorted by chromosome. A simple sort -k1,1 will suffice. =========================== =============================================================================================================================================================================================================== Option Description =========================== =============================================================================================================================================================================================================== **-ibam** BAM file as input for coverage. Each BAM alignment in A added to the total coverage for the genome. Use "stdin" if passing it with a UNIX pipe: For example: | samtools view -b | genomeCoverageBed -ibam stdin -g hg18.genome **-d** Report the depth at each genome position. *Default behavior is to report a histogram*. **-max** Combine all positions with a depth >= max into a single bin in the histogram. **-bg** Report depth in BedGraph format. For details, see: http://genome.ucsc.edu/goldenPath/help/bedgraph.html **-bga** Report depth in BedGraph format, as above (i.e., -bg). However with this option, regions with zero coverage are also reported. This allows one to quickly extract all regions of a genome with 0 coverage by applying: "grep -w 0$" to the output. **-split** Treat "split" BAM or BED12 entries as distinct BED intervals when computing coverage. For BAM files, this uses the CIGAR "N" and "D" operations to infer the blocks for computing coverage. For BED12 files, this uses the BlockCount, BlockStarts, and BlockEnds fields (i.e., columns 10,11,12). **-strand** Calculate coverage of intervals from a specific strand. With BED files, requires at least 6 columns (strand is column 6). =========================== =============================================================================================================================================================================================================== ========================================================================== 5.10.2 Default behavior ========================================================================== By default, **genomeCoverageBed** will compute a histogram of coverage for the genome file provided. The default output format is as follows: 1. chromosome (or entire genome) 2. depth of coverage from features in input file 3. number of bases on chromosome (or genome) with depth equal to column 2. 4. size of chromosome (or entire genome) in base pairs 5. fraction of bases on chromosome (or entire genome) with depth equal to column 2. For example: :: cat A.bed chr1 10 20 chr1 20 30 chr2 0 500 cat my.genome chr1 1000 chr2 500 genomeCoverageBed -i A.bed -g my.genome chr1 0 980 1000 0.98 chr1 1 20 1000 0.02 chr2 1 500 500 1 genome 0 980 1500 0.653333 genome 1 520 1500 0.346667 ========================================================================== 5.10.3 (-max)Controlling the histogram's maximum depth ========================================================================== Using the **-max** option, **genomeCoverageBed** will "lump" all positions in the genome having feature coverage greather than or equal to **max** into the **max** histogram bin. For example, if one sets **-max** equal to 50, the max depth reported in the output will be 50 and all positions with a depth >= 50 will be represented in bin 50. ========================================================================== 5.10.4 (-d)Reporting "per-base" genome coverage ========================================================================== Using the **-d** option, **genomeCoverageBed** will compute the depth of feature coverage for each base on each chromosome in genome file provided. The "per-base" output format is as follows: 1. chromosome 2. chromosome position 3. depth (number) of features overlapping this chromosome position. For example: :: cat A.bed chr1 10 20 chr1 20 30 chr2 0 500 cat my.genome chr1 1000 chr2 500 genomeCoverageBed -i A.bed -g my.genome -d | head -15 | tail -n 10 chr1 6 0 chr1 7 0 chr1 8 0 chr1 9 0 chr1 10 0 chr1 11 1 chr1 12 1 chr1 13 1 chr1 14 1 chr1 15 1 ========================================================================== 5.1.13 (-split)Reporting coverage with spliced alignments or blocked BED features ========================================================================== As described in section 1.3.19, genomeCoverageBed will, by default, screen for overlaps against the entire span of a spliced/split BAM alignment or blocked BED12 feature. When dealing with RNA-seq reads, for example, one typically wants to only screen for overlaps for the portions of the reads that come from exons (and ignore the interstitial intron sequence). The **-split** command allows for such overlaps to be performed. For additional details, please visit the Usage From The Wild site and have a look at example 5, contributed by Assaf Gordon.