Usage:
breseq -r reference1.gbk [-r reference2.gbk ...] reads1.fastq [reads2.fastq, reads3.fastq...]
Run the breseq pipeline for mutation prediction from genome re-sequencing data.
Required options:
Input reference genome sequence files in GenBank format. If there are multiple reference sequences stored in separate GenBank files (e.g., a bacterial genome and a plasmid), this option can be supplied multiple times.
The remaining arguments at the command line are the FASTQ input files of reads. FASTQ files with base quality scores that are not in SANGER format will be converted. In addition, reads with >50% N bases will be removed from the converted FASTQ file by default. breseq re-calibrates the error rates for each FASTQ file separately, so data sets that were generated independently should be stored in different input files.
Expert options:
Ignore bases with a quality score lower than this value when calling mutations. This accommodates Illumina formats that use quality scores of 2 to flag bad data. These bases are still used for aligning to the reference genome and are shown highlighted in yellow when drawing alignments, but they do not contribute to read alignment evidence. Default: 3
Require the entire read to match for it to be counted as aligned.
Require at least this number of nucleotides in the read to match for it to be counted as aligned.
Identify and predict the frequencies of SNPs and small indels that are polymorphic (appear in only a subpopulation of reads). See Polymorphism Prediction for additional options and note that this option is still experimental.
Usage:
breseq bam2aln [-b input.bam] [-f input.fasta] [-o output/path] region1 [region2 region3 ...]
Creates HTML pileup files displaying reads aligned to each specified region.
Options:
BAM database file of read alignments. Defaults: reference.bam, data/reference.bam.
FASTA file of reference sequences. Defaults: reference.fasta, data/reference.fasta.
Output path. If there are multiple regions, must be a directory path, and all output files will be output here with names region1.html, region2.html, ... If there is just one region, the output file will be given this name if it is not the name of an already existing directory. Default: current path.
Maximum number of reads that will be aligned to a region. If there are more than this many reads, then the reads displayed are randomly chosen and a warning is added to the output. Default: 1000.
Regions to create output for must be provided in the format FRAGMENT:START-END, where FRAGMENT is a valid identifier for one of the sequences in the FASTA file, and START and END are 1-indexed coordinates of the beginning and end positions. Any read overlapping these positions will be shown. A separate output file is created for each region.
Usage:
breseq bam2cov [-b input.bam] [-f input.fasta] [-o output/path] region1 [region2 region3 ...]
Create a coverage plot or table for the specified region or regions.
Options:
BAM database file of read alignments. Defaults: reference.bam, data/reference.bam.
FASTA file of reference sequences. Defaults: reference.fasta, data/reference.fasta.
Base name of output files. Region specification (seq_id:start-end) appended if there are multiple output files. Default: seq_id:start-end for single regions or the current directory for multiple regions.
Format of output plot: PNG or PDF. Default: PNG
Create tab delimited file of coverage instead of a plot.
Only plot/tabulate total coverage, not per strand coverage.
Number of positions to output coverage information for in interval (0=ALL). Default: 600
Regions to create output for must be provided in the format FRAGMENT:START-END, where FRAGMENT is a valid identifier for one of the sequences in the FASTA file, and START and END are 1-indexed coordinates of the beginning and end positions. A separate output file is created for each region.