breseq produces results as a stand-alone HTML archive in the output directory. You can load these files directly in a browser, or copy the directory to a server to allow access via the web.
Important files include:
Each row displays a predicted mutation in the re-sequenced sample relative to the reference. Examples showing how the format varies depending on the type of mutation are provided in the following sections.
Column descriptions:
All gene information is taken from input GenBank files. How informative descriptions are about the effects of mutations is entirely dependent on the quality of annotation in the reference sequence files.
Replacement of the reference T at position 70,867 with a C inside the araA gene. This mutation changes the 92nd codon of araA from GAC to GGC, causing an aspartic acid (D) to glycine (G) substitution in the encoded protein. The base change in the codon is the reverse-complement of the base change in the genome because this gene is encoded on the bottom strand of the reference sequence.
Replacement of the reference T at position 1,298,712 with a G in the intergenic region between the ychE and oppA genes. The mutation is downstream of ychE by 674 bases (because this gene is before it in position and on the top strand of the reference) and upstream of oppA by 64 bases (because this gene is after it in position and also on the top strand of the genome).
Replacement of two adjacent reference bases at positions 47,977 and 47,978 with AC in an intergenic region. This mutation is near the end of the genome, so there is no gene after it. It is downstream of lambdap79 by 33 bases (because this gene is before it in position and on the top reference strand).
For insertion mutations, new bases are added after the specified position.
Insertion of a G after reference position 3,893,551. This mutation is 6 nucleotides downstream of kup and 50 nucleotides upstream of insJ-5.
Insertion of CC after reference position 3,290,071 inside the gltB gene. This mutation occurs after the 205th base of the 4554-base open reading frame of this gene.
For deletion rows, the position column gives the first missing reference base and the mutation column gives the size of the deletion. Thus, the deleted reference region extends from position to position + size -1.
A 6,934-base deletion starting at position 3,894,997. The annotation column reports that it is IS150-mediated, because this repetitive element occurs on one margin of the deletion. This deletion begins before the rbsD gene and ends within the yieO gene. This mutation is supported by New Junction (JC) and Missing coverage (MC) evidence.
A single-base deletion at position 1,332,148 in an intergenic region. The deleted nucleotide is located 131 bp downstream of the topA gene and 79 bp upstream of the cysB gene. This mutation is supported by Read alignment (RA).
Mobile element insertions can result in duplications of the target site. The provided position is the first of such possibly duplicated bases. The number of bases in parentheses in the annotation, e.g. (+7) bp, are duplicated, starting with the indicated position, so that they now occur before and after the new copy of the mobile element. Additional bases may be added or deleted as a result of the mobile element insertion at either end. These are indicated outside of double colons (::) on the affected side of the mobile element name in the annotation column. The strand of the newly inserted mobile element is indicated in parentheses after its name.
Insertion of an IS3 element in the reverse orientation. Bases 3,571,196 through 3,571,198 are duplicated, so that they now occur on each margin of the newly inserted element. In addition, the sequence TCA was added directly after the IS3 element on the right margin. The duplicated bases are positions 397 through 399 of the 435-base uspA reading frame.
Insertion of an IS186 element in the forward orientation. Bases 4,524,522 through 4,524,527 are duplicated, so that they now occur on each margin of the newly inserted element. These bases are 494 through 499 of the 549-base fimA reading frame.
Insertion of an IS186 element in the forward orientation. Bases 2,736,667 through 2,736,675 are duplicated, so that they now occur on each margin of the newly inserted element. Two bases of the mobile element on the left margin were lost, apparently during insertion. The duplicated bases are 818 through 826 of the 1425-base ascB reading frame.
For duplications and other tandem amplifications, position indicates the first repeated base.
Duplication of 8 bp inside the pykF gene. The bases 1,733,290 to 1,733,297 now appear twice at this location. This mutation would cause a frameshift.
Evidence is shown in tables with different fields from mutation predictions, that provide more detailed information about support for genomic changes. Clicking on any evidence link for a mutation prediction will also bring up pages with tables showing all items of evidence that breseq used to predict the mutational event.
Each JC row consists of two sub-rows, one describing one side of the junction in the reference sequence. If a sub-row is highlighted in orange, it means that side of the junction ambiguously maps to more than one place in the reference. In this case, the coordinate shown is an example of one site.
Column descriptions:
Examples:
This image shows the page from clicking on the * link for this junction. A partial alignment of reads to the new junction is shown. Notice the two joined pieces of the reference sequence at the top that they align to. This sequence is on the bottom strand of the reference if start is greater than end.
This image shows the page from clicking on one of the ? links for this junction. Notice that only a piece of the reads maps to this region and that it ends where these reads begin matching a disjoint region in the reference genome. Clearly the old junction is not supported by any reads in this sample and must no longer exist. Once again, only a partial alignment is shown.
Column descriptions:
Example:
Partial alignment of reads showing that most support a base substitution. The > and < for each named read indicate the strand of the reference sequence that it matched (top and bottom, respectively).
Column descriptions:
Example:
Read coverage depth around the missing coverage. The white area shows the maximal boundaries of the predicted range.
The graphed lines are labeled “unique” for reads with only one best match to the reference genome and “repeat” for multiple equally good matches to repeat sequences (which are down-weighted by how many matches they have, i.e. a read matching three places contributes 1/3 to the coverage depth at each matched site). Within each type coverage is graphed separately for reads mapping to the “top” and “bottom” strands of the reference sequence (i.e., forward and reverse complement matches) to aid in detecting artifacts, and these sum to the “total” coverage value.
breseq outputs several files that can be used by other software programs to further analyze the final processed read data.
You can visualize the “raw data” (how breseq aligned reads to the reference genome) using the Integrative Genomics Viewer (IGV) and files located in the data folder created by breseq.
- Click ‘File’, and then ‘Import Genome...’
- Fill out the requested information: ‘ID’, ‘Name’
- Choose the FASTA file: data/reference.fasta.
- The other fields are optional.
- Click ‘File’, and then ‘Load from File...”
- Choose the GFF3 file: data/reference.gff3.
- Click ‘File’, and then ‘Load from File...”
- Choose the BAM file: data/reference.bam.