genewise - options: gene model

Next: genewise - options: output Up: genewise Previous: genewise - options: dna/protein Contents

genewise - options: gene model

-codon

[codon.table] Codon file. The default is for the universal code, but you can supply your own

-gene

[human.gf] Gene parameter file. Provide statistics for different gene models. Current human.gf and worm.gf are provided. The statistics are basically too complicated to explain here.

-subs

[1e-05] Substitution error rate, ie the assummed probability of base substitutions in the sequencing reaction/assembly that provided the DNA sequence. The substituion error is what dominates the penalty for stop codons - a higher error rate implies a smaller penalty for stop codons

-indel

[1e-05] Insertion/deletion error rate, ie the assummed probability of indel events in the sequencing reaction/assembly that provided the DNA sequence. The indel rate is what provides the penalty for frameshift errors. A higher error rate implies a smaller penalty for indels.

-cfreq

[model/flat] Using codon bias or not? [default flat] - a reasonably pointless option now, as it only applies when using -syn flat. If codon bias is modelled, then common codons score more than uncommons one for the same amino acid.

-splice

[model/flat] Using splice model or GT/AG? [default model] - use the full blown model for splice sites, or a simplistic GT/AG. Generally if you are using a DNA sequence which is from human or worm, then leave this on. If you are using a very different (eg plant) species, switch it off.

-intron

[model/tied] Use tied model for introns [default tied] - whether intron base distribution effects the parse. Because varying GC content and/or repeats can seriously drag the algorithm away from correct parses when intron base distribution is used, this is usually switched off.

-null

[syn/flat] Random Model as synchronous or flat [default syn] - whether to use a null model which is a simple base distribution (called flat), or imagine that the viterbi path is being compared to a gene based null model that is making all the same gene exon/intron boundaries (synchronous). The latter is basically a hack which demphaises the gene prediction machinery and tries to trust the homology machinery. (not ideal!)

-pg

[file] Potential Gene file (heurestic for speeding alignments). The potential gene file should look like

pgene # stands for potential gene
ptrans # stands for potential transcript
pexon <start-in-dna> <end-in-dna> <start-in-protein> <end-in-protein>
pexon <start-in-dna> <end-in-dna> <start-in-protein> <end-in-protein>
...
endptrans
<another ptrans if you like>
endpgene

When this file is read in, it provides a series of start/end in dna and protein sequences around which is drawn an envelope of possibly alignment area. The alignment is then calculated only in this area

This feature has not been well tested yet. any potential bugs reported in are very useful.

-alg

[623/623L/2193/2193L/6LITE] Algorithm used [default 623/623L] You should read the section on algorithms (4.4). Basically 623 and 623L are cheaper computationally and more robust with respect to repeats etc. 2193 and 2193L are much more expensive, more sensitive to changes in parameters but potentially more accurate.

-kbyte

[ 2000] Max number of kilobytes used in main calculation. Indicates how much memory can be used for the dynamic programming calculation.

Next: genewise - options: output Up: genewise Previous: genewise - options: dna/protein Contents

Eric DEVEAUD 2015-02-27