MView: Frequently Asked Questions

What is MView?
Examples
Obtaining the software
What input formats are recognised?
What output formats are there?
Can MView process data from a Web page?
Command line options
How can I print?
Can I switch off HTML markup?
What do the percent identities mean?
Some sequences are incomplete or contain strange characters?
Memory usage
System requirements
Installation
Copyright and Licensing information
Citation
Acknowledgments

[top] What is MView?

MView reformats the results of a sequence database search (BLAST, FASTA, etc) or a multiple alignment (MSF, PIR, CLUSTAL, etc) adding optional HTML markup to control colouring and web page layout. MView is not a multiple alignment program, nor is it a general purpose alignment editor.

[top] Examples

Some examples illustrate use of various command line options.

[top] Obtaining the software

The latest version of the software can be downloaded from the SourceForge MView project area as a UNIX/Linux gzipped tar archive or 'tarball'. MView is distributed under the terms of the GPL.

[top] What input formats are recognised?

The code has been tested for the following formats and versions for protein and nucleotide sequences:

BLAST (NCBI series 2.2)

format tested versions status MView option

blastp 2.2.6 ok -in blast

blastn 2.2.6 ok -in blast

blastx 2.2.6 ok -in blast

tblastn 2.2.6 ok -in blast

tblastx 2.2.6 ok -in blast

psi-blast 2.2.6 ok -in blast

phi-blast 2.2.6 ok -in blast

BLAST (NCBI series 2.0)

format tested versions status MView option

blastp 2.0.4, 2.0.5, 2.0.9, 2.0.10 ok -in blast

blastn 2.0.4, 2.0.5, 2.0.9, 2.0.14 ok -in blast

blastx 2.0.5, 2.0.9 ok -in blast

tblastn 2.0.5, 2.0.10 ok -in blast

tblastx 2.0.5 ok -in blast

psi-blast 2.0.2, 2.0.4, 2.0.5, 2.0.6, 2.0.10 ok -in blast

phi-blast 2.0.9 ok -in blast

BLAST (NCBI series 1.4)

format tested versions status MView option

blastp 1.4.7, 1.4.9 ok -in blast

blastn 1.4.9 ok -in blast

blastx 1.4.9 ok -in blast

tblastn 1.4.9 ok -in blast

tblastx 1.4.9 ok -in blast

BLAST (WashU series 2.0)

format tested versions status MView option

blastp 2.0a13, 2.0a19, 13-Apr-2004 ok -in blast

blastn 2.0a19, 13-Apr-2004 ok -in blast

blastx 2.0a19, 13-Apr-2004 ok -in blast

tblastn 2.0a19, 13-Apr-2004 ok -in blast

tblastx 2.0a19, 13-Apr-2004 ok -in blast

FASTA (series 3.4)

format tested versions status MView option

fasta34 3.4t23 ok -in fasta

fastx34 3.4t23 ok -in fasta

fasty34 3.4t23 ok -in fasta

tfasta34 3.4t23 ok -in fasta

tfastx34 3.4t23 ok -in fasta

tfasty34 3.4t23 ok -in fasta

FASTA (series 3.0 - 3.3)

format tested versions status MView option

fasta3 3.0t76, 3.2t01, 3.2t07, 3.3t01, 3.3t07 ok -in fasta

tfastx3 3.0t82, 3.1t07 ok -in fasta

FASTA (series 2)

format tested versions status MView option

fasta 2.0u ok -in fasta

tfastx 2.0u63 ok -in fasta

FASTA (series 1)

format tested versions status MView option

fasta 1.6c24 ok -in fasta

multiple alignment formats

format versions status MView option

plain - ok -in plain

Pearson/FASTA - ok -in pearson

PIR - ok -in pir

MSF - ok -in msf

CLUSTAL W 1.60, 1.70, 1.83 ok -in clustal

MaxHom/HSSP 1.0 1991 ok -in hssp

MULTAS/MULTAL - experimental -in multas

MIPS-ALN - experimental -in mips

secondary structure prediction formats

format versions status MView option

jnet -z - experimental -in jnet

The "plain" multiple alignment format is a trivial format comprising a column of identifiers and an adjacent column of aligned sequences. If you can convert some strange alignment to this you can always read it into MView. More formats can be expected to follow.
Note that as of version 1.37 MView automatically selects an appropriate parser for the particular BLAST or FASTA program/version once it knows it is dealing with input from either program suite.

BLAST (NCBI series 2.2)
format	tested versions	status	MView option
blastp	2.2.6	ok	`-in blast`
blastn	2.2.6	ok	`-in blast`
blastx	2.2.6	ok	`-in blast`
tblastn	2.2.6	ok	`-in blast`
tblastx	2.2.6	ok	`-in blast`
psi-blast	2.2.6	ok	`-in blast`
phi-blast	2.2.6	ok	`-in blast`

BLAST (NCBI series 2.0)
format	tested versions	status	MView option
blastp	2.0.4, 2.0.5, 2.0.9, 2.0.10	ok	`-in blast`
blastn	2.0.4, 2.0.5, 2.0.9, 2.0.14	ok	`-in blast`
blastx	2.0.5, 2.0.9	ok	`-in blast`
tblastn	2.0.5, 2.0.10	ok	`-in blast`
tblastx	2.0.5	ok	`-in blast`
psi-blast	2.0.2, 2.0.4, 2.0.5, 2.0.6, 2.0.10	ok	`-in blast`
phi-blast	2.0.9	ok	`-in blast`

BLAST (NCBI series 1.4)
format	tested versions	status	MView option
blastp	1.4.7, 1.4.9	ok	`-in blast`
blastn	1.4.9	ok	`-in blast`
blastx	1.4.9	ok	`-in blast`
tblastn	1.4.9	ok	`-in blast`
tblastx	1.4.9	ok	`-in blast`

BLAST (WashU series 2.0)
format	tested versions	status	MView option
blastp	2.0a13, 2.0a19, 13-Apr-2004	ok	`-in blast`
blastn	2.0a19, 13-Apr-2004	ok	`-in blast`
blastx	2.0a19, 13-Apr-2004	ok	`-in blast`
tblastn	2.0a19, 13-Apr-2004	ok	`-in blast`
tblastx	2.0a19, 13-Apr-2004	ok	`-in blast`

FASTA (series 3.4)
format	tested versions	status	MView option
fasta34	3.4t23	ok	`-in fasta`
fastx34	3.4t23	ok	`-in fasta`
fasty34	3.4t23	ok	`-in fasta`
tfasta34	3.4t23	ok	`-in fasta`
tfastx34	3.4t23	ok	`-in fasta`
tfasty34	3.4t23	ok	`-in fasta`

FASTA (series 3.0 - 3.3)
format	tested versions	status	MView option
fasta3	3.0t76, 3.2t01, 3.2t07, 3.3t01, 3.3t07	ok	`-in fasta`
tfastx3	3.0t82, 3.1t07	ok	`-in fasta`

FASTA (series 2)
format	tested versions	status	MView option
fasta	2.0u	ok	`-in fasta`
tfastx	2.0u63	ok	`-in fasta`

FASTA (series 1)
format	tested versions	status	MView option
fasta	1.6c24	ok	`-in fasta`

multiple alignment formats
format	versions	status	MView option
plain	-	ok	`-in plain`
Pearson/FASTA	-	ok	`-in pearson`
PIR	-	ok	`-in pir`
MSF	-	ok	`-in msf`
CLUSTAL W	1.60, 1.70, 1.83	ok	`-in clustal`
MaxHom/HSSP	1.0 1991	ok	`-in hssp`
MULTAS/MULTAL	-	experimental	`-in multas`
MIPS-ALN	-	experimental	`-in mips`

secondary structure prediction formats
format	versions	status	MView option
jnet -z	-	experimental	`-in jnet`

[top] What output formats are there?

The default behaviour is to (re)produce a multiple alignment from the input data as plain ASCII. HTML markup will be added if any of the HTML-specific or colouring options are set. Other useful output formats are dumps of the input in Pearson/FASTA format (-out pearson), PIR format (-out pir), or MSF format (-out msf) for processing by another program , or as an RDB table for storage/manipulation in relational database form (-out rdb). You can also produce plain columnar output (-out plain) without MView's decorations.

[top] Can MView process data from a Web page?

Basically, no, unless you are lucky or prepared to edit the Web page. The MView parsers are all built to recognise the raw text output produced by the respective programs (BLAST, FASTA, etc.) or to recognise particular flat-file formats (MSF, PIR, etc.). When a site adds HTML markup to this to make a Web page, arbitrary parts are changed/deleted/added polluting the text, so that even dumping the page in text-only format still leaves traces. If you feel that MView is sufficiently useful, you could ask the Web site maintainer to add an MView output option to their service.

[top] Command line options

Probably the first place to start is to invoke MView with:
    mview -help
There are a lot of options, but the commonest ones are detailed here. The basic action of the program is to generate a plain text dump of the input data with percent sequence identities computed with respect to the first sequence in the output.
There are more command line options that I haven't documented below - some were added for locally used features. Expect changes and new options as the software evolves.

[top] Basic usage
Given an existing alignment in a file "data" in "plain" format, the minimal use might be:
    mview -in plain data > data.out
Or you might attach MView on the end of a pipeline:
    some_process | mview -in plain > data.out
To change the input format to scan a FASTA run, also in "data", use:
    mview -in fasta data > data.out
[top] Basic usage - adding HTML
To add some HTML markup a few extra options are needed, for example:
    mview -in fasta -html body data > data.html
produces a page of HTML wrapped inside a <BODY></BODY> tag pair with a coloured background, and you can load this into your Web browser with a URL like "file://your_path/data.html".
If you want a complete Web page, you can use -html full (gives the MIME-type and wraps the ouput in <HTML> and <BODY> tag pairs) or -html head (the same but without the MIME-type).
To get just the alignment block without these tags use -html data, so that you can merge the output into another web page.
Adding some colour is simple. To colour all the residues:
    mview -in fasta -html head -coloring any data > data.html
and this looks better in my Netscape if the residues are emboldened, so
    mview -in fasta -html head -coloring any -bold data > data.html
Now try colouring by identity to the first sequence:
    mview -in fasta -html head -coloring identity -bold data > data.html
and then make the non-identical residues and gaps grey, instead of black:
    mview -in fasta -html head -coloring identity -bold -symcolor gray -gapcolor gray data > data.html
Now try using an internal style sheet to get blocked colouring. The -bold option is no longer needed:
    mview -in fasta -html head -css on -coloring identity -symcolor gray -gapcolor gray data > data.html
The -in option isn't always necessary. If the filename extension, or the filename itself minus any directory path begins with or contains the first few letters of the valid -in options (eg., mydata.msf or mydata.fasta or tfastx_run1.dat), MView tries to choose a sensible input format, allowing multiple files in mixed formats to be supplied on the command line. The -in option will always override this mechanism but requires that all input files be of the same format.
[top] Rulers
Add a ruler along the top, with -ruler on. Only one kind of ruler is currently provided, numbering the columns of the final alignment from M to N (incrementing) or N to M (decrementing) based on the input sequence numbering, if any. This defaults to 1 to the length of the alignment for multiple alignments. TBLASTX rulers differ slightly in that the native query numbering is given in nucleotide units, but MView reports amino acid units instead (using modulo 3 arithmetic).

[top] Alignment colouring modes
There are several ways to colour the alignment:
-coloring any, will colour every residue according to the currently selected palette.
-coloring identity, will colour only those residues that are identical to some reference sequence (usually the query or first row).
-coloring consensus, will colour only those residues that belong to a specified physicochemical class that is conserved in at least a specified percentage of all rows for a given column. This defaults to 70% and and may be set to another threshold, eg., -coloring consensus -threshold 80 would specify 80%. Note that the physicochemical classes in question can be confined to individual residues.
-coloring group, is like -coloring consensus, but colours residues by the colour of the class to which they belong.
By default, the consensus computation counts gap characters, so that sections of the alignment may be uncolored where the presence of gaps prevents the non-gap count from reaching the threshold. Setting -con_gaps off prevents this, allowing sequence-only based consensus thresholding.
The default palette assumes the input alignment is of protein sequences and sets their colours according to amino acid physicochemical properties: another palette should be selected for DNA or RNA alignments.
Consensus colouring is complicated and some understanding of palettes and consensus patterns is required first before trying to explain alignment consensus colouring.
Finally, if the first character of a "sequence" is the hash '#' character it will not be coloured by the prevailing colourmap. Instead, a colourmap having a matching name (minus the '#' character) will be used, if it exists. For example, if the input alignment contains a line identified by '#sec-struct' then a colormap called [sec-struct] would apply to any rows containing that string in their identifier. New colourmaps can be loaded from a file using the -colorfile option.

[top] Colour palettes
Palettes have (arbitrary) names, eg., MView assumes a protein alignment and defaults to the palette P1 for proteins or D1 for nucleotides. To change default molecule type use -dna. Different palettes are explicitly selected using the -colormap option. For example, to select one of the built-in palettes for viewing nucleotide sequences, use -colormap D1.
There are default palettes for protein and nucleotide sequences. The latter can be selected with the -dna option.
The built-in palettes can be listed from the command line with -listcolors, and new colour schemes can be defined and loaded from a file using the -colorfile option.
Here are the default palettes:
[P1]
#protein: highlight amino acid physicochemical properties
*        ->  dark-gray            #mismatch
?        ->  light-gray           #unknown
Aa       =>  bright-green         #hydrophobic
Bb       =>  dark-gray            #D or N
Cc       =>  yellow               #cysteine
Dd       =>  bright-blue          #negative charge
Ee       =>  bright-blue          #negative charge
Ff       =>  dark-green           #large hydrophobic
Gg       =>  bright-green         #hydrophobic
Hh       =>  dark-green           #large hydrophobic
Ii       =>  bright-green         #hydrophobic
Kk       =>  bright-red           #positive charge
Ll       =>  bright-green         #hydrophobic
Mm       =>  bright-green         #hydrophobic
Nn       =>  purple               #polar
Pp       =>  bright-green         #hydrophobic
Qq       =>  purple               #polar
Rr       =>  bright-red           #positive charge
Ss       =>  dull-blue            #small alcohol
Tt       =>  dull-blue            #small alcohol
Vv       =>  bright-green         #hydrophobic
Ww       =>  dark-green           #large hydrophobic
Xx       ->  dark-gray            #any
Yy       =>  dark-green           #large hydrophobic
Zz       =>  dark-gray            #E or Q

[D1]
#DNA: highlight purine versus pyrimidine
*        ->  dark-gray            #mismatch
?        ->  light-gray           #unknown
Aa       =>  bright-blue          #purine
Bb       =>  dark-gray            #C or G or T; not A
Cc       =>  dull-blue            #pyrimidine
Dd       =>  dark-gray            #A or G or T; not C
Gg       =>  bright-blue          #purine
Hh       =>  dark-gray            #A or C or T; not G
Kk       =>  dark-gray            #G or T
Mm       =>  dark-gray            #A or C
Nn       =>  dark-gray            #A or C or G or T
Rr       =>  dark-gray            #A or G
Ss       =>  dark-gray            #C or G
Tt       =>  dull-blue            #pyrimidine
Uu       =>  dull-blue            #pyrimidine
Vv       =>  dark-gray            #A or C or G; not T
Ww       =>  dark-gray            #A or T
Xx       ->  dark-gray            #any
Yy       =>  dark-gray            #C or T
When writing a new palette, palette names (in square brackets, ie., [P1] and [D1] above) are case-insensitive. Symbols to be coloured are case-sensitive and may be given as a single character or as a character pair in each definition. Comments beginning with a hash '#' character continue to the end of the line. In these examples, both upper and lowercase versions of each residue are given with their associated colour to ensure that either case is coloured the same. In other contexts it may be desirable to specify different colours separately for upper and lowercase symbols.
A symbol to name mapping can use a predefined colour name (as above) or an explicit hexadecimal RGB code. The arrow separating the symbol(s) from the colour code can be double => or single ->. When style sheets have been selected -css on, a double arrow means that the colour should be applied to the background of the symbol while a single arrow means that only the letter should be coloured. When Style Sheets are off, only letters can be coloured anyway and the arrows are equivalent.
Predefined colours are defined as in the following short segment of the built in palette:
#Palette:
#color                     : #RGB 
color black                : #000000
color white                : #ffffff
color red                  : #ff0000
color green                : #00ff00
color blue                 : #0000ff
color cyan                 : #00ffff
color magenta              : #ff00ff
color yellow               : #ffff00    
...
The full range of colours and palettes can be dumped out from MView with the -listcolors or -listcolors -html head options. The latter will produce a correspondingly coloured web page suitable for viewing in a browser.
[top] Consensus patterns
A block of consensus lines can be added beneath the alignment using -consensus on. By default, this adds 4 extra lines giving consensus patterns computed at thresholds of 100,90,80,70%.
Consensus patterns are based on residue equivalence classes, that is, sets of residues that share some physicochemical property. There are two default consensus group definitions for protein P1 and nucleotide D1 alignments, the latter being selected with the -dna option.
At a given percentage threshold, the most discriminating equivalence class is chosen to represent the residues in a given column and an associated symbol is displayed. For example, the default protein and nucleotide consensus groups define the following symbols and equivalence class mappings:
[P1]
#protein consensus: report conserved physicochemical classes, derived from
#the Venn diagrams of:
# Taylor W. R. (1986). The classification of amino acid conservation.
# J. Theor. Biol. 119:205-218.
#as used in:
# Bork, P., Brown, N.P., Hegyi, H., Schultz, J. (1996). The protein
# phosphatase 2C (PP2C) superfamily: Detection of bacterial homologues.
# Protein Science. 5:1421-1425.
#description     =>  symbol  members
*                =>  .
A                =>  A       { A }
C                =>  C       { C }
D                =>  D       { D }
E                =>  E       { E }
F                =>  F       { F }
G                =>  G       { G }
H                =>  H       { H }
I                =>  I       { I }
K                =>  K       { K }
L                =>  L       { L }
M                =>  M       { M }
N                =>  N       { N }
P                =>  P       { P }
Q                =>  Q       { Q }
R                =>  R       { R }
S                =>  S       { S }
T                =>  T       { T }
V                =>  V       { V }
W                =>  W       { W }
Y                =>  Y       { Y }
alcohol          =>  o       { S, T }
aliphatic        =>  l       { I, L, V }
aromatic         =>  a       { F, H, W, Y }
charged          =>  c       { D, E, H, K, R }
hydrophobic      =>  h       { A, C, F, G, H, I, K, L, M, R, T, V, W, Y }
negative         =>  -       { D, E }
polar            =>  p       { C, D, E, H, K, N, Q, R, S, T }
positive         =>  +       { H, K, R }
small            =>  s       { A, C, D, G, N, P, S, T, V }
tiny             =>  u       { A, G, S }
turnlike         =>  t       { A, C, D, E, G, H, K, N, Q, R, S, T }

[D1]
#DNA consensus: report conserved ring types
#description     =>  symbol  members
*                =>  .
A                =>  A       { A }
C                =>  C       { C }
G                =>  G       { G }
T                =>  T       { T }
U                =>  U       { U }
purine           =>  r       { A, G }
pyrimidine       =>  y       { C, T, U }
Alternative equivalence classes can be selected using -con_groupmap, the available list of built-ins can be seen with -listgroups, and new groups can be defined in the same format and read in from a file using -groupfile.
Alternative thresholds to be displayed can be specified as a comma-separated list using the -con_threshold option.
Tip: A useful capability is to control whether only consensus properties (-con_ignore singleton) or just the conserved residues themselves (-con_ignore class) are displayed in consensus lines. The default is to show both using whichever equivalence class is most specific.
By default, the consensus computation counts gap characters, so that sections of the alignment may have gaps as the consensus. Setting -con_gaps off prevents this, producing consensi based only on sequence.
You can specify a colour scheme for the consensus lines using -con_coloring and -con_colormap to change the default palette (PC1 for protein or DC1 for nucleotide). These options are analogous to those for controlling the alignment colouring and follow the same naming scheme.
[top] Alignment consensus colouring
This section assumes an understanding of palettes and consensus patterns.
Colouring of an alignment by consensus determines which residues to colour and the colours to use based on (1) the consensus threshold chosen for the colouring operation (covered in the section on alignment colouring modes), (2) a consideration of the common physicochemical properties of the residues in that column, and (3) the chosen colour scheme:
Given the most specific equivalence class describing the column using the prevailing consensus equivalence classes, any residues in the column belonging to that class will be coloured using the prevailing palette.
In practice, for the default situation of a protein alignment and no special selection of palettes or consensus groups from the command line, then the P1 (D1) equivalence classes and the P1 (D1) colour palette will be used (option -dna).
Tip: If you want to see only the conserved residues above the threshold (ie., only one type of conserved residue per column), add the option -ignore class.
Alternative consensus classes and palettes can be specified using -groupmap and -colormap. Note that these are distinct from any settings used to control displayed consensus lines, although the option naming is similar.

[top] Sequence numbering or ranking
One can colour and compute identities with respect to a sequence other than the first/query sequence using the -reference option. This takes either the sequence identifier or an integer argument corresponding to the ranking or ordering of a sequence. For multiple alignment input formats, sequences are numbered from 1, while for searches the hits are numbered from 1, but the query itself is 0, so beware.

[top] Labelling and pagination
The labelling information can be too broad, so you can switch some off. Labels at the left of the alignment are in blocks numbered from zero (0) rank, (1) identifier, (2) description, (3) score block, (4) percent identities, (5) query sequence positions, and (6) hit sequence positions. Labels 5 and 6 only appear for search data from blast or fasta family input and can be used to read off the positions of HSPs, for example. Each of these can be switched off with an option like -label2 to remove descriptions.
The default layout is a single unbroken horizontal band of alignment - fine if scrolling inside Netscape. However, you may prefer to break the alignment into vertically stacked panes. For panes, for example, 80 columns wide, set -width 80. Widths refer to the alignment, not to the descriptor information at left.
It is possible to narrow (or expand!) the displayed sequence range, for example, -range 10:78 would select only that column range of the alignment using the numbering scheme reported when -ruler on is set (see Rulers). The order of the numbers is unimportant making it simpler to state interest in a region of the alignment that might actually be reversed in the output (eg., a BLASTN search hit matching the reverse complement of the query strand). Note any range setting has no effect and is not related to the sequence position labelling for blast/fasta input.

[top] Filtering alignments
Usually, specifying a limited number of hits to view from a long search alignment speeds things up a lot as there's less parsing and less formatting to be generated, so to get the best 10 hits, use the option -top 10.
You also can squeeze more out of a deep alignment and get a less biased view if a threshold on the pairwise sequence identity is set using -maxident N, where N is some value between 0 and 100.
Other filters specific to BLASTP, FASTA, etc., input formats allow cutoffs on scores or p-values, etc. In particular, it is possible to apply some control over the selection of HSPs used in building the MView alignment using the -hsp filtering option.
Of interest to anyone using PSI-BLAST, you can display alignments for any/all iterations of a PSI-BLAST run using, say:
    mview -in blast -cycle 1,5,10,20 mydata  
to get just those iterations. The default is to display only the last iteration. If you want all output, use -cycle '*'.
Rows can be dropped explicitly using the -disc option. This can be supplied a comma separated list of row identifiers, rank numbers (see above for an explanation of sequence rank numbers), rank number ranges, regular expressions (case insensitive, enclosed between // characters) to match against row identifiers, or the '*' symbol meaning all rows.
Likewise, the -keep option specifies a list of rows to keep in the alignment. The -keep option overrides -disc whenever a row is common to both.
For example, the options
    -disc '*'  -keep '2,3,6..10,/^pdb/'
or even
    -disc '/.*/'  -keep '2,3,6..10,/^pdb/'
would discard everything except rows 2, 3, 6 through 10 inclusive, and any hits beginning with the string 'pdb'.
Note: the currently set reference row is still used for percent identity and colouring operations, even though the row may have been dropped from display by the -disc list.
Another control option can be used to prevent MView from using rows for colouring or for calculation of percent identities although these rows will still be displayed. Use -nop to specify a list (comma separated as usual) of id's or row numbers to flag for 'NO Processing'. This is useful for displaying non-alignment data (eg., secondary structure predictions) alongside the alignment.
[top] SRS (Sequence Retrieval System) links
If HTML markup is produced it is possible to embed SRS links in sequence identifiers if they conform to the following patterns:
    database|accession|identifier 
    database:identifier
as produced by some BLAST and FASTA servers. Such links will be to the EBI and EMBL SRS services and will only be constructed if the database names are listed in the SRS.pm library with this software. This library can be modified for your site if you know some Perl and a little SRS syntax.
[top] Using Cascading Style Sheets

Release 1.40 added cascading style sheets allowing more specific control of HTML elements. In particular, this enables selective colouring of text fore/backgrounds allowing alignments to use coloured blocks instead of just coloured lettering.
This is enabled with the -css on option in combination with the -html option to switch HTML processing on generally. It is disabled with -css off. You can refer to an external style file with -css URL where the URL give a valid path for the Web server to find the file (ie., file:/some/path or http://server/path).
Having loaded your own colour schemes into MView with the -colorfile option, you can dump these as a style file with -html css which just dumps the style sheet to standard output for redirection to a file.
Controlling coloured fore/backgrounds for alignment lettering is handled in the colour scheme definition mechanism.

[top] How can I print?

With difficulty from Netscape. First my UNIX Netscape (4.03) won't produce colour postscript, but a Mac version does. On the Mac one can zoom and preview the image. To produce something that fits on A4 one must set -width 60 or similar and turn off some of the leading text, eg., -label2 -label3.
However, I have been told that you can mark and copy MView output in your browser and paste it into a Microsoft Word 2000 document where you can process it as you wish.

[top] What do the percent identities mean?

Percent identities reported in each alignment row are calculated with respect to the reference sequence (usually the query or first row), as follows:
            number of identical residues
           ------------------------------  x 100
            length of ungapped reference
            sequence over aligned region
Still, in the case of BLAST MView output, minor deviations from the percentages reported by BLAST are due to (i) different rounding, and (ii) the way MView assembles a single pseudo-sequence for a hit composed of multiple HSPs, giving an averaged percent identity.

[top] Can I switch off HTML markup?

Yes - the program defaults to plain ASCII output unless the -html option is set, or the output format is set to PIR, MSF, Pearson, or RDB.

[top] Some sequences are incomplete or contain strange characters?

There are three kinds of input to MView: (i) a preformatted multiple alignment, (ii) an ungapped search (eg., BLAST 1.4.x), or (iii) a gapped search (eg., FASTA, BLAST2). Multiple alignments require minimal parsing and are subjected only to formatting stages. Searches are processed according to whether they are are ungapped or gapped, and this can lead to some apparent inconsistencies in the output (discussed below).

[top] Why is the BLAST query sequence incomplete?
The query sequence is recovered from the input search results. If, like in BLAST, portions of the query were unmatched, they will not appear on the output. Nevertheless MView will pad the missing sequence with 'X' characters based on the numeric match ranges - the worst that can happen then is that the trailing end (ie., the C-terminus) of the alignment is missing. Occasionally, you may see a '?' character - this means that a non-standard residue was seen on input.

[top] How are overlapping BLAST HSPs processed?
Ungapped BLAST input is processed to produce a stack of hit sequence strings aligned against a contiguous query sequence. The query sequence acts as a template for each hit sequence onto which hit fragments are overlayed in the query positions.
In outline the default method of processing of HSPs is as follows:
For BLAST (series 1), as of MView version 1.37, only the HSPs contributing to the ranked hit contribute to this overlay process. A sorting scheme ensures that the best of these fragments are overlayed last and are not obscured by weaker ones, for example, BLAST hits are sorted by score and length. Differences of ordering of fragments along query and hit naturally result in a patchwork that may not correspond exactly to the real hit sequences. Nevertheless, the resulting alignment stack is very informative, and the user can always run and view a gapped search if that is preferred.
For BLAST (series 2) and PSI-BLAST, often only a single gapped alignment is reported by blast for a given database hit. However, sometimes there are alternative alignments and the same stacking rules apply.
More detailed descriptions of the rules for HSP selection and tiling are available. Some control over the choice of HSPs is available through the -hsp option described therein which allows (i) only ranked HSPs (the default) to be tiled; (ii) all HSPs to be tiled, or (iii) all HSPs to be extracted separately.

[top] Why are some symbols lowercased?
Gapped input (eg., FASTA, BLAST2, PSI-BLAST) is subject to a further processing step when producing the stacked alignment. The query sequence again acts as a template, but gapped regions introduced into the query are excised from both query and hit to ensure a contiguous query string. In the affected hit sequence, the position of the excision is marked by lowercasing the boundary symbols. Again, the stacked alignments produced are very informative since only unmatched regions of hits are lost from the display.

[top] Memory usage

Use of memory by MView can be very great, particularly if you try to process complete sets of PSI-BLAST cycles each containing 1000s of hits all at once. Use of most filtering options should reduce memory requirements by cutting down the number of internal data structures created. Likewise, processing each alignment separately will save memory or you can use the option -register off to cause each alignment to be output when ready (by default all alignments are saved until the end so they can be printed with fields in register). Finally, the choice of malloc library compiled into your perl may affect memory use.

[top] System requirements

MView and its underlying class libraries are implemented in Perl, version 5, for UNIX, and should be easily portable to other systems.
As of MView release 1.40, the code requires a minimum of perl version 5.004.
Formatting and colouring of HTML alignments requires a fixed-width font (eg., Courier) and support for the <FONT> tag, so a modern standards-compliant browser such as Mozilla Firefox is recommended. In particular, use of style sheets as of MView release 1.40 requires that your browser supports HTML 4.0.

[top] Installation

Installation on UNIX or Linux is easy, assuming you have Perl.
Save the archive to your software area, eg., /usr/local.
Gunzip it and extract it through tar, eg.,
    gunzip < mview-1.47.tgz | tar xvf - 
or on a Linux box,
    tar xvzf mview-1.47.tgz 
This would create a directory called mview-1.47 and place all the files under there.
Change to this directory and load bin/mview into an editor.
Set a Perl interpreter path valid for your machine after the '#!' magic number.
Change the "use lib 'some stuff';" line to, in our example,
    use lib '/usr/local/mview-1.47/lib';
Finally, copy or symlink mview to somewhere on your PATH and rehash or login again.
Ask your system manager or a Perl guru for help if this looks weird.

[top] Copyright and Licensing information

MView and associated libraries are Open Source Software protected by the GPL. All users must adhere to these licensing terms, acceptance of which is implicit when the software is downloaded.

[top] Citation

If you use MView in your work, please cite:
Brown, N.P., Leroy C., Sander C. (1998). MView: A Web compatible database search or multiple alignment viewer. Bioinformatics. 14(4):380-381. PubMed

[top] Acknowledgments

People who have contributed include C. Leroy (early versions of the FASTA, BLAST2 (WashU), BLAST 1.4, and PSI-BLAST parsers and SRS mappings). Useful suggestions came from R. Lopez at EBI, and my former colleagues in the old Sander group there. Thanks also to Nigel Douglas and Willie Taylor for allowing me to serve MView from the Laboratory of Mathematical Biology at NIMR, London for several years. Many other people have suggested new features and reported bugs. I hope I have acknowledged them in the change log and apologise if I have missed anyone out.
This project is unrelated to the Bioperl project, but probably should be...

[Home]

[FAQ]

Maintained by Nigel P. Brown. Last update Dec 19 2005 .