edu.msu.cme.rdp.classifier.readseqwrapper
Class SequenceParser

java.lang.Object
  extended by edu.msu.cme.rdp.classifier.readseqwrapper.SequenceParser

public class SequenceParser
extends java.lang.Object

A class whick parses the input sequences and creates Sequence objects.


Field Summary
private static char[] charLookup
           
private  ParsedSequence curSeq
           
private  java.lang.String emblErrorMsg
           
private  java.lang.String fastaErrorMsg
           
private  java.lang.String format
           
private  java.lang.String formatError
           
private  java.lang.String genbankErrorMsg
           
(package private)  java.util.regex.Matcher matcher
           
private static int MAX_ASCII
           
(package private)  java.util.regex.Pattern pattern
           
(package private)  java.io.BufferedReader reader
           
(package private)  java.lang.String regexEmbl
           
(package private)  java.lang.String regexFasta
           
(package private)  java.lang.String regexGenbank
           
private static java.lang.String TEXT_FORMAT
           
(package private) static java.lang.String UNKNOWN_SEQ
           
 
Constructor Summary
SequenceParser(java.io.InputStream inStream)
          Creates new SequenceParser to parse the sequences from an InputStream.
SequenceParser(java.io.Reader rhs)
          Creates new SequenceParser to parse the sequences from a Reader.
 
Method Summary
 void close()
          Closes the reader.
 ParsedSequence getNextSequence()
          Returns the next available ParsedSequence from input.
private  void init()
          Checks the format of the input.
private  java.lang.String modifySequence(java.lang.String s)
          Modifies the sequence string.
private  void setSequenceFormat()
          Checks the format of the first input sequence.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

reader

java.io.BufferedReader reader

pattern

java.util.regex.Pattern pattern

matcher

java.util.regex.Matcher matcher

regexFasta

java.lang.String regexFasta

regexGenbank

java.lang.String regexGenbank

regexEmbl

java.lang.String regexEmbl

TEXT_FORMAT

private static final java.lang.String TEXT_FORMAT
See Also:
Constant Field Values

UNKNOWN_SEQ

static final java.lang.String UNKNOWN_SEQ
See Also:
Constant Field Values

curSeq

private ParsedSequence curSeq

formatError

private java.lang.String formatError

format

private java.lang.String format

fastaErrorMsg

private java.lang.String fastaErrorMsg

genbankErrorMsg

private java.lang.String genbankErrorMsg

emblErrorMsg

private java.lang.String emblErrorMsg

MAX_ASCII

private static final int MAX_ASCII
See Also:
Constant Field Values

charLookup

private static char[] charLookup
Constructor Detail

SequenceParser

public SequenceParser(java.io.InputStream inStream)
               throws java.io.IOException,
                      SequenceParserException
Creates new SequenceParser to parse the sequences from an InputStream. supported formats: Fasta, Genbank, EMBL or free text for single sequence.

Throws:
java.io.IOException
SequenceParserException

SequenceParser

public SequenceParser(java.io.Reader rhs)
               throws java.io.IOException,
                      SequenceParserException
Creates new SequenceParser to parse the sequences from a Reader. supported formats: Fasta, Genbank, EMBL or free text for single sequence.

Throws:
java.io.IOException
SequenceParserException
Method Detail

init

private void init()
           throws java.io.IOException,
                  SequenceParserException
Checks the format of the input.

Throws:
exception - if the format is not one of the supported formats: Fasta, Genbank, EMBL or free text for single sequence.
java.io.IOException
SequenceParserException

setSequenceFormat

private void setSequenceFormat()
                        throws java.io.IOException,
                               SequenceParserException
Checks the format of the first input sequence. It assumes that all the sequences from the input share the same format.

Throws:
java.io.IOException
SequenceParserException

getNextSequence

public ParsedSequence getNextSequence()
                               throws java.io.IOException,
                                      SequenceParserException
Returns the next available ParsedSequence from input. If no sequence is available, then null is returned.

Throws:
java.io.IOException
SequenceParserException

close

public void close()
           throws java.io.IOException
Closes the reader.

Throws:
java.io.IOException

modifySequence

private java.lang.String modifySequence(java.lang.String s)
                                 throws java.io.IOException
Modifies the sequence string. Removes -, ~ and digits. Returns a string.

Throws:
java.io.IOException