edu.msu.cme.rdp.classifier.rrnaclassifier
Class Classifier

java.lang.Object
  extended by edu.msu.cme.rdp.classifier.rrnaclassifier.Classifier

public class Classifier
extends java.lang.Object

This is the class to do the classification.


Field Summary
private  float[] accumulateProbArr
           
private  int MAX_NUM_OF_WORDS
          The assumed maximum number of words per sequence.
static int MIN_GOOD_WORDS
           
static int MIN_SEQ_LEN
          The minimum number of bases per sequence.
private  int NUM_OF_RUNS
          The number of bootstrap trials.
private  float[][] querySeq_wordProbArr
           
private  TrainingInfo trainingInfo
           
 
Constructor Summary
Classifier(TrainingInfo t)
          Creates new Classifier.
 
Method Summary
private  void addBestGenusNode(HierarchyTree node, java.util.List resultList)
          Adds a single RankAssignment to the list of the classification results if a treenode is already included in the list, simply increases the confidence for that RankAssignment by 1, for easy calculation.
(package private)  void addConfidence(HierarchyTree node, java.util.HashMap map)
          increase the count of the RankAssignment in the map if match that node or any ancestor of that node.
 ClassificationResult classify(Sequence pSeq)
          Takes a query sequence, returns the classification result.
private  void findAncestor(HierarchyTree root, RankAssignment curNode, java.util.List ancestorList)
          Finds the ancestors of the given node up to the root.
private  java.util.List getFinalResultList(java.util.HashMap map, HierarchyTree aNode)
           
private  java.util.List getGreedyPath(HierarchyTree root, java.util.List resultList)
          Returns a list of RankAssignment in which root is the first item.
private  int getRandomIndex(int maxi)
          Generates a random integer in the range of 0 to the specified maximum integer.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

trainingInfo

private TrainingInfo trainingInfo

NUM_OF_RUNS

private final int NUM_OF_RUNS
The number of bootstrap trials. Initially set to 100.

See Also:
Constant Field Values

MAX_NUM_OF_WORDS

private final int MAX_NUM_OF_WORDS
The assumed maximum number of words per sequence. Initially set to 5000.

See Also:
Constant Field Values

MIN_SEQ_LEN

public static final int MIN_SEQ_LEN
The minimum number of bases per sequence. Initially set to 200.

See Also:
Constant Field Values

MIN_GOOD_WORDS

public static final int MIN_GOOD_WORDS
See Also:
Constant Field Values

querySeq_wordProbArr

private float[][] querySeq_wordProbArr

accumulateProbArr

private float[] accumulateProbArr
Constructor Detail

Classifier

Classifier(TrainingInfo t)
Creates new Classifier.

Method Detail

classify

public ClassificationResult classify(Sequence pSeq)
Takes a query sequence, returns the classification result. For each query sequence, first assign it to a genus node using all the words for calculation. Then randomly chooses one-eighth of the all overlapping words in the query to calculate the joint probability. The number of times a genus was selected out of the number of bootstrap trials was used as an estimate of confidence in the assignment to that genus.

Throws:
ShortSequenceException - if the sequence length is less than the minimum sequence length.

addConfidence

void addConfidence(HierarchyTree node,
                   java.util.HashMap map)
increase the count of the RankAssignment in the map if match that node or any ancestor of that node.

Parameters:
node -
map -

getFinalResultList

private java.util.List getFinalResultList(java.util.HashMap map,
                                          HierarchyTree aNode)

addBestGenusNode

private void addBestGenusNode(HierarchyTree node,
                              java.util.List resultList)
Adds a single RankAssignment to the list of the classification results if a treenode is already included in the list, simply increases the confidence for that RankAssignment by 1, for easy calculation. Later this confidence will be divided by NUM_OF_RUNS.


findAncestor

private void findAncestor(HierarchyTree root,
                          RankAssignment curNode,
                          java.util.List ancestorList)
Finds the ancestors of the given node up to the root. The ancestors are kept in a list in which the root is the first item.


getGreedyPath

private java.util.List getGreedyPath(HierarchyTree root,
                                     java.util.List resultList)
Returns a list of RankAssignment in which root is the first item. Each node in the list is the one which has the highest confidence among the children of the previous node. Algorithm: From the top rank, select the child that has the highest votes, traverse down to the bottom rank level.


getRandomIndex

private int getRandomIndex(int maxi)
Generates a random integer in the range of 0 to the specified maximum integer.