halo.data
Class Filter

java.lang.Object
  extended by halo.data.Filter

public class Filter
extends java.lang.Object

Provides methods for filtering the values contained in a Data object

Author:
Stefanie Kaufmann

Constructor Summary
Filter()
           
 
Method Summary
static boolean checkForGenes(Data data)
          Checks if the data object has the gene name attribute loaded and can be used for PQS filtering
static Data filter(Data data, double threshold)
          removes all experiments where at least one of the values is beneath a certain threshold Commandline option -f threshold='value'
static Data filterAbsent(Data data, java.util.ArrayList<java.lang.String> relevantColumns, java.lang.String call, int threshold)
          Removes values that are defined as 'absent' in the attribute list Commandline option -f present='label1,label2,...
static Data filterCorrectionBias(Data data, java.util.HashMap<java.lang.String,java.lang.Double> corr)
          Filters the data according to given values for bias correction in such a way that only those probeset ids are kept that have corresponding correction values
static Data filterPQS(Data data, Normalization l, boolean histogram)
          Please note that this method can only be used after normalization!
static Data filterPQS(Data data, Normalization l, double threshold, boolean histogram)
          Please note that this method can only be used after normalization!
static Data filterPQS(Data data, Normalization l, int replicate, boolean histogram)
          Please note that this method can only be used after normalization!
static Data transferAllValues(Data oldData, Data newData)
          Transfers all values that are not directly used in filtering from the old data object to the new data object
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Filter

public Filter()
Method Detail

filter

public static Data filter(Data data,
                          double threshold)
removes all experiments where at least one of the values is beneath a certain threshold Commandline option -f threshold='value'

Parameters:
data - Data object which shall be filtered
threshold - Cutoff-threshold
Returns:
Filtered Data object

filterAbsent

public static Data filterAbsent(Data data,
                                java.util.ArrayList<java.lang.String> relevantColumns,
                                java.lang.String call,
                                int threshold)
Removes values that are defined as 'absent' in the attribute list Commandline option -f present='label1,label2,...:call:threshold'

Parameters:
data - Data object that will be filtered
relevantColumns - List of the labels of all columns that contain call information
call - The call according to which will be filtered, usually 'A' for absent
threshold - Defines how often the $call has to be noted for one spotid for it to be filtered
Returns:
Filtered Data object

filterPQS

public static Data filterPQS(Data data,
                             Normalization l,
                             double threshold,
                             boolean histogram)
Please note that this method can only be used after normalization! The normalization object in which the results from this step are stored has to be given as parameter, but normalization has to be already performed! Filters the data in such a way that all remaining entries have a quality control value lower than the given threshold Commandline option -f pqs='threshold'

Parameters:
data - Data object that shall be filtered
l - Linear Regression object which correction factors will be used for filtering
threshold - A threshold which defines which values shall be kept
histogram - TRUE if a histogram about the PQS values should be produced, FALSE otherwise
Returns:
The new Data object where probe sets with low quality are not contained anymore

filterPQS

public static Data filterPQS(Data data,
                             Normalization l,
                             int replicate,
                             boolean histogram)
Please note that this method can only be used after normalization! The normalization object in which the results from this step are stored has to be given as parameter, but normalization has to be already performed! Filters data such that from each gene only one entry, the one with the lowest quality control value, is kept; Calculation is performed for only one replicate Commandline option -f pqsmin=replicate (from 1 to #replicates)

Parameters:
data - Data object that shall be filtered
l - The normalization object which will be used for the filtering
replicate - The replicate for which the ratio calculation will be performed
histogram - TRUE if a histogram about the PQS values should be produced, FALSE otherwise
Returns:
The filtered Data object

filterPQS

public static Data filterPQS(Data data,
                             Normalization l,
                             boolean histogram)
Please note that this method can only be used after normalization! The normalization object in which the results from this step are stored has to be given as parameter, but normalization has to be already performed! Filters data such that from each gene only one entry, the one with the lowest quality control value, is kept; Calculation is performed for all replicates Commandline option -f pqsmin=-1

Parameters:
data - Data object that shall be filtered
l - The normalization object which will be used for the filtering
histogram - TRUE if a histogram about the PQS values should be produced, FALSE otherwise
Returns:
The filtered Data object

filterCorrectionBias

public static Data filterCorrectionBias(Data data,
                                        java.util.HashMap<java.lang.String,java.lang.Double> corr)
Filters the data according to given values for bias correction in such a way that only those probeset ids are kept that have corresponding correction values

Parameters:
data - The original data object
corr - The correction values, mapped from id -> value
Returns:
The filtered Data object

transferAllValues

public static Data transferAllValues(Data oldData,
                                     Data newData)
Transfers all values that are not directly used in filtering from the old data object to the new data object

Parameters:
oldData - The data object before filtering
newData - The new data object, created while filtering
Returns:
The new data object with all the values of the old data object

checkForGenes

public static boolean checkForGenes(Data data)
Checks if the data object has the gene name attribute loaded and can be used for PQS filtering

Parameters:
data - The data object
Returns:
TRUE if gene names are loaded and PQS filtering can proceed