Sample Use Case 3: Using Normalization and Half-life calculation methods

This sample use case demonstrates the usage of normalization and half-life calculation with HALO. The complete source code can be found in the package halo.examples. Please note that, since the different parts of HALO depend on each other, the source code provided below is depending on previous loading of data, as shown in the previous use case. Important note: All variable parameters (methods, thresholds, etc) that are used in this example are chosen arbitrarily and only for description purposes. These parameters have to be chosen carefully depending on the data and goals for practical uses.

Table of contents:


Choose parameters

We first have to set several important parameters: The labeling time and the half-life calculation method that we want to set. How to do this is demonstrated in the example below.

//choose labeling time
double time = 55;

//Choose half-life calculation methods
ArrayList medMethods = new ArrayList();
medMethods.add(new HalfLife_New());
medMethods.add(new HalfLife_Pre());

Calculate median half-life

Before starting the half-life calculation, we want to calculate the median half-life.

double medianHL = median(data, medMethods, time);
System.out.println("Median half-life "+medianHL);

Normalize data

A crucial step is the normalization of the data. Below is demonstrated how to use linear regression as normalization method and how to calculate correction factors. For a more detailed description of the different available methods see the Javadoc. Another parameter that can be set before calculations is the method for calculation of ratios; we can either choose that the ratio of the average over all replicates should be used, or the average over the ratios for all replicates. Alternatively you can reduce the calculations to one replicate if you like to.

//Normalization by linear regression
Normalization lr = new LinearRegression(data);
//set method for ratio calculation (default = RATIOFIRST)
data.setMethod(Data.AVERAGEFIRST);
//OR
//set replicate
// lr.setReplicate(1);
CorrectionFactors factors = lr.calculateCorrectionFactors();

Filtering with probeset quality scores

For the filtering with probeset quality scores previous normalization is necessary. We can thus perform this filtering step only now, after normalization. The example shows how to use this filter and write the filtered data into a file. If the gene name label used for the attribute is not gene_name you have to set it (like below) to the correct label. If you do not want a histogram as output, choose false as last parameter in the filter method.

//Filter with PQS
data.setGeneName("Gene Symbol");
data = Filter.filterPQS(data, lr, true);
data.writeOutput("Examples_mouse_filtered_pqs.txt", colTot, colNew, colPre, colAtt);

Calculate half-lives

Half-Life calculation:
After data is loaded and normalized we can start directly with the calculation of half-lives. We can calculate half-lives either based on the median which we have already calculated, or with the data normalized through linear regression. Below both cases are demonstrated: At first the calculation of half-lives with the method based on Newly transcribed/Total RNA and the median half-life, for which we have to repeat normalization, and secondly the calculation of half-lives with the Pre-existing/Total method for linear regression normalized data.

//Choose half-life calculation method: based on newly transcribed/total RNA
HalfLife hlNew = new HalfLife_New();
hlNew.initialize(data);

//Use normalization based on median half-life
hlNew.calculateCorrectionFactors(medianHL, time);
//calculate the half-lives
hlNew.calculateHalfLives(time);
//print the half-lives with gene names in an output file
hlNew.printHalfLivesWithGeneNames("Example_mouse_halflives_nt.txt");

//Calculate a second half-life method: based on pre-existing/total RNA
HalfLife hlPre = new HalfLife_Pre();
hlPre.initialize(data);
//Use normalization based on linear regression (see above)
hlPre.setCorrectionFactor(factors);
//calculate the half-lives
hlPre.calculateHalfLives(time);



Plot normalization results and half-lives

We can both plot the half-lives and the results from normalization. For normalization plotting you simply have to call one function. The plotting of half-lives requires more parameters. The code example below shows you how you can do this. We have to prepare lists defining used methods, labeling times and replicates for the legend of the graph before constructing the graph.

GraphHandler.plotNormalization(lr, data);
//Prepare the parameters necessary for plotting
List<HalfLife> lives = new ArrayList<HalfLife>();
lives.add(hlNew);

//Define the names of the used methods
List<String> methods = new ArrayList<String>();
methods.add(HalfLife.NEWLY);

//Define the corresponding labeling times
List<Double> times = new ArrayList<Double>();
times.add(time);

//Leave replicate definition empty so that an average over all replicates will be used
List<Integer> replicates = new ArrayList<Integer>();

//Start plotting
XYGraphConstructor graphConstructor = GraphHandler.plotHalfLives(data, lives, methods,
times, replicates);
graphConstructor.generateGraph();



Printing of half-lives to a file data

Half-lives and ratios for each half-live calculation method can be printed in an output file as shown below. We first have to define a set of parameters: A header for the output file can be defined and we can print either half-life values, ratios or both. We can print any number of half-life results in any order.

lives.add(hlPre);
///Define parameters
String header = "Spotid\tNewly_transcribed/Total\tPre-existing/Total"; //header for the output file
int which = HalfLifeWriter.HALFLIFE; //defines if you want only half-lives, ratios or both
String output = SampleUseCase1.DATAFILE+".halflives";

//Start printing
new HalfLifeWriter(output, header, which, lives.get(0), lives.get(1));


Output

The output produced by HALO should look like this:

Loading data...
Done loading data.
You have 31451 probesets.
------------------------------
Loading attributes...
Done loading attributes.
------------------------------
Filtering data...
Done filtering data.
You have 11031 probesets.
------------------------------
Filtering data...
Done filtering data.
You have 10984 probesets.
------------------------------
Filtering data...
Done filtering data.
You have 10937 probesets.
------------------------------
Filtering data...
Done filtering data.
You have 10731 probesets.
------------------------------
Starting linear regression...
Done with linear regression.
These are your correction factors:
c_u: 0.8183804750680859
c_l: 0.14120777127743048
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Starting linear regression...
Done with linear regression.
These are your correction factors:
c_u: 0.8183804750680859
c_l: 0.14120777127743048
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Starting linear regression...
Done with linear regression.
These are your correction factors:
c_u: 0.8475997207116831
c_l: 0.08835652212253098
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Starting linear regression...
Done with linear regression.
These are your correction factors:
c_u: 0.8475997207116831
c_l: 0.08835652212253098
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Starting linear regression...
Done with linear regression.
These are your correction factors:
c_u: 0.8469761513565193
c_l: 0.1055589785867248
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Starting linear regression...
Done with linear regression.
These are your correction factors:
c_u: 0.8469761513565193
c_l: 0.1055589785867248
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Median half-life 320.3161021500094
Starting linear regression...
Done with linear regression.
These are your correction factors:
c_u: 0.8336859400282673
c_l: 0.11746855592414432
------------------------------
Filtering data...
Done filtering data.
You have 7208 probesets.
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Starting half-life calculation...
Done calculating half-lives.
------------------------------
Data loaded for graph construction...
Graph generated.
Writing results into file...
Done writing results.

The following files should be produced:


The following plots should be produced:



HALO documentation