gosim
Calculates networks of semantic similarity scores using the Gene
Ontology as proposed in Schlicker et al., 2006.
Gene ontology score networks can get very large, consider using a
cutoff value (parameter -c
). The calculation
of a network may take some time depending on the power of the computer
you use.
Call the tool without any command line arguments to get detailed information about the parameters for this program.
See also:
Example calls
Note: The GO ontology and annotation files are only provided with this
package for yeast. The corresponding files for other species or updated
versions for yeast can be downloaded from the Gene Ontology website.
- load ontology and use Yeast gene annotations (
-gonet
,
-anno
)
- use the "biological process" namespace (
-name
)
- follow both types of relations (is_a and part_of, default
setting)
- term similarity measure: relevance (default), functional
similarity measure: total maximum (default)
cmdline.sh gosim -gonet data/go/gene_ontology_edit.obo
-name bp -anno
data/go/gene_association.sgd
- use the "molecular function" namespace (
-name
)
- only follow is_a relationships in the onotlogy (
-rel
)
- term similarity measure: resnik (
-termsim
)
- functional similarity measure: col/row average (
-funsim
)
- only output scores >= 3.0 (
-c
)
- output a gzipped network to a file (
-o
,
-oz
)
cmdline.sh gosim -gonet data/go/gene_ontology_edit.obo
-name mf -anno data/go/gene_association.sgd -rel isa -termsim resnik
-funsim colrowavg -c 3.0 -o go.net -oz
Important note: The protein
identifier types used in the GO files might not be the same as in the
other data files. You might have to use a name mapping file. ProCope
provides such a name mapping file for yeast in data/yeastmappings_YYMMDD.txt
.
Command line option: -namemap
data/yeastmappings_YYMMDD.txt