tools.namemapping
Class ProteinManager

java.lang.Object
  extended by tools.namemapping.ProteinManager

public class ProteinManager
extends Object

This class contains static methods which perform the mapping of string identifiers for proteins to internal integer IDs. This works as follows:

Quick start

If you quickly want internal IDs for string identifiers without bothering about anything else, simple call getInternalID(String) for each of your identifiers. Same identifiers will return the same internal IDs.

Name mappings

Name mappings are useful if you are working with proteins which have different synonyms or database identifiers. You simple provide the mappings as a directed network (which you can read from a file) and the protein manager will automatically map those names according to the edges in the network. See also: addNameMappings(ProteinNetwork)

Regular expressions

If you are reading files where the needed identifiers are contained in longer protein identification strings you can use regular expressions to parse the information you need from the string. Use setRegularExpression(String) to parse all incoming identifiers using the given expression and unsetRegularExpression() to stop using a regular expression.

Author:
Jan Krumsiek
See Also:
ProteinLabel, ProteinLabelOrganism

Field Summary
protected static boolean caseSensitive
          Should protein identifiers be handled case-sensitive?
 
Method Summary
static Object addAnnotation(int internalID, String key, Object value)
          Adds an annotation to the protein with a given internal ID.
static void addAnnotations(int internalID, Map<String,Object> newAnnotations)
          Adds set of annotations to a protein with a given internal ID.
static void addNameMappings(ProteinNetwork mappings)
          Add a list of name mappings to the protein manager.
static void clearNameMappings()
          Removes all existing name mappings from the protein manager.
static Object getAnnotation(int internalID, String key)
          Retrieves an annotation for a given protein.
static Map<String,Object> getAnnotations(int internalID)
          Retrieves all annotations for a given protein.
static Set<Integer> getFilteredProteins(BooleanExpression expression)
          Return the subset of proteins which match a given expression.
static int getInternalID(ProteinLabel label)
          Returns the internal protein ID for a given ProteinLabel
static int getInternalID(String label)
          Returns the internal ID for a given string identifier.
static ProteinLabel getLabel(int internalID)
          Returns the ProteinLabel associated with a given internal id.
static int getProteinCount()
          Returns the number of registered proteins.
static Synonyms getSynonyms()
          Return the Synonyms object currently used
static void loadProteinAnnotations(String file)
          Load protein annotations from a given file.
static void main(String[] args)
           
static void saveProteinAnnotations(File file)
          Saves protein annotations to a given file.
static void saveProteinAnnotations(OutputStream stream)
          Saves protein annotations to a given output stream.
static void saveProteinAnnotations(String file)
          Saves protein annotations to a given file.
static void setCaseSensitivity(boolean sensitive)
          Set if text identifiers should be case sensitive.
static void setRegularExpression(String regex)
          Use given regular expression to parse the actual identifier from the next incoming text identifiers (see above).
static void unsetRegularExpression()
          Do not use regular expression parsing for the following incoming text identifiers
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

caseSensitive

protected static boolean caseSensitive
Should protein identifiers be handled case-sensitive? This value can be changed using setCaseSensitivity(boolean)

Method Detail

addNameMappings

public static void addNameMappings(ProteinNetwork mappings)
Add a list of name mappings to the protein manager. Name mappings are described by directed networks. An edge A=>B in the network means that each occurence of the name A will be translated to {code B}. The name mappings will be added to the Synonyms object of the protein manager.

Parameters:
mappings - directed network containing name mappings

clearNameMappings

public static void clearNameMappings()
Removes all existing name mappings from the protein manager.


getSynonyms

public static Synonyms getSynonyms()
Return the Synonyms object currently used

Returns:
Synonyms object used in the protein manager

getInternalID

public static int getInternalID(ProteinLabel label)
Returns the internal protein ID for a given ProteinLabel

Parameters:
label - protein label for which the internal ID will be returned
Returns:
internal ID of the given protein label

getInternalID

public static int getInternalID(String label)
Returns the internal ID for a given string identifier. This methods automatically converts the string into a ProteinLabel object.

Parameters:
label - protein label for which the internal ID will be returned
Returns:
internal ID of the given protein label

getLabel

public static ProteinLabel getLabel(int internalID)
Returns the ProteinLabel associated with a given internal id. If there is no label for this ID it will return a new label with the identifer #UNASSIGNED ID: [id]#

Parameters:
internalID - internal ID for which the protein label will be returned
Returns:
protein label for the given internal ID

addAnnotation

public static Object addAnnotation(int internalID,
                                   String key,
                                   Object value)
                            throws ProCopeException
Adds an annotation to the protein with a given internal ID. An annotation consists of a String key and an arbitrary Object which must be of an Integer, Float, String or a List.

Existing annotations with the same key will be overwritten

Parameters:
internalID - internal ID of the protein for which an annotation is added
key - key of the annotation
value - value of the annotation
Returns:
old value if key already existed or null if this key is new
Throws:
ProCopeException - if the internal ID is not assigned

addAnnotations

public static void addAnnotations(int internalID,
                                  Map<String,Object> newAnnotations)
                           throws ProCopeException
Adds set of annotations to a protein with a given internal ID. See also: addAnnotation(int, String, Object).<

Existing annotations will be overwritten.

Parameters:
internalID - internal ID of the protein for which the annotations will be added
newAnnotations - map of annotations to be added.
Throws:
ProCopeException - if the internal ID is not assigned

getAnnotation

public static Object getAnnotation(int internalID,
                                   String key)
                            throws ProCopeException
Retrieves an annotation for a given protein.

Parameters:
internalID - internal ID of the protein
key - key of the annotation
Returns:
the value of that annotation or null
Throws:
ProCopeException - if the internal ID is not assigned

getAnnotations

public static Map<String,Object> getAnnotations(int internalID)
                                         throws ProCopeException
Retrieves all annotations for a given protein.

Parameters:
internalID - internal ID of the protein
Returns:
map of annotations, will be empty if no annotations are associated with the protein
Throws:
ProCopeException - if the internal ID is not assigned

getProteinCount

public static int getProteinCount()
Returns the number of registered proteins.

Returns:
number of proteins registered in the manager

getFilteredProteins

public static Set<Integer> getFilteredProteins(BooleanExpression expression)
Return the subset of proteins which match a given expression. See also: BooleanExpression

Parameters:
expression - expression to be evaluated
Returns:
subset of proteins which match the expression

setRegularExpression

public static void setRegularExpression(String regex)
                                 throws PatternSyntaxException
Use given regular expression to parse the actual identifier from the next incoming text identifiers (see above).

Parameters:
regex - regular expression to be used
Throws:
PatternSyntaxException - If the expression's syntax is invalid

unsetRegularExpression

public static void unsetRegularExpression()
Do not use regular expression parsing for the following incoming text identifiers


setCaseSensitivity

public static void setCaseSensitivity(boolean sensitive)
Set if text identifiers should be case sensitive. For example, if this value is set to true, YPR173C and ypr173c will return the same internal ID.

Parameters:
sensitive - identifiers case-sensitive?

saveProteinAnnotations

public static void saveProteinAnnotations(String file)
                                   throws IOException
Saves protein annotations to a given file.

Parameters:
file - output file
Throws:
IOException - if the file could not be written
See Also:
addAnnotation(int, String, Object)

saveProteinAnnotations

public static void saveProteinAnnotations(File file)
                                   throws IOException
Saves protein annotations to a given file.

Parameters:
file - output file
Throws:
IOException - if the file could not be written
See Also:
addAnnotation(int, String, Object)

saveProteinAnnotations

public static void saveProteinAnnotations(OutputStream stream)
Saves protein annotations to a given output stream.

Parameters:
file - output stream
See Also:
addAnnotation(int, String, Object)

loadProteinAnnotations

public static void loadProteinAnnotations(String file)
                                   throws IOException,
                                          ProCopeException
Load protein annotations from a given file.

Parameters:
file - file to load annotations from
Throws:
IOException - if the file could not be opend
ProCopeException - if the file format is invalid or something else went wrong

main

public static void main(String[] args)
                 throws IOException
Throws:
IOException