Search
Links and Functions
Language Selection
Breadcrumb Navigation
Home Publications Playing biology's name game: identifying protein names in scientific text
Main Navigation
Content
Playing biology's name game: identifying protein names in scientific text
| Publication Type | | Conference Paper |
| Authors | | Daniel Hanisch, Juliane Fluck, Heinz-Theodor Mevissen, Ralf Zimmer |
| Year of Publication | | 2003 |
| Editors | | Russ B. Altman, Keith A. Dunker, Lawrence Hunter, Teri E. Klein |
| Proceedings Title | | Proceedings of the 8th Pacific Symposium on Biocomputing (PSB 2003) |
| Pages | | 403-414 |
| Keywords | | textmining |
| Conference Date/Location | | Lihue, Hawaii, USA, January 3-7, 2003 |
| Citation Key | | bioinflmu-281 |
| Document visibility | | Global publication list |
| Export | | BibTex |
Abstract
A growing body of work is devoted to the extraction of protein or gene interaction information from the scientific literature. Yet, the basis for most extraction algorithms, i.e. the specific and sensitive recognition of protein and gene names and their numerous synonyms, has not been adequately addressed. Here we describe the construction of a comprehensive general purpose name dictionary and an accompanying automatic curation procedure based on a simple token model of protein names. We designed an efficient search algorithm to analyze all abstracts in MEDLINE in a reasonable amount of time on standard computers. The parameters of our method are optimized using machine learning techniques. Used in conjunction, these ingredients lead to good search performance. A supplementary web page is available at http://cartan.gmd.de/ProMiner/.
Service Menu
Footer