MMRF for proteome annotation applied to human protein disease prediction

García Jiménez, BeatrizLedezma Espino, Agapito IsmaelSanchis de Miguel, María Araceli2011-07-072011-07-072011Inductive logic programming: 20th International Conference, ILP 2010, Florence, Italy, June 27-30, 2010. Berlin: Springer, 2011, p. 67-75 (Lecture notes in computer science. Lecture notes in artificial intelligence; 6489) ISBN 978-3-642-21295-6978-3-642-21294-9 (Print)978-3-642-21295-6 (Online)0302-9743 (Print)1611-3349 (Online)https://hdl.handle.net/10016/11716Proceedings of: 20th International Conference, ILP 2010, Florence, Italy, June 27-30, 2010Biological processes where every gene and protein participates is an essential knowledge for designing disease treatments. Nowadays, these annotations are still unknown for many genes and proteins. Since making annotations from in-vivo experiments is costly, computational predictors are needed for different kinds of annotation such as metabolic pathway, interaction network, protein family, tissue, disease and so on. Biological data has an intrinsic relational structure, including genes and proteins, which can be grouped by many criteria. This hinders the possibility of finding good hypotheses when attribute-value representation is used. Hence, we propose the generic Modular Multi-Relational Framework (MMRF) to predict different kinds of gene and protein annotation using Relational Data Mining (RDM). The specific MMRF application to annotate human protein with diseases verifies that group knowledge (mainly protein-protein interaction pairs) improves the prediction, particularly doubling the area under the precision-recall curveapplication/pdfeng© SpringerRelational data miningHuman disease annotationMulti-class relational decision treeFirst-order logicStructured dataMMRF for proteome annotation applied to human protein disease predictionconference posterInformática10.1007/978-3-642-21295-6open access6775Inductive Logic Programming: 20th International Conference, ILP 2010, Florence, Italy, June 27-30, 2010. Revised Papers