Journal of The Royal Society Interface
Restricted accessResearch articles

Implications of functional similarity for gene regulatory interactions

Kimberly Glass

Kimberly Glass

Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA

Department of Physics, University of Maryland, College Park, MD, USA

[email protected]

Google Scholar

Find this author on PubMed

,
Edward Ott

Edward Ott

Department of Physics, University of Maryland, College Park, MD, USA

Google Scholar

Find this author on PubMed

,
Wolfgang Losert

Wolfgang Losert

Department of Physics, University of Maryland, College Park, MD, USA

Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA

Google Scholar

Find this author on PubMed

and
Michelle Girvan

Michelle Girvan

Department of Physics, University of Maryland, College Park, MD, USA

Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA

Google Scholar

Find this author on PubMed

Published:https://doi.org/10.1098/rsif.2011.0585

    If one gene regulates another, those two genes are likely to be involved in many of the same biological functions. Conversely, shared biological function may be suggestive of the existence and nature of a regulatory interaction. With this in mind, we develop a measure of functional similarity between genes based on annotations made to the Gene Ontology in which the magnitude of their functional relationship is also indicative of a regulatory relationship. In contrast to other measures that have previously been used to quantify the functional similarity between genes, our measure scales the strength of any shared functional annotation by the frequency of that function's appearance across the entire set of annotations. We apply our method to both Escherichia coli and Saccharomyces cerevisiae gene annotations and find that the strength of our scaled similarity measure is more predictive of known regulatory interactions than previously published measures of functional similarity. In addition, we observe that the strength of the scaled similarity measure is correlated with the structural importance of links in the known regulatory network. By contrast, other measures of functional similarity are not indicative of any structural importance in the regulatory network. We therefore conclude that adequately adjusting for the frequency of shared biological functions is important in the construction of a functional similarity measure aimed at elucidating the existence and nature of regulatory interactions. We also compare the performance of the scaled similarity with a high-throughput method for determining regulatory interactions from gene expression data and observe that the ontology-based approach identifies a different subset of regulatory interactions compared with the gene expression approach. We show that combining predictions from the scaled similarity with those from the reconstruction algorithm leads to a significant improvement in the accuracy of the reconstructed network.

    References

    • 1
      Ashburner M., et al. 2000 Gene ontology: tool for the unification of biology. The Gene Ontology consortium. Nat. Genet. 25, 25–29.doi:10.1038/75556 (doi:10.1038/75556). Crossref, PubMed, ISIGoogle Scholar
    • 2
      Consortium T. G. O.. 2010 The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res. 38((Suppl. 1)), D331–D335.doi:10.1093/nar/gkp1018 (doi:10.1093/nar/gkp1018). Crossref, PubMed, ISIGoogle Scholar
    • 3
      Huang D. W., et al. 2007 DAVID Bioinformatics resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 35((Suppl. 2)), W169–W175.doi:10.1093/nar/gkm415 (doi:10.1093/nar/gkm415). Crossref, PubMed, ISIGoogle Scholar
    • 4
      Mostafavi S.& Morris Q.. 2010 Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 26, 1759–1765.doi:10.1093/bioinformatics/btq262 (doi:10.1093/bioinformatics/btq262). Crossref, PubMed, ISIGoogle Scholar
    • 5
      King O. D., Foulger R. E., Dwight S. S., White J. V.& Roth F. P.. 2003 Predicting gene function from patterns of annotation. Genome Res. 13, 896–904.doi:10.1101/gr.440803 (doi:10.1101/gr.440803). Crossref, PubMed, ISIGoogle Scholar
    • 6
      Youn A., Reiss D. J.& Stuetzle W.. 2010 Learning transcriptional networks from the integration of ChIP-chip and expression data in a non-parametric model. Bioinformatics 26, 1879–1886.doi:10.1093/bioinformatics/btq289 (doi:10.1093/bioinformatics/btq289). Crossref, PubMed, ISIGoogle Scholar
    • 7
      Lee I., Date S. V., Adai A. T.& Marcotte E. M.. 2004 A probabilistic functional network of yeast genes. Science 306, 1555–1558.doi:10.1126/science.1099511 (doi:10.1126/science.1099511). Crossref, PubMed, ISIGoogle Scholar
    • 8
      Franke L., Van Bakel H., Fokkens L., de Jong E. D., Egmont-Petersen M.& Wijmenga C.. 2006 Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am. J. Hum. Genet. 78, 1011–1025.doi:10.1086/504300 (doi:10.1086/504300). Crossref, PubMed, ISIGoogle Scholar
    • 9
      Yang X., Zhou Y., Jin R.& Chan C.. 2009 Reconstruct modular phenotype-specific gene networks by knowledge-driven matrix factorization. Bioinformatics 25, 2236–2243.doi:10.1093/bioinformatics/btp376 (doi:10.1093/bioinformatics/btp376). Crossref, PubMed, ISIGoogle Scholar
    • 10
      Newman M. E. J.. 2003 The structure and function of complex networks. SIAM Rev. 45, 167–256.doi:10.1137/S003614450342480 (doi:10.1137/S003614450342480). Crossref, ISIGoogle Scholar
    • 11
      Milo R., Shen-Orr S., Itzkovitz S., Kashtan N., Chklovskii D.& Alon U.. 2002 Network motifs: simple building blocks of complex networks. Science 298, 824–827.doi:10.1126/science.298.5594.824 (doi:10.1126/science.298.5594.824). Crossref, PubMed, ISIGoogle Scholar
    • 12
      Solé R. V., Cancho R. F., Montoya J. M.& Valverde S.. 2002 Selection, tinkering, and emergence in complex networks. Complex 8, 20–33.doi:10.1002/cplx.10055 (doi:10.1002/cplx.10055). CrossrefGoogle Scholar
    • 13
      Jeong H., Mason S. P., Barabási A. L.& Oltvai Z. N.. 2001 Lethality and centrality in protein networks. Nature 411, 41–42.doi:10.1038/35075138 (doi:10.1038/35075138). Crossref, PubMed, ISIGoogle Scholar
    • 14
      Wagner A.. 2001 The yeast protein interaction network evolves rapidly and contains few redundant duplicate genes. Mol. Biol. Evol. 18, 1283–1292. Crossref, PubMed, ISIGoogle Scholar
    • 15
      Guimera R.& Nunes Amaral L. A.. 2005 Functional cartography of complex metabolic networks. Nature 433, 895–900.doi:10.1038/nature03288 (doi:10.1038/nature03288). Crossref, PubMed, ISIGoogle Scholar
    • 16
      Zhao J., Yu H., Luo J., Cao Z.& Li Y.. 2006 Complex networks theory for analyzing metabolic networks. Chin. Sci. Bull. 51, 1529–1537. CrossrefGoogle Scholar
    • 17
      Gama-Castro S., et al. 2008 RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucl. Acids Res. 36((Suppl. 1)), D120–D124.doi:10.1093/nar/gkm994 (doi:10.1093/nar/gkm994). Crossref, PubMed, ISIGoogle Scholar
    • 18
      Lord P. W., Stevens R. D., Brass A.& Goble C. A.. 2003 Semantic similarity measures as tools for exploring the Gene Ontology. Pac. Symp. Biocomput. 8, 601–612. Google Scholar
    • 19
      Huang D. W., et al. 2007 The DAVID gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 8, R183.doi:10.1186/gb-2007-8-9-r183 (doi:10.1186/gb-2007-8-9-r183). Crossref, PubMed, ISIGoogle Scholar
    • 20
      Faith J. J., Hayete B., Thaden J. T., Mogno I., Wierzbowski J., Cottarel G., Kasif S., Collins J. J.& Gardner T. S.. 2007 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5, e8.doi:10.1371/journal.pbio.0050008 (doi:10.1371/journal.pbio.0050008). Crossref, PubMed, ISIGoogle Scholar
    • 21
      The Gene Ontology Consortium. 2001 Creating the Gene Ontology resource: design and implementation. Genome Res. 11, 1425–1433. Crossref, PubMed, ISIGoogle Scholar
    • 22
      Clauset A., Shalizi C. R.& Newman M. E. J.. 2009 Power-law distributions in empirical data. SIAM Rev. 51, 661–703. Crossref, ISIGoogle Scholar
    • 23
      Pesquita C., Faria D., Falcão A. O., Lord P.& Couto F. M.. 2009 Semantic similarity in biomedical ontologies. PLoS Comput. Biol. 5, e1000443.doi:10.1371/journal.pcbi.1000443 (doi:10.1371/journal.pcbi.1000443). Crossref, PubMed, ISIGoogle Scholar
    • 24
      Ovaska K., Laakso M.& Hautaniemi S.. 2008 Fast gene ontology based clustering for microarray experiments. BioData Mining 1, 11.doi:10.1186/1756-0381-1-11 (doi:10.1186/1756-0381-1-11). Crossref, PubMed, ISIGoogle Scholar
    • 25
      Resnik P.. 1995 Using information content to evaluate semantic similarity in a taxonomy. Proc. 14th Int. Joint Conf. on Artificial Intelligence, August 1995, Montreal, Canada, pp. 448–453. Google Scholar
    • 26
      Lin D.. 1998 An information-theoretic definition of similarity. Proc. 15th Int. Conf. on Machine Learning, July 1998, Madison, WI, USA, pp. 296–304. Google Scholar
    • 27
      Jiang J. J.& Conrath D. W.. 1997 Semantic similarity based on corpus statistics and lexical taxonomy. Int. Conf. Res. on Computational Linguistics (ROCLING X), 1997, Taiwan, 9008. Google Scholar
    • 28
      Schlicker A., Domingues F., Rahnenfuhrer J.& Lengauer T.. 2006 A new measure for functional similarity of gene products based on gene ontology. BMC Bioinform. 7, 302.doi:10.1186/1471-2105-7-302 (doi:10.1186/1471-2105-7-302). Crossref, PubMed, ISIGoogle Scholar
    • 29
      Wang H., Azuaje F., Bodenreider O.& Dopazo J.. 2004 Gene expression correlation and gene ontology-based similarity: an assessment of quantitative relationships. CIBCB ‘04. Proc. 2004 IEEE Symp. on Computational Intelligence in Bioinformatics and Computational Biology, La Jolla, San Diego, CA, USA, pp. 25–31. CrossrefGoogle Scholar
    • 30
      Sevilla J. L., Segura V., Podhorski A., Guruceaga E., Mato J. M., Martinez-Cruz L. A., Corrales F. J.& Rubio A.. 2005 Correlation between gene expression and GO semantic similarity. Comput. Biol. Bioinform. IEEE/ACM Trans. 2, 330–338. Crossref, PubMed, ISIGoogle Scholar
    • 31
      Guo X., Liu R., Shriver C. D., Hu H.& Liebman M. N.. 2006 Assessing semantic similarity measures for the characterization of human regulatory pathways. Bioinformatics 22, 967–973.doi:10.1093/bioinformatics/btl042 (doi:10.1093/bioinformatics/btl042). Crossref, PubMed, ISIGoogle Scholar
    • 32
      Pesquita C., Faria D., Bastos H., Ferreira A., Falcao A.& Couto F.. 2008 Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinform. 9((Suppl. 5)), S4. Crossref, PubMed, ISIGoogle Scholar
    • 33
      Chabalier J., Mosser J.& Burgun A.. 2007 A transversal approach to predict gene product networks from ontology-based similarity. BMC Bioinform. 8, 235.doi:10.1186/1471-2105-8-235 (doi:10.1186/1471-2105-8-235). Crossref, PubMed, ISIGoogle Scholar
    • 34
      Martin D., Brun C., Remy E., Mouren P., Thieffry D.& Jacq B.. 2004 GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol. 5, R101. Crossref, PubMed, ISIGoogle Scholar
    • 35
      Butte A. J.& Kohane I. S.. 2000 Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac. Symp. Biocomput. 5, 418–429. Google Scholar
    • 36
      Zhu J., et al. 2004 An integrative genomics approach to the reconstruction of gene networks in segregating populations. Cytogenet. Genome Res. 105, 363–374.doi:10.1159/000078209 (doi:10.1159/000078209). Crossref, PubMed, ISIGoogle Scholar
    • 37
      Margolin A. A., Nemenman I., Basso K., Wiggins C., Stolovitzky G, Dalla Favera R.& Califano A.. 2006 ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform. 7((Suppl. 1)), S7. Crossref, PubMed, ISIGoogle Scholar
    • 38
      Lee I., Li Z.& Marcotte E. M.. 2007 An improved, bias-reduced probabilistic functional gene network of bakers yeast., Saccharomyces cerevisiae. PLoS ONE 2, e988.doi:10.1371/journal.pone.0000988 (doi:10.1371/journal.pone.0000988). Crossref, PubMed, ISIGoogle Scholar
    • 39
      MacIsaac K. D., Wang T., Gordon D. B., Gifford D. K., Stormo G. D.& Fraenkel E.. 2006 An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinform. 7, 113.doi:10.1186/1471-2105-7-113 (doi:10.1186/1471-2105-7-113). Crossref, PubMed, ISIGoogle Scholar