Inference of proteinprotein interaction networks from multiple heterogeneous data
 Lei Huang^{1},
 Li Liao^{1}Email author and
 Cathy H. Wu^{1, 2}
https://doi.org/10.1186/s1363701600402
© Huang et al. 2016
Received: 6 August 2015
Accepted: 9 February 2016
Published: 19 February 2016
Abstract
Proteinprotein interaction (PPI) prediction is a central task in achieving a better understanding of cellular and intracellular processes. Because highthroughput experimental methods are both expensive and timeconsuming, and are also known of suffering from the problems of incompleteness and noise, many computational methods have been developed, with varied degrees of success. However, the inference of PPI network from multiple heterogeneous data sources remains a great challenge. In this work, we developed a novel method based on approximate Bayesian computation and modified differential evolution sampling (ABCDEP) and regularized laplacian (RL) kernel. The method enables inference of PPI networks from topological properties and multiple heterogeneous features including gene expression and Pfam domain profiles, in forms of weighted kernels. The optimal weights are obtained by ABCDEP, and the kernel fusion built based on optimal weights serves as input to RL to infer missing or new edges in the PPI network. Detailed comparisons with control methods have been made, and the results show that the accuracy of PPI prediction measured by AUC is increased by up to 23 %, as compared to a baseline without using optimal weights. The method can provide insights into the relations between PPIs and various feature kernels and demonstrates strong capability of predicting faraway interactions that cannot be well detected by traditional RL method.
Keywords
Protein interaction network Network inference Interaction prediction Differential evolution1 Introduction
Uncovering proteinprotein interaction (PPI) is crucial to having a better understanding of intracellular signaling pathways, modeling of protein complex structures and elucidating various biochemical processes. Although several highthroughput experimental methods, such as yeast twohybrid system and mass spectrometry method, have been used to determine a larger number of protein interactions, these methods are known to be prone to having high falsepositive rates, besides of their high cost. Therefore, efficient and accurate computational methods for PPI prediction are urgently needed.
Generally, current computational methods for PPI prediction can be classified into two categories: A) pairwise biological similarity based methods and B) network levelbased methods. For category A, computational approaches have been developed to predict if any given pair of proteins interact with each other, based on various properties such as sequence homology, gene coexpression and phylogenetic profiles [1–5]. Moreover, some previous work also demonstrated that threedimensional structural information, when available, can be used to predict PPIs with accuracy superior to predictions based on nonstructural evidence [6, 7]. However, with no first principles to tell deterministically yet if two given proteins interact or not, the pairwise biological similarity based on various features and attributes can run out its predictive power, as often the signals may be too weak or noisy. Therefore, recently, many researches have been focused on integrating heterogeneous pairwise features, e.g., genomic features, semantic similarities, in seek of better prediction accuracy [8–11]. It is biologically meaningful if we can disentangle the relations among various pairwise biological similarities and PPIs, but it is still in early stage for the incomplete and noisy pairwise similarity kernels.
To circumvent the limitations with using pairwise biological similarity, efforts have also been made to investigate PPI prediction in the context of networks, which may provide extra information to resolve ambiguities incurred at pairwise level. A network can be constructed from reliable pairwise PPIs, with nodes representing proteins and edges representing interactions. Topological features, such as the number of neighbors, can be collected for nodes and then are used to measure the similarity for any given node pair to make PPI prediction for the corresponding proteins [12–15]. Inspired by the PageRank algorithm [16], variants of random walkbased methods have been proposed to go beyond these node centric topological features to get the whole network involved; the probability of interaction between given two proteins is measured in terms of how likely a random walk in the network starting at one node will reach the other node [17–19]. These methods are suitable for PPI prediction in cases when the task is to find all interacting partners for a particular protein, by using it as the start node for random walks. The computational cost increases from O(N) to O(N ^{2}) for allagainstall PPI prediction. To overcome the limitation of single startnode random walk, many kernels on network for link prediction and semisupervised classification have been systemically studied [20], which can measure the randomwalk distance for all node pairs at once. Compared with the random walk methods, kernel methods are obviously more efficient and applicable to various network types. But, both the variants of random walk and random walkbased kernels cannot differentiate faraway interacting candidates well. Besides, instead of computing proximity measures between nodes from the network structure directly, Kuchaiev et al. and Cannistraci et al. proposed geometric denoise methods that embed PPI network into a lowdimensional geometric space, in which protein pairs that are closer to each other represent good candidate interactions [1, 21].
Furthermore, when the network is represented as an adjacent matrix, the prediction problem can be transformed into a spectral analysis and matrix completion problem. For example, Symeonidis et al. [22] did link prediction for biological and social networks based on multiway spectral clustering. Wang et al. [23] and Krishna et al. [24] predicted PPI interactions through matrix factorizationbased methods. By and large, the prediction task will be reduced to convex a optimization problem, and the performance depends on the objective function, which should be carefully designed to ensure fast convergence and avoidance of being stuck in the local optima.
The two kinds of methods, pairwise biological similaritybased methods and network levelbased methods, can be mutually beneficial. For example, weights can be assigned to edges in the network using pairwise biological similarity scores. In Backstrom et al. [19], a supervised learning task is proposed to learn a function that assigns weighted strengths to edges in the network such that a random walker is more likely to visit the nodes to which new links will be created in the future. The matrix factorizationbased methods proposed by Wang et al. [23] and Krishna et al. [24] also included multimodal biological sources to enhance the prediction performance. In these methods, however, only the pairwise features for the existing edges in the network will be utilized, even though from a PPI prediction perspective, what is particularly useful is to incorporate pairwise features for node pairs that are not currently linked by a direct edge but will if a new edge (PPI) is predicted. Therefore, it would be of great interest if we can infer PPI network directly from multimodal biological features kernels that involve all node pairs. In Yamanishi et al. [25], a method is developed to infer protein networks from multiple types of genomic data based on a variant of kernel canonical correlation analysis (CCA). In that work, all genomic kernels are simply added together, with no weights to regulate these heterogeneous and potentially noisy data sources for their contribution towards PPI prediction. Also, it seems that the partial network needed for supervised learning based on kernel CCA needs to be sufficiently large, e.g., a leaveoneout cross validation is used, to attain good performance.
In this paper, we propose a new method based on ABCDEP sampling method and regularized Laplacian (RL) kernel to infer PPI networks from multiple hetergeneous data. The method uses both topological features and various genomic kernels, which are weighted to form a kernel fusion. The weights are optimized using ABCDEP sampling [26]. Unlike data fusion with genomic kernels for binary classification [27], the combined kernel in our case will be used instead to create a regularized Laplacian kernel [20, 28] for PPI prediction. We demonstrate how the method circumvents the issue of unbalanced data faced by many machinelearning methods in bioinformatics. One main advantage of our method is that only a small partial network is needed for training in order to make the inference at the whole network level. Moreover, the results show that our method works particularly well with detecting interactions between nodes that are far apart in the network, which has been a difficult task for other methods. Tested on Yeast PPI data and compared to two control methods, traditional regularized Laplacian kernel method and regularized Laplacian kernel based on equally weighted kernels, our method shows a significant improvement of over 20 % increase in performance measured by ROC score.
2 Methods and data
2.1 Problem definition
where i and j are two nodes in the nodes set V, and (i,j) represents an edge between i and j, (i,j)∈E. The graph is called connected if there is a path of edges to connect any two nodes in the graph. For supervised learning, we divide the network into three parts: connected training network G _{ tn }=(V,E _{ tn }), validation set G _{ vn }=(V _{ vn },E _{ vn }), and testing set G _{ tt }=(V _{ tt },E _{ tt }). For G _{ tn }, it consists of a minimum spanning tree, augmented with a small set of randomly selected edges. Because all edges are equally weighted, each time a minimum spanning tree is newly built, it will be different from a previous one. And G _{ vn } and G _{ tt } are two nonoverlapping subsets of edges randomly chosen from the edges that are not in G _{ tn }.
Note that the training network is incomplete, i.e., with many edges taken away and reserved as testing examples. Therefore, our inferring task is to predict or recover the interactions in the testing set G _{ tt } based on the kernel fusion.
2.2 How to infer PPI network?
2.3 ABCDEP sampling method for learning weights
In this work, we revise the ABCDEP sampling method [26] to optimize the weights for kernels in Eq. (2). ABCDEP sampling method, based on approximate Bayesian computation with differential evolution and propagation, shows strong capability of accurately estimating parameters for multiple models at one time. The parameter optimization task here is relatively easier than that in [26] as there is only one RLbased prediction model. Specifically, given the connected training network G _{ tn } and N feature kernels in Eq. (2), the length of the particle in ABCDEP would be N+1, where particle can also be seen as a sample including the N+1 weight values. As mentioned before, the PPI network is divided into three parts: the connected training network G _{ tn }, validation set G _{ vn } and testing set G _{ tt }. To obtain the optimal particle(s), a population of particles with size N _{ p } is intialized, and ABCDEP sampling is run iteratively until a particle is found in the evolving population that maximizes the AUC of inferring training network G _{ tn }, validation set G _{ vn }. The validation set G _{ vn } is used to avoid overfitting as the algorithm converges. Algorithm 2 shows the detailed sampling process.
Algorithm 2 is the main structure in which a population of particles with equal importance is initialized and each particle consists of kernel weights randomly generated from a uniform prior. Given the particle population, Algorithm 3 samples through the parameter space for good particles and assigns them weights according to the predicting quality of their corresponding kernel fusion K _{ fusion }. Note that, different from the ABCDEP sampling method in [26] where the logarithm of the Boltzmann distribution is adopted, here, we accept or reject a new candidate particle based on Boltzmann distribution with simulated annealing method [32]. Through the evolution process, bad particles will be filtered out and good particles will be kept for the next generation. We repeat this process until the algorithm converges. The optimal particle is used to build kernel fusion K _{ fusion } for PPI prediction.
2.4 Data and kernels
We use yeast PPI networks downloaded from DIP database (Release 20150101) [33] to test our algorithm. Notably, some interactions without Uniprotkb ID have been filtered out in order to do name mapping and make use of genomic similarity kernels [27]. As a result, the PPI network contains 5093 proteins and 22,423 interactions, from which the largest connected component is used to serve as golden standard network. It consists of 5030 proteins and 22,394 interactions. Only tens of proteins and interactions are not included in the largest connected component, which makes the golden standard data almost as complete as the original network. As mentioned before, the golden standard PPI network is divided into three parts that are connected training network G _{ tn }, validation set G _{ vn } and testing set G _{ tt }, where training network G _{ tn } is included in the kernel fusion, validation set G _{ vn } is used to find optimal weights for feature kernels and testing set G _{ tt } is used to evaluate the inference capability of our method.

G _{ tn }: G _{ tn } is the connected training network that provides connectivity information. It can also be thought of as a base network to do the inference.

K _{ Jaccard } [34]: This kernel measure the similarity of protein pairs i,j in term of \(\frac {neigbors(i) \cap neighbors(j)}{neighbors(i) \cup neighbors(j)}\).

K _{ SN }: It measures the total number of neighbors of protein i and j, K _{ SN }=neighbors(i)+neighbors(j).

K _{ B } [27]: It is a sequencebased kernel matrix that is generated using the BLAST [35].

K _{ E } [27]: This is a gene coexpression kernel matrix constructed entirely from microarray gene expression measurements.

K _{ Pfam } [27]: This is a generalization of the previous pairwise comparisonbased matrices in which the pairwise comparison scores are replaced by expectation values derived from hidden Markov models (HMMs) in the Pfam database [36].
These kernels are positive semidefinite. Please refer to [27] for detailed analysis (or proof). Moreover, Eq. (2) is guaranteed to be positive semidefinite, because basic algebraic operations such as addition, multiplication, and exponentiation preserve the key property of positive semidefiniteness [37]. Finally, all these kernels are normalized to the scale of (0,1) in order to avoid bias.
3 Results and discussion
3.1 Inferring PPI network
In Fig. 3, we also compared with another method, WOLP, which uses linear programming to optimize the weights W _{ i } for the various kernel features [38]. It can be seen that WOLP, with AUC at about 0.83, also performs signigicantly better than the baseline, indicating that the method is effective in weighting various features to improve PPI inference. Note that although reference [38] has “random walk” in its title, the method WOLP does not do sampling; instead, the weights for kernel features are optimized by linear programming, constrained with the transition matrix from the training network for any wouldbe random walk over the PPI network when kernel features are incorporated. As such, WOLP is more computationally efficient but with a tradeoff of slightly worse performance as compared to ABCDEP, which has the best AUC, 0.86, in this study.
3.2 Effects of the training data
3.3 Detection of interacting pairs far apart in the network
It is known that the basic idea of using random walk or random walk based kernels [17–20] for PPI prediction is that good interacting candidates usually are not faraway from the start node, e.g., only 2,3 edges away in the network. Consequently, for some existing networklevel link prediction methods, testing nodes have been chosen to be within a certain distance range, which largely contributes to their good performance reported. In reality, however, a method that is capable and good at detecting interacting pairs far apart in the network can be even more useful, such as in uncovering cross talk between pathways that are not nearby in the PPI network.
3.4 Analysis of weights and efficiency
As the method incorporates multiple heterogeneous data, it can be insightful to inspect the final optimal weights. In our case, the optimal weights are 0.8608, 0.1769, 0.9334, 0, 0.0311, 0.9837, respectively for feature kernels G _{ tn }, K _{ Jaccard }, K _{ SN }, K _{ B }, K _{ E }, and K _{ Pfam }. These weights indicate that K _{ SN } and K _{ Pfam } are the predominant contributors to PPI prediction. This observation is consistent with the intuition that proteins interact via interfaces made of conserved domains [39], and PPI interactions can be classified based on their domain families and domains from the same family tend to interact [40–42]. Although the true strength of our method lies in integrating multiple heterogeneous data for PPI network inference, the optimal weights can serve as a guidance to select most relevant features when time and resources are limited.
Lastly, despite of the common concern of time efficiency with methods based on evolutionary computing, the issue is mitigated in our case. In our experiments, only a small number of particles, 150 to be exact, is needed for the initial population for ABCDEP sampling. Also, as shown in the Fig. 2, our ABCDEP algorithm is quickly converged, within 10 iterations. Moreover, since the PPI inference from RL _{ OPTK } is shown to be less sensitive to the size of training data, only 5394 gold standard edges, less than 25 % of the total number, are used. And, we do not need to retrain the model for different testing data, which is another timesaving property of our method.
4 Conclusions
In this work, we developed a novel supervised method that enables inference of PPI networks from topological and genomic feature kernels in an optimized integrative way. Tested on DIP yeast PPI network, the results show that our method exhibits competitive advantages over control methods in several ways. First, the proposed method achieved superior performance in PPI prediction, as measured by ROC score, over 20 % higher than the baseline, and this margin is maintained even when the control methods use a significantly larger training set. Second, we also demonstrated that by integrating topological and genomic features into regularized Laplacian kernel, the method avoids the shortrange problem encountered by randomwalk based methods—namely the inference becomes less reliable for nodes that are far from the start node of the random walk, and show obvious improvements on predicting faraway interactions. Lastly, our method can also provide insights into the relations between PPIs and various similarity features of protein pairs, thereby helping us make good use of these features. As more features with respect to proteins are collected from various omics studies, they can be used to characterize protein pairs in terms of feature kernels from different perspectives. Thus, we believe that our method provides a useful framework in fusing various feature kernels from heterogeneous data to improve PPI prediction.
Declarations
Acknowledgements
Funding: Delaware INBRE program, with grant from the National Institute of General Medical SciencesNIGMS (P20 GM103446) from the National Institutes of Health.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 O Kuchaiev, M Rašajski, DJ Higham, N Pržulj, Geometric denoising of proteinprotein interaction networks. PLoS Comput. Biol.5(8), 1000454 (2009).View ArticleMathSciNetGoogle Scholar
 Y Murakami, K Mizuguchi, Homologybased prediction of interactions between proteins using averaged onedependence estimators. BMC Bioinforma.15(1), 213 (2014).View ArticleGoogle Scholar
 L Salwinski, D Eisenberg, Computational methods of analysis of proteinprotein interactions. Curr. Opin. Struct. Biol.13(3), 377–382 (2003).View ArticleGoogle Scholar
 R Craig, L Liao, Phylogenetic tree information aids supervised learning for predicting proteinprotein interaction based on distance matrices. BMC Bioinforma.8(1), 6 (2007).View ArticleGoogle Scholar
 A Gonzalez, L Liao, Predicting domaindomain interaction based on domain profiles with feature selection and support vector machines. BMC Bioinforma.11(1), 537 (2010).View ArticleGoogle Scholar
 QC Zhang, D Petrey, L Deng, L Qiang, Y Shi, CA Thu, B Bisikirska, C Lefebvre, D Accili, T Hunter, T Maniatis, A Califano, B Honig, Structurebased prediction of proteinprotein interactions on a genomewide scale. Nature. 490(7421), 556–560 (2012).View ArticleGoogle Scholar
 R Singh, D Park, J Xu, R Hosur, B Berger, Struct2net: a web service to predict proteinprotein interactions using a structurebased approach. Nucleic Acids Res.38(suppl 2), 508–515 (2010).View ArticleGoogle Scholar
 Y Deng, L Gao, B Wang, ppipre: predicting proteinprotein interactions by combining heterogeneous features. BMC Syst. Biol.7(Suppl 2), 8 (2013).View ArticleGoogle Scholar
 J Sun, Y Sun, G Ding, Q Liu, C Wang, Y He, T Shi, Y Li, Z Zhao, Inpreppi: an integrated evaluation method based on genomic context for predicting proteinprotein interactions in prokaryotic genomes. BMC Bioinforma.8(1), 414 (2007).View ArticleGoogle Scholar
 YR Cho, M Mina, Y Lu, N Kwon, P Guzzi, Mfinder: uncovering functionally associated proteins from interactome data integrated with go annotations. Proteome Sci.11(Suppl 1), 3 (2013).View ArticleGoogle Scholar
 SH Jung, WH Jang, DS Han, A computational model for predicting protein interactions based on multidomain collaboration. IEEE/ACM Trans. Comput. Biol. Bioinforma.9(4), 1081–1090 (2012).View ArticleGoogle Scholar
 HH Chen, L Gou, XL Zhang, CL Giles, in Proceedings of the 27th Annual ACM Symposium on Applied Computing. Discovering missing links in networks using vertex similarity measures. SAC ’12 (ACMNew York, 2012), pp. 138–143.View ArticleGoogle Scholar
 L Lü, T Zhou, Link prediction in complex networks: a survey. Physica A. 390(6), 11501170 (2011).View ArticleGoogle Scholar
 C Lei, J Ruan, A novel link prediction algorithm for reconstructing proteinprotein interaction networks by topological similarity. Bioinformatics. 29(3), 355–364 (2013).View ArticleGoogle Scholar
 N Pržulj, Proteinprotein interactions: making sense of networks via graphtheoretic modeling. BioEssays. 33(2), 115–123 (2011).View ArticleGoogle Scholar
 L Page, S Brin, R Motwani, T Winograd, The PageRank Citation Ranking: Bringing Order to the Web (Stanford InfoLab, Stanford, CA, USA, 1999). Previous number = SIDLWP19990120, http://ilpubs.stanford.edu:8090/422/.Google Scholar
 H Tong, C Faloutsos, JY Pan, Random walk with restart: fast solutions and applications. Knowl. Inf. Syst.14(3), 327–346 (2008). doi:10.1007/s1011500700942.View ArticleMATHGoogle Scholar
 RH Li, JX Yu, J Liu, in Proceedings of the 20th ACM International Conference on Information and Knowledge Management. Link Prediction: The Power of Maximal Entropy Random Walk (ACMNew York, NY, USA, 2011), pp. 1147–1156. http://doi.acm.org/10.1145/2063576.2063741.Google Scholar
 L Backstrom, J Leskovec, in Proceedings of the Fourth ACM International Conference on Web Search and Data Mining. Supervised random walks: Predicting and recommending links in social networks. WSDM ’11 (ACMNew York, 2011), pp. 635–644.View ArticleGoogle Scholar
 F Fouss, K Francoisse, L Yen, A Pirotte, M Saerens, An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification. Neural Netw.31(0), 53–72 (2012).View ArticleMATHGoogle Scholar
 CV Cannistraci, G AlanisLobato, T Ravasi, Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding. Bioinformatics. 29(13), 199–209 (2013).View ArticleGoogle Scholar
 P Symeonidis, N Iakovidou, N Mantas, Y Manolopoulos, From biological to social networks: link prediction based on multiway spectral clustering. Data Knowl. Eng.87(0), 226–242 (2013).View ArticleGoogle Scholar
 H Wang, H Huang, C Ding, F Nie, Predicting protein–protein interactions from multimodal biological data sources via nonnegative matrix trifactorization. J. Comput. Biol.20(4), 344–358 (2013). doi:10.1089/cmb.2012.0273.View ArticleMathSciNetGoogle Scholar
 AK Menon, C Elkan, in Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases  Volume Part II. Link prediction via matrix factorization. ECML PKDD’11 (SpringerBerlin, 2011), pp. 437–452.Google Scholar
 Y Yamanishi, JP Vert, M Kanehisa, Protein network inference from multiple genomic data: a supervised approach. Bioinformatics. 20(suppl 1), 363–370 (2004).View ArticleGoogle Scholar
 L Huang, L Liao, CH Wu, Evolutionary model selection and parameter estimation for proteinprotein interaction network based on differential evolution algorithm. IEEE/ACM Trans. Comput. Biol. Bioinforma.12(3), 622–631 (2015).View ArticleGoogle Scholar
 GRG Lanckriet, T De Bie, N Cristianini, MI Jordan, WS Noble, A statistical framework for genomic data fusion. Bioinformatics. 20(16), 2626–2635 (2004).View ArticleGoogle Scholar
 T Ito, M Shimbo, T Kudo, Y Matsumoto, in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. Application of kernels to link analysis. KDD ’05 (ACMNew York, 2005), pp. 586–592.View ArticleGoogle Scholar
 AJ Smola, R Kondor, 2777, ed. by B Schölkopf, MK Warmuth. Learning Theory and Kernel Machines: 16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003, Washington, DC, USA, August 2427, 2003. Proceedings (Springer Berlin HeidelbergBerlin, Heidelberg, 2003), pp. 144–158, doi:10.1007/9783540451679_12.
 A Mantrach, N van Zeebroeck, P Francq, M Shimbo, H Bersini, M Saerens, Semisupervised classification and betweenness computation on large, sparse, directed graphs. Pattern Recogn.44(6), 1212–1224 (2011).View ArticleMATHGoogle Scholar
 JY Pan, HJ Yang, C Faloutsos, P Duygulu, in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Automatic multimedia crossmodal correlation discovery. KDD ’04 (ACMNew York, 2004), pp. 653–658.Google Scholar
 S Kirkpatrick, CD Gelatt, MP Vecchi, Optimization by simulated annealing. Science. 220(4598), 671–680 (1983).View ArticleMathSciNetMATHGoogle Scholar
 L Salwinski, CS Miller, AJ Smith, FK Pettit, JU Bowie, D Eisenberg, The database of interacting proteins: 2004 update. Nucleic Acids Res.32(90001), 449–451 (2004).View ArticleGoogle Scholar
 P Jaccard, Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles. 37:, 547–579 (1901).Google Scholar
 SF Altschul, W Gish, W Miller, EW Myers, DJ Lipman, Basic local alignment search tool. J. Mol. Biol.215(3), 403–410 (1990).View ArticleGoogle Scholar
 ELL Sonnhammer, SR Eddy, R Durbin, Pfam: A comprehensive database of protein domain families based on seed alignments. Proteins Struct. Funct. Bioinforma.28(3), 405–420 (1997).View ArticleGoogle Scholar
 C Berg, JPR Christensen, P Ressel, Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions, 1st edn., vol. 100 (SpringerVerlag New York, New York, 1984).View ArticleMATHGoogle Scholar
 L Huang, L Liao, CH Wu, in Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference On. Proteinprotein interaction network inference from multiple kernels with optimization based on random walk by linear programming, (2015), pp. 201–207. doi:10.1109/BIBM.2015.7359681.
 M Deng, S Mehta, F Sun, T Chen, Inferring domaindomain interactions from proteinprotein interactions. Genome Res.12(10), 1540–1548 (2002).View ArticleGoogle Scholar
 Z Itzhaki, E Akiva, Y Altuvia, H Margalit, Evolutionary conservation of domaindomain interactions. Genome Biol.7(12), 125 (2006).View ArticleGoogle Scholar
 J Park, M Lappe, SA Teichmann, Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the {PDB} and yeast1. J. Mol. Biol.307(3), 929–938 (2001).View ArticleGoogle Scholar
 D Betel, R Isserlin, CWV Hogue, Analysis of domain correlations in yeast protein complexes. Bioinformatics. 20(suppl 1), 55–62 (2004).View ArticleGoogle Scholar