Optimal cancer prognosis under network uncertainty
 Mohammadmahdi R Yousefi^{1}Email author and
 Lori A Dalton^{1, 2}
https://doi.org/10.1186/s1363701400203
© Yousefi and Dalton; licensee Springer. 2015
Received: 21 July 2014
Accepted: 25 November 2014
Published: 27 January 2015
Abstract
Typically, a vast amount of experience and data is needed to successfully determine cancer prognosis in the face of (1) the inherent stochasticity of cell dynamics, (2) incomplete knowledge of healthy cell regulation, and (3) the inherent uncertain and evolving nature of cancer progression. There is hope that models of cell regulation could be used to predict disease progression and successful treatment strategies, but there has been little work focusing on the third source of uncertainty above. In this work, we investigate the impact of this kind of network uncertainty in predicting cancer prognosis. In particular, we focus on a scenario in which the precise aberrant regulatory relationships between genes in a patient are unknown, but the patient gene regulatory network is contained in an uncertainty class of possible mutations of some known healthy network. We optimistically assume that the probabilities of these abnormal networks are available, along with the best treatment for each network. Then, given a snapshot of the patient gene activity profile at a single moment in time, we study what can be said regarding the patient’s treatability and prognosis. Our methodology is based on recent developments on optimal control strategies for probabilistic Boolean networks and optimal Bayesian classification. We show that in some circumstances, prognosis prediction may be highly unreliable, even in this optimistic setting with perfect knowledge of healthy biological processes and ideal treatment decisions.
1 Introduction
NCI defines cancer prognosis as ‘...an estimate of the likely course and outcome of a disease. The prognosis of a patient diagnosed with cancer is often viewed as the chance that the disease will be treated successfully and that the patient will recover’ [1]. A central problem in translational medicine is thus to decide, given biological knowledge and a collection of observations, whether a cancer patient will bear any chance of successful treatment.
There are a myriad of approaches to model both normal (healthy) and aberrant (cancerous) cell dynamics, including biological pathways, coexpression networks, Bayesian networks, Boolean networks (BNs), probabilistic BNs (PBNs), Petri nets, differential equationbased networks, etc. It is believed that these may be used to predict disease diagnosis, progression, and successful treatment strategies, which has led to much work on the identification and analysis of biological networks in genomics and biomedicine.
There remain two questions regarding prognosis. First, even if the underlying network of a patient were perfectly known and the best drug to use for the patient were also known, would a patient necessarily be curable? Second, suppose the precise network of a patient were unknown, but probabilities of an uncertainty class of networks, for instance all possible mutations of some healthy network, were available along with the best drug to use for each abnormal network. Then based on available measurements, say genomic or proteomic profiles of the patient, what could be said regarding a patient’s treatability and prognosis? That is, might the very nature of cancer, with its uncertain progression and unique characteristics in each individual, make it impossible to predict prognosis, even given perfect knowledge of all biological processes and ideal treatment decisions? In this paper, we give quantitative answers to these questions, at least at a conceptual level in the context of optimal control strategies for PBNs, by studying intervention outcome in a framework of uncertain biology.
PBNs are a class of dynamical models for functional gene regulatory networks (GRNs) [2]. They can capture the intrinsic uncertainty of gene interactions and measurement error, rendering GRN dynamics as Markov chains. They also provide a systematic way of modeling intervention scenarios, where the theory of discretetime Markov decision processes can be applied to determine optimal intervention strategies. The steadystate distribution (SSD) of the model Markov chain reflects the longterm behavior (phenotypes) of the underlying network, and changes imposed on the SSD through various types of network intervention serve as a guide for developing beneficial treatment strategies. In short, given a PBN, one can optimally design an intervention strategy to alter the dynamics of the network so that the gene activity profiles (GAPs) evolve in a desired manner.
Managing uncertainty is especially important in modeling biological networks, where there is inherent uncertainty in the state of a network due to immeasurable latent variables, as well as uncertainty due to a lack of knowledge or partial knowledge of the relationships between observable variables even in a healthy network [3]. Here, we focus on a third source of model uncertainty due to the inherent unpredictability of somatic gene mutations or aberrant pathway malfunctioning that may arise in a cancer. This corresponds to listing plausible scenarios in which a healthy network may undergo a functional disruption in normal gene regulation. It is imperative to take into account this uncertainty to provide a robust decision regarding cancer prognosis.
We assume that a patient’s network belongs to an uncertainty class of networks, each derived from a known healthy network that contains some structure essentially common to all networks. Each network in the uncertainty class possesses one or more ‘mutations’ of the healthy network, representing various possible subtypes or stages of cancer. Some networks in the uncertainty class may be very treatable (good prognosis), while others may be difficult or impossible to treat (bad prognosis). In fact, we will partition the space of networks into four classes based on the severity of disease with treatment and the benefit of treatment. We measure the severity of disease by the longrun probability that cancerous cells visit certain known undesirable states, or equivalently, the SSD mass of these undesirable states. We measure the benefit of treatment by the difference between steadystate mass in undesirable states before and after treatment, which we call the steadystate shift.
Our objective is to optimally classify patients into our four prognosis categories and to study the impact of network uncertainty on predicting prognosis. Recent work on optimal Bayesian classification (OBC) furnishes an elegant framework for designing optimal classifiers and optimally estimating their error [4,5]. In the general setting, it is assumed that the true underlying sampling distribution belongs to a parameterized uncertainty class of distributions associated with a known prior probability distribution. Closedform solutions are available for several models with conjugate priors.
In prior work, there have been several studies developing subnetwork markers extracted from protein or gene interaction networks to improve cancer diagnosis [69]. While it is clear that classifier performance can be greatly improved using subnetwork markers, these works only consider groups of components known to interact and do not take full advantage of network structure itself. Furthermore, these works focus on diagnosis and do not model the effect of intervention. Work in [10] proposes a competitionbased strategy using large datasets to identify the best methods to predict breast cancer prognosis. Several methods are employed using genomic or clinical information or both. While the authors demonstrate that some of the best methods for prognosis prediction incorporate molecular features selected by expert prior knowledge along with both molecular and clinical data, all methods used are based on datadriven machine learning rather than optimal prediction and error estimation and do not take full advantage of network structure to improve prediction. In [11,12], the authors present methods of constructing uncertainty classes of gene expression distributions in the OBC framework that are consistent with available pathway information to improve classification. However, the focus is on diagnosis rather than prognosis, and these works treat network uncertainty as stemming from ignorance. For instance, they assume that all data is drawn from the same sampling distribution, rather than modeling multiple subtypes of cancer that may exhibit different patterns of gene expression. While these advances improve cancer classification using various forms of prior knowledge, no work that we know of rigorously addresses optimal error rates that can be achieved in the presence of uncertain knowledge of the underlying network due to the inherent heterogeneity of cancer.
2 Network model
Since PBNs fundamentally rely on the dynamics of constituent BNs, we shall define BNs first. A BN is characterized by a set of n nodes, v _{ i }∈{0,1} for i=1,…,n, representing the expression level of genes or their products, and a collection of n Boolean predictor functions, f _{ i }:{0,1}^{ n }→{0,1} for i=1,…,n, describing the functional relationships between genes. In this setting, 0 and 1 represent down and upregulation of genes, respectively.
There is a natural bijection between v ^{ k } and its integer representation \(x^{k}\in \mathcal {S}=\{0,1,\ldots,2^{n}1\}\) given by \(x^{k}=\sum _{i=1}^{n} 2^{ni} {v_{i}^{k}}\). We call x ^{ k } the state of the network at time k and the state space.
PBNs generalize BNs by introducing random switching between several contexts, where each context is a BN on its own. They also introduce a random gene perturbation, where the current state of each gene in the network is randomly flipped with probability p. If the PBN has only a single context, then the model becomes a BN with perturbation (BNp), which will serve as our model for GRNs in this paper.
Probabilistic transition rules of any PBN can be modeled by a homogeneous Markov chain. We denote the stochastic process of state transitions by \(\{Z^{k}\in \mathcal {S} : k=0,1,\ldots \}\). Originating from state \(x\in \mathcal {S}\), the successor state \(y\in \mathcal {S}\) is selected according to the transition probability matrix (TPM) , with (x,y) element \(\mathcal {P}_{\textit {xy}}:= P(Z^{k+1}=y\mid Z^{k}=x)\) for all k=0,1,… [2]. Due to random gene perturbation, the equivalent Markov chain is ergodic and has a unique invariant distribution, π, equal to the SSD of the network under no intervention. We also use π _{ x } to denote the probability mass of π evaluated at state \(x \in \mathcal {S}\).
3 Optimal intervention in PBNs
Treatment aims to alter the dynamics of a cell to achieve some desirable property or behavior. To formalize this for a given PBN, let be a set of undesirable states, which may be an arbitrary subset of . States in may correspond to pathological behavior or known cancer phenotypes. A natural measure of the performance of a treatment or control policy then becomes the longrun expected occupation of undesirable states. We now review optimal intervention, assuming the true TPM is perfectly known.
Two types of intervention methods for PBNs have been proposed: structural intervention [16] and external control [17,18]. The former aims to effectively change the wiring of a GRN so that longrun dynamics of the underlying Markov chain are moved toward beneficial states. Several advanced techniques, such as siRNA interference, can carry out pathway blockage [3]. The latter method involves designing a program for taking actions over time that alter the expression level of some genes (or gene products), known as control genes, effectively steering the longrun dynamics of the network away from undesirable states. This type of intervention corresponds to intervention using drugs to act on gene products. In this paper, we choose the latter method and assume that the PBN admits an external control input a from a set of actions, \(\mathcal {A}=\{0,1\}\), where a=0 indicates nointervention and a=1 indicates that the expression level of a single control gene, corresponding to a node c∈{1,2,…,n}, is flipped. Under control action a=1, the transition probabilities at state x, or equivalently the row corresponding to x in the original TPM, are replaced by the row corresponding to state \(\tilde {x}\) having the same binary representation as x except with node v _{ c } flipped. Let \(\left \{\left (Z^{k}, A^{k}\right) \in \mathcal {S} \times \mathcal {A} : k = 0, 1, \ldots \right \}\) denote the stochastic process of states and actions taken. The transition rules for the controlled PBN are given by a new TPM, \(\mathcal {P}(a)\), with (x,y) element \(\mathcal {P}_{\textit {xy}}(a)=P(Z^{k+1}=y\mid Z^{k}=x, A^{k}=a)\), for k=0,1,…. The ergodicity of the controlled TPM, \(\mathcal {P}(a)\), for each \(a \in \mathcal {A}\), is immediate from the ergodicity of the original uncontrolled TPM, .
Suppose we wish to optimally steer the dynamics away from undesirable states by applying a regimen of external control actions at each time k=0,1,…,N. This optimization problem has been wellstudied in the context of optimal Markov decision processes. We define a control policy, μ={μ ^{0},μ ^{1},…,μ ^{ N }}, as a sequence of instructions for taking actions that take into account the entire history of states and actions up to time k, h ^{ k }=(z ^{0},a ^{0},z ^{1},a ^{1},…,z ^{ k },a ^{ k }). In particular, after observing the history, h ^{ k−1}, and the current state, z ^{ k }, the control policy prescribes action \(a\in \mathcal {A}\) with some designated probability μ ^{ k }(a∣h ^{ k−1},z ^{ k }), satisfying 0≤μ ^{ k }(a∣h ^{ k−1},z ^{ k })≤1 and \(\sum _{a\in \mathcal {A}} \mu ^{k} \left (a\mid h^{k1}, z^{k}\right)=1\).
where \(\mathrm {E}_{x}^{\mu }\) denotes the expectation relative to \(\mathrm {P}_{x}^{\mu }\) [20]. Let \(J^{\ast }(x) = \inf _{\mu \in \mathcal {M}}~J(x, \mu)\) for any initial state \(x \in \mathcal {S}\). A policy μ ^{∗} is optimal if J ^{∗}(x)=J(x,μ ^{∗}), for every \(x\in \mathcal {S}\). It can be shown that there exists an optimal control policy that belongs to \(\mathcal {M}_{\text {SD}}\), and that J ^{∗}(x) is independent of the initial state x [19].
Although the search space for μ is \(\mathcal {M}_{\text {SR}}\), it can be shown that \({\mu ^{\ast } \in \mathcal {M}_{\text {SD}}}\) [19,21,22]. Furthermore, since the controlled Markov chain is ergodic, \({\sum _{a\in \mathcal {A}}} \nu _{\textit {xa}}^{\ast } \neq 0\) for all \(x\in \mathcal {S}\).
4 Network uncertainty class
Having established a method to model networks and optimal intervention, we next discuss a model for network uncertainty that captures variability among cancer patients due to unpredictable and compounding mutations. Essentially, we assume that the patient’s network belongs to an uncertainty class of possible ‘cancer’ networks that are the result of one or several detrimental modifications (mutations) of a nominal ‘healthy’ network.
Let \(\mathcal {R}^{H}\) denote the regulatory matrix of a nominal healthy network, which possesses a small steadystate mass in undesirable states. We denote our uncertainty set of regulatory matrices by Θ and impose two constraints: (1) regulatory matrices in Θ differ from \(\mathcal {R}^{H}\) by only a few number of elements. For example, assuming that each mutation, or perturbation, corresponds to a random edge addition (0 is mutated to 1 or −1) or removal (1 or −1 is mutated to 0), each element in Θ might have up to some number of edges added or removed relative to \(\mathcal {R}^{H}\). We allow different limits to the number of edges added versus removed, but assume that the total number of each type of edge mutation in any regulatory matrix of Θ is small relative to the size of the network. (2) Θ should contain only regulatory matrices for which the undesirable steadystate mass is greater than some threshold. Thus, cancers in our model have detrimental effects as mutations accumulate.
To reflect the reality that cancer cells with more mutations are more rare and that certain types of perturbations may be more or less likely, we assign prior probabilities to every network represented in Θ. To this end, we assume that the number of mutations of a network in Θ follows essentially a truncated geometric distribution, where the probability of l mutations is proportional to γ ^{ l } for some 0<γ≤1 (normalization is necessary since the number of mutations of networks in Θ is bounded). We further assume that all networks with l mutations are equally likely, for example, if there are N _{ l } regulatory matrices in Θ that have l elements mutated with respect to \(\mathcal {R}^{H}\), then they are all equally likely with probability proportional to γ ^{ l }/N _{ l }. Once we have calculated these values for all elements of Θ, we normalize their sum to one, guaranteeing a valid probability distribution, and denote the resulting probability distribution by Λ, i.e., we have \(\sum _{\mathcal {R} \in \Theta } \Lambda (\mathcal {R}) = 1\) and \(\Lambda (\mathcal {R}) > 0\) for all \(\mathcal {R} \in \Theta \).
Each in Θ induces a SSD under no intervention, which we denote by \(\pi _{\mathcal {R}}\). Also, let \(\pi _{\mathcal {R}x}\) be the SSD of evaluated at point \(x \in \mathcal {S}\), and let \(\Pi = \{\pi _{\mathcal {R}} : \mathcal {R} \in \Theta \}\) be the multiset of all SSDs corresponding to networks in Θ. Note that SSDs in Π may not be unique.

Class 1 (Θ ^{1}): \(\pi _{\mathcal {R}^{\ast }\mathcal {U}} < \alpha \) and \(\pi _{\mathcal {R}\mathcal {U}}  \pi _{\mathcal {R}^{\ast }\mathcal {U}} < \beta _{1}\) (patient’s condition is not critical),

Class 2 (Θ ^{2}): \(\pi _{\mathcal {R}^{\ast }\mathcal {U}} < \alpha \) and \(\pi _{\mathcal {R}\mathcal {U}}  \pi _{\mathcal {R}^{\ast }\mathcal {U}} \geq \beta _{1}\) (patient responds well to an effective treatment),

Class 3 (Θ ^{3}): \(\pi _{\mathcal {R}^{\ast }\mathcal {U}} \geq \alpha \) and \(\pi _{\mathcal {R}\mathcal {U}}  \pi _{\mathcal {R}^{\ast }\mathcal {U}} \geq \beta _{2}\) (patient’s condition can be improved to some extent),

Class 4 (Θ ^{4}): \(\pi _{\mathcal {R}^{\ast }\mathcal {U}} \geq \alpha \) and \(\pi _{\mathcal {R}\mathcal {U}}  \pi _{\mathcal {R}^{\ast }\mathcal {U}} < \beta _{2}\) (patient’s condition is poor and cannot be improved).
for every \(\mathcal {R} \in \Theta ^{i}\) and i∈{1,2,3,4}.
5 Bayesian classification
Our objective is now to study optimal classification of patients into the four prognosis classes. A classifier, ψ, is a function that takes as input observations, in our case a point \(x \in \mathcal {S}\) representing the GAP of a cancer patient at a single time epoch, and outputs a prediction of some unknown label associated with the observations, here a member of {1,2,3,4} representing one of four possible prognoses of the patient. In general, classification performance depends on the underlying sampling distribution governing observations, which in our model is precisely the steadystate distribution of the patient’s network without control. Were the network of the patient perfectly known, prognosis could be determined perfectly as the class corresponding to this network, and it would not be necessary to obtain a GAP for the patient. In the case of network uncertainty, prediction is no longer perfect and observing the GAP of a patient potentially aids in making a better prognosis.
To perform optimal classification, we utilize OBC theory, which is founded on a Bayesian framework that models uncertainty in the underlying sampling distributions [4,5]. Essentially, a prior probability is assigned to all sampling distributions in an uncertainty class that may have produced the observed sample. In our application, the prior probability on the uncertainty class of networks induces a prior on the uncertainty class of steadystate distributions without control, making OBC classification very natural to implement. The main idea is to leverage minimum meansquare error (MMSE) estimation theory to obtain an optimal Bayesian error estimates for any classifier. Thanks to MMSE estimation theory, the optimal Bayesian error estimate (BEE) is precisely the expected misclassification rate with respect to the prior. The optimal Bayesian classifier is then defined to be that classifier which minimizes the BEE.
In the usual implementation of OBC, uncertainty is interpreted as more of an issue of ignorance, where there are some true underlying classconditional distributions, but their identity in the uncertainty class is unknown and can be revealed with training data. Here, all distributions in the uncertainty class may exist in the population, and the issue is in devising a robust classifier that can be applied generally to all distributions in the uncertainty class with minimal expected error. A consequence is that training data from different patients generally cannot be used to collapse the prior to a tighter posterior, unless care is taken to consider known connections to the patient of interest.
Now, suppose is unknown. Let \(\mathcal {L} = \{1,2,\ldots, L\}\), where L is the number of classes, each associated with a set of networks Θ ^{ i }, a multiset of sampling distributions, \(\{\pi _{\mathcal {R}} : \mathcal {R} \in \Theta ^{i}\}\), and priors, \(\{\Lambda ^{i}(\mathcal {R}) : \mathcal {R} \in \Theta ^{i}\}\). A natural metric for classifier performance is the expected misclassification rate, \(\hat {\varepsilon }(\psi) = \mathrm {E}_{\Lambda } [\!\varepsilon _{\mathcal {R}} (\psi)]\), where E_{ Λ } denotes an expectation over with respect to the distribution Λ. One can show that \(\hat {\varepsilon }(\psi) = \sum _{i \in \mathcal {L}} c^{i} \mathrm {E}_{\Lambda ^{i}} [\!\varepsilon _{\mathcal {R}} (\psi)]\), where \(\mathrm {E}_{\Lambda ^{i}}\phantom {\dot {i}\!}\) denotes an expectation over \(\mathcal {R} \in \Theta ^{i}\) with respect to conditional distribution Λ ^{ i }. This quantity is, in fact, equivalent to the BEE, where the class probabilities, c ^{ i }, are perfectly known, and \(\mathrm {E}_{\Lambda ^{i}} [\!\varepsilon _{\mathcal {R}} (\psi)]\) is the expected error contributed by class i.
The following theorem shows how ψ _{OBC} can be found [4].
Theorem 1.
An optimal Bayesian classifier, ψ _{OBC}, satisfying Equation 11 exists and at point \(x \in \mathcal {S}\) is given by ψ _{OBC}(x)=i, where \(i \in \mathcal {L}\) is such that \(c^{i} {f^{i}_{x}} \geq c^{j} {f^{j}_{x}}\) for all \(j \in \mathcal {L}\). In the event of a tie, by convention we choose the class, i, satisfying \(c^{i} {f^{i}_{x}} \geq c^{j} {f^{j}_{x}}\) for all \(j \in \mathcal {L}\) with the smallest index.
for \(i \in \mathcal {L}\). Whereas Equation 14 evaluates the overall error rate over random networks and observations, Equation 15 may be used to evaluate the error rate over random networks conditioned on a particular observation, x.
6 Simulation results
In this section, we implement our procedure to study prognosis prediction on synthetically generated networks, as well as two real networks derived from biological processes related to cancer development. The first real network models the mammalian cell cycle, and the second emulates cell response to various stress signals such as DNA damage, oxidative stress, and activated oncogenes.
6.1 Synthetic networks
To construct synthetic uncertainty classes of networks, we begin by outlining a methodology to construct healthy networks that are calibrated to have low undesirable steadystate mass. We generate a seed regulatory matrix, \(\mathcal {R}^{S}\), by randomly filling each row of \(\mathcal {R}^{S}\) with −1 or 1 as follows. Let r _{max} denote the maximum number of predictors for each gene. We draw the number of predictors for gene i, r(i), uniformly from the set {1,…,r _{max}}. The location of the r(i) nonzero elements in the ith row of \(\mathcal {R}^{S}\), designating the predictors of gene i, are determined by drawing uniformly from the set {T⊂{1,2,…,n}:T=r(i)}. Once the predictors of each gene are determined, we assign 1 to each corresponding location in \(\mathcal {R}^{S}\) with probability β∈[0,1] and −1 with probability 1−β. β reflects a bias toward what type of regulatory relationship (activation or suppression) is more likely to occur. Given the perturbation probability p, we calculate a TPM and its SSD for the network corresponding to the seed regulatory matrix [23]. We then select a nominal healthy network, \(\mathcal {R}^{H}\), as the network with minimum undesirable steadystate mass among all possible networks with a single mutation relative to \(\mathcal {R}^{S}\).
Let REM and ADD be two nonnegative integers. We enumerate all regulatory matrices such that no greater than REM and ADD edges are removed from or added to \(\mathcal {R}^{H}\), respectively. We then exclude networks that have lower undesirable steadystate mass than the healthy network, as well as networks with undesirable steadystate mass less than the average undesirable mass of all networks with single mutations. This guarantees that the set Θ contains only networks with unfavorable steadystate distributions. Given γ, we then calculate the probability distribution Λ for elements of Θ.
We generate 250 random seed networks with seven genes (n=7). For each network, we select at most three predictors for each gene (r _{max}=3), with both types of edges being equally likely (β=0.5) and set the BNp random gene perturbation probability p to 0.01. We define the set of undesirable states, , to be the set of all states in which the gene corresponding to the most significant bit (v _{1}) in the binary representation of the state is downregulated. This results in half of the states being undesirable. We also set the number of edge removals to REM=1, the number of edge additions to ADD=1, and the mutation probability γ to 0.5. Each seed network corresponds to an uncertainty set Θ.
In the next stage of our procedure, given a control gene, we design the optimal intervention policy for each \(\mathcal {R} \in \Theta \), which results in a controlled SSD \(\pi _{\mathcal {R}^{\ast }}\). In our classification settings, L=4 and we partition Θ into four subsets by choosing α, β _{1}, and β _{2} such that these subsets have (almost) equal sizes. Given Λ, the prior probability of networks in Θ, we use Equation 13 to find the OBC for the uncertainty set Θ and probability distribution Λ. We also estimate the error of this classifier using Equation 14. Changing the control gene does not affect Θ, however it will change the partitioning of Θ and classification results. Thus, we set the control gene, in turn, to every gene in the network excluding the target gene.
See the supplementary materials for analogous results on 54 different settings, varying β∈{0.1,0.5,0.9}, p∈{0.1,0.5,0.9}, γ∈{0.005,0.01,0.1}, and r _{max}∈{2,3}, and a discussion on the effect of these parameters.
6.2 Real networks
6.2.1 Mammalian cellcycle network
Regulatory relationships of the mammalian cell cycle network
Gene  Predictors: regulatory type (+ / −) 

CycD  CycD: + 
Rb  CycD: −, p27: +, CycE: −, CycA: −, CycB: − 
p27  CycD: −, p27: +, CycE: −, CycA: −, CycB: − 
E2F  Rb: −, p27: +, CycA: −, CycB: − 
CycE  Rb: −, p27: +, E2F: +, CycE: −, CycA: − 
CycA  Rb: −, E2F: +, CycA: +, Cdc20: −, 
Cdh1: −, UbcH10: −  
Cdc20  Cdh1: −, CycB: + 
Cdh1  p27: +, CycA: −, Cdc20: +, CycB: − 
UbcH10  CycA: +, Cdc20: +, Cdh1: −, 
UbcH10: +, CycB: +  
CycB  Cdc20: −, Cdh1: − 
The expected undesirable mass after intervention and the OBC error rate for the mammalian cell cycle network
Control gene  \(\boldsymbol {\mathrm {E}_{\Lambda } [\pi _{\mathcal {R}^{\ast }\mathcal {U}}]}\)  \(\boldsymbol {\hat {\varepsilon } (\psi _{\text {OBC}})}\) 

E2F  0.2889  0.6572 
CycE  0.2334  0.6733 
CycA  0.2872  0.6650 
Cdc20  0.3386  0.6799 
Cdh1  0.3371  0.6704 
UbcH10  0.3497  0.6705 
CycB  0.2941  0.6631 
6.2.2 Stress response network
Regulatory relationships of a p53 signaling network
Gene/protein/signal  Predictors: regulatory type (+ / −) 

DNAdamage  DNAdamage: + 
p53  ATR: +, CHEK1: +, CHEK2: +, 
MDM2: −, MDMX: −  
p14ARF  p14ARF: + 
ATR  DNAdamage: + 
ATM  DNAdamage: + 
CHEK1  ATR: + 
CHEK2  ATM: + 
MDM2  p14ARF: −, MDMX: + 
MDMX  MDM2: − 
The expected undesirable mass after intervention and the OBC error rate for the stress response network
Control gene  \(\boldsymbol {\mathrm {E}_{\Lambda } [\pi _{\mathcal {R}^{\ast }\mathcal {U}}]}\)  \(\boldsymbol {\hat {\varepsilon } (\psi _{\text {OBC}})}\) 

p14ARF  0.0095  0.5175 
ATR  0.0104  0.4789 
ATM  0.0150  0.5935 
CHEK1  0.0110  0.5218 
CHEK2  0.0134  0.5561 
MDM2  0.00666  0.5376 
MDMX  0.0084  0.5220 
7 Conclusion
We have outlined a framework in which it is possible to utilize prior knowledge regarding cell regulation, for instance pathway information in healthy and aberrant networks, to optimally predict prognosis. That being said, there are several important generalizations of our model that merit further study: (1) integrating partial ignorance of the healthy network itself into our uncertainty class of networks, (2) allowing the network to change over time, thereby taking into account the progressive deterioration of cancer as mutations accumulate, (3) modeling uncertainty in the ideal drug regimen for each network, (4) integrating different types of observations into the analysis, and (5) combining optimal prognosis prediction with optimal treatment recommendations under network uncertainty.
While ψ _{OBC} makes optimal prognosis predictions under network uncertainty, obtaining the GAP or any other relevant information from a patient has the effect of reducing uncertainty. A key point in this work is that we study performance with respect to prognosis only. Although one must overcome network uncertainty, it is not necessary to be able to actually infer the network or any mutations, rather, for our purposes one only needs enough relevant data to make good predictions regarding prognosis. Thus, a second major question we address is whether it is possible to successfully predict prognosis with a relatively small amount of data and available biological knowledge.
The larger the uncertainty class, generally the more difficult prognosis becomes. This is an intuitive result: more uncertainty requires more information to draw accurate conclusions. Furthermore, prognosis performance depends on many factors, including the type of cancer (the original healthy network and its associated uncertainty class), the individual patient’s network, and the particular sample drawn from the patient. Very often, prognosis prediction from a single GAP is highly unreliable, even in this optimistic setting with perfect knowledge of healthy biological processes and ideal treatment decisions. In this case, the remedy is to collect more data, for instance timeseries GAP measurements, to help identify the patient’s network or at least ensure reliable prognosis prediction. One may be lucky and find that their condition is quite clear from a single measurement, but, at least in our examples, it is typical to find that very little is revealed about one’s condition, necessitating additional lab tests.
Declarations
Acknowledgements
The authors would like to thank Edward R. Dougherty for his fruitful discussions.
Authors’ Affiliations
References
 Understanding Cancer Prognosis. (www.cancer.gov/cancertopics/factsheet/Support/prognosisstats).
 I Shmulevich, ER Dougherty, S Kim, W Zhang, Probabilistic Boolean networks: A rulebased uncertainty model for gene regulatory networks. Bioinformatics 18(2), 261–274 (2002).View ArticleGoogle Scholar
 BJ Yoon, X Qian, ER Dougherty, Quantifying the objective cost of uncertainty in complex dynamical systems. IEEE Trans. Signal Process. 61(9), 2256–2266 (2013).View ArticleGoogle Scholar
 LA Dalton, ER Dougherty, Optimal classifiers with minimum expected error within a Bayesian framework – Part I: Discrete and Gaussian models. Pattern Recognit. 46(5), 1301–1314 (2013).View ArticleMATHGoogle Scholar
 LA Dalton, ER Dougherty, Optimal classifiers with minimum expected error within a Bayesian framework – Part II: Properties and performance analysis. Pattern Recognit. 46(5), 1288–1300 (2013).View ArticleMATHGoogle Scholar
 HY Chuang, E Lee, YT Liu, D Lee, T Ideker, Networkbased classification of breast cancer metastasis. Mol. Syst. Biol. 3(140) (2007).Google Scholar
 E Lee, HY Chuang, JW Kim, T Ideker, D Lee, Inferring pathway activity toward precise disease classification. PLoS Comput. Biol. 4(11), e1000217 (2008).View ArticleGoogle Scholar
 J Su, BJ Yoon, ER Dougherty, Accurate and reliable cancer classification based on probabilistic inference of pathway activity. PLoS One 4(12), e8161 (2009).View ArticleGoogle Scholar
 J Su, BJ Yoon, ER Dougherty, Identification of diagnostic subnetwork markers for cancer in human proteinprotein interaction network. BMC Bioinf 11(Suppl 6), S8 (2010).View ArticleGoogle Scholar
 E Bilal, J Dutkowski, J Guinney, IS Jang, BA Logsdon, G Pandey, Sauerwine B A, Y Shimoni, HK Moen Vollan, BH Mecham, OM Rueda, J Tost, C Curtis, MJ Alvarez, VN Kristensen, S Aparicio, AL BørresenDale, C Caldas, A Califano, SH Friend, T Ideker, EE Schadt, GA Stolovitzky, AA Margolin, Improving breast cancer survival analysis through competitionbased multidimensional modeling. PLoS Comput. Biol. 9(5), e1003047 (2013).View ArticleGoogle Scholar
 M Shahrokh Esfahani, J Knight, A Zollanvari, BJ Yoon, ER Dougherty, Classifier design given an uncertainty class of feature distributions via regularized maximum likelihood and the incorporation of biological pathway knowledge in steadystate phenotype classification. Pattern Recognit. 46(10), 2783–2797 (2013).View ArticleGoogle Scholar
 M Shahrokh Esfahani, Dougherty E R, Incorporation of biological pathway knowledge in the construction of priors for optimal Bayesian classification. IEEE/ACM Trans. Comput. Biol. Bioinf. 11, 202–218 (2014).View ArticleGoogle Scholar
 F Li, T Long, Y Lu, Q Ouyang, C Tang, The yeast cellcycle network is robustly designed. Proc. Nat. Acad. Sci. USA. 101(14), 4781–4786 (2004).View ArticleGoogle Scholar
 Y Wu, X Zhang, J Yu, Q Ouyang, Identification of a topological characteristic responsible for the biological robustness of regulatory networks. PLoS Comput Biol 5(7), e1000442 (2009).View ArticleMathSciNetGoogle Scholar
 A Garg, AD Cara, I Xenarios, L Mendoza, GD Micheli, Synchronous versus asynchronous modeling of gene regulatory networks. Bioinformatics 24(17), 1917–1925 (2008).View ArticleGoogle Scholar
 X Qian, ER Dougherty, Effect of function perturbation on the steadystate distribution of genetic regulatory networks: Optimal structural intervention. IEEE Trans. Signal Process 56(10), 4966–76 (2008).View ArticleMathSciNetGoogle Scholar
 A Datta, A Choudhary, ML Bittner, ER Dougherty, External control in Markovian genetic regulatory networks. Machine Learning. 52(12), 169–191 (2003).View ArticleMATHGoogle Scholar
 MR Yousefi, A Datta, ER Dougherty, Optimal intervention in Markovian gene regulatory networks with randomlength therapeutic response to antitumor drug. IEEE Trans. Biomed. Eng. 60(12), 3542–3552 (2013).View ArticleGoogle Scholar
 C Derman, Finite State Markovian Decision Processes (Academic Press, New York, 1970).Google Scholar
 MR Yousefi, ER Dougherty, Intervention in gene regulatory networks with maximal phenotype alteration. Bioinformatics 29(14), 1758–1767 (2013).View ArticleGoogle Scholar
 LCM Kallenberg, Linear Programming and Finite Markovian Control Problems (Mathematisch Centrum, Amsterdam, 1983).Google Scholar
 E Altman, Constrained Markov Decision Processes (Boca Raton, Chapman Hall/CRC, 1999).Google Scholar
 I Ivanov, P Simeonov, N Ghaffari, X Qian, ER Dougherty, Selection policyinduced reduction mappings for Boolean networks. IEEE Trans. Signal Process. 58(9), 4871–4882 (2010).View ArticleMathSciNetGoogle Scholar
 Fauré A, A Naldi, C Chaouiya, D Thieffry, Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle. Bioinformatics 22(14), e124–e131 (2006).View ArticleGoogle Scholar
 M Kanehisa, S Goto, KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).View ArticleGoogle Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.