Reverse engineering gene regulatory networks from measurement with missing values

Background Gene expression time series data are usually in the form of high-dimensional arrays. Unfortunately, the data may sometimes contain missing values: for either the expression values of some genes at some time points or the entire expression values of a single time point or some sets of consecutive time points. This significantly affects the performance of many algorithms for gene expression analysis that take as an input, the complete matrix of gene expression measurement. For instance, previous works have shown that gene regulatory interactions can be estimated from the complete matrix of gene expression measurement. Yet, till date, few algorithms have been proposed for the inference of gene regulatory network from gene expression data with missing values. Results We describe a nonlinear dynamic stochastic model for the evolution of gene expression. The model captures the structural, dynamical, and the nonlinear natures of the underlying biomolecular systems. We present point-based Gaussian approximation (PBGA) filters for joint state and parameter estimation of the system with one-step or two-step missing measurements. The PBGA filters use Gaussian approximation and various quadrature rules, such as the unscented transform (UT), the third-degree cubature rule and the central difference rule for computing the related posteriors. The proposed algorithm is evaluated with satisfying results for synthetic networks, in silico networks released as a part of the DREAM project, and the real biological network, the in vivo reverse engineering and modeling assessment (IRMA) network of yeast Saccharomyces cerevisiae. Conclusion PBGA filters are proposed to elucidate the underlying gene regulatory network (GRN) from time series gene expression data that contain missing values. In our state-space model, we proposed a measurement model that incorporates the effect of the missing data points into the sequential algorithm. This approach produces a better inference of the model parameters and hence, more accurate prediction of the underlying GRN compared to when using the conventional Gaussian approximation (GA) filters ignoring the missing data points. Electronic supplementary material The online version of this article (doi:10.1186/s13637-016-0055-8) contains supplementary material, which is available to authorized users.


State Update:
GivenX k−1|k−1 and P XX k−1|k−1 are available at time k − 1 and are defined as: We assume that v k is uncorrelated with Y k−1 . Given the Gaussian assumption in (10) -(11), the Gaussian distributions in (16) and the fact that p(X k−1 |Y k−1 ) is Gaussian, then the joint PDF of x a k−1 , x k and v k conditioned on Y k−1 is also Gaussian: whereX and considering the fact thatx a k−1|k−1 ,x k|k−1 and Y k−1 are uncorrelated with v k , then: wherex a k−1|k−1 in (30) and P aa k−1|k−1 in (31) are given inX k−1|k−1 and P XX k−1|k−1 respectively in (28). However, to obtainx k|k−1 , P xx k|k−1 and P ax k−1,k|k−1 we do the following. Given that w k−1 is uncorrelated with Y k−1 , and from (5) and (11), we obtainx k|k−1 as follows: likewise, P xx k|k−1 is obtained as follows: Assuming that w k−1 and x a k−1 are uncorrelated, we obtain P ax k−1,k|k−1 as follows: (34)

Measurement Update:
After we approximate the predictive PDF withX k|k−1 and P XX k|k−1 , the Gaussian approximation of the filtering PDF will be obtained by computing the first two moments of the augmented stateX k|k and P XX k|k . This is achieved by using the Kalman filter equations: where K X k is the Kalman gain. Next, we will derive each of the expressions in (35).
Substituting the delayed/missing measurement function described in (7) -(9) into (13) yields:ŷ Using the equations (7) and (36), the measurement error (innovation) can be written in terms of the errors in the delayed measurements z k−d : Given that γ d k is independent of measurement z k , substituting (8) -(9) and (37) into the definition of P yy k|k−1 in (13) , we express this conditional covariance as follows: and similarly P Xy k|k−1 can be written as: Next, we must obtain the expressionsẑ k−d|k−1 , P zz k−d|k−1 and P Xz k,k−d|k−1 for d = 0, 1, 2.
Noting that v k is zero mean Gaussian noise with covariance R k and it is independent of Y k−1 , then (for d = 0) we obtainẑ k|k−1 , P zz k|k−1 and P Xz k|k−1 as follows. By using (6): and and Lastly, for d = 2,ẑ k−2|k−1 , P zz k−2|k−1 and P Xz k,k−2|k−1 are obtained as follows:ẑ and where the expectation terms are given by and Considering that w k−1 is uncorrelated with z k−2 , (50) can be computed as: Finally, givenX k−1|k−1 and P XX k−1|k−1 at time k − 1, Gaussian approximations of p(y k |Y k−1 ) , p(X k |Y k−1 ) and p(X k , y k |Y k−1 ) are also Gaussian,i.e., With Bayes rule, the GA of p(X k |Y k ) with filtering estimationX k|k and the covariance P XX k|k at time k of the augmented state is: The joint PDF is expressed as: we rewrite Σ as: and where K X k and P XX k|k are given in (35). Then Σ −1 is obtained as: Substituting (57) -(58) into (54) and coupled with the results obtained in (36) and (38), (54) becomes: Thus, we obtain the posterior PDF p(X k |Y k ) of the augmented state by substituting (59) into (53) with the meanX k|k and covariance P XX k|k as in (35).

Point based numerical integration
For an l-dimensional random vector x ∼ N (x; µ, P ), the first and second moments (mean and covariance) of x can be captured by using a set of points deterministically, called sigma-points. Consider the following nonlinear transformation of x, i.e., y = h(x). The mean and covariance of the random vector y and the cross-covariance between x and y can be estimated by propagating each of the sigma points through the nonlinear function, and these estimates are accurate to the second order (or third order for true Gaussian priors) of the Taylor series expansion of h(x) for any nonlinear function [1]. Next, we briefly introduce certain sigma-point based numerical integration techniques in the literature.
Unscented Transformation (UT): In the unscented transform, a set of 2l + 1 sigma-points are chosen as and their respective weights as where √ P = chol(P ), (P ) i represents the i th column of matrix P , α is a scaling parameter and it is usually a small number, 0 α 1, in order to avoid oversampling non-local effects when linearities are strong; β is a parameter that incorporates prior knowledge of random variable x, β = 2 is optimal for Gaussian distribution [2]; λ = α 2 (l + κ) − l where κ 0, guarantees positive semi-definiteness of the covariance matrix during the recursive filtering. Then the mean and variance of y and the cross-covariance of x and y are approximated as:

(62)
Third-degree Cubature Rule: The third-degree cubature rule is a special form of the UT [3]. A set of sigma points Γ i , i = 1, ..., 2l, are chosen (one point short of UT) as follows: with their respective weights as: The mean and variance of y and the cross-covariance of x and y are calculated in the same manner as in UT. Central Difference Rule: In Stirling's approximation of h(·), the derivatives in the Taylor series are replaced by the central divided differences [4]. Correspondingly, we have 2l + 1 points: with their respective weights as where δ is the step size which can be set to √ 3 for Gaussian distribution. Given that Y i = h(Γ i ), the mean and variance of y and the cross-covariance of x and y are approximated as: 3 Algorithm for UKF with one-step or two-step missing measurements Letx 0|0 be the initial state, and P xx 0|0 be the initial state covariance. The filter is initialized in the following manner: where N is the number of genes in the network. The algorithm iterates the following steps (state update and measurement update) for k = 1, 2, ...

State update
Approximation of the conditional mean and covariance of X k given Y k−1 , i.e.,X k|k−1 and P XX k|k−1 .