Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information

Jia, Bin; Wang, Xiaodong

doi:10.1186/1687-4153-2013-16

Research
Open access
Published: 17 December 2013

Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information

Bin Jia¹ &
Xiaodong Wang²

EURASIP Journal on Bioinformatics and Systems Biology volume 2013, Article number: 16 (2013) Cite this article

3666 Accesses
3 Citations
Metrics details

Abstract

The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF₃), and the fifth-degree cubature Kalman filter (CKF₅) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

1 Introduction

Inferring gene regulatory network (GRN) has become one of the most important missions in system biology. Genome-wide expression data is widely used due to the development of several high-throughput experimental technologies. The gene regulatory network can be inferred from a number of gene expression samples taken over a period of time. Modeling of GRN is required before its structure can be inferred. Common dynamical modeling methods of GRN include Bayesian networks [1], Boolean networks [2], ordinary differential equations [3], state-space models [4, 5], and so on. Various approaches based on different models have been used to infer the network from observed gene expression data, such as the Markov Chain Monte Carlo (MCMC) methods for the dynamic Bayesian network model [6] and the ordinary differential equation model [7], as well as the Kalman filtering methods for the state-space model [4, 8] and the ordinary differential equation model [3]. Some survey papers can be found in [9–12].

Due to the ‘stochastic’ nature of the gene expression, the Kalman filtering approach based on the state-space model is one of the most competitive methods for inferring the GRN. The Kalman filter is optimal for linear Gaussian systems. However, the GRN is generally highly nonlinear. Hence, advanced filtering methods for nonlinear dynamic systems should be considered. The extended Kalman filter (EKF) is probably the most widely used nonlinear filter which uses the first-order Taylor series expansion to linearize the nonlinear model. However, the accuracy of the EKF is low when the system is highly nonlinear or contains large uncertainty. The point-based Gaussian approximation filters have been recently proposed to improve the performance of the EKF, which employ various quadrature rules to compute the integrals involved in the exact Bayesian estimation. Many filters fall into this category, such as the unscented Kalman filter (UKF) [13], the Gauss-Hermite quadrature filter [14], the cubature Kalman filter (CKF) [15], and the sparse-grid quadrature filter [16]. Besides the point-based Gaussian approximation filters, the particle filter has drawn much attention recently [17]. The particle filter uses random particles with weights to represent the probability density function (pdf) in the Bayesian estimation and provides better estimation result than the EKF. The main problem of the particle filter is that the computational complexity is high, and therefore, it is hard to use for high-dimensional problems, such as the problem considered in this paper.

The EKF and the particle filter have been used for the inference of GRN [4, 8, 18]. In this paper, we consider the point-based Gaussian approximation filters. Our main objective is to provide a framework of incorporating network prior information into the filters. For example, some gene regulations may be known [19] from literature and the inference accuracy of GRN can be improved by incorporating the known regulations of the GRN [20]. Integration of the prior knowledge or constraints with the GRN inference algorithm has been introduced to improve the inference result. The DNA motif sequence in gene promoter regions is incorporated in [21] while modeling of transcription factor interactions is incorporated in [22]. As mentioned in [20], experimentally determined physical interactions can be obtained. In addition, the sparsity constraint is frequently used in the inference of the GRN. To the best of the authors’ knowledge, the most related work in incorporating the prior information in Bayesian filters is [8]. In that work, rather than directly getting the inference results from the filter, an optimization method is used. In particular, a cost function is used in which the sparsity constraint is enforced. However, the cost function in [8] does not consider the uncertainty of the state in the filtering. That cost function in fact is not coupled well with the filtering algorithm. In addition, it did not consider other kinds of prior information. In this paper, we propose a new framework that incorporates the prior information effectively in the filtering algorithm by solving a constrained optimization problem. Efficient recursive algorithms are provided to solve the associated optimization problem.

The remainder of this paper is organized as follows. In Section 2, the modeling of gene regulatory network is introduced. The point-based Gaussian approximation filters are briefly introduced in Section 3. The proposed new filtering framework is described in Section 4. In Section 5, experimental results are provided. Finally, concluding remarks are given in Section 6.

2 State-space modeling of gene regulatory network

The GRN can be described by a graph in which genes are viewed as nodes and edges depict causal relations between genes. The structure of GRN reveals the mechanisms of biological cells. Analyzing the structure of GRN will pave the way for curing various diseases [23]. The learning of GRN has drawn much attention recently due to the availability the microarray data. By analyzing collected gene expression levels over a period of time, one can identify various regulatory relations between different genes. To facilitate the analysis of the GRN, modeling of GRN is required. Different models can be used, such as Bayesian networks [1], Boolean networks [2], ordinary differential equation [3], and state-space model [4, 5]. The state-space model has been widely used because it incorporates noise and can make use of computationally efficient filtering algorithms [5]. Thus, we also use the state-space modeling of GRN in this paper.

Under the discrete-time state-space modeling of the gene regulatory networks, the network evolution from time k to time k - 1 can be described by

x_{k} = f (x_{k - 1}) + v_{k},

(1)

where x_k= [ x_1,k,…,x_n,k]^T is the state vector and x_i,k denotes the gene expression level of the i-th gene at time k. f is a nonlinear function that characterizes the regulatory relationship among the genes. v_k is the state noise and it is assumed to follow a Gaussian distribution with mean 0 and covariance matrix Q_k, i.e., $v_{k} \sim N (0, Q_{k})$ .

Following [8], we use the following nonlinear function in the state Equation (1):

f (x) = A g (x),

(2)

with

g (x) = [\begin{array}{c} g_{1} (x_{1}) \\ ⋮ \\ g_{n} (x_{n}) \end{array}]

(3)

and

g_{i} (x) = \frac{1}{1 + e^{- μ_{i} x}} .

(4)

In (2), A is the regulatory coefficient matrix with a_ij denoting the regulation coefficient from gene j to gene i. Note that a positive coefficient a_ij indicates that gene j activates gene i and a negative a_ij indicates that gene j represses gene i. In (4), μ_i is a parameter. Note that A and μ_i are unknown parameters. The discrete-time nonlinear stochastic dynamic system [24] shown in Eqs. (1)-(3) have been successfully used to describe the GRN [4, 8]. Equation (4) is also called Sigmoid function which is frequently used since it is consistent with the fact that all concentrations get saturated at some point in time [25]. The Sigmoid function has been used in modeling GRN to verify various methods, such as artificial neural network [26], simulated annealing and clustering algorithm [27], extended Kalman filter [4], particle filter [8], and Genetic programming and Kalman filtering [25].

For the measurement model, we consider the following general nonlinear observation equation

\begin{array}{l} y_{k} = h (x_{k}) + n_{k}, \end{array}

(5)

where h (·) is some nonlinear function, n_k is the measurement noise, which is assumed to follow the Gaussian distribution with mean 0 and covariance matrix R_k, i.e., $n_{k} \sim N (0, R_{k})$ . For example, if the noise corrupted expression levels are observed, then h(x) = x.

3 Network inference using point-based Gaussian approximation filters

3.1 Gaussian approximation filters

In this section, the framework of point-based Gaussian approximation filters for the state-space dynamic model is briefly reviewed. We consider the state-space model consisting of the state Equation (1) and the measurement Equation (5). We denote $y^{k} ≜ [y_{1}, \dots, y_{k}]$ .

The optimal Bayesian filter is composed of two steps: prediction and filtering. Specifically, given the prior pdf $p (x_{k - 1} | y^{k - 1})$ at time k - 1, the predicted conditional pdf $p (x_{k} | y^{k - 1})$ is given by

p (x_{k} | y^{k - 1}) = \int p (x_{k} | x_{k - 1}) p (x_{k - 1} | y^{k - 1}) d x_{k - 1} .

(6)

After the measurement at time k becomes available, the filtered pdf is given by

p (x_{k} | y^{k}) = \frac{p (y_{k} | x_{k}) p (x_{k} | y^{k - 1})}{\int p (y_{k} | x_{k}) p (x_{k} | y^{k - 1}) d x_{k}} .

(7)

The pdf recursions in (6) and (7) are in general computationally intractable unless the system is linear and Gaussian. The Gaussian approximation filters approximate (6) and (7) by invoking Gaussian assumptions. Specifically, the first assumption is that given y^k-1, x_k-1 has a Gaussian distribution, i.e., $x_{k - 1} | y^{k - 1} \sim N ({\hat{x}}_{k - 1 | k - 1}, P_{k - 1 | k - 1})$ . The second assumption is that (x_k,y_k) are jointly Gaussian given y^k-1.

It then follows from the second assumption that given y^k-1, x_k has a Gaussian distribution, i.e., $x_{k} | y^{k - 1} \sim N ({\hat{x}}_{k | k - 1}, P_{k | k - 1}) .$ Using (1) and the first assumption, we have the predicted mean ${\hat{x}}_{k | k - 1}$ and covariance P_k|k-1 given respectively by

\begin{array}{l} {\hat{x}}_{k | k - 1} & ≜ E {x_{k} | y^{k - 1}} = E_{x_{k - 1} | y^{k - 1}} \{f (x_{k - 1})\} \\ = \int f (x) ϕ (x; {\hat{x}}_{k - 1 | k - 1}, P_{k - 1 | k - 1}) d x, \end{array}

(8)

and

\begin{array}{l} P_{k | k - 1} & ≜ Cov {x_{k} | y^{k - 1}} \\ = E_{x_{k - 1} | y^{k - 1}} \{(f (x_{k - 1}) - {\hat{x}}_{k | k - 1}) \\ \times {(f (x_{k - 1}) - {\hat{x}}_{k | k - 1})}^{T}\} + Q_{k - 1} \\ = \int (f (x) - {\hat{x}}_{k | k - 1}) \\ \times {(f (x) - {\hat{x}}_{k | k - 1})}^{T} ϕ (x; {\hat{x}}_{k - 1 | k - 1}, \\ P_{k - 1 | k - 1}) d x + Q_{k - 1}, \end{array}

(9)

where $ϕ (x; \hat{x}, P)$ denotes the multivariate Gaussian pdf with mean $\hat{x}$ and covariance P.

Then, following the second assumption, given $y^{k} = [y^{k - 1}, y_{k}]$ , x_k is Gaussian distributed, i.e., $x_{k} | y^{k} \sim N ({\hat{x}}_{k | k}, P_{k | k}) .$ Using the conditional property of the multivariate Gaussian distribution, the filtered mean ${\hat{x}}_{k | k}$ and covariance P_k|k are given respectively by

\begin{array}{l} {\hat{x}}_{k | k} & ≜ E {x_{k} | y_{k}, y^{k - 1}} \\ = {\hat{x}}_{k | k - 1} + L_{k} (y_{k} - {\hat{y}}_{k | k - 1}) \end{array}

(10)

\begin{array}{l} and P_{k | k} & ≜ Cov {x_{k} | y_{k}, y^{k - 1}} \\ = P_{k | k - 1} - L_{k} P_{k}^{xy}, \end{array}

(11)

with

\begin{array}{l} {\hat{y}}_{k | k - 1} & = E_{x_{k} | y^{k - 1}} \{h (x_{k})\} \\ = \int h (x) ϕ (x; {\hat{x}}_{k | k - 1}, P_{k | k - 1}) d x, \end{array}

(12)

\begin{array}{l} L_{k} & = P_{k}^{xy} {(R_{k} + P_{k}^{yy})}^{- 1}, \end{array}

(13)

\begin{array}{l} P_{k}^{xy} & = E_{x_{k} | y^{k - 1}} \{(x - {\hat{x}}_{k | k - 1}) {(h (x) - {\hat{y}}_{k | k - 1})}^{T}\} \\ = \int (x - {\hat{x}}_{k | k - 1}) {(h (x) - {\hat{y}}_{k | k - 1})}^{T} \\ ϕ (x; {\hat{x}}_{k | k - 1}, P_{k | k - 1}) d x, \end{array}

(14)

\begin{array}{l} P_{k}^{yy} & = E_{x_{k} | y^{k - 1}} \{(h (x) - {\hat{y}}_{k | k - 1}) {(h (x) - {\hat{y}}_{k | k - 1})}^{T}\} \\ = \int (h (x) - {\hat{y}}_{k | k - 1}) {(h (x) - {\hat{y}}_{k | k - 1})}^{T} \\ ϕ (x; {\hat{x}}_{k | k - 1}, P_{k | k - 1}) d x . \end{array}

(15)

3.2 Point-based Gaussian approximation filters

The integrals in (8), (9), (12), (14) and (15) are Gaussian type that can be efficiently approximated by various quadrature methods. Specifically, if a set of weighted points $\{(γ_{i}, w_{i}), i = 1, \dots, N\}$ can be used to approximate the integral

\int h (x) ϕ (x; 0, I) d x \approx \sum_{i = 1}^{N} w_{i} h (γ_{i}),

(16)

then the general Gaussian-type integral can be approximated by

\int h (x) ϕ (x; \hat{x}, P) d x \approx \sum_{i = 1}^{N} w_{i} h (S γ_{i} + \hat{x}),

(17)

where P= S S^T and S can be obtained by Cholesky decomposition or singular value decomposition (SVD).

Using (17), we can then approximate (8) and (9) as follows:

{\hat{x}}_{k | k - 1} \approx \sum_{i = 1}^{N} w_{i} f (ξ_{k - 1, i})

(18)

and

\begin{array}{l} P_{k | k - 1} \approx & \sum_{i = 1}^{N} w_{i} f (ξ_{k - 1, i} - {\hat{x}}_{k | k - 1}) \\ \times {(ξ_{k - 1, i} - {\hat{x}}_{k | k - 1})}^{T} + Q_{k - 1}, \end{array}

(19)

where ξ_k - 1,i is the transformed quadrature point obtained from the covariance decomposition, i.e.,

\begin{array}{l} P_{k - 1 | k - 1} & = S_{k - 1} S_{k - 1}^{T}, \end{array}

(20)

\begin{array}{l} ξ_{k - 1, i} & = S_{k - 1} γ_{i} + {\hat{x}}_{k - 1 | k - 1} . \end{array}

(21)

Similarly, we can approximate (12), (14) and (15) as follows:

\begin{array}{l} {\hat{y}}_{k | k - 1} & = \sum_{i = 1}^{N} w_{i} h ({\tilde{ξ}}_{k, i}), \end{array}

(22)

\begin{array}{l} P_{k}^{xy} & = \sum_{i = 1}^{N} w_{i} ({\tilde{ξ}}_{k, i} - {\hat{x}}_{k | k - 1}) {(h ({\tilde{ξ}}_{k, i}) - {\hat{y}}_{k | k - 1})}^{T}, \end{array}

(23)

\begin{array}{l} P_{k}^{yy} & = \sum_{i = 1}^{N} w_{i} (h ({\tilde{ξ}}_{k, i}) - {\hat{y}}_{k | k - 1}) {(h ({\tilde{ξ}}_{k, i}) - {\hat{y}}_{k | k - 1})}^{T}, \end{array}

(24)

where ${\tilde{ξ}}_{k, i}$ is the transformed point obtained from the decomposition of the predicted covariance, i.e.,

\begin{array}{l} P_{k | k - 1} & = {\tilde{S}}_{k} {\tilde{S}}_{k}^{T}, \end{array}

(25)

\begin{array}{l} {\tilde{ξ}}_{k, i} & = {\tilde{S}}_{k} γ_{i} + {\hat{x}}_{k | k - 1} . \end{array}

(26)

Various numerical rules can be used to form the approximation in (16), which lead to different Gaussian approximation filters. In particular, the unscented transformation, the Gauss-Hermite quadrature rule, and the sparse-grid quadrature rules are used in the unscented Kalman filter (UKF), the Gauss-Hermite quadrature Kalman filter (GHQF), and the sparse-grid quadrature filter (SGQF), respectively.

Recently, the fifth-degree quadrature filter has been proposed and shown to be more accurate than the third-degree quadrature filters, such as the UKF and the third-degree cubature Kalman filter (CKF₃), when the system is highly nonlinear or contains large uncertainty [16]. In this paper, we consider the UKF, CKF₃, and the fifth-degree cubature Kalman filter (CKF₅). Other filters such as the central difference filter [14] and divided difference filter [28] can also be used. The CKF₅ is based on Mysovskikh’s method which uses fewer point than the fifth-degree quadrature filter in [16]. In the following, different numerical rules used in (16) are briefly summarized.

3.2.1 Unscented transform

In the unscented Kalman filter (UKF), we have N = 2 n + 1 where n is the dimension of x. The quadrature points and the corresponding weights are given respectively by

γ_{i} = \{\begin{array}{l} 0, & i = 1, \\ \sqrt{(n + κ)} e_{i - 1}, & i = 2, \dots, n + 1, \\ - \sqrt{(n + κ)} e_{i - n - 1}, & i = n + 2, \dots, 2 n + 1, \end{array}

(27)

and

w_{i} = \{\begin{array}{l} \frac{κ}{n + κ}, & i = 1, \\ \frac{1}{2 (n + κ)}, & i = 2, \dots, 2 n + 1, \end{array}

(28)

where κ is a tunable parameter, and e_i is the i-th n-dimensional unit vector in which the i-th element is 1 and other elements are 0.

3.2.2 Cubature rules

The left-hand side of (16) can be rewritten as

\int h (x) ϕ (x; 0, I) d x = \frac{1}{π^{n / 2}} \int h (\sqrt{2} x) exp (- x^{T} x) d x .

(29)

Consider the integral

I (h) = \int h (x) exp (- x^{T} x) d x .

(30)

By letting x = r s with s^Ts = 1 and $r = \sqrt{x^{T} x}$ , I(h) can be rewritten in the spherical-radial coordinate system as

I (h) = \int_{0}^{\infty} \int_{U_{n}} h (r s) r^{n - 1} exp (- r^{2}) d σ (s) d r,

(31)

where $U_{n} = \{s \in R^{n} : ∥ s ∥ = 1\}$ , and $σ (\cdot)$ is the spherical surface measure or the area element on U_n.

Note that (31) contains two types of integrals: the radial integral $\int_{0}^{\infty} h_{r} (r) r^{n - 1} exp (- r^{2}) d r$ and the spherical integral $\int_{U_{n}} h_{s} (s) d σ (s)$ .

If the radial rule can be approximated by

\int_{0}^{\infty} h_{r} (r) r^{n - 1} exp (- r^{2}) d r \approx \sum_{i = 1}^{N_{r}} w_{r, i} h_{r} (r_{i}),

(32)

and the spherical integral can be approximated by

\int_{U_{n}} h_{s} (s) d σ (s) \approx \sum_{j = 1}^{N_{s}} w_{s, j} h_{s} (s_{j}),

(33)

then (31) can be approximated by

\begin{array}{l} I (h) & \approx & \sum_{i = 1}^{N_{r}} \sum_{j = 1}^{N_{s}} w_{r, i} w_{s, j} h (r_{i} s_{j}) . \end{array}

(34)

A third-degree cubature rule to approximate (29) is obtained by using the third-degree spherical rule and radial rule [15]:

\int h (x) ϕ (x; 0, I) d x \approx \frac{1}{2 n} \sum_{i = 1}^{n} [h (\sqrt{n} e_{i}) + h (- \sqrt{n} e_{i})] .

(35)

Remark: The third-degree cubature rule is identical to the unscented transformation with κ = 0.

To construct the fifth-degree cubature rule, the Mysovskikh’s method [29] and the moment matching method [16] are used to provide the fifth-degree spherical rule and radial rule, respectively. The final fifth-degree cubature rule is given by

\begin{array}{l} \int h (x) ϕ (x; 0, I) d x \approx & \frac{2}{n + 2} h (0) + \\ + \frac{n^{2} (7 - n)}{2 {(n + 1)}^{2} {(n + 2)}^{2}} \sum_{i = 1}^{n + 1} [h (\sqrt{n + 2} s_{1}^{(i)}) \\ + h (- \sqrt{n + 2} s_{1}^{(i)})] \\ + \frac{2 {(n - 1)}^{2}}{{(n + 1)}^{2} {(n + 2)}^{2}} \sum_{i = 1}^{n (n + 1) / 2} [h (\sqrt{n + 2} s_{2}^{(i)}) \\ + h (- \sqrt{n + 2} s_{2}^{(i)})], \end{array}

(36)

where the point $s_{1}^{(i)}$ is given by

s_{1}^{(i)} = [p_{1}^{(i)}, p_{2}^{(i)}, \dots, p_{n}^{(i)}], i = 1, 2, \dots, n + 1,

(37)

with

p_{j}^{(i)} = \{\begin{array}{l} - \sqrt{\frac{n + 1}{n (n - j + 2) (n - j + 1)}}, & j < i, \\ \sqrt{\frac{(n + 1) (n - i + 1)}{n (n - i + 2)}}, & j = i, \\ 0, & j > i. \end{array}

(38)

Moreover, the set of points ${s_{2}^{(i)}}$ is given by

\begin{array}{l} \{s_{2}^{(i)}\} = & \{\sqrt{\frac{n}{2 (n - 1)}} (s_{1}^{(k)} + s_{1}^{(l)}) : k < l, k, l = 1, \\ 2, \dots, n + 1\} . \end{array}

(39)

3.3 Augmented state-space model for network inference

In the state-space model for gene regulatory networks described in Section 3.2, the underlying network structure is characterized by the n × n regulatory coefficient matrix A in (2) and the parameters μ = [ μ₁,…,μ_n] in (4). The problem of network inference then becomes to estimate A and μ. To do that, we incorporate the unknown parameters A and μ into the state vector to obtain an augmented state-space model, and then apply the point-based Gaussian approximation filters to estimate the space vector and thereby obtaining the estimates of A and μ.

Specifically, we denote $θ = [a_{11}, a_{12}, \dots, a_{1 n}, \dots, {a_{nn}, μ_{1}, \dots, μ_{n}]}^{T}$ and the augmented state vector ${\bar{x}}_{k} = {[x_{k}^{T}, θ^{T}]}^{T}$ . Then, the augmented state equation can be written as

{\bar{x}}_{k} = \bar{f} ({\bar{x}}_{k - 1}) + {\bar{v}}_{k} = [\begin{array}{l} A_{k - 1} g_{k - 1} (x_{k - 1}) \\ θ_{k - 1} \end{array}] + [\begin{array}{l} v_{k - 1} \\ 0 \end{array}] .

(40)

Note that A_k-1 and g_k-1 can be obtained from θ_k-1, and ${\bar{v}}_{k} \sim N (0, {\bar{Q}}_{k})$ with ${\bar{Q}}_{k} = diag ([Q_{k} O_{n^{2} + n}])$ , where O_m denotes an m × m all-zero matrix.

In the remainder of this paper, we assume that the noisy gene expression levels are observed. Therefore, the augmented measurement equation becomes

y_{k} = h ({\bar{x}}_{k}) + n_{k} = B {\bar{x}}_{k} + n_{k},

(41)

where $B = [I_{n}, O_{n \times (n^{2} + n)}]$ , $O_{n \times (n^{2} + n)}$ denotes an n × (n² + n) all zeros matrix.

The point-based Gaussian approximation filters can then be used to obtain the estimate of the augmented state, ${\hat{\bar{x}}}_{k}$ , from which the estimates of the unknown network parameters, i.e., $\hat{A}$ and $\hat{μ}$ can then be obtained.

Note that since the measurement Equation (41) is linear, the filtering Equations (10, 11) become

\begin{array}{l} {\hat{\bar{x}}}_{k | k} & = {\hat{\bar{x}}}_{k | k - 1} + L_{k} (y_{k} - B {\hat{\bar{x}}}_{k | k - 1}), \end{array}

(42)

\begin{array}{l} and P_{k | k} & = P_{k | k - 1} - L_{k} B P_{k | k - 1}, \end{array}

(43)

\begin{array}{l} with L_{k} & = P_{k | k - 1} B^{T} {(R_{k} + B P_{k | k - 1} B^{T})}^{- 1}, \end{array}

(44)

which are the same as the filtering updates for Kalman filters.

4 Incorporating prior information

In practice, some prior knowledge on the underlying GRN is typically available. In this section, we outline approaches to incorporating such prior knowledge into the point-based Gaussian approximation filters for network inference. In particular, we consider two types of prior information, namely, sparsity constraints and range constraints on the network. For networks with sparsity constraints, we incorporate an iterative thresholding procedure into the Gaussian approximation filters. And to accommodate range constraints, we employ PDF-truncated Gaussian approximation filters.

4.1 Optimization-based approach for sparsity constraints

4.1.1 The optimization formulations

Note that under the Gaussian assumption, the state estimation ${\hat{\bar{x}}}_{k | k}$ of the Kalman filter is equivalently given by the solution to the following optimization problem [30, 31]

\begin{array}{l} {\hat{\bar{x}}}_{k | k} & = \underset{\bar{x}}{arg min} J (\bar{x}), \end{array}

(45)

\begin{array}{l} with J (\bar{x}) & ≜ {(y_{k} - h (\bar{x}))}^{T} R_{k}^{- 1} (y_{k} - h (\bar{x})) \\ + {(\bar{x} - {\hat{\bar{x}}}_{k | k - 1})}^{T} P_{k | k - 1}^{- 1} (\bar{x} - {\hat{\bar{x}}}_{k | k - 1}) . \end{array}

(46)

To incorporate the prior information of the GRN, (46) is modified as

\tilde{J} (\bar{x}) = J (\bar{x}) + λ J_{p} (\bar{x}),

(47)

where $J_{p} (\bar{x})$ is a penalty function associated with the prior information and λ is a tunable parameter that regulates the tightness of the penalty.

For example, in gene regulatory networks, each gene only interacts with a few genes [20]. To capture such a sparsity constraint, a Laplace prior distribution can be used for the connection coefficient matrix A, i.e.,

p (A) = {(λ / 2)}^{n^{2}} exp (- λ \sum_{i = 1}^{n} \sum_{j = 1}^{n} | a_{ij} |) .

(48)

Therefore, in this case, $J_{p} (\bar{x}) = - log p (A) = c_{1} ∥ A ∥_{1} + c_{2}$ where c₁ and c₂ are constants. And, (47) can be rewritten as

\tilde{J} (\bar{x}) = J (\bar{x}) + λ ∥ A ∥_{1} .

(49)

Note that (49) can also be interpreted as the result of applying the least squares shrinkage selection operator (LASSO) to (47). The LASSO adds an L₁ - norm constraint to the GRN so that the regulatory coefficient matrix A tends to be sparse with many zero elements.

As another example, if some known regulatory relationship exists, then it should be taken into account to improve the estimation accuracy. Specifically, define an n × n indicator matrix E = [ e_i,j] where e_ij = 1 indicates that there is a lack of regulation from gene j to gene i. Then, similar to the use of LASSO, a penalty on a_ij should incur if e_ij = 1. Thus, (47) can be rewritten as

\tilde{J} (\bar{x}) = J (\bar{x}) + λ ∥ E \circ A ∥_{1} .

(50)

Note that as in [20], here we do not force a_ij = 0 corresponding to e_ij = 1 but rather use an L₁ - norm penalty. The advantage of such an approach is that it allows the algorithm to pick different structures but more likely to pick the edges without penalties. ‘o’ denotes the entry-wise product operation of two matrices.

4.1.2 Iterative thresholding algorithm

Solving the optimization problems in (49) and (50) is not straightforward since |a| is non-differentiable at a = 0. In the following, an efficient solver called the iterative thresholding algorithm is introduced.

For convenience, we consider a general optimization problem of the form

\underset{\bar{x}}{arg min} J (\bar{x}) = L (\bar{x}) + ∥ λ \circ \bar{x} ∥_{1},

(51)

where $λ = {[λ_{1}, λ_{2}, \dots, λ_{n^{2} + 2 n}]}^{T}$ and $L (\bar{x})$ is a smooth function. Note that if $λ = {[0_{1 \times n}, λ \times 1_{1 \times n^{2}}, 0_{1 \times n}]}^{T}$ , then (51) becomes (49); and if $λ = {[0_{1 \times n}, λ \times \hat{\underset{̲}{θ}}, 0_{1 \times n}]}^{T}$ , then (51) becomes (50). Note that $\hat{\underset{̲}{θ}} = {[e_{11}, e_{12}, \dots, e_{1 n}, \dots, e_{nn}]}^{T}$ .

The solution to (51) can be iteratively obtained by solving a sequence of optimization problems. As in Newton’s method, the Taylor series expansion of $L (\bar{x})$ around the solution ${\bar{x}}^{t}$ at the t-th iteration is given by

L ({\bar{x}}^{t} + Δ \bar{x}) ≅ L ({\bar{x}}^{t}) + Δ {\bar{x}}^{T} \nabla L ({\bar{x}}^{t}) + \frac{α_{t}}{2} ∥ Δ \bar{x} ∥_{2}^{2},

(52)

where ∇L is the gradient of L and α_t is such that α_tI mimics the Hessian ∇²L. Then, ${\bar{x}}^{t + 1}$ is given by [32]

{\bar{x}}^{t + 1} = \underset{z}{arg min} {(z - {\bar{x}}^{t})}^{T} \nabla L ({\bar{x}}^{t}) + \frac{α_{t}}{2} ∥ z - {\bar{x}}^{t} ∥_{2}^{2} + ∥ λ \circ z ∥_{1} .

(53)

The equivalent form of (53) is given by [32]

\begin{array}{l} {\bar{x}}^{t + 1} & = \underset{z}{arg min} \frac{1}{2} ∥ z - u^{t} ∥_{2}^{2} + \frac{1}{α_{t}} ∥ λ \circ z ∥_{1}, \end{array}

(54)

\begin{array}{l} with u^{t} & = {\bar{x}}^{t} - \frac{1}{α_{t}} \nabla L ({\bar{x}}^{t}), \end{array}

(55)

\begin{array}{l} α_{t} & \approx & \frac{{(s^{t})}^{T} r^{t}}{∥ s^{t} ∥^{2}}, \end{array}

(56)

\begin{array}{l} s^{t} & = {\bar{x}}^{t} - {\bar{x}}^{t - 1}, \end{array}

(57)

\begin{array}{l} r^{t} & = \nabla L ({\bar{x}}^{t}) - \nabla L ({\bar{x}}^{t - 1}) . \end{array}

(58)

The solution to (54) is given by [32] ${\bar{x}}^{t + 1} = η^{S} (u^{t}, \frac{λ}{α_{t}})$ , where

η^{S} (u, a) = sign (u) max \{| u | - a, 0\}

(59)

is the soft thresholding function with sign(u) and $max \{| u | - a, 0\}$ being component-wise operators.

Finally, the iterative procedure for solving (51) is given by

{\bar{x}}^{t + 1} = sign ({\bar{x}}^{t} - \frac{1}{α_{t}} \nabla L ({\bar{x}}^{t})) max \{|{\bar{x}}^{t} - \frac{1}{α_{t}} \nabla L ({\bar{x}}^{t})| - \frac{λ}{α_{t}}, 0\} .

(60)

And the iteration stops when the following condition is met

\frac{| J ({\bar{x}}^{t}) - J ({\bar{x}}^{t - 1}) |}{| J ({\bar{x}}^{t - 1}) |} \leq ε,

(61)

where ε is a given small number.

4.2 PDF truncation method for range constraints

If the range constraints on the regulatory coefficients are available, the inference accuracy can be improved by enforcing the constraints in the Gaussian approximation filters.

In particular, assume that we impose the following range constraints on the state vector $\bar{x}$

c \leq \bar{x} \leq d .

(62)

The PDF truncation method [31] can be employed to incorporate the above range constraint into the Gaussian approximation filters, by converting the updated mean ${\hat{\bar{x}}}_{k | k}$ and covariance P_k|k to the pseudo mean ${\hat{\bar{x}}}_{k | k}^{t}$ and covariance $P_{k | k}^{t}$ which are then used in the next prediction and filtering steps.

We next briefly outline the PDF truncation procedure. We use ${\hat{\bar{x}}}_{k | k, i}^{t}$ and $P_{k | k, i}^{t}$ to denote the mean and covariance after the first i constraints have been enforced. Initially, we set ${\hat{\bar{x}}}_{k | k, 0}^{t} = {\hat{\bar{x}}}_{k | k}$ and $P_{k | k, 0}^{t} = P_{k | k}$ . Consider the following transformation

z_{k, i} = G_{i} D_{i}^{- 1 / 2} S_{i}^{T} ({\bar{x}}_{k} - {\hat{\bar{x}}}_{k | k, i}^{t})

(63)

where $S_{i}$ and $D_{i}$ are obtained from the Jordan canonical decomposition $S_{i} D_{i} S_{i}^{T} = P_{k | k, i}^{t}$ and $G_{i}$ is obtained by using the Gram-Schmidt orthogonalization and it satisfies [33]

G_{i} D_{i}^{1 / 2} S_{i}^{T} e_{i} = [{(e_{i}^{T} P_{k | k, i}^{t} e_{i})}^{1 / 2}, 0, \dots, 0] .

(64)

Then, the upper bound $e_{i}^{T} \bar{x} \leq d_{i}$ is transformed to [33]

[1, 0, \dots, 0] z_{k, i} \leq \frac{d_{i} - e_{i}^{T} {\hat{\bar{x}}}_{k | k, i}^{t}}{{(e_{i}^{T} P_{k | k, i}^{t} e_{i})}^{1 / 2}} ≜ {\tilde{d}}_{i} .

(65)

Similarly, the lower bound $e_{i}^{T} \bar{x} \geq c_{i}$ is transformed to

[1, 0, \dots, 0] z_{k, i} \geq \frac{c_{i} - e_{i}^{T} {\hat{\bar{x}}}_{k | k, i}^{t}}{{(e_{i}^{T} P_{k | k, i}^{t} e_{i})}^{1 / 2}} ≜ {\tilde{c}}_{i} .

(66)

The constraint requires that the first element of z_k,i lies between ${\tilde{c}}_{i}$ and ${\tilde{d}}_{i}$ . Hence, only the truncated PDF of the first element of z_k,i is considered and it is given by [33]

\begin{array}{l} f (z) & = α_{i} exp (- z^{2} / 2), \end{array}

(67)

\begin{array}{l} with α_{i} & = \frac{\sqrt{2}}{\sqrt{π} [erf ({\tilde{d}}_{i} / \sqrt{2}) - erf ({\tilde{c}}_{i} / \sqrt{2})]} . \end{array}

(68)

Then, the mean and variance of the first element of z_k,i after imposing the i-th constraint are given respectively by

\begin{array}{l} μ_{i} = & \int_{{\tilde{c}}_{i}}^{{\tilde{d}}_{i}} zf (z) d z = α_{i} [exp (- {\tilde{c}}_{i}^{2} / 2) - exp (- {\tilde{d}}_{i}^{2} / 2)], \end{array}

(69)

\begin{array}{l} σ_{i}^{2} = & \int_{{\tilde{c}}_{i}}^{{\tilde{d}}_{i}} {(z - μ_{i})}^{2} f (z) d z \\ = & α_{i} [exp (- {\tilde{c}}_{i}^{2} / 2) ({\tilde{c}}_{i} - 2 μ_{i}) \\ - exp (- {\tilde{d}}_{i}^{2} / 2) ({\tilde{d}}_{i} - 2 μ_{i})] + μ_{i}^{2} + 1 . \end{array}

(70)

Thus, the mean and covariance of the transformed state vector after imposing the i-th constraint are given respectively by

\begin{array}{l} {\bar{z}}_{k, i} & = {[μ_{i}, 0, \dots, 0]}^{T}, \end{array}

(71)

\begin{array}{l} Q_{k, i} & = diag ([σ_{i}^{2}, 1, \dots, 1]) . \end{array}

(72)

By taking the inverse transform of (63), we then get

\begin{array}{l} {\hat{\bar{x}}}_{k | k, i + 1}^{t} & = S_{i} D_{i}^{1 / 2} G_{i}^{T} {\bar{z}}_{k, i} + {\hat{\bar{x}}}_{k | k, i}^{t}, \end{array}

(73)

\begin{array}{l} P_{k | k, i + 1}^{t} & = S_{i} D_{i}^{1 / 2} G_{i}^{T} Q_{k, i} G_{i} D_{i}^{1 / 2} S_{i}^{T} . \end{array}

(74)

After imposing all n constraints, the final constrained state estimate and covariance at time k are given respectively by ${\hat{\bar{x}}}_{k | k}^{t} ≜ {\hat{\bar{x}}}_{k | k, n}^{t}$ and $P_{k | k}^{t} ≜ P_{k | k, n}^{t}$ .

5 Numerical results

5.1 Synthetic network

In this section, a synthetic network that contains eight genes is used to test the performance of the EKF, the UKF, the CKF₃, the CKF₅, and their corresponding filters incorporating the prior information. Forty data points are collected to infer the structure of the network. The system noise and measurement noise are assumed to be Gaussian distributed with means 0 and covariances ${\bar{Q}}_{k} = diag ([0.01 I_{8} O_{72}])$ and R_k = 0.01 I₈, respectively. The connection coefficient matrix is given by

\begin{array}{l} A = (\begin{array}{l} 0 & 0 & 0 & 0 & 0 & 0 & 2.4 & 3.2 \\ 0 & 0 & 0 & 4.1 & 0 & - 2.4 & 0 & 4.1 \\ - 5.0 & 2.1 & - 1.5 & 0 & 4.5 & 0 & 2.1 & 0 \\ 0 & 1.3 & 2.5 & - 3.7 & 1.8 & 0 & 0 & - 3.1 \\ 0 & 0 & 0 & - 2.6 & - 3.2 & 0 & - 1 & 4 \\ - 1.5 & - 1.8 & 0 & 3.4 & 1.4 & 1.1 & 0 & 1.7 \\ - 1.8 & 0 & 0 & - 3 & 1.1 & 2.4 & 0 & 0 \\ - 1.3 & 0 & - 1 & 0 & 2.1 & 0 & 0 & 2.2 \end{array}) \end{array}

(75)

and μ_i = 2, i = 1,⋯,8. For the filter, each coefficient in $\hat{A}$ is initialized from a Gaussian distribution with mean 0 and variance 0.2. Moreover, the coefficient μ_i is initialized from a Gaussian distribution with mean 1.5 and variance 0.2. The system state is initialized using the first measurement.

The metric used to evaluate the inferred GRN is the true positive rate (TPR), the false positive rate (FPR), and the positive predictive value (PPV). They are given by [34]

TPR = \frac{TP #}{TP # + FN #},

(76)

FPR = \frac{FP #}{FP # + TN #},

(77)

PPV = \frac{TP #}{TP # + FP #},

(78)

where the number of true positives (TP #) denotes the number of links correctly predicted by the inference algorithm; the number of false positives (FP #) denotes the number of incorrectly predicted links; the number of true negatives (TN #) denotes the number of correctly predicted nonlinks; and the number of false negatives (FN #) denotes the number of missed links by the inference algorithm [8].

5.1.1 Comparison of the EKF with point-based Gaussian approximation filters

The UKF with different parameter κ is tested. The simulation results based on 50 Monte Carlo runs are shown in Table 1. It can be seen that UKFs with κ=0,2,5 have slight better performance than UKFs with κ = -5,-2. One possible reason is that the weights of all sigma points used in the UKF are all positive when κ ≥ 0. In general, all positive weights will guarantee better stability of the filtering algorithm. However, it should be emphasized that, in this specific example, there is no big difference between UKFs with different κ. In addition, the objective of this paper was to investigate the proposed filter incorporating the prior information. Hence, the UKF is used to denote UKF with κ = 3 - n and compare with the filters incorporating the prior information.

Table 1 Comparison of UKF with different κ

Full size table

The inference results of the EKF, the UKF, the CKF₃, and the CKF₅ are summarized in Table 2, all results are based on 50 Monte Carlo runs. It can be seen that all point-based Gaussian approximation filters have better performance than the EKF since the average(avg) FPR is lower and the average TPR and precision are higher than that of the EKF. Although the CKFs exhibit slightly better filtering performance than the UKF in some runs, they are comparable in terms of TPR, FPR, and PPV.

Table 2 Comparison of different filters

Full size table

Based on the above tests, in the rest of the paper, only the UKF is used.

5.1.2 Comparison of the UKF and the UKF incorporating the prior information

As mentioned above, the UKF is used as a typical filter to compare the performance with and without the prior information.

Incorporating existing network information The following prior existing network information is assumed to be known: 1) gene1, gene5, and gene7 have little possibility to regulate gene2; 2) gene2, gene3, gene8 have little possibility to regulate gene7. Hence, the indicator matrix in (50) is given by

\begin{array}{l} E = (\begin{array}{l} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}) . \end{array}

(79)

The comparison of the UKF and the UKF incorporating the existing network information (denoted by UKF_{p 1}) with λ = 2 is shown in Table 3. It can be seen that the average TP # and TN # of the UKF_{p 1} are both higher than those of the UKF. In addition, the average FP # and FN # of the UKF_{p 1} are lower than those of the UKF. Hence, the UKF_{p 1} predicts more correct links and nonlinks than the UKF. Moreover, the UKF_{p 1} produces less incorrect links and missed links than the UKF. The average TPR and the precision of the UKF_{p 1} are higher than those of the UKF. In addition, the average FPR of the UKF_{p 1} is lower than that of the UKF. Hence, by using the existing network information, the inference accuracy can be improved.

Table 3 Inferred results of the conventional filter and filters incorporating the prior information

Full size table

The performance of UKF_{p 1} with different λ is shown in Table 4. It is seen that the performance of UKF_{p 1} and UKF is close when λ is small since only small regulation is imposed on the solution. When λ is large, the difference between the UKF_{p 1} and UKF is large. In particular, the UKF_{p 1} provides sparser solution than the UKF when λ is large. It can be seen from Table 4, the average FPR of UKF_{p 1} decreases with the increasing of λ. The average TPR of UKF_{p 1}, however, does not increase monotonically with the increasing of λ. The average PPR of UKF_{p 1} increases with the increasing of λ. Hence, roughly speaking, the UKF_{p 1} is better than the UKF when large λ is used.

Table 4 Comparison of UKF _p1 using different λ

Full size table

To consider the strength of the links, rather than setting it to 1, e_ij (in the indicator matrix E) is set to different values. Large e_ij is used if the strength of the link from gene j to gene i is strong. For convenience, the UKF considering the strength of links is denoted as ${UKF}_{\hat{p} 1}$ . To compare the performance of ${UKF}_{\hat{p} 1}$ with UKF_{p 1}, for ${UKF}_{\hat{p} 1}$ , the values of the second row in Equation (79) is multiplied by 5. The performance of ${UKF}_{\hat{p} 1}$ using different λ is given in Table 5. It can be seen from Tables 4 and 5 that the performance of ${UKF}_{\hat{p} 1}$ and UKF_{p 1} is close when λ is small, e.g., λ = 0.1. In addition, the average TPR and FPR of ${UKF}_{\hat{p} 1}$ is smaller than that of UKF_{p 1} for all tested λ except for λ = 0.1. Hence, PPR is used to evaluate the performance of ${UKF}_{\hat{p} 1}$ and UKF_{p 1}. Although the average PPR of ${UKF}_{\hat{p} 1}$ and UKF_{p 1} is close when the λ is large, e.g., λ = 10, the average PPR of ${UKF}_{\hat{p} 1}$ is consistently higher than that of UKF_{p 1}. The results indicate that the inference accuracy of ${UKF}_{\hat{p} 1}$ and UKF_{p 1} are close when λ is very small or very large. The inference accuracy of ${UKF}_{\hat{p} 1}$ outperforms UKF_{p 1} when the appropriate strength of the link and parameter λ are used.

Table 5 Effect of strength of the links using different λ

Full size table

To consider the effect of false prior knowledge, the second row of the indicator matrix in Equation (79) is changed to [ 0,1,1,1,0,1,0,1], which conflicts with the truth. For convenience, we use ${UKF}_{\bar{p} 1}$ to denote the UKF incorporating this false prior knowledge. In Table 6, the performance of ${UKF}_{\bar{p} 1}$ with different λ is shown. It can be seen from Tables 4 and 6 that the average TPR of ${UKF}_{\bar{p} 1}$ is smaller than that of UKF_{p 1} when λ is small, e.g., λ = 0.1,0.5. In addition, the average FPR of ${UKF}_{\bar{p} 1}$ is larger than that of UKF_{p 1} when λ is large, e.g., λ = 5,10. Moreover, although the average PPR of ${UKF}_{\bar{p} 1}$ is close to that of UKF_{p 1} when λ is small, the average PPR of ${UKF}_{\bar{p} 1}$ is consistently lower than that of UKF_{p 1}. Hence, as expected, the results indicate that the false prior knowledge will lead to worse inference result.

Table 6 Effect of false prior information using different λ

Full size table

Incorporating LASSO The problem setup is the same as before except that the LASSO rather than the existing network information is used. The UKF incorporating LASSO is denoted as UKF_{p 2}.

As shown in Table 3, the average TP # and FP # of UKF_{p 2} are lower than those of UKF and the average TN # and FN # of UKF_{p 2} are higher than those of UKF. Hence, UKF_{p 2} produces less links, including correct and incorrect ones. In addition, UKF_{p 2} produces more nonlinks and missed links. It is consistent with the fact that the LASSO tends to provide a sparse solution. It can be seen from Table 3 that the average FPR of UKF_{p 2} is lower than that of UKF and the average precision of UKF_{p 2} is higher than that of UKF. Hence, by incorporating LASSO, the inference accuracy is improved.

A representative inference result of UKF_{p 2} and the true regulations are shown in Figure 1. For comparison, the inference result of UKF and the true regulations are shown in Figure 2. By comparing Figure 2 and Figure 1, it can be seen that UKF falsely predicts the nonlinks from gene1 to gene2, from gene3 to gene6, from gene4 to gene8, from gene5 to gene2, and from gene6 to gene4 while UKF_{p 2} does not.

The performance UKF_{p 2} with different λ is shown in Table 7. It is seen that the performance of UKF_{p 2} and UKF is close when λ is small since only small regulation is imposed on the solution. When λ is large, the difference between UKF_{p 2} and UKF is large. The average TPR and FPR of UKF_{p 2} decrease with the increasing of the λ. The average PPR does not increase monotonicallly with the increasing of λ. Generally speaking, for different λ, UKF_{p 2} is more sensitive than that of UKF_{p 1}. Although the performance of UKF_{p 2} depends on λ, the average PPR of UKF_{p 2} is consistently higher than that of UKF. Hence, roughly speaking, UKF_{p 2} has better performance than UKF.

Table 7 Comparison of UKF _p2 using different λ

Full size table

Incorporating the range constraint The existing network information can be used to provide the rough range constraint of $\bar{x}$ . A tight constraint is forced on the regulation coefficient a_ij when there is a small regulation possibility from genej to genei and a loose constraint is forced on the regulation coefficient with no prior information. In the simulation, for the coefficients corresponding to the zero elements in (79), the lower bound and the upper bound are set as -10 and 10, respectively. For the coefficients corresponding to the nonzero elements in (79), the lower bound and the upper bound are set as -0.1 and 0.1, respectively. The UKF incorporating the range constraint is denoted as UKF_{p 3}. As shown in Table 3, the average FPR of UKF_{p 3} is lower than that of UKF and the average precision of UKF_{p 3} is higher than that of UKF.

5.2 Yeast protein synthesis network

In this section, time-series gene expression data of the yeast protein synthesis network is used. Five genes (HAP1, CYB2, CYC7,CYT1, and COX5A) of the yeast protein synthesis network are considered and 17 data points which can be found in [35] are collected. The regulation relationship between them has been revealed by the biological experiment and shown in Figure 3. The dashed arrow in Figure 3 denotes ‘repression’ and the solid arrow denotes ‘activation.’

The GRN is inferred by the UKF and UKF_{p 2}. The predicted gene expressions using parameters estimated by UKF_{p 2} and the true measured gene expressions are shown in Figure 4. It can be seen that the model output fits the measured data well. The variances of the regulatory coefficients of HAP1 (P_1i (1 ≤ i ≤ 5)) are shown in Figure 5. It can be seen that the filter converges since the variance P_1i approaches zero. The results for other regulatory coefficients are similar and not shown here. The evaluation of the inferred GRN by UKF and UKF_{p 2} is shown in Table 8.

Table 8 Inferred results of the UKF and UKF _p2

Full size table

By incorporating the sparsity constraint, UKF_{p 2} provides much better inference results than UKF. As shown in Table 8, the TP # and TN # of UKF_{p 2} are higher than those of UKF and the FP # and FN # are lower than those of UKF. In addition, it can be seen from Table 8, the FPR of UKF_{p 2} is lower than that of UKF and the TPR and the precision of UKF_{p 2} is higher than that of UKF.

6 Conclusions

In this paper, we have proposed a framework of employing the point-based Gaussian approximation filters which incorporates the prior knowledge to infer the gene regulatory network (GRN) based on the gene expression data. The performance of the proposed framework is tested by a synthetic network and the yeast protein synthesis network. Numerical results show that the inference accuracy of the GRN by the proposed point-based Gaussian approximation filter incorporating the prior information is higher than using the traditional filters without incorporating prior knowledge. The proposed method works for small- and medium-size GRNs due to the computational complexity considerations. It remains a future research topic how to adapt the proposed inference framework to handle large GRNs at reasonable computational complexity.

References

Zou M, Conzen SD: A new dynamic Bayesian network (dbn) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics. 2005, 21 (1): 71-79.
Article Google Scholar
Zhou X, Wang X, Pal R, Ivanov I, Bittner M, Dougherty ER: A Bayesian connectivity-based approach to constructing probabilistic gene regulatory networks. Bioinformatics. 2004, 20 (17): 2918-2927.
Article Google Scholar
Quach M, Brunel N, d’Alché Buc F: Estimating parameters and hidden variables in non-linear state-space models based on odes for biological networks inference. Bioinformatics. 2007, 23 (23): 3209-3216.
Article Google Scholar
Wang Z, Liu X, Liu Y, Liang J, Vinciotti V: An extended Kalman filtering approach to modeling nonlinear dynamic gene regulatory networks via short gene expression time series. Comput. Biol. Bioinformatics, IEEE/ACM Trans. 2009, 6 (3): 410-419.
Article Google Scholar
Wu X, Li P, Wang N, Gong P, Perkins EJ, Deng Y, Zhang C: State space model with hidden variables for reconstruction of gene regulatory networks. BMC Syst Biol. 2011, 5 (Suppl 3): S3-10.1186/1752-0509-5-S3-S3.
Article Google Scholar
Werhli AV, Husmeier D: Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge. Stat. Appl. Genet. Mol. Biol. 2007, 6: Article 15-
MathSciNet Google Scholar
Mazur J, Ritter D, Reinelt G, Kaderali L: Reconstructing nonlinear dynamic models of gene regulation using stochastic sampling. BMC Bioinformatics. 2009, 10: 448-
Article Google Scholar
Noor A, Serpedin E, Nounou M, Nounou H: Inferring gene regulatory networks via nonlinear state-space models and exploiting sparsity. Comput. Biol. Bioinformatics, IEEE/ACM Trans. 2012, 9 (4): 1203-1211.
Article Google Scholar
Hecker M, Lambeck S, Toepfer S, van Someren E, Guthke R: Gene regulatory network inference: Data integration in dynamic models - a review. Biosystems. 2009, 96 (1): 86-103.
Article Google Scholar
Markowetz F, Spang R: Inferring cellular networks - a review. BMC Bioinformatics. 2007, 8 (Suppl 6): S5-10.1186/1471-2105-8-S6-S5.
Article Google Scholar
Huang Y, Tienda-Luna I, Wang Y: Reverse engineering gene regulatory networks. Signal Process. Mag., IEEE. 2009, 26 (1): 76-97.
Article Google Scholar
de Jong H: Modeling and simulation of genetic regulatory systems: a literature review. J. Comput. Biol. 2002, 9: 67-103.
Article Google Scholar
Julier SJ, Uhlmann JK: Unscented filtering and nonlinear estimation. Proc. IEEE. 2004, 92 (3): 401-422. 10.1109/JPROC.2003.823141.
Article Google Scholar
Ito K, Xiong K: Gaussian filters for nonlinear filtering problems. Automatic Control, IEEE Trans. 2000, 45 (5): 910-927. 10.1109/9.855552.
Article MathSciNet Google Scholar
Arasaratnam I, Haykin S: Cubature kalman filters. Automatic Control, IEEE Trans. 2009, 54 (6): 1254-1269.
Article MathSciNet Google Scholar
Jia B, Xin M, Cheng Y: Sparse-grid quadrature nonlinear filtering. Automatica. 2012, 48 (2): 327-341. 10.1016/j.automatica.2011.08.057.
Article MathSciNet Google Scholar
Arulampalam M, Maskell S, Gordon N, Clapp T: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. Signal Process., IEEE Trans. 2002, 50 (2): 174-188. 10.1109/78.978374.
Article Google Scholar
Shen X, Vikalo H: Inferring parameters of gene regulatory networks via particle filtering. EURASIP J. Adv. Signal Process. 2010, 2010: 204612-10.1155/2010/204612.
Article Google Scholar
Steele E, Tucker A, ‘t Hoen PA, Schuemie M: Literature-based priors for gene regulatory networks. Bioinformatics. 2009, 25 (14): 1768-1774.
Article Google Scholar
Christley S, Nie Q, Xie X: Incorporating existing network information into gene network inference. PLoS ONE. 2009, 4 (8): e6799-
Article Google Scholar
Tamada Y, Kim S, Bannai H, Imoto S, Tashiro K, Kuhara S, Miyano S: Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection. Bioinformatics. 2003, 19 (suppl 2): 227-236.
Article Google Scholar
Li H, Zhan M: Unraveling transcriptional regulatory programs by integrative analysis of microarray and transcription factor binding data. Bioinformatics. 2008, 24 (17): 1874-1880.
Article Google Scholar
Bouaynaya N, Shterenberg R, Schonfeld D: Methods for optimal intervention in gene regulatory networks [applications corner]. Signal Process. Mag., IEEE. 2012, 29 (1): 158-163.
Article Google Scholar
Chen L, Aihara K: Chaos and asymptotical stability in discrete-time neural networks. Physica D: Nonlinear Phenomena. 1997, 104 (3): 286-325.
Article MathSciNet Google Scholar
Qian L, Wang H, Dougherty ER: Inference of noisy nonlinear differential equation models for gene regulatory networks using genetic programming and Kalman filtering. Signal Process., IEEE Trans. 2008, 56 (7): 3327-3339.
Article MathSciNet Google Scholar
Vohradsky J: Neural model of the genetic network. J. Biol. Chem. 2001, 276 (39): 36168-36173.
Article Google Scholar
Mjolsness E, Mann T, Castano R, Wold B: From coexpression to coregulation: an approach to inferring transcriptional regulation among gene classes from large-scale expression data. in Advances in Neural Information Processing Systems. 1999, 12: 928-934.
Google Scholar
Nørgaard M, Poulsen NK, Ravn O: New developments in state estimation for nonlinear systems. Automatica. 2000, 36 (11): 1627-1638. 10.1016/S0005-1098(00)00089-3.
Article MathSciNet Google Scholar
Mysovskikh IP: The Approximation of Multiple Integrals by Using Interpolatory Cubature Formulae in Quantitative Approximation, ed. by R DeVore, K Scherer. 1980, Academic Press, New York,
Google Scholar
Jazwinski AH: Stochastic Processes and Filtering Theory. 2007, Academic Press Inc., Waltham, MA,
Google Scholar
Teixeira BO, Tôrres LA, Aguirre LA, Bernstein DS: On unscented Kalman filtering with state interval constraints. J. Process Control. 2010, 20 (1): 45-57. 10.1016/j.jprocont.2009.10.007.
Article Google Scholar
Wright S, Nowak R, Figueiredo M: Sparse reconstruction by separable approximation. Signal Process., IEEE Trans. 2009, 57 (7): 2479-2493.
Article MathSciNet Google Scholar
Simon D, Simon DL: Constrained Kalman filtering via density function truncation for turbofan engine health estimation. Int. J. Syst. Sci. 2010, 41 (2): 159-171. 10.1080/00207720903042970.
Article Google Scholar
Emmert-Strib F, Dehmer M: Analysis of Microarray Data. 2008, Wiley-Blackwell, Hoboken, NJ,
Book Google Scholar
Wang H, Qian L, Dougherty E: Inference of gene regulatory networks using s-system: a unified approach. Syst. Biol., IET. 2010, 4 (2): 145-156. 10.1049/iet-syb.2008.0175.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Fusion Technology, Germantown, MD, 20876, USA
Bin Jia
Department of Electrical Engineering, Columbia University, New York, NY, 10027, USA
Xiaodong Wang

Authors

Bin Jia
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Wang.

Additional information

Competing interests

All authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Jia, B., Wang, X. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information. J Bioinform Sys Biology 2013, 16 (2013). https://doi.org/10.1186/1687-4153-2013-16

Download citation

Received: 25 July 2013
Accepted: 11 November 2013
Published: 17 December 2013
DOI: https://doi.org/10.1186/1687-4153-2013-16

Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information

Abstract

Abstract

1 Introduction

2 State-space modeling of gene regulatory network

3 Network inference using point-based Gaussian approximation filters

3.1 Gaussian approximation filters

3.2 Point-based Gaussian approximation filters

3.2.1 Unscented transform

3.2.2 Cubature rules

3.3 Augmented state-space model for network inference

4 Incorporating prior information

4.1 Optimization-based approach for sparsity constraints

4.1.1 The optimization formulations

4.1.2 Iterative thresholding algorithm

4.2 PDF truncation method for range constraints

5 Numerical results

5.1 Synthetic network

5.1.1 Comparison of the EKF with point-based Gaussian approximation filters

5.1.2 Comparison of the UKF and the UKF incorporating the prior information

5.2 Yeast protein synthesis network

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords