This is an old revision of the document!
Andre Fujita, Kaname Kojima, Alexandre G. Patriota, Joao R. Sato, Patricia Severino, Satoru Miyano
Summary: We propose a likelihood ratio test (LRT) with Bartlett correction in order to identify Granger causality between sets of time series gene expression data. The performance of the proposed test is compared to a previously published bootstrap-based approach. LRT is shown to be significantly faster and statistically powerful even within non-Normal distributions. An R package named gGranger containing an implementation for both Granger causality identification tests, is also provided.
Contact: andrefujita AT riken DOT jp
Installation: In order to install gGranger, download the appropriate file below and type the following command at your terminal console: R CMD INSTALL <file name>
R packages | |
---|---|
for Windows | ggranger_win_1.0.0.tar.gz |
for Linux | ggranger_linux_1.0.0.tar.gz |
Original paper (Identification of Granger causality between gene sets)
Paper (preprint): grangerforgroups.pdf DOI:10.1142/S0219720010004860
Supplementary files
Simulation 1 (Multivariate model): This simulation illustrates a multivariate (standard) case where nine (predictor) genes Granger cause one (target) gene. The weights are at the edges of the network. Gene x_{9} does not Granger cause gene y_{t}. (Figure 1: figure1suppl.pdf)
Simulation 2 (Module-module model): This simulation illustrates a module-module case where a set of genes Granger cause another set of genes. The details about this simulation is explained in the original paper in page 9 simulation 1 of grangerforgroups.pdf. (Figure 2: figure2suppl.pdf)
Simulation 2 with different noises
The following figures illustrate the p-value distributions for both, the bootstrap and the Likelihood Ratio Test procedures under the null hypothesis (set I to set III; set II to set I; set III to set and set III to set III), and the ROC curves for simulation 2 of the manuscript. Full lines represents the LRT while dashed line is the bootstrap test in the ROC curves. The results show that both LRT and bootstrap procedures control effectively the rate of false positives (p-values histograms close to uniform distributions)
- Simulations with Gaussian noises (N(0,1)).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
75 | pvaluebootfp-75.pdf | pvaluelikefp-75.pdf | roc-75.pdf | |
100 | pvaluebootfp-100.pdf | pvaluelikefp-100.pdf | roc-100.pdf |
- Simulations with Uniform noises (U(-0.5,0.5)).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
200 | pvaluebootfp-uniform.pdf | pvaluelikefp-uniform.pdf | roc-uniform.pdf |
- Simulations with Exponential noises (with mean Exp(1)-1).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
75 | pvaluebootfp-75exp.pdf | pvaluelikefp-75exp.pdf | roc-75exp.pdf |
- Simulations with Gamma noises (Gamma(1,1)-1).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
100 | pvaluebootfp-gamma.pdf | pvaluelikefp-gamma.pdf | roc-gamma.pdf |
- Simulations with half-normal noises (abs(N(0,1))-sqrt(1/Pi)).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
100 | pvaluebootfp-halfnormal.pdf | pvaluelikefp-halfnormal.pdf | roc-halfnormal.pdf |
- Simulations with t-Student noises (d.f.=3).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
100 | pvaluebootfp-tstudentdf3.pdf | pvaluelikefp-tstudentdf3.pdf | roc-tstudentdf3.pdf |
- Simulations with t-Student noises (d.f.=7).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
100 | pvaluebootfp-tstudentdf7.pdf | pvaluelikefp-tstudentdf7.pdf | roc-tstudentdf7.pdf |
- Simulations with a multivariate t-Student noises (d.f.=3).
Time series length | Bootstrap test | Likelihood ratio test | ROC curve | |
---|---|---|---|---|
100 | pvaluebootfp-tstudentxxxdf3.pdf | pvaluelikefp-tstudentxxxdf3.pdf | roc-tstudentxxxdf3.pdf |
Verifying the control of type I error in actual biological data
In order to verify if LRT can control the rate of false positives even in actual biological data, we selected the same genes used in (grangerforgroups.pdf) and permuted the values of the time series, consequently, eliminating eventually existing Granger causality among them. Then, the Granger causality between sets based on LRT were carried out in order to identify Granger causality. In hela-random.pdf one can observe that the type I error is effectively controlled by LRT since all the p-values' histograms are close to uniform distributions (under the null hypothesis). This experiment was done 10,000 times.
Application to actual biological data
The following figure (hela-network.pdf) illustrates the application of Granger causality for sets of genes with LRT and Bartlett correction. The coefficients estimated by the method are at the edges. Solid lines represent statistically significant Granger causalities with p-value < 0.05 and dashed lines are p-value < 0.10.