Table of Contents

Projects

Cell Illustrator

The aim of this Cell Illustrator Project is to supply a standard in silico tool for biologist working in bench. By using this software, with easy drug and drop based operations, user can model, simulate, visualize and analyze your favorite biological pathways, e.g. metabolic pathway, signal trausduction pathway, gene regulatory network, cell-cell interactions.

Cell System Markup Language (CSML)

The aim of this Cell System Markup Language (CSML) project is to develop the useful XML format for modeling, visualizing and simulating biological pathways, e.g. metabolic pathway, signal transduction pathway, gene regulatory networks and cell-cell interactions. Another aim is to develop infrastructure related with this CSML format, e.g. parser, simulator, editor, reporter, analyzer, support and documentation of this format who would like to develop own application by using this format.

Statistical Analysis for Gene Expression Data

Clustering of Gene Expression Data

An open source library for hierarchical clustering, k-means and self-organizing map (SOM) was develped in our laboratory. This library is written in C language and formally used by NCBI. Also, we developed a model-based clustering method called Mixed Factors Anaslysis and produced a software ArrayCluster as an implementation.

Estimation of Gene Regulatory Network

Using gene expression data measured by GeneChip or other microarrays, we focus on estimating gene regulatory networks. Computational methodology enhanced with cutting-edge statistical methods, includes Bayesian Networks, Lasso regression, state space models were developed in order to construct large-scale gene netowrks. As an application of gene networks, we focus on computational drug target discovery.

Analysis of Exon Array Data

Exon Array, a very high-density array, enables us to measure expression values of more than 1,000,000 exons, simultaneously. Usual personal computer is not enough for high-level statistical anlaysis. In our laboraroty, a statistical method was developed for identifying aberrant splice variations in tumor cells. We implemented the method on our super computer system of Human Genome Center, as ExonMiner that is publicly available web service.

Gene Association Analysis

In significance analysis of microarrays, we usually define significant gene sets based on statistical testing. For evaluating such gene sets biologically, there are many software and web applications such as GO::TermFinder developed previously. However, other methods use ranking of genes based on p-values of statistical testing for evaluating significance of each GO term. We elucidated the problem in this rank-based methods and developed a statistical method to solve it. The method proposed is called MetaGP and have beed provided as a web service.

Data Assimilation for Biological System Networks

It is called “The Fourth Science” which “blends” simulation models and observational data “rationally”. Our key issue is to develop and fuse the data analysis technology and the biological system simulation technology.