My main research interest is in identifying causal interactions in biological systems from whole-genome studies. We use two approaches to infer causation: (1) chronology of events observed in dynamical systems, especially time-dependent gene expression, and (2) non-Gaussian patterns in joint probability distributions. Timing of cell-cycle regulated gene transcription: We develop computational methods of analysis of timecourse gene expression data, based on MAP optimization and the Maximum Entropy principle. We have designed and implemented an algorithm which deconvolves the measured culture-average profiles and allows to recover the single cell expression profile for each cell-cycle regulated gene. Peaks of transcripts regulated by the yeast cell cycle were recovered with a precision an order of magnitude better than the resolution of the source data. We have identified a previously undescribed, pre-replicative (G1/P) wave of transcription of cell cycle genes. Our results have provided new insight into the assembly and dynamics of molecular complexes involved in the mitotic cell division (e.g. MCM, ORC), as well as allowed us to discover transcriptional regulation of genes previously thought to be constitutively expressed, as Cdc28/Cdk1, the master cell cycle regulator. Cell-cycle regulation in different species and conditions: We apply the deconvolution method to comparing the temporal organization of cell-cycle events in different species and under different experimental conditions (e.g. high- and – low – nutrient, healthy and disease states, cell cultures representing individuals of different ages). We identify the preserved regulatory modules. By analyzing the data in context of regulatory motifs in the untranslated regions of the genes, we reconstruct the transcription factor activity and its evolution or dependence on environmental factors. Spatiotemporal Organization of Somitogenesis: Generation of somites in a vertebrate embryo is dependent on waves gene expression which exhibits periodicity in the spatiotemporal domain. We are developing methods of data analysis tailored to microarray data collected in such systems. The algorithms include methods of detecting regulated genes, as well as reconstructing the underlying spatiotemporal expression patterns using a maximum a posteriori deconvolution procedure, similar to the one applied to the yeast cell-cycle. Inferring causation in protein networks from non-Gaussian probability distributions: Reconstructing protein networks is important for selecting candidate biomarkers and targets for drugs. The task is facilitated if the directionality (or causality) of interactions is known. We are working on inferring causal interactions in protein networks without the need for experimental interventions, by identifying asymmetric features in joint distributions of expression levels of pairs of genes, collected in a large number of conditions. We select and calibrate various statistical measures of asymmetry, using known interactions in yeast and human protein networks as training sets. Online Tools for Analysis of Time Course Gene Expression Profiles: SCEPTRANS is a comprehensive on-line tool for analysis of microarrays from cell-division and metabolic cycles in the budding yeast. We are expanding this project into a general repository of periodic expression profiles, including data from different processes, such as circadian rhythms, sleep phases and organism development, supplemented with relations based on ontologies, evolutionary homology, regulatory motifs and profile clustering.
Publications/Creative Works
Click here to search for this faculty member's publications on PubMed.
Important Disclaimer: The responsibility for the accuracy of the information contained on these pages lies with the authors and user providing such information.