Recent and ongoing projects
Most of our projects aim at the development of data-driven Systems Biology; more specifically, on two areas: Data integration approaches, tools, and software, and Bioinformatics for novel large-scale measurement technologies. Almost all projects are close collaborations with experimentalists.
Data integration
Most biological circuitry that regulates cellular stress responses involve many molecules, and at several levels. Computational modeling of regulatory networks therefore has to integrate varied, large-scale data sets.
Cytoscape
Cytoscape is the worldwide standard open-source platform for network visualization and data integration. This project, which we started at the Institute for Systems Biology in Seattle, provides a universal, customizable tool for the network-based visualization and analysis of larger-scale datasets. The Cytoscape core software is now co-developed in the United States, Canada, and Europe, and is regarded as an innovative model for scientific collaboration. As part of the NIH-funded National Resource for Network Biology, we currently extend Cytoscape towards the learning of networks from omics data.
Stress regulation in Bacillus subtilis
In the context of the BaSysBio EU project, we study the responses of Bacillus subtilis to a large panel of environmental changes. Using statistical network-based and multi-scale modeling approaches, and in collaboration with the laboratory of Jan Maarten van Dijl (Groningen, Netherlands), we aim to identify players and mechanisms in a surprising induction of competence under a mild nutrient change.
Stress regulation in Arabidopsis thaliana
Based principally on a large compendium of transcriptome time series, we try to identify key players, and their interactions in a panel of abiotic and biotic stress responses in Arabidopsis thaliana. Our experimental partner in this project is Heribert Hirt (Évry, France).
Immune regulation in human intestinal epithelial cells
The immune reaction of human epithelial cells to microbial challenges has to be finely controlled. In collaboration with Philippe Sansonetti (Institut Pasteur, Paris) we explore what the introduction of quantitative models on the basis of QT-PCR and RNAseq transcriptome data may add to our understanding.
Characterization of the immune response network
The immune reaction of humans in the general population is controlled by a complex network of genetic and environmental factors. Knowledge about this ‘immune response network’ represents a potentially important substrate for personalized diagnosis and treatment. In the context of this larger 10-year project (Milieu Intérieur, coordinated by Matthew Albert and Lluis Quintana-Murci at Institut Pasteur), our group will apply modern statistical models to help define the immune response network, and model its dynamic behavior.
Mining patterns in malaria infection data
Observations in multidimensional datasets from long-term studies about the modalities of Malaria infection can point to significant and subtle factors that control infection. In collaboration with Anavaj Sakuntabhai (Institut Pasteur, Paris), we use existing and newly developed data mining approaches to extract such observations.
Bioinformatics for novel large-scale measurement technologies
We promote the evolution of novel large-scale measurement technologies through the development of computational approaches to extract a maximum of meaningful information as the basis for descriptive and predictive models. Most current projects revolve around mass spectrometry based-proteomics, the current technology of choice for the comprehensive characterization of biological systems on the level of proteins.
Exploiting increasing accuracy and precision of mass spectrometry
Mass spectrometry is able to determine the mass of peptides more and more accurately. The information in peptide (MS) peaks is usually not exploited. In collaboration with the labs of David Goodlett and Tina Guina (U. Washington, Seattle) have shown that, the information in these peaks is rich enough to identify even without further peptide fragmentation.
Computational identification of additional proteins through networks
Despite technological advances, the detection of low-abundance proteins, and their abundance changes, remains challenging. Together with the laboratory of Florence Pinet (Institut Pasteur, Lille), we developed a computational approach based on protein-protein interaction (PPI) networks to identify a list of proteins that might have remained undetected in differential proteomic profiling experiments, and demonstrated the proof-of-concept.
Computational optimization of mass spectrum search engine parameters
Correct adjustment of spectrum search engine parameters is key in successful proteomic data analysis. However, few guidelines are available, and parameters are often set intuitively. We have developed a computational approach to optimize the parameters of standard spectrum search engines, and demonstrated that this approach can lead to the identification of twice as many proteins as standard approaches. Partner in this project was Fabio Cerqueira (Viçosa, Brasil).
Large-scale identification of glycoproteins
Glycosylated proteins are key players in processes such as infection and many diseases, but their complex structure makes them particularly difficult to identify on a large scale. In the GlycoHIT project, we collaborate with Zohar Yakhini (Technion/Agilent Labs, Haifa), and Janne Lehtiö (Karolinska Institut, Stockholm) to push the limit of the deep study of the glycoproteome.
Tools for analyzing data from combination fragmentation modes
It is the fragmentation of small peptides in the mass spectrometer that ultimately leads to protein identification. Modern mass spectrometers offer different fragmentation modes (such as collision-induced fragmentation and multistage activation) that can be used alone or in combination. In collaboration with Delphine Pflieger (Évry, France) we evaluate different the efficacy of different fragmentation modes, and approaches and software to interpret experiments that use combined fragmentation modes.
Differential analysis of genome-wide binding (ChIP-chip) data
Chromatin immunoprecipitation, followed by DNA chip hybridization (ChIP-chip), is a key technology transcription factor that aims at determining binding sites and gene regulatory networks on a genome-wide scale. It is known that binding sites and networks vary under genetic and environmental perturbations, but computational tools for the analysis of these variations are currently underdeveloped. We collaborate with several labs (Pascale Cossart, Arthur Scherf, Institut Pasteur) in the extraction of differences in ChIP-chip signals between different conditions, and their biological interpretation.
