Bioinformatic toolscrafted in the lab
Available tools for the scientific community developped by ADlab
BioDiscML (Biomarker Discovery by Machine Learning) is a tool that automates the analysis of complex biological datasets using machine learning methods. From a collection of samples and their associated characteristics BioDiscML produce a minimal subset of biomarkers and a model that will predict efficiently a specified outcome. It uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting either categorical or continuous outcome from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation.
A bioinformatics tool to automatically provide annotations for genes included in DNA blocks in linkage disequilibrium with candidate SNPs.
LD-annot estimates experiment-specific linkage disequilibrium to delineate regions genetically linked to each genetic markers from a list of polymorphisms (most often candidate SNPs from GWAS) and provide coordinates and annotations for genes included or overlapping such regions.
This package produces metagene plots to compare coverages of sequencing experiments at selected groups of genomic regions. It can be used for such analyses as assessing the binding of DNA-interacting proteins at promoter regions or surveying antisense transcription over the length of a gene. The metagene2 package can manage all aspects of the analysis, from normalization of coverages to plot facetting according to experimental metadata. Bootstraping analysis is used to provide confidence intervals of per-sample mean coverages.
netOmics is a multi-omics networks builder and explorer. It uses a combination of network inference algorithms and and knowledge-based graphs to build multi-layered networks. The package can be combined with timeOmics to incorporate time-course expression data and build sub-networks from multi-omics kinetic clusters. Finally, from the generated multi-omics networks, propagation analyses allow the identification of missing biological functions (1), multi-omics mechanisms (2) and molecules between kinetic clusters (3). This helps to resolve complex regulatory mechanisms.
NGS++: C++ library for manipulating Next Generation Sequencing data
Our groups have developed a C++ library designed for Next Generation Sequencing data manipulation. It is specifically tailored to help develop applications that work with genomic regions and features, such as epigenomics marks, gene features and data that are often associated with BED type files.
Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed. Available here.
A package for ChIP-ChIP and tiling arrays, rMAT: This package is an R version of the package MAT and contains functions to parse and merge Affymetrix BPMAP and CEL tiling array files.
A package for ChIP-Seq and motif analysis, rGADEM: rGADEM is an efficient de novo motif discovery tool for large-scale genomic sequence data. It is an open-source R package, which is based on the GADEM software.
R-Omix: Epigenomics Portal
A portal containing novel bioinformatics tools with highly customizable user interface based on the Shiny framework has been developed by our team. Those interfaces offer the possibility to rapidly generate graphs which are easily integrable in publications. All those bioinformatics tools are closely related to epigenomics fields and are hosted on the Compute Canada‘s infrastructure.
Varian/t ExplOreR: Exploratory tool for fine-mapping. VEXOR is a platform-independent browser-based integrative environment for functional annotation in R, based on the Shiny package. This interface provides a comprehensive analytical framework to characterize the role of variants driving susceptibility signals in regions defined by GWAS.
A package for single-nucleotide polymorphisms visualization, ShinySNP: This package provide a highly customizable graphical user interface which enable the visualization of single-nucleotide polymorphisms (SNPs).
Time-Course Multi-Omics data integration
timeOmics is a generic data-driven framework to integrate multi-Omics longitudinal data measured on the same biological samples and select key temporal features with strong associations within the same sample group. The main steps of timeOmics are: 1. Plaform and time-specific normalization and filtering steps; 2. Modelling each biological into one time expression profile; 3. Clustering features with the same expression profile over time; 4. Post-hoc validation step.
A package for nucleosome positioning, RJMCMC : This package uses informative Multinomial-Dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling.