Fact sheet 14: Epigenetic data visualization

Paloma Perez-Bello Gil and Nilay Can

Proper data visualization is crucial for the analysis of experimental results from NGS technologies, and particularly in epigenomics where multiple molecular events are relevant. Being able to effectively represent epigenetic properties is useful not only for the extraction of meaningful information from experiments such as the identification of significant genomic regions involved in a epigenetic process but also in the initial phases of exploratory data analysis, when the visualization of data in the correct context can lead to the formulation of biological questions and statistical models [1].

As the first step of DNA methylation analysis, it is useful to inspect a selection of genomic regions visually in locally-operated genome browsers such as IGV (Integrative Genomics Viewer) or web-based portals like the UCSC genome browser [2]. At this stage, it is possible to obtain a glimpse of the distribution of epigenetic marks across different genome features, such as genes, promoters, transposable elements and major genomic domains. In addition, several R/Bioconductor packages are available for data visualization. Some of them are quite general in scope, being designed for the processing of NGS data, whereas others are specialized in the presentation of data obtained from specific epigenomics methodologies [1].

Complementary to these general purpose visualization tools, various types of statistical graphics approaches can be used to obtain a more informative view of DNA methylation data. For example, the Hilbert curve method compresses genome-wide maps into compact, two-dimensional diagrams, and these are useful for detecting spatial patterns in the distribution of DNA methylation. Location of DNA methylation in the genome is critical for the interpretation of its regulatory role as this epigenetic mark can have both neutral (or possibly even positive) and negative effects on gene expression depending on the genome features that are affected (e.g. gene body methylation vs transposon silencing). However, It should be noted that DNA methylation alone, albeit crucial for regulation of gene expression, provides partial information on epigenetic states, whereas chromatin dynamics is determined by a combination of several factors including histone modifications, non- histone proteins and non-coding RNAs (ncRNAs) that define chromatin structure and accessibility [3].

The high dimensionality of the data sets describing all these properties makes them too complex to be handled from a classical point of view. It is, then, necessary to address the problem from a number of different perspectives at once. This approach leads to the concept of spatially-defined chromatin states that are contributed by multiple epigenetic marks enriched in the same genomic regions. To analyse chromatin modifications in a combinatorial manner, there are various methods using NGS available that combine multiple genome-wide epigenomic maps and use combinatorial and spatial mark patterns to infer a complete epigenetic annotation of the genome. One example is the Hidden Markov Model based algorithm used by the ChromHMM software, which learns chromatin-states using chIP-seq data of various histone modifications, although more comprehensive approaches taking to account DNA methylation as well might became available in the future. Recognizing chromatin states and identifying their genomic occurrences in the genome provides a systematic annotation of DNA elements and regulatory control regions, some of which can be involved in development, in cell differentiation as well as in the interaction between genome and environment [4].

  1. Bayón, G., Fernández, A. and Fraga, M. (2018). Bioinformatics Tools in Epigenomics Studies.
  2. Bock, Christoph. (2012). Analysing and Interpreting DNA Methylation Data. Nature Reviews Genetics, vol. 13, no. 10, pp. 705–719.
  3. Zhang, H., Lang, Z. and Zhu, J. (2018). Dynamics and function of DNA methylation in plants. Nature Reviews Molecular Cell Biology, 19(8), pp.489-506.
  4. Ernst, J. and Kellis, M. (2017). Chromatin-state discovery and genome annotation with ChromHMM. Nature Protocols, 12(12), pp.2478-2492.