Hierarchical cluster analysis in clinical research with heterogeneous study population: Highlighting its visualization with R

Zhongheng Zhang, Fionn Murtagh, Sven Van Poucke, Su Lin, Peng Lan

Research output: Contribution to journalArticle

27 Citations (Scopus)

Abstract

Big data clinical research typically involves thousands of patients and there are numerous variables available. Conventionally, these variables can be handled by multivariable regression modeling. In this article, the hierarchical cluster analysis (HCA) is introduced. This method is used to explore similarity between observations and/or clusters. The result can be visualized using heat maps and dendrograms. Sometimes, it would be interesting to add scatter plot and smooth lines into the panels of the heat map. The inherent R heatmap package does not provide this function. A series of scatter plots can be created using lattice package, and then background color of each panel is mapped to the regression coefficient by using custom-made panel functions. This is the unique feature of the lattice package. Dendrograms and color keys can be added as the legend elements of the lattice system. The latticeExtra package provides some useful functions for the work.

Original languageEnglish
Number of pages11
JournalAnnals of Translational Medicine
Volume5
Issue number4
DOIs
Publication statusPublished - 1 Feb 2017
Externally publishedYes

Fingerprint Dive into the research topics of 'Hierarchical cluster analysis in clinical research with heterogeneous study population: Highlighting its visualization with R'. Together they form a unique fingerprint.

  • Cite this