Hierarchical cluster analysis in clinical research with heterogeneous study population

Highlighting its visualization with R

Zhongheng Zhang, Fionn Murtagh, Sven Van Poucke, Su Lin, Peng Lan

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Big data clinical research typically involves thousands of patients and there are numerous variables available. Conventionally, these variables can be handled by multivariable regression modeling. In this article, the hierarchical cluster analysis (HCA) is introduced. This method is used to explore similarity between observations and/or clusters. The result can be visualized using heat maps and dendrograms. Sometimes, it would be interesting to add scatter plot and smooth lines into the panels of the heat map. The inherent R heatmap package does not provide this function. A series of scatter plots can be created using lattice package, and then background color of each panel is mapped to the regression coefficient by using custom-made panel functions. This is the unique feature of the lattice package. Dendrograms and color keys can be added as the legend elements of the lattice system. The latticeExtra package provides some useful functions for the work.

Original languageEnglish
Number of pages11
JournalAnnals of Translational Medicine
Volume5
Issue number4
DOIs
Publication statusPublished - 1 Feb 2017
Externally publishedYes

Fingerprint

Cluster Analysis
Color
Hot Temperature
Research
Population

Cite this

@article{8abf49bb3d9c43318b6229cda5098574,
title = "Hierarchical cluster analysis in clinical research with heterogeneous study population: Highlighting its visualization with R",
abstract = "Big data clinical research typically involves thousands of patients and there are numerous variables available. Conventionally, these variables can be handled by multivariable regression modeling. In this article, the hierarchical cluster analysis (HCA) is introduced. This method is used to explore similarity between observations and/or clusters. The result can be visualized using heat maps and dendrograms. Sometimes, it would be interesting to add scatter plot and smooth lines into the panels of the heat map. The inherent R heatmap package does not provide this function. A series of scatter plots can be created using lattice package, and then background color of each panel is mapped to the regression coefficient by using custom-made panel functions. This is the unique feature of the lattice package. Dendrograms and color keys can be added as the legend elements of the lattice system. The latticeExtra package provides some useful functions for the work.",
keywords = "Clinical research, Dendrogram, Heat map, Hierarchical cluster analysis (HCA)",
author = "Zhongheng Zhang and Fionn Murtagh and Poucke, {Sven Van} and Su Lin and Peng Lan",
year = "2017",
month = "2",
day = "1",
doi = "10.21037/atm.2017.02.05",
language = "English",
volume = "5",
journal = "Annals of Translational Medicine",
issn = "2305-5839",
publisher = "AME Publishing Company",
number = "4",

}

Hierarchical cluster analysis in clinical research with heterogeneous study population : Highlighting its visualization with R. / Zhang, Zhongheng; Murtagh, Fionn; Poucke, Sven Van; Lin, Su; Lan, Peng.

In: Annals of Translational Medicine, Vol. 5, No. 4, 01.02.2017.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Hierarchical cluster analysis in clinical research with heterogeneous study population

T2 - Highlighting its visualization with R

AU - Zhang, Zhongheng

AU - Murtagh, Fionn

AU - Poucke, Sven Van

AU - Lin, Su

AU - Lan, Peng

PY - 2017/2/1

Y1 - 2017/2/1

N2 - Big data clinical research typically involves thousands of patients and there are numerous variables available. Conventionally, these variables can be handled by multivariable regression modeling. In this article, the hierarchical cluster analysis (HCA) is introduced. This method is used to explore similarity between observations and/or clusters. The result can be visualized using heat maps and dendrograms. Sometimes, it would be interesting to add scatter plot and smooth lines into the panels of the heat map. The inherent R heatmap package does not provide this function. A series of scatter plots can be created using lattice package, and then background color of each panel is mapped to the regression coefficient by using custom-made panel functions. This is the unique feature of the lattice package. Dendrograms and color keys can be added as the legend elements of the lattice system. The latticeExtra package provides some useful functions for the work.

AB - Big data clinical research typically involves thousands of patients and there are numerous variables available. Conventionally, these variables can be handled by multivariable regression modeling. In this article, the hierarchical cluster analysis (HCA) is introduced. This method is used to explore similarity between observations and/or clusters. The result can be visualized using heat maps and dendrograms. Sometimes, it would be interesting to add scatter plot and smooth lines into the panels of the heat map. The inherent R heatmap package does not provide this function. A series of scatter plots can be created using lattice package, and then background color of each panel is mapped to the regression coefficient by using custom-made panel functions. This is the unique feature of the lattice package. Dendrograms and color keys can be added as the legend elements of the lattice system. The latticeExtra package provides some useful functions for the work.

KW - Clinical research

KW - Dendrogram

KW - Heat map

KW - Hierarchical cluster analysis (HCA)

UR - http://www.scopus.com/inward/record.url?scp=85014389283&partnerID=8YFLogxK

U2 - 10.21037/atm.2017.02.05

DO - 10.21037/atm.2017.02.05

M3 - Article

VL - 5

JO - Annals of Translational Medicine

JF - Annals of Translational Medicine

SN - 2305-5839

IS - 4

ER -