With the increasing availability of big data, the need is urgent for more studies of best practices when dealing with these data. There are six chapters in this book. Chapter 1 provides an overview of the big data clinical research, including the perspective, the general accessing workflow, a brief review of machine learning methods and data acquisition and management. Chapter 2 discusses about exploratory data analysis and data management. It focuses on the missing data problem that is frequently encountered in clinical studies by introducing a number of methods and their applications. First it discusses about missing data exploration and data reshaping and aggregating. Then it introduces several imputation methods including single imputation, multiple imputation, and multivariate imputation. Chapter 3 discusses methods for variable selection for both parametric and non-parametric models that are commonly used in clinical studies. It also discusses about methods for diagnostic and introduced a useful R package to draw Nomograms. Chapter 4 discusses about the analysis of survival data. In this chapter both the application of parametric and semi-parametric models are illustrated, as well as the competing risk model. Chapter 5 discusses several commonly used unsupervised and supervised machine learning methods including the k nearest neighbor, naïve Bayes classification, decision tree and neural network. Chapter 6 addresses a number of other important statistical areas that has applications in clinical studies, for example, the hierarchical cluster analysis and its visualization with R, causal mediation analysis, structural equation modeling, and case-crossover design.
|Place of Publication||Hong Kong|
|Publisher||AME Publishing Company|
|Number of pages||233|
|Publication status||Published - 2018|