hdpGLM: An R Package to Estimate Heterogeneous Effects in Generalized Linear Models Using Hierarchical Dirichlet Process

Diogo Ferrari

Main Article Content


The existence of latent clusters with different responses to a treatment is a major concern in scientific research, as latent effect heterogeneity often emerges due to latent or unobserved features - e.g., genetic characteristics, personality traits, or hidden motivations - of the subjects. Conventional random- and fixed-effects methods cannot be applied to that heterogeneity if the group markers associated with that heterogeneity are latent or unobserved. Alternative methods that combine regression models and clustering procedures using Dirichlet process are available, but these methods are complex to implement, especially for non-linear regression models with discrete or binary outcomes. This article discusses the R package hdpGLM as a means of implementing a novel hierarchical Dirichlet process approach to estimate mixtures of generalized linear models outlined in Ferrari (2020). The methods implemented make it easy for researchers to investigate heterogeneity in the effect of treatment or background variables and identify clusters of subjects with differential effects. This package provides several features for out-of-the-box estimation and to generate numerical summaries and visualizations of the results. A comparison with other similar R packages is provided.

Article Details

Article Sidebar