Title: | Module to Compute Influence and Leverage Statistics for Regression Models with Clustered Errors |
---|---|
Description: | Module to compute cluster specific information for regression models with clustered errors, including leverage and influence statistics. Models of type 'lm' and 'fixest'(from the 'stats' and 'fixest' packages) are supported. 'summclust' implements similar features as the user-written 'summclust.ado' Stata module (MacKinnon, Nielsen & Webb, 2022; <arXiv:2205.03288v1>). |
Authors: | Alexander Fischer [aut, cre] |
Maintainer: | Alexander Fischer <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.7.0 |
Built: | 2025-01-01 03:20:35 UTC |
Source: | https://github.com/s3alfisc/summclust |
summclust
Plots residual leverage, partial leverage and the leave-one-cluster-out regression coefficients
## S3 method for class 'summclust' plot(x, ...)
## S3 method for class 'summclust' plot(x, ...)
x |
An object of type |
... |
other optional function arguments |
Note that the function requires ggplot2
to be installed.
A list containing
residual_leverage |
A |
coef_leverage |
A |
coef_beta |
A |
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) plot(res) }
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) plot(res) }
summary()
method for objects of type summclust
A summary()
method for objects of type summclust
## S3 method for class 'summclust' summary(object, ...)
## S3 method for class 'summclust' summary(object, ...)
object |
An object of type summclust |
... |
misc arguments |
The function summary.summclust
returns a range of
cluster leverage statistics based on an object of type summclust
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) summary(res) }
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) summary(res) }
Compute influence and leverage metrics for clustered inference based on the Cluster Jackknife described in MacKinnon, Nielsen & Webb (2022).
summclust(obj, ...)
summclust(obj, ...)
obj |
An object of class |
... |
Other arguments |
An object of type summclust
, including
a CRV3 variance-covariance estimate as described in
MacKinnon, Nielsen & Webb (2022)
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
summclust.lm, summclust.fixest
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) summary(res) tidy(res) plot(res) }
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) summary(res) tidy(res) plot(res) }
fixest
Compute influence and leverage metrics for clustered inference
based on the Cluster Jackknife as described in MacKinnon, Nielsen & Webb
(2022) for objects of type fixest
.
## S3 method for class 'fixest' summclust( obj, cluster, params, absorb_cluster_fixef = TRUE, type = "CRV3", ... )
## S3 method for class 'fixest' summclust( obj, cluster, params, absorb_cluster_fixef = TRUE, type = "CRV3", ... )
obj |
An object of type fixest |
cluster |
A clustering vector |
params |
A character vector of variables for which leverage statistics should be computed. If NULL, leverage statistics will be computed for all k model covariates |
absorb_cluster_fixef |
TRUE by default. Should the cluster fixed effects be projected out? This increases numerical stability and decreases computational costs |
type |
"CRV3" or "CRV3J" following MacKinnon, Nielsen & Webb |
... |
other function arguments passed to 'vcov' |
An object of type summclust
, including
a CRV3 variance-covariance estimate as described in
MacKinnon, Nielsen & Webb (2022)
coef_estimates |
The coefficient estimates of the linear model. |
vcov |
A CRV3 or CRV3J variance-covariance matrix estimate as described in MacKinnon, Nielsen & Webb (2022) |
leverage_g |
A vector of leverages. |
leverage_avg |
The cluster leverage. |
partial_leverage |
The partial leverages. |
coef_var_leverage_avg |
Coefficient of Variation for the leverage statistic |
coef_var_leverage_g |
Coefficient of Variation for the Partial Leverage Statistics |
coef_var_N_G |
Coefficient of Variation for the Cluster Sizes. |
beta_jack |
The jackknifed' leave-on-cluster-out regression coefficients. |
params |
The input parameter vector 'params'. |
N_G |
The number of clusters- |
call |
The |
cluster |
The names of the clusters. |
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
if(requireNamespace("summclust") && requireNamespace("haven") && requireNamespace("fixest")){ library(summclust) library(haven) library(fixest) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) feols_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = feols_fit, params = c("msp", "union"), cluster = ~ind_code, ) summary(res) tidy(res) plot(res) }
if(requireNamespace("summclust") && requireNamespace("haven") && requireNamespace("fixest")){ library(summclust) library(haven) library(fixest) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) feols_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = feols_fit, params = c("msp", "union"), cluster = ~ind_code, ) summary(res) tidy(res) plot(res) }
lm
Compute influence and leverage metrics for clustered inference
based on the Cluster Jackknife as described in MacKinnon, Nielsen & Webb
(2022) for objects of type lm
.
## S3 method for class 'lm' summclust(obj, cluster, params, type = "CRV3", ...)
## S3 method for class 'lm' summclust(obj, cluster, params, type = "CRV3", ...)
obj |
An object of type lm |
cluster |
A clustering vector |
params |
A character vector of variables for which leverage statistics should be computed. |
type |
"CRV3" or "CRV3J" following MacKinnon, Nielsen & Webb. CRV3 by default |
... |
other function arguments passed to 'vcov' |
An object of type summclust
, including
a CRV3 variance-covariance estimate as described in
MacKinnon, Nielsen & Webb (2022)
coef_estimates |
The coefficient estimates of the linear model. |
vcov |
A CRV3 or CRV3J variance-covariance matrix estimate as described in MacKinnon, Nielsen & Webb (2022) |
leverage_g |
A vector of leverages. |
leverage_avg |
The cluster leverage. |
partial_leverage |
The partial leverages. |
beta_jack |
The jackknifed' leave-on-cluster-out regression coefficients. |
params |
The input parameter vector 'params'. |
N_G |
The number of clusters- |
call |
The |
cluster |
The names of the clusters. |
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, cluster = ~ind_code, params = c("msp", "union") ) summary(res) tidy(res) plot(res) }
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, cluster = ~ind_code, params = c("msp", "union") ) summary(res) tidy(res) plot(res) }
Obtain results from a summclust
object in a tidy data frame.
## S3 method for class 'summclust' tidy(x, ...)
## S3 method for class 'summclust' tidy(x, ...)
x |
An object of class 'summclust' |
... |
Other arguments |
A data.frame containing coefficient estimates, t-statistics, standard errors, p-value, and confidence intervals based on CRV3 variance-covariance matrix and t(G-1) distribution
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) tidy(res) }
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) res <- summclust( obj = lm_fit, params = c("msp", "union"), cluster = ~ind_code, ) tidy(res) }
Compute CRV3 covariance matrices via a cluster jackknife as described in MacKinnon, Nielsen & Webb (2022)
vcov_CR3J(obj, ...)
vcov_CR3J(obj, ...)
obj |
An object of class |
... |
misc function argument |
An object of type 'vcov_CR3J'
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
vcov_CR3J.lm, vcov_CR3J.fixest
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) # CRV3 standard errors vcov <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3" ) # CRV3 standard errors vcovJN <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3J", ) }
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) # CRV3 standard errors vcov <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3" ) # CRV3 standard errors vcovJN <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3J", ) }
fixest
Compute CRV3 covariance matrices via a cluster
jackknife as described in MacKinnon, Nielsen & Webb
(2022) for objects of type fixest
## S3 method for class 'fixest' vcov_CR3J( obj, cluster, type = "CRV3", return_all = FALSE, absorb_cluster_fixef = TRUE, ... )
## S3 method for class 'fixest' vcov_CR3J( obj, cluster, type = "CRV3", return_all = FALSE, absorb_cluster_fixef = TRUE, ... )
obj |
An object of type fixest |
cluster |
A clustering vector |
type |
"CRV3" or "CRV3J" following MacKinnon, Nielsen & Webb. CRV3 by default |
return_all |
Logical scalar, FALSE by default. Should only the vcov be returned (FALSE) or additional results (TRUE) |
absorb_cluster_fixef |
TRUE by default. Should the cluster fixed effects be projected out? This increases numerical stability. |
... |
other function arguments passed to 'vcov' |
An object of class vcov_CR3J
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
if(requireNamespace("summclust") && requireNamespace("haven") && requireNamespace("fixest")){ library(summclust) library(haven) library(fixest) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) feols_fit <- feols( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) # CRV3 standard errors vcov <- vcov_CR3J( obj = feols_fit, cluster = ~ind_code, type = "CRV3" ) # CRV3 standard errors vcovJN <- vcov_CR3J( obj = feols_fit, cluster = ~ind_code, type = "CRV3J", ) }
if(requireNamespace("summclust") && requireNamespace("haven") && requireNamespace("fixest")){ library(summclust) library(haven) library(fixest) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) feols_fit <- feols( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) # CRV3 standard errors vcov <- vcov_CR3J( obj = feols_fit, cluster = ~ind_code, type = "CRV3" ) # CRV3 standard errors vcovJN <- vcov_CR3J( obj = feols_fit, cluster = ~ind_code, type = "CRV3J", ) }
lm
Compute CRV3 covariance matrices via a cluster
jackknife as described in MacKinnon, Nielsen & Webb
(2022) for objects of type lm
## S3 method for class 'lm' vcov_CR3J(obj, cluster, type = "CRV3", return_all = FALSE, ...)
## S3 method for class 'lm' vcov_CR3J(obj, cluster, type = "CRV3", return_all = FALSE, ...)
obj |
An object of type lm |
cluster |
A clustering vector |
type |
"CRV3" or "CRV3J" following MacKinnon, Nielsen & Webb. CRV3 by default |
return_all |
Logical scalar, FALSE by default. Should only the vcov be returned (FALSE) or additional results (TRUE) |
... |
other function arguments passed to 'vcov' |
An object of class vcov_CR3J
MacKinnon, James G., Morten Ørregaard Nielsen, and Matthew D. Webb. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust." arXiv preprint arXiv:2205.03288 (2022).
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) # CRV3 standard errors vcov <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3" ) # CRV3 standard errors vcovJN <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3J", ) }
if(requireNamespace("summclust") && requireNamespace("haven")){ library(summclust) library(haven) nlswork <- read_dta("http://www.stata-press.com/data/r9/nlswork.dta") # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ union + race + msp + as.factor(birth_yr) + as.factor(age) + as.factor(grade), data = nlswork) # CRV3 standard errors vcov <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3" ) # CRV3 standard errors vcovJN <- vcov_CR3J( obj = lm_fit, cluster = ~ind_code, type = "CRV3J", ) }