Testing the equality of two regression coefficients

The default hypothesis tests that software spits out when you run a regression model is the null that the coefficient equals zero. Frequently there are other more interesting tests though, and this is one I’ve come across often — testing whether two coefficients are equal to one another. The big point to remember is that Var(A-B) = Var(A) + Var(B) - 2*Cov(A,B). This formula gets you pretty far in statistics (and is one of the few I have memorized).

Note that this is not the same as testing whether one coefficient is statistically significant and the other is not. See this Andrew Gelman and Hal Stern article that makes this point. (The link is to a pre-print PDF, but the article was published in the American Statistician.) I will outline four different examples I see people make this particular mistake.

One is when people have different models, and they compare coefficients across them. For an example, say you have a base model predicting crime at the city level as a function of poverty, and then in a second model you include other control covariates on the right hand side. Let’s say the the first effect estimate of poverty is 3 (1), where the value in parentheses is the standard error, and the second estimate is 2 (2). The first effect is statistically significant, but the second is not. Do you conclude that the effect sizes are different between models though? The evidence for that is much less clear.

To construct the estimate of how much the effect declined, the decline would be 3 - 2 = 1, a decrease in 1. What is the standard error around that decrease though? We can use the formula for the variance of the differences that I noted before to construct it. So the standard error squared is the variance around the parameter estimate, so we have sqrt(1^2 + 2^2) =~ 2.23 is the standard error of the difference — which assumes the covariance between the estimates is zero. So the standard error around our estimated decline is quite large, and we can’t be sure that it is an appreciably different estimate of poverty between the two models.

There are more complicated ways to measure moderation, but this ad-hoc approach can be easily applied as you read other peoples work. The assumption of zero covariance for parameter estimates is not a big of deal as it may seem. In large samples these tend to be very small, and they are frequently negative. So even though we know that assumption is wrong, just pretending it is zero is not a terrible folly.

The second is where you have models predicting different outcomes. So going with our same example, say you have a model predicting property crime and a model predicting violent crime. Again, I will often see people make an equivalent mistake to the moderator scenario, and say that the effect of poverty is larger for property than violent because one is statistically significant and the other is not.

In this case if you have the original data, you actually can estimate the covariance between those two coefficients. The simplest way is to estimate that covariance via seemingly unrelated regression. If you don’t though, such as when you are reading someone else’s paper, you can just assume the covariance is zero. Because the parameter estimates often have negative correlations, this assumption will make the standard error estimate smaller.

The third is where you have different subgroups in the data, and you examine the differences in coefficients. Say you had recidivism data for males and females, and you estimated an equation of the effect of a treatment on males and another model for females. So we have two models:

Model Males  : Prob(Recidivism) = B_0m + B_1m*Treatment
Model Females: Prob(Recidivism) = B_0f + B_1f*Treatment

Where the B_0? terms are the intercept, and the B_1? terms are the treatment effects. Here is another example where you can stack the data and estimate an interaction term to estimate the difference in the effects and its standard error. So we can estimate a combined model for both males and females as:

Combined Model: Prob(Recidivism) = B_0c + B_1c*Treatment + B_2c*Female + B_3c(Female*Treatment)

Where Female is a dummy variable equal to 1 for female observations, and Female*Treatment is the interaction term for the treatment variable and the Female dummy variable. Note that you can rewrite the model for males and females as:

Model Mal.: Prob(Recidivism) =     B_0c      +      B_1c    *Treatment    ....(when Female=0)
Model Fem.: Prob(Recidivism) = (B_0c + B_2c) + (B_1c + B_3c)*Treatment    ....(when Female=1)

So we can interpret the interaction term, B_3c as the different effect on females relative to males. The standard error of this interaction takes into account the covariance term, unlike estimating two totally separate equations would. (You can stack the property and violent crime outcomes I mentioned earlier in a synonymous way to the subgroup example.)

The final fourth example is the simplest; two regression coefficients in the same equation. One example is from my dissertation, the correlates of crime at small spatial units of analysis. I test whether different places that sell alcohol — such as liquor stores, bars, and gas stations — have the same effect on crime. For simplicity I will just test two effects, whether liquor stores have the same effect as on-premise alcohol outlets (this includes bars and restaurants). So lets say I estimate a Poisson regression equation as:

log(E[Crime]) = Intercept + b1*Bars + b2*LiquorStores

And then my software spits out:

                  B     SE      
Liquor Stores    0.36  0.10
Bars             0.24  0.05

And then lets say we also have the variance-covariance matrix of the parameter estimates – which most stat software will return for you if you ask it:

                L       B  
Liquor_Stores    0.01
Bars            -0.0002 0.0025

On the diagonal are the variances of the parameter estimates, which if you take the square root are equal to the reported standard errors in the first table. So the difference estimate is 0.36 - 0.24 = 0.12, and the standard error of that difference is sqrt(0.01 + 0.0025 - 2*-0.002) =~ 0.13. So the difference is not statistically significant. You can take the ratio of the difference and its standard error, here 0.12/0.13, and treat that as a test statistic from a normal distribution. So the rule that it needs to be plus or minus two to be stat. significant at the 0.05 level applies.

This is called a Wald test specifically. I will follow up with another blog post and some code examples on how to do these tests in SPSS and Stata. For completeness and just because, I also list two more ways to accomplish this test for the last example.

There are two alternative ways to do this test though. One is by doing a likelihood ratio test.

So we have the full model as:

 log(E[Crime]) = b0 + b1*Bars + b2*Liquor_Stores [Model 1]

And we have the reduced model as:

 log(E[Crime]) = b4 + b5*(Bars + Liquor_Stores)  [Model 2]

So we just estimate the full model with Bars and Liquor Stores on the right hand side (Model 1), then estimate the reduced model (2) with the sum of Bars + Liquor Stores on the right hand side. Then you can just do a chi-square test based on the change in the log-likelihood. In this case there is a change of one degree of freedom.

I give an example of doing this in R on crossvalidated. This test is nice because it extends to testing multiple coefficients, so if I wanted to test bars=liquor stores=convenience stores. The prior individual Wald tests are not as convenient for testing more than two coefficients equality at once.

Here is another way though to have the computer more easily spit out the Wald test for the difference between two coefficients in the same equation. So if we have the model (lack of intercept does not matter for discussion here):

y = b1*X + b2*Z [eq. 1]

We can test the null that b1 = b2 by rewriting our linear model as:

y = B1*(X + Z) + B2*(X - Z) [eq. 2]

And the test for the B2 coefficient is our test of interest The logic goes like this — we can expand [eq. 2] to be:

y = B1*X + B1*Z + B2*X - B2*Z [eq. 3]

which you can then regroup as:

y = X*(B1 + B2) + Z*(B1 - B2) [eq. 4]

and note the equalities between equations 4 and 1.

B1 + B2 = b1; B1 - B2 = b2

So B2 tests for the difference between the combined B1 coefficient. B2 is a little tricky to interpret in terms of effect size for how much larger b1 is than b2 – it is only half of the effect. An easier way to estimate that effect size though is to insert (X-Z)/2 into the right hand side, and the confidence interval for that will be the effect estimate for how much larger the effect of X is than Z.

Note that this gives an equivalent estimate as to conducting the Wald test by hand as I mentioned before.

21 Comments

by Andy Wheeler on October 19, 2016 • Permalink

Posted in Regression, scholarly

Tagged regression, scholarly

Posted by Andy Wheeler on October 19, 2016

https://andrewpwheeler.com/2016/10/19/testing-the-equality-of-two-regression-coefficients/

21 Comments

Thom
/ May 26, 2017

This is a really clear summary. I’d also add that the reparameterization to b1 * (x1+x2)/2 and b2 * (x1-x2) is also sometimes useful for handling collinearity when you have two highly correlated predictors that are also capturing some nuanced distinction.

Reply
Kate
/ August 8, 2019

Hi Andrew, thanks so much for the explanation. I currently encounter a similar question: to test the equality of two regression coefficients from two different models but in the same sample. I need to test whether the cross-sectional effects of an independent variable are the same at two time points. Since the effects/regression coefficients may be correlated at the two time points, and I don’t know how to calculate their covariance, could you advise what to do?

Reply
- apwheele
  / August 8, 2019
  
  From your description you can likely stack the models and construct an interaction effect. So something like
  
  y_it = B0 + B1*(X) + B2*(Time Period = 2) + B3(X*Time Period = 2)
  
  Then the B3 effect is the difference in the X effect across the two time periods.
  
  (A complication of this is you should account for correlated errors across the shared units in the two groups. Such as via clustered standard errors or random/fixed effects for units.)
  
  If X does not change over the two time periods, you could do the SUR approach and treat the two time periods as different dependent variables, see https://andrewpwheeler.wordpress.com/2017/06/12/testing-the-equality-of-coefficients-same-independent-different-dependent-variables/.
  
  (Which is another way to account for the correlated errors across the models.)
  
  Reply
  - liuchaoyu@gmail.com
    / August 8, 2019
    
    Thanks Andrew. Can the model also applies to when the DV are measured at two different time but the IV are the same across time? say can I use it to compare the prediction effects of parent educational level on children’s grades at year 1 and the prediction on year 2 grades. since the year 1 grade will definitely be correlated with year 2. Thanks again!
  - apwheele
    / August 9, 2019
    
    Just based on that description I would use a multi-level growth type model, with a random intercept for students. Then you just have the covariates as I stated.
Alexander Pachanov
/ January 25, 2021

Dear, Andrew, thank you for your clear explanation. Can we apply the method in the 4-th case (a Wald test) also to a poisson GLMM? I would like to check if two coefficients are equal. If yes, is 2*pnorm() applicable for calculating a p-value?
Thank you in advance,
Alexander

Reply
- apwheele
  / January 25, 2021
  
  Yep it will be OK for a mixed model as well for the fixed coefficients or for the individual random effects, and the formula will be the same.
  
  Reply
Samuel
/ October 30, 2024

Dear Andrew, thank you very much for your answer. I am a beginner, and I currently have two different X variables and one identical Y. I would like to compare whether there is a significant difference between the two regression coefficients. Could I use the Z-test for this? (The two X variables are related, similar to patent application and authorization counts.)

Reply
- apwheele
  / October 30, 2024
  
  Yes, that sounds like the exact scenario I describe in this post.
  
  Reply
  - Samuel
    / December 17, 2024
    
    Hi Andew! I would like to seek your advice on a related research question. Specifically, I am considering running regressions of variable Y on both the mean and the maximum of variable X, aiming to explain how Y is influenced by the overall level and the extreme level of X. Are there any potential statistical pitfalls with this approach? Would it be more appropriate to replace the mean with the median? Alternatively, do you have any other suggestions for addressing this issue? I look forward to your guidance.
  - apwheele
    / December 17, 2024
    
    There isn’t anyway for me to know just based on the description. So imagine you had an equation of the micro level items, e.g.
    
    #1 y = B1*x_1 + B2*x_2 ….. + Bk*x_k
    
    You are saying I only can model summary statistics, is
    
    #2 y = B*(mean[X])
    
    #3 y = B*(max[X])
    
    Appropriate? Well, if each of the B coefficients is approximately the same in the #1 equation, then #2 equation will I think be correct.
    
    For #3 to be correct, it would need to be a more complicated function in #1. Say a step function (the B’s in #1 are 0 if below a threshold), and only one of the B’s can be positive.
  - Samuel
    / December 17, 2024
    
    I apologize for not describing my question clearly. For example, I am interested in exploring how an individual’s time spent learning computer skills is influenced by their peers. Specifically, I divide the independent variable into the average time and the maximum time spent by peers, as shown in the following models:
    
    y = B1peer_mean + B2x_2 ….. + Bkx_k （1）；
    
    y = B1peer_max + B2x_2 ….. + Bkx_k。
    
    Would this approach be valid? Thank you for your response.

Andrew Wheeler

Testing the equality of two regression coefficients

21 Comments

Thom

Kate

apwheele

liuchaoyu@gmail.com

apwheele

Alexander Pachanov

apwheele

Samuel

apwheele

Samuel

apwheele

Samuel

Leave a reply to Thom Cancel reply

Recent Posts

Categories

Site RSS Feeds

Follow Blog via Email

Top Posts & Pages

Stack Exchange

Andrew Wheeler

Testing the equality of two regression coefficients

Share this:

Related

21 Comments

Thom

Kate

apwheele

liuchaoyu@gmail.com

apwheele

Alexander Pachanov

apwheele

Samuel

apwheele

Samuel

apwheele

Samuel

Leave a reply to Thom Cancel reply

Recent Posts

Categories

Site RSS Feeds

Follow Blog via Email

Top Posts & Pages

Stack Exchange