How to interpret one sided tests for coefficient differences?

In my ask me anything series, Rob Case writes in a question about interpreting one-sided tests for the difference in coefficients:

Mr. Wheeler,

Thank you for your page https://andrewpwheeler.com/2016/10/19/testing-the-equality-of-two-regression-coefficients/

I did your technique (at the end of the page) of re-running the model with X+Z and X-Z as independent variables (with coefficients B1 and B2, respectively).

I understand:

  1. (although you did not say so) that testing whether coefficient b1 (X’s coefficient in the original equation) is LESS THAN coefficient b2 (Z’s coefficient in the original regression) is a one-sided test; and testing whether one coefficient is DIFFERENT from another is a two-sided test
  2. that the 90%-confidence t-distribution-critical-values-with-infinite-degrees-of-freedom are 1.282 for one-sided tests and 1.645 for two-sided tests
  3. that if the resulting t-stat for the B2 coefficient is say 1.5, then—according to the tests—I should therefore be 90% confident that b1 is in fact less than b2; and I should NOT be 90% confident that b1 is different from b2.

But—according to MY understanding of logic and statistics—if I am 90% confident that b1 is LESS THAN b2, then I would be MORE THAN 90% confident that b1 DIFFERS from b2 (because “differs” includes the additional chance that b1 is greater than b2), i.e. the tests and my logic conflict. What am I doing wrong?

Rob

So I realize null hypothesis statistical testing (NHST) can be tricky to interpret – but the statement in 3 is not consistent with how we do NHST for several reasons.

So if we have a null hypothesis that Beta1 = Beta2, for reasons to do with the central limit theorem we actually rewrite this to be:

Null: Beta1 - Beta2 = 0 => Theta0

I’ve noted this new parameter we are testing – the difference in the two coefficients – as Theta0. For NHST we assume this parameter is 0, and then test to see how close our data is to this parameter. So we estimate with our data:

b1 - b2 = Diff
DiffZ = Diff/StandardError_Diff

Now, to calculate a p-value, we need to say how unlikely our data estimate, DiffZ, is given the assumed null distribution Theta0. So imagine we draw our standard normal distribution curve about Theta0. This then defines the space for NHST, for a typical two sided test we have (here assuming DiffZ is a negative value):

P(Z < DiffZ | Theta0 ) + P(Z > -DiffZ | Theta0 ) = Two tailed p-value

Where less than Z is our partitioning of the space of the null hypothesis, since an exact value for DiffZ here when the distribution of potential outcomes is continuous is zero. For a one sided test, you would just take the relevant portion of the above, and not add the two above portions together:

P(Z < DiffZ | Theta0 ) = One tail p-value for Beta1 < Beta2
P(Z > -DiffZ | Theta0 ) = One tail p-value for Beta1 > Beta2

Note here that the test is conditional on the null hypothesis. Statements such as ‘I should therefore be 90% confident that b1 is in fact less than b2’, which seem to estimate the complement of the p-value (e.g. 1 – p-value) and interpret it as a meaningful probability are incorrect.

P-values are basically numerical summaries of how close the data are to the presumed null distribution. Small p-values just indicate they are not close to the assumed null distribution. The complement of the p-value is not evidence for the alternative hypothesis. It is just the left over distribution for the null hypothesis that is inside the Z values.

Statisticians oftentimes at this point in the conversation suggest Bayesian analysis and instead interpret posteriori probabilities instead of p-values. I will stop here though, as I am not sure “90% confident” readily translates into a specific Bayesian statement. (It could be people are better off doing inferiority/equivalence testing for example, e.g. changing the null hypothesis.)

Marginal effects vs Wald tests (Stata)

Calli Cain, a criminologist from FAU asks:

What is the best method to examine whether there are group differences (e.g., gender, race) in the effects of several variables on binary outcomes (using logistic regression)? For example – if you want to look at the gendered effects of different types of trauma experiences on subsequent adverse behaviors (e.g., whether participant uses drugs, alcohol, has mental health diagnosis, has attempted suicide). Allison (1999) cautions against using Equality of Coefficients tests to look at group differences between regression coefficients like we might with OLS regression. If you wanted to look at the differences between a lot of predictors (n= 16) on various outcomes (n=6) – what would be best way to go about it (I know using interaction terms would be good if you were just interested in say gender differences of one or two variables on the outcome). Someone recommended comparing marginal effects through average discrete changes (ADCs) or discrete changes at representative values (DCRs) – which is new to me. Would you agree with this suggestion?

When I am thinking about should I use method X or method Y type problems, I think about the specific value I want to estimate first, and then work backwards about the best method to use to get that estimate. So if we are talking about binary endpoints such as uses drugs (will go with binary for now, but my examples will readily extend to say counts or real valued outcomes), there are only generally two values people are interested in; say the change in probability is 5% given some input (absolute risk change, e.g. 10% to 5%), or a relative risk change such as X decreases the overall relative risk by 20% (e.g. 5% to 4%).

The former, absolute change in probabilities, is best accomplished via various marginal effects or average discrete changes as Calli says. Most CJ examples I am aware of I think these make the most sense to focus on, as they translate more directly to costs and benefits. Focusing on the ratio’s sometimes makes sense, such as in case-control studies, or if you want to extrapolate coefficient estimates to a very different sample. Or if you are hyper focused on theory and the underlying statistical model.

Will show an example in Stata using simulated data to illustrate the differences. Stata is very nice to work with different types of marginal estimates. Here I generate data with three covariates, female/males, and then some interactions. Note the covariate x1 has the same effect for males/females, x2 and x3 though have countervailing effects (females it decreases, males it increases the probability).

* Stata simulation to show differences in Wald vs Margins
clear
set more off
set seed 10
set obs 5000
generate female = rbinomial(1,0.5)
generate x1 = rnormal(0,1)
generate x2 = rnormal(0,1)
generate x3 = rnormal(0,1)

* x1 same effect, x2/x3 different across groups
generate logit = -0.1 + -2.8*female + 1.1*(x1 + x2 + x3) + -1.5*female*(x2 + x3)
generate prob = 1/(1 + exp(-1*logit))
generate y = rbinomial(1,prob)
drop logit prob

I intentionally generated the groups so females/males have quite different baseline probabilities for the outcome y here – something that happens in real victim data in criminology.

* Check out marginal differences 
tabstat y, by(female)

So you can see males have the proportion of the outcome near 50% in the sample, whereas females are under 10%. So Calli is basically interested in the case below, where we estimate all pairwise interactions, so have many coefficient differences to test on the right hand side.

* Estimate model with interactions (linear coefficients)
logistic y i.female x1 x2 x3 i.female#(c.x1 c.x2 c.x3), coef

This particular model does the Wald tests for the coefficient differences just by the way we have set up the model. So the interaction effects test for differences from the baseline model, so can see the interaction for x1 is not stat significant, but x2/x3 are (as they should be). But if you are interested in the marginal effects, one place to start is with derivatives directly, and Stata automatically for logit models spits out probabilities:

* Marginal effects 
margins female, dydx(x1 x2 x3)
* x1 is the same linear effect, but margins are quite different

So even though I made the underlying effect for x1 equal between males/females for the true underlying data generation process, you can see here the marginal derivative is much smaller for females. This is the main difference between Wald tests and margins.

This is ok though for many situations. Say x1 is a real valued treatment, such as Y is victimization in a sample of high risk youth, and x1 is hours given for a summer job. We want to know the returns of expanding the program – here the returns are higher for males than females due to the different baseline probabilities of risk between the two. This is true even if the relative effect of summer job hours is the same between the two groups.

Again Stata is very convenient, and we can test for the probability differences in males/females by appending r. to the front of the margins command.

* can test difference contrast in groups
margins r.female, dydx(x1 x2 x3)

But the marginal derivative can be difficult to interpret – it is the average slope, but what does that mean exactly? So I like evaluating at fixed points of the continuous variable. Going back to our summer job hours example, you could evaluate going from 0 to 50 hours, or going from 50 to 100, or 0 to 100, etc. and see the average returns in terms of reductions in the probability of trauma. Here because I simulated the data to be a standard normal, I just go from -1 to 0 to 1:

* Probably easier to understand at particular x1 values
margins female, at(x1=(-1(1)1))

So that table is dense, but it says when x1=-1, females have a probability of y of 2%, and males have a probability of y of 32%. Now go up the ladder to x1=0, females have a probability of 6% and males have a probability of 48%. So a discrete change of 4% for females and 16% for males. If we want to generate an interval around that discrete change effect:

* Can test increases one by one
margins female, at(x1=(-1 0 1))   contrast(atcontrast(ar) effects marginswithin)

See, isn’t Stata’s margins command so nice! (For above, it actually may make more sense to use margins , at(x1=(-1 0 1)) over(female) contrast(atcontrast(ar) effects). Margins estimates the changes over the whole sample and averages filling in certain values, with over it only does the averaging within each group on over.) And finally we can test the difference in difference, to see if the discrete changes in males females from going from -1 to 0 to 1 are themselves significant:

* And can test increases of males vs females
margins r.female, at(x1=(-1(1)1)) contrast(atcontrast(ar))

So the increase in females is 13% points smaller than the increase in males going from -1 to 0, etc.

So I have spent alot of time on the probabilities so far. I find them much easier to interpret, and I do not care so much about the fact that it doesn’t necessarily say the underlying DGP is different from males/females. But many people are interested in the odds ratios (say in case-control studies). Or generalizing to a different sample, say this is a low risk sample of females, and you want to generalize the odds ratio’s to a higher risk sample with a baseline more around 50%. Then looking at the odds ratio may make more sense.

Or so far I have not even gotten to Calli’s main point, how to test many of these effects for no differences at once. There I would just suggest the likelihood ratio test (which does not have the problems with the Wald tests on the coefficients and that the variance estimates may be off):

* Estimate restricted model
logistic y ibn.female c.x1 c.x2 c.x3, coef noconstant
estimates store restrict

* Another way to do the full interaction model
* More like separate male and female
logistic y ibn.female ibn.female#(c.x1 c.x2 c.x3), coef noconstant
estimates store full

* LRT test between models
lrtest restrict full

So here as expected, one rejects the null that the restricted model is a better fit to the data. But this is pretty uninformative – I rather just go to the more general model and quantify the differences.

So if you really want to look at the odds ratios, we can do that using lincom post our full interaction model:

logistic y i.female ibn.female#(c.x1 c.x2 c.x3), coef noconstant

And here is an example post Wald test for equality:

lincom 0.female#c.x1 - 1.female#c.x1

You may ask where does this odd’s ratio of 0.921 come from? Well, way back in our first full model, the interaction term for female*x1 is 0.0821688, and exp(-0.0821688) equals that odds ratio and has the same p-value as the original model I showed. And so you can see that the x1 effect is the same across each group. But estimating the other contrasts is not:

lincom 0.female#c.x2 - 1.female#c.x2

And Stata defaults this to estimating a difference in the odds ratio (as far as I can tell you can’t tell Stata to do the linear coefficient after logit, would need to change the model to glm y x, family(binomial) link(logit) to do the linear tests).

I honestly don’t know how to really interpret this – but I have been asked for it several different times by clients. Maybe they know better than me, but I think it is more to do with people just expect to be dealing with odds ratios after a logistic regression.

So we can coerce margins to give us odds ratios:

* For the odds ratios
quietly margins female, at(x3=(-1(0.1)1)) expression(exp(predict(xb)))
marginsplot , yscale(log) ylabel(0.125 0.25 0.5 1 2 4 8)

Or give us the differences in the odds ratio:

* For the contrast in the OR
quietly margins r.female, at(x3=(-1(0.1)1)) expression(exp(predict(xb)))
marginsplot

(Since it is a negative number cannot be drawn on a log scale.) But again I find it much easier to wrap my head around probabilities:

* For the probabilities
quietly margins female, at(x3=(-2(0.1)2))
marginsplot

So here while x3 increases, for males it increases the probability and females it decreases. The female decrease is smaller due to the smaller baseline risk in females.

So while Calli’s original question was how to do this reasonably for many different contrasts, I would prefer the actual empirical estimates of the differences. Doing a single contrast among a small number of representative values over many variables and placing in a table/graph I think is the best way to reduce the complexity.

I just don’t find the likelihood ratio tests for all equalities that informative, and for large samples they will almost always say the more flexible model is better than the restricted model.

We estimate models to actually look at the quantitative values of those estimates, not to do rote hypothesis testing.

Some microsynth notes

Nate Connealy, a criminologist colleague of mine heading to Tampa asks:

My question is from our CPP project on business improvement districts (Piza, Wheeler, Connealy, Feng 2020). The article indicates that you ran three of the microsynth matching variables as an average over each instead of the cumulative sum (street length, percent new housing structures, percent occupied structures). How did you get R to read the variables as averages instead of the entire sum of the treatment period of interest? I have the microsynth code you used to generate our models, but cannot seem to determine how you got R to read the variables as averages.

So Nate is talking about this paper, Crime control effects of a police substation within a business improvement district: A quasi-experimental synthetic control evaluation (Piza et al., 2020), and here is the balance table in the paper:

To be clear to folks, I did not balance on the averages, but simply reported the table in terms of averages. So here is the original readout from R:

So I just divided those noted rows by 314 to make them easier to read. You could divide values by the total number of treated units though in the original data to have microsynth match on the averages instead if you wanted to. Example below (this is R code, see the microsynth library and paper by Robbins et al., 2017):

library(microsynth)
#library(ggplot2) #not loading here, some issue
set.seed(10)

data(seattledmi) #just using data in the package
cs <- seattledmi
# calculating proportions
cs$BlackPerc <- (cs$BLACK/cs$TotalPop)*100
cs$FHHPerc <- (cs$FEMALE_HOU/cs$HOUSEHOLDS)*100
# replacing 0 pop with 0
cs[is.na(cs)] <- 0

cov.var <- c("TotalPop","HISPANIC","Males_1521","FHHPerc","BlackPerc")
match.out <- c("i_felony", "i_misdemea")

sea_prop <- microsynth(cs, 
                       idvar="ID", timevar="time", intvar="Intervention", 
                       start.pre=1, end.pre=12, end.post=16, 
                       match.out.min=match.out,match.out=FALSE,
                       match.covar=FALSE,check.feas=FALSE,
                       match.covar.min=cov.var, 
                       result.var=match.out)

summary(sea_prop) # balance table

And here you can see that we are matching on the cumulative sums for each of the areas, but we can divide our covariates by the number of treated units, and we will match on the proportional values.

# Can divide by 39 and get the same results
cs[,cov.var] <- cs[,cov.var]/39

sea_div <- microsynth(cs, 
                      idvar="ID", timevar="time", intvar="Intervention", 
                      start.pre=1, end.pre=12, end.post=16, 
                      match.out.min=match.out,match.out=FALSE,
                      match.covar=FALSE,check.feas=FALSE,
                      match.covar.min=cov.var, 
                      result.var=match.out)

summary(sea_div) # balance table

Note that these do not result in the same weights. If you look at the results you will see the treatment effects are slightly different. Also if you do:

# Showing weights are not equal
all.equal(sea_div$w$Weights,sea_prop$w$Weights)

It does not return True. Honestly not familiar enough with the procedure that microsynth uses to do the matching (Raking survey weights) to know if this is due to stochastic stuff or due to how the weighting algorithm works (I would have thought a linear change does not make a difference, but I was wrong).

On the bucket list is to do a matching algorithm that returns geographically contiguous areas and gives the weights all values of 1 (so creates comparable neighborhoods), instead of estimating Raking weights. That may be 5 years though before I get around to that. Gio has a nice map to show the way the weights work now is they may be all over the place (Circo et al., 2021) – I am not sure that is a good thing though.

But I did want to share some functions I used for the paper I worked with Nate on. First, this is for if you use the permutation approach, the function prep_synth returns some of the data in a nicer format to make graphs and calculate your own stats:

# Function to scoop up the data nicely
prep_synth <- function(mod){
    #Grab the plot data
    plotStats <- mod[['Plot.Stats']]
    #For the left graph
    Treat <- as.data.frame(t(plotStats$Treatment))
    Treat$Type <- "Treat"
    #This works for my data at years, will not 
    #Be right for data with more granular time though
    Treat$Year <- as.integer(rownames(Treat))
    Cont <- as.data.frame(t(plotStats$Control))
    Cont$Type <- "Control"
    Cont$Year <- as.integer(rownames(Cont))
    AllRes <- rbind(Treat,Cont)
    #For the right graph
    Perm <- as.data.frame(t(as.data.frame(plotStats$Difference)))
    SplitStr <- t(as.data.frame(strsplit(rownames(Perm),"[.]")))
    colnames(SplitStr) <- c("Type","Year")
    rownames(SplitStr) <- 1:nrow(SplitStr)
    SplitStr <- as.data.frame(SplitStr)
    Perm$Type <- as.character(SplitStr$Type)
    Perm$Year <- as.integer(as.character(SplitStr$Year))
    Perm$Group <- ifelse(Perm$Type == 'Main','Treatment Effect','Permutations') 
    #Reordering factor levels for plots
    AllRes$Type <- factor(AllRes$Type,levels=c('Treat','Control'))
    levels(AllRes$Type) <- c('Treated','Synthetic Control')
    Perm$Group <- factor(Perm$Group,levels=c('Treatment Effect','Permutations'))
    #Exporting result
    Res <- vector("list",length=2)
    Res[[1]] <- AllRes
    Res[[2]] <- Perm
    names(Res) <- c("AggOutcomes","DiffPerms")
    return(Res)
}

It works for the prior tables, but I really made these functions to work with when you used permutations to get the errors. (In the micro synth example, it is easier to work with permutations than in the state level example for synth, in which I think conformal prediction intervals makes more sense, see De Biasi & Circo, 2021 for a recent real example with micro place based data though.)

# Takes like 1.5 minutes
sea_perm <- microsynth(seattledmi, 
                      idvar="ID", timevar="time", intvar="Intervention", 
                      start.pre=1, end.pre=12, end.post=16, 
                      match.out.min=match.out,match.out=FALSE,
                      match.covar=FALSE,check.feas=FALSE,
                      match.covar.min=cov.var, 
                      result.var=match.out, perm=99)

res_prop <- prep_synth(sea_perm)
print(res_prop)

So the dataframe in the first slot is the overall treatment effect, and the second dataframe is a nice stacked version for the permutations. First, I really do not like the percentage change (see Wheeler, 2016 for the most direct critique, but I have a bunch on this site). So I wrote code to translate the treatment effects into crime count reductions instead of the percent change stuff.

# Getting the observed treatment effect on count scale
# vs the permutations

agg_fun <- function(x){
    sdx <- sd(x)
    minval <- min(x)
    l_025 <- quantile(x, probs=0.025)
    u_975 <- quantile(x, probs=0.975)
    maxval <- max(x)
    totn <- length(x)
    res <- c(sdx,minval,l_025,u_975,maxval,totn)
    return(res)
}

treat_count <- function(rp){
    # Calculating the treatment effect based on permutations
    keep_vars <- !( names(rp[[2]]) %in% c("Year","Group") )
    out_names <- names(rp[[2]])[keep_vars][1:(sum(keep_vars)-1)]
    loc_dat <- rp[[2]][,keep_vars]
    agg_treat <- aggregate(. ~ Type, data = loc_dat, FUN=sum)
    n_cols <- 2:dim(agg_treat)[2]
    n_rows <- 2:nrow(agg_treat)
    dif <- agg_treat[rep(1,max(n_rows)-1),n_cols] - agg_treat[n_rows,n_cols]
    dif$Const <- 1
    stats <- aggregate(. ~ Const, data = dif, FUN=agg_fun)
    v_names <- c("se","min","low025","up975","max","totperm")
    long_stats <- reshape(stats,direction='long',idvar = "Const", 
                      varying=list(2:ncol(stats)),
                      v.names=v_names, times=out_names)
    # Add back in the original stats
    long_stats <- long_stats[,v_names]
    rownames(long_stats) <- 1:nrow(long_stats)
    long_stats$observed <- t(agg_treat[1,n_cols])[,1]
    long_stats$outcome <- out_names
    ord_vars <- c('outcome','observed',v_names)
    return(long_stats[,ord_vars])
}

treat_count(res_prop)

So that is the cumulative total effect of the intervention. This is more similar to the WDD test (Wheeler & Ratcliffe, 2018), but since the pre-time period is matched perfectly, just is the differences in the post time periods. And here it uses the permutations to estimate the error, not any Poisson approximation.

But I often see folks concerned about the effects further out in time for synthetic control studies. So here is a graph that just looks at the instant effects for each time period, showing the difference via the permutation lines:

# GGPLOT graphs, individual lines
library(ggplot2)
perm_data <- res_prop[[2]]
# Ordering factors to get the treated line on top
perm_data$Group <- factor(perm_data$Group, c("Permutations","Treatment Effect"))
perm_data$Type <- factor(perm_data$Type, rev(unique(perm_data$Type)))
pro_perm <- ggplot(data=perm_data,aes(x=Year,y=i_felony,group=Type,color=Group,size=Group)) + 
            geom_line() +
            scale_color_manual(values=c('grey','red')) + scale_size_manual(values=c(0.5,2)) +
            geom_vline(xintercept=12) + theme_bw() + 
            labs(x=NULL,y='Felony Difference from Control') + 
            scale_x_continuous(minor_breaks=NULL, breaks=1:16) + 
            scale_y_continuous(breaks=seq(-10,10,2), minor_breaks=NULL) +
            theme(panel.grid.major = element_line(linetype="dashed"), legend.title= element_blank(),
            legend.position = c(0.2,0.8), legend.background = element_rect(linetype="solid", color="black")) +
            theme(text = element_text(size=16), axis.title.y=element_text(margin=margin(0,10,0,0)))

And I also like looking at this for the cumulative effects as well, which you can see with the permutation lines widen over time.

# Cumulative vs Pointwise
perm_data$csum_felony <- ave(perm_data$i_felony, perm_data$Type, FUN=cumsum)
pro_cum  <- ggplot(data=perm_data,aes(x=Year,y=csum_felony,group=Type,color=Group,size=Group)) + 
              geom_line() +
              scale_color_manual(values=c('grey','red')) + scale_size_manual(values=c(0.5,2)) +
              geom_vline(xintercept=12) + theme_bw() + 
              labs(x=NULL,y='Felony Difference from Control Cumulative') + 
              scale_x_continuous(minor_breaks=NULL, breaks=1:16) + 
              scale_y_continuous(breaks=seq(-20,20,5), minor_breaks=NULL) +
              theme(panel.grid.major = element_line(linetype="dashed"), legend.title= element_blank(),
              legend.position = c(0.2,0.8), legend.background = element_rect(linetype="solid", color="black")) +
              theme(text = element_text(size=16), axis.title.y=element_text(margin=margin(0,10,0,0)))

If you do a ton of permutations (say 999 instead of 99), it would likely make more sense to do a fan chart type error bars and show areas of different percentiles instead of each individual line (Yim et al., 2020).

I will need to slate a totally different blog post to discuss instant vs cumulative effects for time series analysis. Been peer-reviewing quite a few time series analyses of Covid and crime changes – most everyone only focuses on instant changes, and does not calculate cumulative changes. See for example estimating excess deaths for the Texas winter storm power outage (Aldhous et al., 2021). Folks could do similar analyses for short term crime interventions. Jerry has a good example of using the Causal Impact package to estimate cumulative effects for a gang takedown intervention (Ratcliffe et al., 2017) for one criminal justice example I am familiar with.

Again for folks feel free to ask me anything. I may not always be able to do as deep a dive as this, but always feel free to reach out.

References

Using google places API in criminology research?

In my ask me anything series, Thom Snaphaan, a criminologist at Ghent University writes in with this question (slightly edited by me):

I read your blog post on using the Google Places API for criminological research. I am interested in using these data in the context of my PhD research. Can I ask you some questions on this matter? We think Google Places might be a very rich data source, specifically the user ratings of places. (1) Is it allowed to use these data on a large scale (two large cities) for scientific research? (2) Is it possible to download a set without the limit of 1,000 requests per day? (3) Are there, in your experience, other (perhaps more interesting) data sources to conduct this study? Many thanks! Best, Thom

And for my responses to Thom,

For 1) I believe it is OK to use for research purposes. You are not allowed to download the data and resell it though.

For 2) The quotas for the places API are much larger, it is now you get $200 credit per month, which amounts to 100,000 API calls. So that should be sufficient even for a large city.

For 3) I do not know, I haven’t paid much attention to the different online apps that do user reviews. Here in the states we have another service called Yelp (mostly for restaurants), I am not sure if that has more reviews or not though.

One additional piece of information not commonly used in place based research (but have seen it used some Hipp, 2016; Perenzin-Askey, 2018), is the use of the number of employees or sales volume at particular crime generators/attractors. This is not available via google, but is via Reference USA or Lexis Nexis. For Dallas IIRC Reference USA had much better coverage (almost twice as many businesses), but I recently reviewed a paper that did boots on the ground validation for Google data in the Indian city of Chennai and the validation for google businesses was very high (Kuralarason & Bernasco, 2021)

Answer in the comments if you think you have more helpful information on leveraging the place based user reviews in research projects.


In the past I have written about using various google APIs, and which I have used in my research for several different projects.

Google has new pricing now, where you get $200 in credits per month per API. But overall the Places and the streetview API you get a crazy ton of potential calls, so will work for most research projects. Looking it over I actually don’t think I have used Google places data in any projects, in Wheeler & Steenbeek, 2021 I use reference USA and some other sources.

Geocoding and distance API limits are tougher, I ended up accidentally charging myself ~$150 for my work with Gio on gunshot fatalities (Circo & Wheeler, 2021) calculating network distance and approximate drive times. The vision API is also quite low (1000 per month), so will need to budget/plan if you need those services for your project. Geocoding you should be able to find alternatives, like the census geocoder (R, python) and then only use google for the leftovers.

References

  • Circo, G. M., & Wheeler, A. P. (2021). Trauma Center Drive Time Distances and Fatal Outcomes among Gunshot Wound Victims. Applied Spatial Analysis and Policy, 14(2), 379-393.
  • Hipp, J. R. (2016). General theory of spatial crime patterns. Criminology, 54(4), 653-679.
  • Kuralarasan, K., & Bernasco, W. (2021). Location Choice of Snatching Offenders in Chennai City. Journal of Quantitative Criminology, Online First.
  • Perezin-Askey, A., Taylor, R., Groff, E., & Fingerhut, A. (2018). Fast food restaurants and convenience stores: Using sales volume to explain crime patterns in Seattle. Crime & Delinquency, 64(14), 1836-1857.
  • Wheeler, A. P., & Steenbeek, W. (2021). Mapping the risk terrain for crime using machine learning. Journal of Quantitative Criminology, 37(2), 445-480.