Jane Jacobs on Neighborhoods

Google image on 5/4/16 – Jane Jacobs 100th birthday.

Continuing on with my discussion of neighborhoods, it seems apt to talk about how Jane Jacob defines neighborhoods in her book The Death and Life of Great American Cities. She is often given perfunctory citations in criminology articles for the idea that mixed zoned neighborhoods (those with both residential and commercial zoning all together) increase “eyes on the street” and thus can reduce crime. I will write another post more fully about informal social control and how her ideas fit in, but here I want to focus on her conception of neighborhoods.

She asks the question, what do neighborhoods accomplish – and from this attempts to define what is a neighborhood. Thus she feels a neighborhood can only be defined in terms of having actual political capital to use for its constituents – and so a neighborhood is any region in which can be organized enough to act as an independent polity and campaign against the larger government in which it is nested. This is quite different from the current conception of neighborhoods in most social sciences – in which we assume neighborhoods exist, calculate their effects on behavior, and then tautologically say they exist when we find they do affect behavior.

Based on her polity idea Jacob’s defines three levels of possible neighborhoods:

  • your street
  • the greater area of persons – around 100,000
  • the entire city

This again is quite different than most of current social sciences. Census tracts and block groups are larger swaths than just one block. In very dense cities like NYC, census block groups are often the square defined by streets on either side, so they split apart people across the street from one another. Census tracts intentionally are made to contain around 8,000 people. For a counter-example criminologist, Ralph Taylor does think streets are neighborhood units. A hearsay paraphrase of his I recently heard was “looking out your front door, all that matters is how far you can see down the street in either direction”.

I think Jacob’s guesstimate of 100,000 may be partly biased from only thinking about giant megapolises. Albany itself has only around 100,000 residents – so it is either streets or the whole city per her definition. Although I can’t argue much about smaller areas having little to no political capital to accomplish specific goals.

In some of the surveys I have participated in they have asked the question about how you define your neighborhood. Here are the responses for two different samples (from the same city at two different time points) for an example:

I think the current approach – that neighborhoods are defined by their coefficients on the right hand side of regression models is untenable in the end – based on ideas that are derivatives of the work in my dissertation. Mainly that discrete neighborhood boundaries are a fiction of social scientists.

Posting my peer reviews on Publons and a few notes about reviewing

Publons is a service that currates your peer review work. In a perfect world this would be done by publishers – they just forward your reviews with some standard meta-data. This would be useful when one paper is reviewed multiple times, as well as identifying good vs. poor reviewers. I forget where I saw the suggestion recently (maybe on the orgtheory or scatterplot blog), but someone mentioned it would be nice if you submit your paper to a different journal to forward your previous reviews at your discretion. I wouldn’t mind that at all, as oft the best journals will reject for lesser reasons because they can be more selective. (Also I’ve gotten copy-paste same reviews from different journals, even though I have updated the draft to address some of the comments. Forwarding would allow me to address those comments directly before the revise-resubmit decision.)

I’ve posted all of my reviews so far, but they are only public if the paper is eventually accepted. So here you can see my review for the recent JQC article Shooting on the Street: Measuring the Spatial Influence of Physical Features on Gun Violence in a Bounded Street Network by Jie Xu and Elizabeth Griffiths.

I’ve done my fair share of complaining about reviews before, but I don’t think the whole peer-review process is fatally flawed despite its low reliability. People take peer review a bit too seriously at times – but that is a problem for most academics in general. Even if you think your idea is not getting a fair shake, just publish it yourself on your website (or places like SSRN and ArXiv). This of course does not count towards things like tenure – but valuing quantity over quality is another separate problem currently in academia.


In the spirit of do unto others as you would have them do unto you, here are a two main points I try to abide by when I review papers.

  • be as specific as possible in your critique

There is nothing more frustrating than getting a vague critique (the paper has multiple mispellings and grammar issues). A frequent one I have come across (both in reviews of my papers and seeing comments others have made on papers I’ve reviewed) is in the framing of the paper – a.k.a. the literature review. (Which makes sending the paper to multiple journals so frustrating, you will always get more arbitrary framing debates each time with new reviewers.)

So for a few examples:

  • (bad) The literature review is insufficient
  • (good) The literature review skips some important literature, see specifically (X, 2004; Y, 2006; Z, 2007). The description of (A, 2000) is awkward/wrong.
  • (bad) The paper is too long, it can be written in half the length
  • (better) The paper could be shortened, section A.1 can be eliminated in my opinion, and section A.2 can be reduced to one paragraph on X.

Being specific provides a clear path for the author to correct what you think, or at least respond if they disagree. The “you can cut the paper in half” I don’t even know how to respond to effectively, nor the generic complaint about spelling. One I’ve gotten before is “x is an innapropriate measure” with no other detail. This is tricky because I have to guess why you think it is innapropriate, so I have to make your argument for you (mind read) and then respond why I disagree (which obviously I do, or I wouldn’t have included that measure to begin with). So to respond to a critique I at first have to critique my own paper – maybe this reviewer is more brilliant than I thought.

Being specific I also think helps cut down on arbitrary complaints that are arguable.

  • provide clear signals to the editor, both about main critiques and the extent to which they can be addressed

Peer review has two potential motivations, one is a gate-keeper and one is to improve the draft. Often times arbitrary advice by reviewers intended for the latter is not clearly delineated in the review, so it is easily confused for evidence pertinent to the gate-keeper function. I’ve gotten reviews of 20 bullet points or 2,000 words that make it seem like a poor paper due to sheer length of the comment, but the majority are minor points or arbitrary suggestions. Longer reviews actually suggest the paper is better – if there is something clearly wrong you can say it in a much shorter space.

Gabriel Rossman states these different parts of peer review a bit more succintly than me:

You need to adopt a mentality of “is it good how the author did it” rather than “how could this paper be made better”

I think this is a good quip to follow. I might add “don’t sweat the small stuff” to that as well. Some editors will read the paper and reviews and make judgement calls – but some editors just follow the reviewers blindly – so I worry with the 20 bullet point minor review that it unduly influenced a reject decision. I’m happy to respond to the bullets, and happy you took the time, but I’m not happy about you (the reviewer) not giving clear advice to the editor of the extent to which I can address those points.

I still give advice about improving the manuscript, but I try to provide clear signals to the editor about main critiques, and I also will explicitly state whether they can be addressed. The “can be addressed” is not for the person writing the paper – it is for the editor making the decision for whether to revise-and-resubmit! The main critiques in my experience will either entail 2-3 main points (or none at all for some papers). I also typically say when things are minor and put them in a separate section, which editors can pretty much ignore.

Being a quantitative guy, the ones that frustrate me the most are complaints about model specifications. Some are legitimately major complaints, but often times it will be things that are highly unlikely to greatly influence the reported results. Examples are adding/dropping/changing a particular control variable and changes in the caliper used for propensity score matching. Note I’m not saying you shouldn’t ask to see differences, I’m asking that you clearly articulate why your suggestion is preferable and make an appropriate judgement as to whether it is a big problem or a little problem. A similar complaint is what information to include in tables or in the main manuscript or appendix. The author already has the information, so it is minor editing, not a major problem.


While I am here I will end with three additional complaints that don’t fit into anywhere previously in my post. One, multiple rounds of review are totally a waste. So the life cycle of the paper-review should be

paper -> review -> editor decision -> reject/accept
                                          or
                                      revise-resumbit -> new paper + responses to reviews -> editor decision

The way the current system works, I have to submit another review after the new paper has been submitted. I rather the editor take the time to see if the authors sufficiently addressed the original complaints, because as a reviewer I am not an unbiased judge of that. So if I say something is important and you retort it is not, what else do you want me to say in my second review! It is the editors job at that point to arbiter disagreements. This then stops the cycle of multiple rounds of review, which have a very large amount of diminishing returns in improving the draft.

This then leads into my second complaint, generally about keeping a civil tone for reviews. In general I don’t care if a reviewer is a bit gruff in the review – it is not personal. But since reviewers have a second go, when I respond I need to keep an uber deferential tone. I don’t think that is really necessary, and I’d rather original authors have similar latitude to be gruff in responses. Reviewers say stupid things all the time (myself included) and you should be allowed to retort that my critique is stupid! (Of course with reasoning as to why it is stupid…)

Finally, I had one reviewer start recently:

This paper is interesting and very well written…I will not focus on the paper’s merits, but instead restrict myself to areas where it can be improved.

The good is necessary to signal to the editor whether a paper should be published. I’ve started to try in my own reviews to include more of the good (which is definately not the norm) and argue why a paper should be published. You can see in my linked review of the Xu and Griffiths paper by the third round I simply gave arguments why the paper should be published, despite a disagreement about the change-point model they reported on in the paper.

Presentation at ACJS 2016

I will be presenting at the ACJS (Academy of Criminal Justice Sciences) conference in Denver in a few days. My talk will be on some of the work I have been conducting with the Albany Police Department via the Finn Institute (Rob Worden and Sarah McLean are co-authors on the presentation). The title is Making stops smart: Predicting arrest rates from discretionary police stops at micro places in Albany, NY and here is the abstract:

Police stops are one of the most invasive, but regularly used crime control tactics by police. Similar to how focusing police resources at hot spots of crime can improve police efficiency, here we examine the spatial variation in arrest rates at micro places (street segments and intersections) in Albany, NY. Using data from over 240,000 discretionary police stops, we fit random effects logistic regression models to predict the probability of an arrest at different micro places. We show that like hot spots, there are examples of high arrest rate locations next to low arrest rate locations. Using a simulation, we show that if one displaced stops from low arrest locations to high arrest locations, one could make half as many stops but still have the same number of total arrests.

Here is a funnel chart of the arrest hit rates at micro-places across the city. You can see quite a bit of extra variation in arrest rates to attempt to explain.

I am giving this at 8 am on Thursday (see Event #185 in the program)

There will be two other presentations at the moment (Ling Wu is not going to make it), and they are:

  • Results from a victim generated crime mapping software, Zavin Nazaretian et al. – Indiana University of PA
  • Spatial analysis of aggravated assault and homicide crime scene, arrest and offender residence locations in Houston, TX, Elishewah Weisz – Sam Houston

So if you are interested in crime mapping stuff it should be a good session.

Feel free to bug me if you see me around at ACJS.


 

Also before I forget, my co-workers are presenting a poster on analysis of Syracuse Truce – a focused deterrence gang intervention. The posters are on Friday, so I won’t be around unfortunately. The title is Gangs, groups, networks, and deterrence: An evaluation of Syracuse Truce. (See poster #45 in the same program I linked to earlier.) Rob and Kelly will at least be manning the poster though – so you can go and bug them about the details!

Here is a picture of the reach of call-ins for one particular gang. The idea is for those who attended call ins to spread the message to other members. So this graph evaluates how well the call-ins would be expected to reach all of the members of the gang.

If you are wondering what I do for my job – yes I pretty much just make maps and graphs all day long šŸ˜‰

 

 

 

Dropbox links blocked in China – just email me

A bit of materialĀ I share on the blog is hosted via Dropbox. A current student of mine mentioned that Dropbox linksĀ are blocked in China. So when he was home on break he did not have access to materials I linked to on my site. He mentioned Dropbox links haveĀ been blocked for around a year (although – his words – youtube and facebook have been blockedĀ for years).

If I post something on the blog, but you do not have access to it, always feel free to send me an email. I will send the materials directly. My email is available via my About page, or on my CV.

It is unfortunate Dropbox links are blocked in China, but I see no reason to change services because of this – since any service I could switch to would potentially share the same fate as Dropbox.

 

License plate readers and the trade off in privacy

As a researcher in criminal justice, tackling ethical questions is a difficult task. There are no hypotheses to test, nor models to fit, just opinions bantering around. I figured I would take my best shot and writing some coherent thoughts on the topic of the data police collect and its impacts on personal privacy – and my blog is really the best outlet.

What prompted this is a recent Nick Selby post which suggested the use of license plate readers (LPRs) to target Johns in LA is one of the worst ideas ever and a good example of personal privacy invasion by law enforcement. (Also see this Washington Post opinion article.)

I have a bit of a different and more neutral take on the program, and will try to articulate some broader themes in personal privacy invasion and the collection/use of data by police. I think it is an important topic and will continue to be with the continual expansion of public sensor data being collected by the police (with body worn cameras, stationary cameras, cell phone data, GPS traces being some examples). Basically, much of the negative sentiment I’ve seen so far of this hypothetical intervention are for reasons that don’t have to do with privacy. I’ll articulate these points by presenting alternative, currently in use police programs that use similar means, but have different ends.

To describe the LA program in a nutshell, the police use what are called license plate readers to identify particular vehicles being driven in known prostitution areas. LPRs are just cameras that take a snapshot of a license plate, automatically code the alpha-numeric plate, and then place that [date-time-location-plate-car image] in a database. Linking up this data with registered vehicles, in LA the idea is to have the owner of the vehicle sent a letter in the mail. The letter itself won’t have any legal consequences, just a note that says the police know you have been spotted. The idea in theory is that you will think you are more likely to be caught in the future, and may have some public shaming also if your family happens to see the letter, so you will be less likely to solicit a prostitute in the future.

To start with, some of the critiques of the program focus on the possibilities of false positives. Probably no reasonable person would think this is a worthwhile idea if the false positive rate is anything but small – people will be angry with being falsely accused, there are negative externalities in terms of family relationships, and any potential crime reduction upside would be so small that it is not worthwhile. But, I don’t think that itself is damning to this idea – I think you could build a reasonable algorithm to limit false positives. Say the car is spotted multiple times at a very specific location, and specific times, and the home owners address is not nearby the location. It would be harder to limit false positives in areas where people conduct other legitimate business, but I think it has potential with just LPR data, and would likely improve by adding in other information from police records.

If you have other video footage, like from a stationary camera, I think limiting false positives can definitely be done by incorporating things like loitering behavior and seeing the driver interact with an individual on the street. Eric Piza has done similar work on human coding/monitoring video footage in Newark to identify drug transactions, and I have had conversations with an IBM Smart City rep. and computer scientists about automatically coding audio and video to identify particular behaviors that are just as complicated. False negatives may still be high, but I would be pretty confident you could create a pretty low false positive rate for identifying Johns.

As a researcher, we often limit our inquiries to just evaluating 1) whether the program works (e.g. reduces crime) and 2) if it works whether it is cost-effective. LPR’s and custom notifications are an interesting case compared to say video cameras because they are so cheap. Camera’s and the necessary data storage infrastructure are so expensive that, to be frank, are unlikely to be a cost-effective return on investment in any short term time frame even given the best case scenario crime reductions (ditto for police body worn cameras). LPR’s and mailing letters on the other hand are cheap (both in terms of physical capital and human labor), so even small benefits could be cost-effective.

So in short, I don’t think the idea should be dismissed outright because of false positives, and the idea of using public video/sensor footage to proactively identify criminal behavior could be expanded to other areas. I’m not saying this particular intervention would work, but I think it has better potential than some programs police departments are currently spending way more money on.

Assuming you could limit the false positives, the next question then is it ok for the police to intrude on the privacy of individuals who have not committed any particular crime? The answer to this I don’t know, but there are other examples of police sending letters that are similar in nature but haven’t generated much critique. One is the use of letters to trick offenders with active warrants to turning themselves in. Another more similar example though are custom notifications. These are very similar in that often the individuals aren’t identified because of specific criminal charges, but are identified using data analytics and human intelligence to place them as high risk and gang involved offenders. Intrusion to privacy is way higher for these custom notifications than the suggested Dear John letters, but individuals did much more to precipitate police action as well.

When the police stop you in the car or on the street the police are using discretion to intrude in your privacy under circumstances where you have not necessarily committed a crime. Is there any reason a cop has to take that action in person versus seeing it on a video? Automatic citations at red light cameras are similar in mechanics to what this program is suggesting.

The note about negative externalities to legitimate businesses in the areas and the cost of letters I consider hyperbole. Letters are cheap, and actual crime data is frequently available that could already be used to redline neighborhoods. But Nick’s critique of the information being collated by outside agencies and used in other actuarial aspects, such as loans and employment decisions, I think is legitimate. I have no good answers to this problem – I have mixed feelings as I think open data is important (which ironically I can’t quantify in any meaningful way), and I think perpetual online criminal histories are a problem as well. Should we not have public crime maps though because businesses are less likely to invest in high crime neighborhoods? I think doing a criminal background check for many businesses is a legitimate query as well.

I have mixed feelings about familial shaming being an explicit goal of the letters, but compared to an arrest the letter is mundane. It is even less severe than a citation (which given some state laws you could be given a citation for loitering in a high prostitution area). Is a program that intentionally tries to shame a person – which I agree could have incredible family repercussions – a legitimate goal of the criminal justice system? Fair question, but in terms of privacy issues though I think it is a red herring – you can swap out different letters that would not have those repercussions but still uses the same means.

What if instead of the "my eyes are on you" letter the police simply sent a PSA like post-card that talked about the blight of sex workers? Can police never send out letters? How about if police send out letters to people who have previous victimizations about ways to prevent future victimization? I have a feeling much of the initial negative reactions to the Dear John program are because of the false positive aspect and the "victimless" nature of the crime. The ethical collection and use of data is a bit more subtle though.

LPR data was initially intended to passively identify stolen cars, but it is pretty ripe for mission creep. One example is that the police could use LPR data to actively track a cars location without a warrant. It is easy to think of both good and other bad examples of its use. For good examples, retrospectively identifying a car at the scene of a crime I think is reasonable, or to notify the police of a vehicle associated with a kidnapping.

For another example use of LPR data, what if the police did not send custom notifications, but used such LPR data to create a John list of vehicles, and then used that as information to profile the cars? If we think using LPR data to identify stolen cars is a legitimate use should we ignore the data we have for other uses? Does the potential abuse of the data outweigh the benefits – so LPR collection shouldn’t be allowed at all?

For equivalent practices, most police departments have chronic offender or gang lists that use criminal history, victimizations, where you have been stopped and who you have been stopped with to create similar databases. This is all from data the police routinely collect. The LPR data can be reasonably questioned whether it is available for such analytics use – police RMS data is often available in large swaths to the general public though.

Although you can question whether police should be allowed to collect LPR data, I am going to assume LPR data is not going to go away, and cameras definitely are not. So how do you regulate the use of such data within police departments? In New York, when you conduct an online criminal history check you have to submit a reason for doing the check. That is a police officer or a crime analyst can’t do a check of your next door neighbor because you are curious – you are supposed to have a more relevant reason related to some criminal investigation. You could have a similar set up with LPR that prevents actively monitoring a car except in particular circumstances and to purge the data after a particular time frame. It would be up to the state though to enact legislation and monitor its use. There is currently some regulation of gang databases, such as sending notifications to individuals if they are on the list and when to take people off the list.

Similar questions can be extended beyond public cameras though to other domains, such as DNA collection and cell phone data. Cell phone data is regularly collected with warrants currently. DNA searching is going beyond the individual to familial searches (imagine getting a DUI, and then the police use your DNA to tell that a close family member committed a rape).

Going forward, to frame the discussion of police behavior in terms of privacy issues, I would ask two specific questions:

  • Should the police be allowed to collect this data?
  • Assuming the police have said data, what are reasonable uses of that data?

I think the first question, should the police be allowed to collect this data, should be intertwined with how well does the program work and how cost-effective is the program (or potential if the program has not been implemented yet). There are no bright lines, but there will always be a trade off between personal privacy and public intrusion. Higher personal intrusion would demand a higher level of potential benefits in terms of safety. Given that LPR’s are passively collecting data I consider it an open question whether they meet a threshold of whether it is reasonable for the police to collect such data.

Some data police now collect, such as public video and DNA, I don’t see going away whether or not they meet a reasonable trade-off. In those cases I think it is better to ask what are reasonable uses of that data and how to prevent abuses of it. Basically any police technology can be given extreme examples where it saved a life or where a rogue agent used it in a nefarious way. Neither extreme case should be the only information individuals use to evaluate whether such data collection and use is ethical though.

Spatial analysis course in CJ (graduate) – Spring 2016 SUNY Albany

This spring I am teaching a graduate level GIS course for the school of criminal justice on the downtown SUNY campus. There are still seats available, so feel free to sign up. Here is the page with the syllabus, and I will continue to add additional info./resources to that page.

Academics tend to focus on regression of lattice/areal data (e.g. see Matt Ingrams course over in Poli. Sci.), and in this course I tried to mix in more things I regularly encountered while working as a crime analyst that I haven’t seen coverage of in other GIS courses. For example I have a week devoted to the journey to crime and geographic offender profiling. I also have a week devoted to introducing the current most popular models used to forecast crime.

I’ve started a specific wordpress page for courses, which I will update with additional courses I prepare.

Accepted position at University of Texas at Dallas

After being on the job market 2+ years, I have finally landed a tenure-track job at the University of Texas at Dallas in their criminology department. Long story short, I’m excited for the opportunity at Dallas, and I’m glad I’m done with the market.

I will refrain from giving job advice, I doubt I did a good job in many circumstances in all stages. But now that it is over I wanted to map all the locations I applied to. Red balloons are places I had an in person interview for a tenure track position.

In the end I applied to around 80 ads over the two year period (about 40 per each wave), and I had 8 in person interviews before I landed the Dallas position. My rate is worse than all of my friends/colleagues on the market during the past few years (hence why I shouldn’t give advice), but around 4~5 interviews before getting an offer is the norm among the small sample size of my friends (SUNY Albany CJ grads that is).

So folks soon to be on the market this is one data point of what to expect.

Music and distractions in the workplace

I was recently re-reading Zen and the Art of Motorcycle Maintenance, and it re-reminded me of why I do not like to listen to music in the workplace. The thesis in Pirsig’s book (in regards to listening to music) is simple; you can’t concentrate entirely on the task at hand if you have music distracting you. So those who value their work tend to not have idle distractions like music playing (and be all engrossed in their work).

I have worked in various shared workspaces (cubicles and shared offices) for quite a while now, and I do have a knack for going off into space and ignoring all of the background noise around me. But I still do not like listening to music, even though I have learned to cope with the situation. At this point I prefer the open office workspace, as there at least is no illusion of privacy. When I worked at a cubicle someone coming behind me and scaring me was basically a daily thing.

Scott Adams, the artist of the Dilbert comic, had a recent blog post saying that music is the lesser evil compared to constant distractions via the internet (email, facebook, twitter, etc.) This I can understand as well, and sometimes I turn off the wi-fi to try to get work done without distraction. I don’t see how turning on music helps, but given its prevalence it may just be differences between myself and other people. I should probably turn off the wi-fi for all but an hour in the morning and an hour in the afternoon everyday, but I’m pretty addicted to the internet at this point.

It partly depends on the task I am currently working on though how easily I am distracted. Sometimes I can get really engrossed in a particular problem and become obsessed with it to the point you could probably set the office on fire and I wouldn’t notice. For example this programming problem dominated my thoughts for around two days, and I ended up thinking of the general solution while I did not have access to the computer (while I was waiting for my car to get inspected). Most of the time though I can only give that type of concentration for an hour or two a day though, and the rest of the time I am working in a state of easy distraction.

Background music I don’t like, and other ambient noises I can manage to drown out, but background TV drives me crazy. My family was watching videos (on TV and tablets) the other day while I was reading Zen and ironically I became angry, because I was really into the book and wanted to give it my full concentration. I know people who watch TV in bed to go to sleep, and it is giving me a headache just thinking about it while I am writing this blog post.

I highly recommend both Zen and the Art of Motorcycle Maintenance and Scott Adam’s blog. I’m glad I revisited Zen, as it is an excellent philosophical book on the logic of science that did not make much of an impression on me as an undergrad, but I have a much better grasp of it after having my PhD and reading some other philosophy texts (like Popper).

New working paper: What We Can Learn from Small Units of Analysis

I’ve posted a new working paper, What We Can Learn from Small Units of Analysis to SSRN. This is a derivative of my dissertation (by the same title). Below is the abstract:

This article provides motivation for examining small geographic units of analysis based on a causal logic framework. Local, spatial, and contextual effects are confounded when using larger units of analysis, as well as treatment effect heterogeneity. I relate these types of confounds to all types of aggregation problems, including temporal aggregation, and aggregation of dependent or explanatory variables. Unlike prior literature critiquing the use of aggregate level data, examples are provided where aggregation is unlikely to hinder the goals of the particular research design, and how heterogeneity of measures in smaller units of analysis is not a sufficient motivation to examine small geographic units. Examples of these confounds are presented using simulation with a dataset of crime at micro place street units (i.e. street segments and intersections) in Washington, D.C.

As always, if you have comments or critiques let me know.

Tables and Graphs paper rejection/update – and on the use of personal pronouns in scientific writing

My paper, Tables and Graphs for Monitoring Temporal Crime Patterns was recently rejected from Policing: An International Journal of Police Strategies & Management. I’ve subsequently updated the SSRN draft based on feedback from the review, and here I post the reviews and my responses to those reviews (in the text file).

One of the main critiques by both reviewers was that the paper was too informal, mainly because of the use of "I" in the paper. I use personal pronouns in writing intentionally, despite typical conventions in scientific writing, so I figured a blog post about why I do this is in order. I’ve been criticized for it on other occasions as well, but this is the first time it was listed as a main reason to reject an article of mine.

My main motivation comes from Michael Billig’s book Learn to Write Badly: How to Succeed in the Social Sciences (see a prior blog post I wrote on the contents). In a nut-shell, when you use personal pronouns it is clear that you, the author, are doing something. When you rewrite the sentence to avoid personal pronouns, you often obfuscate who the actor is in a particular sentence.

For an example of Billig’s point that personal pronouns can be more informative, I state in the paper:

I will refer to this metric as a Poisson z-score.

I could rewrite this sentence as:

This metric will be referred to as a Poisson z-score.

But that is ambiguous as to its source. Did someone else coin this phrase, and I am borrowing it? No – it is a phrase I made up, and using the personal pronoun clearly articulates that fact.

Pretty much all of the examples where I eliminated first person in the updated draft were of the nature,

In this article I discuss the use of percent change in tables.

which I subsequently changed to:

This article discusses the use of percent changes as a metric in tables.

Formal I suppose, but insipid. All rewriting the sentence to avoid the first person pronoun does is make the article seem like a sentient being, as well as forces me to use the passive tense. I don’t see how the latter is better in any way, shape, or form – yet this is one of the main reasons my paper is rejected above. The use of "we" in academic articles seems to be more common, but using "we" when there is only one author is just silly. So I will continue to use "I" when I am the only author.