ptools R package

It has been on my bucket list for a bit, but I wanted to take the time to learn how to construct an R package (same as for a python package). So I crafted a package with only a few functions in it so far, ptools, short for Poisson tools.

These are a handful of functions I have blogged about over the years, including functions for various WDD tests and the variants I have blogged about (weighted harm scores, different time periods, and different area sizes).

Small sample counts in bins (which can be used for Benford’s test), or my original application was checking if a chronic offender had a propensity to commit crimes on certain days of the week.

The Poisson e-test, and a function to check whether a distribution is Poisson distributed and two more Poisson related functions as well.

I think I will add quite a few more functions in the soup before I bother submitting to CRAN. (Installing via devtools via github is quite easy, so I do not feel too bad about that.) If you have functions you think I should add just let me know. (Or just make a pull request and add them yourself!) I also need to work on unit tests, and getting github actions set up. I will probably crunch on this for a bit, and then migrate personal projects back to creating some python libraries for my other work.

I do not use R-studio, but the open book R packages has been immensely helpful. On my windows box I had to bother to add R to my system path, so I can start my R session at the appropriate directory, but besides that very minor hassle it has been quite easy to follow.


I probably have not put in my 10k total hours as a guesstimate to mastery in computer programming. I think maybe closer to 5000, and that is spread out (maybe quite evenly at this point) between python, R, SPSS (and just a little Stata). And I still learn new stuff all the time. Being in an environment where I need to work with more people has really hammered down getting environments right, and making it shareable with other teammates. And part and parcel with that is documenting code in a much more thorough manner than most of the code snippets I leave littered on this blog.

So it probably is worth me posting less, but spending more time making nicer packages to share with everyone.

I do not know really how folks do R programming for making packages. I know a little at this point about creating separate conda environments for python to provide some isolation – is there something equivalent to conda environments for R? Do the R CMD checks make this level of isolation unnecessary? Should I just be working on an isolated docker image for all development work? I do not know. I do not have to worry about that at the moment though.


Part of this self learning journey is because I am trying to start a journal aimed at criminologists where you can submit software packages. Similar to the Journal of Open Source Software or the Journal of Statistical Software, etc. For submission to there I want people to have documentation for functions, and really that necessitates having a nice package (whether in R or python or whatever). So I can’t tell people you need to make a package if I don’t do that myself!

The software papers are not a thing yet (I would call it a soft launch at this point), but I have been bugging folks about submitting papers to get a dry run of the process. If you have something you would like to submit, feel free to get in touch and we can get you set up.

Leave a comment

4 Comments

  1. Michael Rasmussen

     /  July 26, 2022

    Hi Andrew, just wondering if you ever pull approaches from other fields, especially health (I’m a biostatistician)? Can I suggest Statistical Methods for Hospital Monitoring with R. You’re interest in rates, Poisson and potentially funnel plots here very much akin to David Spiegelhalter’s approaches to adverse events, which I think could cross-pollinate to crime statistics to some extent. Might be some limitations there but a suggestion if you haven’t picked up this book before

    Reply
  1. Reversion in the tech stack and why DS models fail | Andrew Wheeler
  2. Staggered Treatment Effect DiD count models | Andrew Wheeler

Leave a comment