Surpassed 100k views in 2022

For the first time, yearly view counts have surpassed 100,000 for my blog.

I typically get a bump of (at best) a few hundred views when I first post a blog. But the most popular posts are all old ones, and I get the majority of my traffic via google searches.

Around March this year monthly bumped up from around 9k to 11k views per month. Not sure of the reason (it is unlikely due to any specific inidividual post, as you can see, none of the most popular posts were posted this year). A significant number of the views are likely bots (what percent overall though I have no clue). So it is possible my blog was scooped up in some other aggregators/scrapers around that time (I would think those would not be counted as search engine referrals though).

One interesting source for the blog, when doing academic style posts with citations, my blog gets picked up by google scholar (see here for example). It is not a big source, but likely a more academic type crowd being referred to the blog (I can tell people have google scholar alerts – when scholar indexes a post I get a handful of referrals).

I have some news coming soon about writing a more regular criminal justice column for an organization (readers will have to wait alittle over a week). But I also do Ask Me Anything, so always feel free to send me an email or comment on here (started AMA as I get a trickle of tech questions via email anyway, and might as well share my response with everyone).

I typically just blog generally about things I am working on. So maybe next up is that auto-ml libraries often have terrible defaults for hypertuning random forests, or maybe an example of data envelopment analysis, or quantile regression for analyzing response times, or monitoring censored data are all random things I have been thinking about recently. But no guarantees about any those topics in particular!

Over 10 years of blogging

I just realized the other day that I have been blogging for over 10 years (I am old!) First hello world post post was back in December 2011.

I would recommend folks in academia/coding to at a minimum do a personal webpage. I use wordpress for my blog (did a free wordpress for quite a long time). WordPress is 0 code to make a personal page to host your CV.

I treat the blog as mostly my personal nerd journal, and blog about things I am working on or rants on occasion. I do not make revenue off of the blog directly, but in terms of getting me exposure it has given quite a few consulting leads over the years. As well as just given my academic work a much wider exposure.

So I always have a few things I want to blog about in the hopper. But always feel free to ask me anything (similar to how Andrew Gelman answers emails), and if I get a chance I will throw up a blog post in response.

Blogging Year in Review 2021

In total views of the blog for 2021, I will have a trickle of a few more views today, but I will not crack the 100k mark. So the blog viewership has not really grown over the past few years, just variance around 90k views per year.

Most of my traffic is a trickle of referrals for old blog posts from search engines. So my top posts of 2021 would be a quite boring old list if I did that.

I have to go down over 20 posts before ones I posted this year come into the views ranking. Typically I get a one time bump of 100~200 views for a single post when I first post it (I have never topped 600 views in one day). But after that it is just competing for search traffic referrals. (Those posts in 2021 are highlighted by the blue bar on the left in this screengrab.)

In other news, I have not written a blog post about it, but the move to a private sector data science gig was a good one for me (much less stressful than being an academic). Two years in I can safely make that assessment.

But, I have continued to do some academic papers on the side. The Buffalo paper was accepted at Journal of Experimental Crim, and the NIJ paper is under review for the IJOTCC open science special issue. So I can still do some criminology work to scratch that itch on the side.

In Covid times everything is remote, but I do enjoy participating in various groups (even if over zoom). As I posted on the blog, always feel free to send me an email to ask me anything.

SPSS Predictive Analytics Blog

SPSS had a blog on the old developerworks site, but they’ve given it a bit of a reboot recently. I’ve volunteered to have my old SPSS posts uploaded to the site, and this is what I said I wanted back in 2012; a blogging community related to SPSS. So when blogging about SPSS related topics I will be cross-posting the posts both here and predictive analytics blog. Hopefully the folks at IBM can get more individuals to participate in writing posts.

2014 Blog stats, and why Blogging >> Articles

The readership of the blog has continued to grow. Here are the total site views per month since the beginning in December 2011.

At this point we can start to see some seasonal patterns. I take a big hit in December and January, and increases when school is in session. I get quite a bit of my traffic from SPSS searches, so I presume much of the traffic are students using SPSS.

I do not worry too much about posting regularly, but I like to take some time if I have not published anything in around 2 weeks. I just enjoy taking a break from a specific work projects, and often I blog about something I have dealt with multiple times (or answered peoples questions multiple times) so I like making a blog post for my own and others reference.

Now, one of the more popular posts I have written is Odds Ratios NEED To Be Graphed On Log Scales. This I published in October 2013, recieved around 100 referrals from twitter the day I published it, and since has averaged about 5-10 views per day (it has accumulated a total of near 3,000 total). It is one of the first sites returned for odds ratio graph from a google search.

Certainly not a number of views to write home to my mother about, but I believe it is better outreach of my opinion than a journal article (not that I would be able to publish such a limited point in a journal article anyway). Take for instance Rothman et al.’s 2011 article, Should Graphs of Risk or Rate Ratios be Plotted on a Log Scale? in the American Journal of Epidemiology that has a differing opinion of mine. I can not find any readership stats for AJE, but I highly doubt that article has been viewed by 3,000 people, and according to google scholar it only has 2 citations currently. One is the response by the editor to the article, and the other is likely in error as it was published before the Rothman article. Site views are superficial as well, but I would place a wager my blog post has reached more readers than the Rothman article. 3,000 is way higher than views or downloads for my papers on SSRN, and even the most viewed articles since 2011 on the Cartography and GIS website have not accumulated 3,000 downloads at this point. (My Viz JTC paper has just over 100 downloads so far after being up for close to a year at this point.) AJE articles very likely have a larger readership than CaGIS – but I have no idea how much larger. I would guess the American Statistician has a more comparable (likely larger?) membership via the ASA, and articles from the first issue of 2014 have accumulated mostly between 200 and 1500 downloads currently (the last issue of 2013 is quite a bit lower). I suspect a download is a bit more of an investment than a page view of my blog (so both are over-estimates of those actually reading the article, but page views are likely a larger over-estimate). But in most cases I get so many more views on the blog compared to that I would an article outreach on the blog is clearly the winner. The audience is different as well, not necessarily better or worse, just different.

I don’t take my work as venerable as Ken Rothman’s (obviously he is a well respected and influential epidemiologist or methodologist more generally for his books), but I disagree with his reasoning for using linear scales in some circumstances in the referenced article. My general response to the Rothman example is that if you want to show absolute risk differences then show them. Plotting the ratios on an arithmentic scale is misleading, and while close for his example is still not as accurate as just plotting the risk differences. In Rothman et al.’s example plotting the odds ratios would result in an overestimate of the absolute risk differences by over 10%! (The absolute risk difference is 90 - 1 = 89, whereas the linear difference between the odds is 10 - .01 = 9.99. The former mapped onto a scale from 0 to 10 would result in a length of 8.9, so an over estimate of (9.99 - 8.9)/8.9 ~ 12%.)

I don’t take blogging as a replacement for academic work, more like an open nerd journal. I’m pretty sure this venue has quite a bit more readership than my journal articles ever will though.


Musings on Comments

I recently posted a comment policy. I do not get that many comments now, but I did delete a comment recently asking for help on a code sample, so figured it would be worth elaborating on.

Comments are great, and I hope to get more, but with my participation on the stack exchange sites (and commenting on other blogs) I definitely see the need to take active moderation. This is not a big deal for the site so far (I rarely post anything controversial) but is really necessary for any blog generating a lot of comments.

My position is not quite extreme as Ed Tufte, e.g. I don’t plan on deleting a comment if I don’t like the content enough, but I will definitely delete off-topic comments and anything I find offensive. My position is more in line with Jeff Atwood’s thoughts, who discusses the topic quite a bit on his coding horror blog (see here for one example).

Basically the comments take curation, just like any site or wiki. Comments without moderation in any substantive amount (e.g. YouTube, many news articles) are simply not worth reading. The same moderation is what makes the stack exchange sites so much better than email list-serves, and the lack of moderation is what makes some MOOC forums so chaotic.