pre-commit hooks and github actions

In keeping up with learning about code development in python and R, two things I have added to my retenmod python package recently are pre-commit hooks and github actions. I am not going to give a code example here, you can google pre-commit python black flake and get like a dozen different blog posts to describe the process. Ditto for github actions. I think it is good to do some googling on the processes, and then you can see the final yaml files I have made to do either process:

The idea behind pre-commit, is that it runs a set of commands to check your py files before you commit your changes to github on your local system. So here the pre-commit I created for this package does three things:

  • runs black code formatter for python (formats whitespace nicely where it can)
  • runs flake8 checks (checks whether py files meet pep standards)
  • updates my readme document

Note one thing I want to be able to do but cannot currently with pre-commit is to update all jupyter notebooks in place as well. Unfortunately this does not work, as notebooks generate some time meta-data under the hood (so the file is modified, and fails the test). There is probably a way to fix this (maybe some smart config via nbdime), but it is not a big deal for me at the moment. But it works fine executing notebooks and saving to different files, so the readme hook in that example works just fine. (And won’t be as painful as say for Rmarkdown as for Jupyter with that time meta-data.)

Pre-commit hooks run on your local system, so you might not want to have it do pytests. My retenmod package though is tiny (intentionally did it as a very simple example to learn). At work I do development on both a windows machine and Red Hat virtual machines, and fortunately so far have not had issues with needing different yaml files, although I could potentially seeing that happen in more complicated set ups. Although if that were the case, I would like just not install on windows (I use local windows laptop to edit powerpoint presentations, and use Red Hat for pretty much everything else).

Github actions does not run on your personal machine, but runs after you have pushed your changes to github. And then github basically spins up virtual environments and does whatever tests. This makes sense for package development, to make sure your package can be installed on multiple operating systems. And you can run other unit tests at this stage on those systems as well. But for certain tests that only make sense on your local system (say functions to generate database connections) github actions will not make sense.

Next up I will need to learn whether it makes sense to generate artifacts via github actions (for that Jupyter notebook example where pre-commit is not working so well?). Also I have never really figured out sphinx docs for python. So I will see if I can get that up and running as well for this retenmod package.

Using Jupyter Notebooks to make nice README’s for GitHub

Working on my R package ptools, the devtools folks have you make a readme R markdown file to compile to a nice readme markdown file for github. I thought to myself that you could functionally do the same thing with juypter notebooks for python. So here is a quick example of that for my retenmod python package.

So first, here is the old readme in its entirety, rendered on Github:

You can see I have an example code snippet, but it does not actually output the results. Here is the update, where I use jupyter to render the markdown nicely:

So we have nice syntax highlighting even. (Note that the pip install code is not run in a %sh cell, it is just code formatting inside of a markdown cell.) I do like the way R markdown renders the output a bit nicer than here (I also haven’t tried with pretty pandas tables). But you can also include matplotlib images as well:

You might typically want to add in README.ipynb to your gitignore file, but here I included it in the github package so you can see what this notebook looks like. To compile the notebook to markdown is quite simple:

jupyter nbconvert --execute --to markdown README.ipynb

If you have matplotlib images, it saves them in a folder named README_files (not sure if there is a flag to change this option). To get the images to render you then also need to push that image folder into Github.