To aid in reproducible analysis, I often have a set of FILE HANDLE
commands at the header of my syntax. For example, here is basically what most of my syntax’s look like at the top.
DATASET CLOSE ALL.
OUTPUT CLOSE ALL.
*Simple description here of what the syntax does.
FILE HANDLE data /NAME = "H:\ProjectX\OriginalData".
FILE HANDLE save /NAME = "C:\Users\axw161530\Dropbox\Documents\BLOG\FileHandles_SPSS".
What those commands go are point to particular locations on my machine that either have the data I will use for the syntax, and where to save the subsequent results. So what this does instead of having to write something like:
GET FILE = "H:\ProjectX\OriginalData\SPSS_Dataset1.sav".
I can just write something like:
GET FILE = "data\SPSS_Dataset1.sav".
The same works for where I save the files, so I would use the second SAVE
line instead of the first after I’ve defined a file handle.
*SAVE OUTFILE = "C:\Users\axw161530\Dropbox\Documents\BLOG\FileHandles_SPSS\TransformedData1.sav"
SAVE OUTFILE = "save\TransformedData1.sav"
Besides being shorter, this greatly aids when you need to move files around. Say I needed to move where I saved the files from my personal drive to a work H drive. If you use file handles, all you need to do is change the one line of code at the top of your syntax. If you used absolute references you would need to edit the file at every location you used that absolute path (e.g. every GET FILE
, or GET TRANSLATE
, or SAVE
).
Another trick you can use is that you can stack multiple file handles together. Consider this example.
FILE HANDLE base /NAME = "H:\ProjectX".
FILE HANDLE data /NAME = "base\Datasets"
FILE HANDLE report /NAME = "base\Reports"
In this case I defined a base location, and then defined the data
and report
file handles as folders within the base file handle. Again if you need to move this entire project, all you need to do is edit that original base
file handle location.
A final note, if I have a set of complicated code that I split up into multiple syntax files, one trick you can use is to place all of your file handles into one set of syntax, and then use INSERT
to call that syntax. Depending on the structure though you may need to edit the INSERT
call though at the header for all of the sub-syntaxes, so it may not be any less work.
I should also mention you could use the CD
command to some of the same effect. So instead of defining a save
file handle, you could do:
CD "C:\Users\axw161530\Dropbox\Documents\BLOG\FileHandles_SPSS".
SAVE OUTFILE = "TransformedData1.sav"
I don’t typically change SPSS’s current directory, but there are legitimate reasons to do so. If SPSS needs write access to the directory in particular, this is sometimes easier than dealing with permissions. That does not come up often for me, and most of the time I have data files in many different places, so I typically just stick with using file handles.
Accessing FILE HANDLES in Python or R
If you are using file handles in SPSS code, you may want to be able to access those file handles the same way in Python or R code called within SPSS. This may be for much of the same reason — you may want to access data or save data in those programs, but use the file handles you have identified. Albert-Jan Roskam on the Nabble group showed how to do this.
So for python you could do below:
BEGIN PROGRAM Python.
import spssaux
#Getting the SPSS file handles
file_handles = spssaux.FileHandles().resolve
#Grabbing a particular handle
data_loc = file_handles('data')
#Setting the python directory to the save location
import os
os.chdir(file_handles('save'))
END PROGRAM.
And for R you can do:
BEGIN PROGRAM R.
fh <- spssdata.GetFileHandles()
file_handles <- sapply(fh, function(x) x[2])
names(file_handles) <- sapply(fh, function(x) x[1])
#Accessing a particular file handle
file_handles["data"]
#setting the R directory to save handle
setwd(file_handles["save"])
END PROGRAM.
In either case it is nice so you again can just edit one line at the top of your syntax if you change file locations, instead of having to edit multiple lines in the subsequent syntax file.
Bruce Weaver
/ July 12, 2022Nice page, Andrew. I would like to suggest, though, that you add a short section with info for Mac users. For example, you define the “save” file handle as follows:
FILE HANDLE save /NAME = “C:\Users\axw161530\Dropbox\Documents\BLOG\FileHandles_SPSS”.
I am not a Mac user, but AFAICT, a Mac user would have to modify that as follows:
FILE HANDLE save /NAME = “Users/axw161530/Dropbox/Documents/BLOG/FileHandles_SPSS”.
The first change was removal of C:, as it does not exist on a Mac. The second change was changing \ to /. The Mac is very fussy about this, and will balk at backslashes. Windows, on the other hand, will work with either.
Finally, Mac users can use the Finder to locate the file or folder they want to define with FILE HANDLE, copy the location info, and paste it into their command. E.g.,
https://osxdaily.com/2015/11/05/copy-file-path-name-text-mac-os-x-finder/
I hope this helps!
Cheers,
Bruce
apwheele
/ July 12, 2022Thank you Bruce!