Reading as caching

When you spend a few years writing code, the principles of programming can start to spill over into other parts of your life. Programming has so many of its own names, its own procedures, its little rituals. Some of them are (as anthropologists like to say) “good to think with,” providing useful metaphors that we can take elsewhere.

I’ve gotten interested in programming as a stock of useful metaphors for thinking about intellectual labor. Here I want to think about scholarly reading in terms of what programmers call caching. Never heard of caching? Here’s what Wikipedia says:

In computing, a cache is a component that stores data so future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation, or the duplicate of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests can be served from the cache, the faster the system performs.

Basically the idea is that, if you need information about X, and it is time-consuming to get that information, then it makes more sense to look up X once and then keep the results nearby for future use. That way, if you refer to X over and over, you don’t waste time retrieving it again and again. You just look up X in your cache; the cache is designed to be quick to access.

Caching – like pretty much everything that programmers do – is a tradeoff. You gain one thing, you lose something else. Typically, with a cache, you save time, but you take up more space in memory, because the cached data has to get stored someplace. For example, in my former programming job, we used to keep a cache of campus directory data. Instead of having to query a central server for our users’ names and email addresses, we would just request all the data we needed every night, around 2am, and keep it on hand for 24 hours. That used up some space on our servers but made our systems run much faster.

One day, I had a thought: scholarly reading is really just a form of caching. When you read, in essence, you are caching a representation of some text in your head. Maybe your cache focuses on the main argument; maybe it focuses on the methodology; maybe on the examples or evidence. In any event, though, what you stick in your memory is always a provisional representation of whatever the original document says. If you are not sure whether your representation is accurate, you can consult the original, but consulting your memory is much faster.

I should probably issue a disclaimer here. I’m intentionally leaving aside a lot of other things about reading in order to make my point. Of course, academic reading isn’t only caching. Reading can be a form of pleasure, a form of experience valuable in itself; it can be a process of imaginary argument, or a way of training your brain to absorb scholarly ideas (which is why graduate students do a lot of it), or a way of forming a more general representation of an academic field. All of that is, of course, valuable and important. But I find that, after you spend long enough in academia, you don’t need to have imaginary arguments with every journal article; you don’t need to love the experience of reading; and you don’t need to constantly remind yourself about the overall shape of your field. Often, you need to read only a relatively well-defined set of things that are directly relevant to your own immediate research.

The analogy between reading and caching becomes important, in any event, when you start to ask yourself a question that haunts lots of graduate students: what should I read? I used to go around feeling terribly guilty that there were dozens, or probably hundreds, of books in my field that I should, theoretically, have been reading. I bought lots of these books, but honestly, I mostly never got around to reading them. That wasn’t because I don’t like reading; I do. It’s because reading (especially when done carefully) is very time-consuming, and time is in horribly short supply for most academics, precarious or not.

Now if we think about reading as a form of caching, we begin to realize that it might be pretty pointless to prematurely cache data that we may never use. For that’s what it is to read books pre-emptively, out of a general sense of moral obligation — you’re essentially caching scholarly knowledge whether or not it has any immediate use-value. To be sure, up to a point, it’s good to read just to get a sense of your field. But there is so much scholarship now that no one human being can, in effect, cache it all in their brain. It’s just not possible to have comprehensive knowledge of a field anymore.

I find this a comforting thought. Once you drop comprehensive knowledge as an impossible academic ideal, you can replace it with something better: knowing how to look things up. In other words, you do need to know how to go find the right knowledge when you need it. If you’re writing about political protests, you need to cache some of the recent literature on protests in your brain. But you don’t need to do this years in advance. You can just do this as part of the writing process.

That’s a rather instrumentalist view of reading, I know, and I don’t always follow it. I do read things sometimes purely because they seem fascinating, or because my friends wrote them, or whatever. But these days, given the time pressures affecting every part of an academic career, we ought to know how to be efficient when that’s appropriate. So: have a caching strategy, and try not to cache scholarly knowledge prematurely.