over the horizon of API development

At APIStrat today I attended the “workshop”¹ on Open API Specification. A somewhat off-hand remark by the final presenter struck me as especially important. Someone had asked whether the new Links Object could be used to implement hypermedia, and the presenter first said no but in subsequent comments admitted it was not impossible, just not a priority. (This seemed quite fair to me.) But in his disclaimers he said something I found remarkable, which was that hypermedia is fundamentally at odds with an API specification. This was probably the most interesting statement of the entire day for me.

When I was in college as a liberal arts student, I encountered a very odd and provocative journal on “ecotheology”. One article in the journal was about “ecodemonology”. Not just an evocative name, but a compelling concept: certain technologies, the article claimed, have an emergent effect that constrains as much as it generates. The big for instance was road infrastructure. We build roads wherever we most want to go. Wonderful, now we can get everywhere at the end of a road much faster. But then, once the roads exist, they affect not just our route-taking decisions but our actual destination-setting: we go where it is convenient to go on a road. To their credit, in correspondence, the author was open to a more neutral taxonomy of effects than naming demons, but it is rather more fun to keep with the original demonology theme.

A year or so ago I hacked a little on a project I called picatrix. I’d just led a book club at Zipcar on RESTful Web APIs and this project was my response to what I’d learned about hypermedia. In the book, the design process led API designers to create a sort of state machine of their API formed of link relations (as state transitions, or edges) and application states (as program states, or nodes). (A sidebar: notice resources don’t even enter into this. They’re an implementation detail!) I thought to myself: “why not formalize this into a directed graph?” So I wrote a little gem that takes a .dot file and protobuff schema and generates an API out of it. Now, the project never went anywhere, it is a mere curiosity². But the principle I think is an intriguing one, and was confirmed to me by the limits OAS admits. The currently expressible API, as a data structure, is not a graph.

But when hypermedia is respected as a constraint, our API must be a graph. Hypermedia essentially requires each node return its adjacency list as a list of edges in addition to returning itself or a redirect (a reified edge) when rendered as a response. In OAS, our horizon for thinking about APIs is that those edges are static and fixed when provided by the server. The edge information in this case being: given response node A, request edges B, C, and D are possible; oh and by the way, here’s how to form those requests. Almost everyone embeds that type of programming information (of actual execution workflow) on the client, and more than that, we’ve come to think of that as the norm in leaving things “generic” or “flexible”.

When I talk to people about APIs, upwards of 90% think of them as: returning JSON, using HTTP, and providing a surface area for the application object model (a bunch of nouns) you’ve used to coalesce your database relations into some runtime language. OAS is great–I genuinely like it and am going to recommend it at work–but it is its own new demon. What kind of APIs can we build? Those we describe in our specs. Given we can’t describe hypermedia APIs, there is no reason to ever imagine building them. Instead, we build our zombie REST APIs and fiddle with Postman or Charles or…

Stepping back for a moment, your API is not just the surface area of your program, but ideally is the recomposable elements other developers can use to write their programs. Without hypermedia, or without an API as graph, we provide them a set of primitives with a higher burden of understanding from outside the actual API. Ok, we returned you the index of pets/ with 200 status, but now what? What can you do with the response message we gave you? How can you continue the conversation with your server? We just pluck off one node and give it back to you, and then you go and check a document to see what edges you might traverse. We’ve moved to automatically generating the documentation, but it’s still out of band information. This is a problem even if we simply abandon REST.

Let’s say we have an RPC endpoint, which is some function. If you’ve used functional languages before, you may have noticed how helpful it is to have a good type system. Why? It helps organize how you can glue functions together. So, every response from a server is a value of some type, and is very likely useful in a future request. (Many codomains are domains.) Whatever RPC we made, we may want to do what is perfectly natural in any other language: find out what other methods operate on our return value.

How do you know an output can be used as a subsequent input? There’s a few options³, but I have two in mind. One is reinflation into a runtime and inspection there. This includes all the isomorphic runtimes you see via client libraries–it is so common it seems not worth explaining its tradeoffs at the moment. But the second option is tantalizingly familiar to someone that’s played with hypermedia. Don’t just return the literal return value in the response. Instead, return the state of the program when it was at output, thus including information like callable functions on your return value along with other runtime context.

Hopefully it’s obvious in the latter case you want to do this judiciously. You don’t really want to dump the entire execution context on every request. (And it wouldn’t be the actual execution context anyway, but a translation into the API’s protocol.) But you could certainly return, optionally, a small subset of functions you think your client will be interested in now that you’ve responded to them with a valid input value. If you do that in a structured enough manner⁴, your client can actually blindly render what you’ve responded with because the only missing element should be the intention of the user.

So is the future resource-driven specifications? I guess for my own part I hope not.

This was not a workshop. It was some vendors speaking fairly tediously, and then a tour of OAS 3.0. There was so little enthusiasm at first the audience wasn’t even inclined to clap. I felt bad. ↩
I’m actually sufficiently curious about this idea again that I might pick it back up, or rewrite what existed from scratch. ↩
I for one would love to curl an RPC endpoint with ?context=methods and see something like what I get when I call .methods in pry. ↩
This aims to separate the graph of your program as a state machine from the representations of values that can exist in it. Those values could be expressed, as I did in my hack project, in something like protobuff. ↩