An API is first and foremost a contract
Like a contract between humans defines the rules both sides have to follow to make a certain interaction work, so an API defines how an application can interact with another one.
Not all contracts between humans have characteristics very similar to a typical API but certain analogies can be identified between a recurrent online sale (e.g. a cheesemonger allowing to order their goods on a e-commerce and have them delivered at your door) and a call done frequently to retrieve certain data: every time I need a certain thing I will call you expecting the object/response to be delivered in a certain way, packaging included.
A contract is made of 2 parts
A contract is made of 2 parts: what’s defined in the piece of paper signed between the 2 parties involved and how those rules are followed when the interaction described on paper is executed taking concrete actions. In the case of software this maps to the documentation of the API and its software implementation.
A contract is important to decide if all parties are interested in establishing a certain interaction and is critical in the early stages when that interaction must be integrated with a process. Once the process is well established, the focus moves on the habit built by observing how the business is carried over in practice. The contract usually remains there, untouched, ready to be examined in case of any contentious or as a reference in case parts of the initial agreement were not used/implemented from the very beginning but become useful later.
This means that what’s written is important only in the early stages and by exception later on. Once signed, what determines the success of a contract is how the business is carried over and it’s on that that soundness of a relationship is based and can be made last a long time.
Contracts can never be fully trusted and executed verbatim
Paper contracts and documentation are both usually written by humans, typically in natural language.
Humans are prone to make mistakes and so the deliverables of their work can never be fully trusted and taken verbatim. Furthermore, despite the amount of efforts put into avoiding that, natural language always requires a step of interpretation, which means the risks of misunderstandings by giving 2 different meanings to the same word are concrete. Also, it’s often impossible to predict every possible situation and document it properly.
As a result contracts usually contain mistakes, are incomplete and not 100% clear in all their parts.
In a human contract, when any of aforementioned problems happen, the solution relies on a combination common sense, a dialogue between the parties involved to clarify the real intentions and sometimes the help from special judges specifically appointed to deal with these matters.
As Software Developers we usually don’t have the luxury of starting a conversation with the provider of an API because we have no time or because there’s no guarantee of an answer. Lacking the existence of a judge super-partes the only thing left that we can do is to be pragmatic and use documentation just as a guideline and really base our implementation on the behaviour we can observe over a number of calls.
The execution of what’s ruled by a contract is what matters the most
As we said, a contract is mostly important when an agreement must be evaluated before being accepted and a new process or habit is to be established. After that point, the only thing that really matters is how that agreement is honored.
In an online sale for something I plan to buy often I browse the catalogue to get a sense of what’s available, check the selling price, the expected delivery time and the return policy. If everything seems fine I do my first order. If I’m satisfied with it, I will do a second and then a third one. After a while I completely forget about the initial contract and build a personal process around what I can observe being the execution of the other hand of my agreement. I won’t add milk to my daily shopping list because based on observation I know it will always be delivered the morning after the order before I leave my house for work. I feel confident I can always prepare that special French recipe because I see that Reblochon is always available on the catalogue.
In Software Development we will initially check things like the price per call to the API, the format and completeness of data sent back, the level of SLA guaranteed for that service. If we are happy with the promises that we can read, we’ll start coding our layer of integration between our application and the API we’ve just decided to use. The time required for the initial assessment can vary from minutes to usually not more than a couple of days. The work built on top of the information we are buying can last months, sometimes years if you take into account subsequent refinements and extensions.
The key for a long lasting business relationship is consistency, which leads to habits and in turn builds trust
At this point it shouldn’t raise any eyebrow to say that a contract can define some parameters that we can consider the minimum requirements for a sale to be considered satisfying by our customers. But it’s about all the other not regulated details that we build fidelization.
if I consistently deliver fresh milk very early in the morning, my customers will know they can see it delivered at their door while still at home, with no need to ask favors to the neighbours. If a certain product is always present in my catalogue customers can built small family rituals around it, like always serving mature Gouda to their guests. If we keep the size of the parcels consistent over time for a certain amount of products, my customers can reliably assess if a parcel motel can be an option for them to collect a delivery or if they need to ask permission to their manager to get the items delivered at their workplace.
Our delivery service become part of the routines of our customers which is the basis to keep them loyal to our products.
Over time we can change the contract by increasing prizes or adding restrictions to the refund rules. The customers will reassess the new clauses and decide if they want to renew their relation with us but the main driver for their decisions will be how much our products are now a positive part of their lives.
The same holds true with APIs. Once an undocumented behaviour is observed and software is built upon it, any change to that behaviour risks to break the application of a customer with the obvious risk to lose them as customers, which of course is the last thing we want.
Let’s see some examples of some things usually neglected and changed light heartedly by developers that instead should be kept consistent over time.
API: Redirections can be blocked
Our APIs are growing and we feel they would be clearer if they were organized in a different way. We know someone has already built software connecting to a certain API endpoint and so to guarantee backward compatibility we smartly think to start exposing the API on a new URL and handling calls from existing users of an API through a simple HTTP redirect to the new URL.
Smart, cheap, elegant but unfortunately at risk to break the external systems.
For security reasons, often out of the control of the implementer because imposed by regulamentations or very demanding customers, many big companies have very strict control on the routes left open to certain applications. It’s very common that access to 3rd party systems (so our APIs) will at least have a firewall rule narrowing the allowed calls to a well known domain. But it’s also not impossible to find firewall’s rules allowing calls only to specific paths of an application.
But even where security constrained are more relaxed (or simply smarter) adding a redirection can break the code calling our APIs.
Imagine for example a maintenance script based on cURL, like a daily job to extract the latest batch of data or a monitor on the health of an application. By default cURL doesn’t follow redirects and one has to explicitly tell it to do so by specifying a -L parameter.
But we are smart kids right, and in 2017 none would write a mission critical tool based on cURL. We expect our consumers to build both their applications and monitoring tools with mature enterprise grade languages and libraries like Java and Spring. So, what we expect them to do is to simply obtain an instance of a Spring’s Rest Template opportunely configured with the right endpoint and let it do the job. Transparently. Such a pity the default for all non-GET calls is not to follow redirects automatically.
API: Adding a field can break deserialization
If the analogy of a contract between humans and API holds true, then based on our daily experience we should expect our users to be happy whenever they are given for free anything not initially included in the contract.
This is not automatically true in real life (imagine to send literally a ton of products to someone who just asked for 1 kilo).
It’s even more not automatically true in Software Development where the flexibility is often reduced to zero.
So, for example, in an API sending back a list of users interested in a certain product category, we may think it to be a good idea to add in a second time a new “Contact Details” field to every user returned. We are providing very valuable information to any API caller interested in targeting specifically users to upsell their products maximizing their chances of success. API users not interested or not ready to leverage that information can simply ignore the existence of that field.
Imagine the application calling your API delegates to the very popular Jackson library the deserialization of JSON API responses. By default the deserialization will fail whenever a property is found in the JSON that doesn’t have a corresponding getter/setter in the destination object. To prevent that, an explicit annotation @JsonIgnoreProperties has to be added at class level. Until it’s not done, the applications of our consumers will continue to break.
API: Removing an undocumented field can break an implementation
If it’s unsafe to add a field to a response, it’s possibly even less safe to remove it.
Sure, since we didn’t include it in the contract we feel it’s perfectly in our rights to remove it whenever it pleases us but by doing so we are ignoring that in the same way as we are, the programmers using our APIs are curious (so WILL have found out the existence of that field), believe themselves to be smart (and so to be allowed to use something simply because it’s there, even if officially it shouldn’t), won’t even read the documentation (aka the contract) because they inherently don’t trust it (in this article we’ve just scratched the surface of why this mindset develops) or because the API to call seems to straightforward and well designed not to require the additional effort to read any documentation at all.
When an undocumented field that remained available for a while is removed, 2 bad things can usually happen:
- There is some code using that field and assuming it’s always present (e.g. lastModifiedDate). No check is performed on its absence and so as soon as the first call is made and the field is not found, the equivalent of the so much feared NullPointerException (wrongly, as we’ll see in a future article) is raised. Big reg light, sleepless night for the poor developer on call that day but immediately spotted. This is the lucky case.
- There is some code that relies on the eventual existence of that field, like for example it could be a boolean flag hasBeenShipped in a B2B of a chain of restaurants that automatically performs orders against an automated warehouse whenever a certain product becomes scarce. The field is not there, which means the goods are not on their way, but there is still time and so the software won’t do anything. A new check is performed the day after on the same order and the result is the same. After a few days the restaurant software takes the initiative and resubmit the order because its main goal is to prevent the restaurant to run out of a certain ingredient. And because the hasBeenShipped flag can never be found, it will continue to do so for the next 7 days until the time comes for the manual check of a human supervisor that will stop that line of order. In the meanwhile 8 identical orders of that specific product are on their way from the warehouse to the restaurant.
In our contract analogy this would be similar to the cheesemonger shop always sending a complimentary piece of special cheese, offered to the user to try new flavours. You build expectations in the customers being able to try something different whenever they submit their weekly order and all of a sudden you stop sending that perk. True, you didn’t break any agreement but imagine their disappointment once they open the parcel and don’t find the sample, especially if they start planning their Friday pre-dinner drinks around that.
API: Increasing response time can break poorly written multi-threaded code
Ok, I agree with you: here I’m taking a scenario to the extreme and make our responsibility to deal with very poor code developed by our users. Apparently this shouldn’t really be our concern but if you consider the huge amount of small business that grow every day at the speed of light and how little resources and experienced developers they can put on an integration task, maybe all of a sudden you may decide that is worth to think also about their poor practices.
So let’s say that you have 2 services, API-A that does something and API-B that relies on the effects of the action triggered by API-A. To accomplish a useless premature optimization a system integrator decides to artificially complicate things and call those services in parallel when they really should be called one after the other because that’s the way they should be handled properly.
On the other end of the world, completely unaware of how your APIs are being used (very very dangerous situation to be in, but this is material for another post), your team has a super smart idea to decrease the response time of API-B so to reduce it of an order of magnitude. There’s enough capacity to approve this optimization hoping to improve the scalability of your system and attracting new users. No change is applied to API-A.
The moment your optimization goes live all of a sudden the naive system integrator who was making the 2 calls in parallel will start receiving responses from API-B well before the ones coming from API-A. All of a sudden you are changing the logic in the systems of your users. If you are lucky the effect is immediate and will result in an evident bug. If you are not, similarly to what happens for fields removed because not part of the initial contract, the system integrator could start experience subtle, and for this reason even more pernicious bugs, that may become properly evident only months later.
To get a sense of what I’m referring to imagine a scenario where API-A is called to book a seat and API-B return the list of seats still available after the booking required by our consumers to refresh their seat map. An end-user of an application built on top of our APIs will experience glitches and start losing confidence in the reliability of the application they are using, may decide to re-submit the order booking a different seat (resulting in duplicates of orders of seats being held for more time than required) or any other possible trouble too hard to predict here.
To imagine a more delicate situation (with an admittedly poorly designed API) consider for a second an API-A responsible to submit the vote of a citizen for the election of a president and an API-B giving back the list of citizens that can still vote. In this case a citizen could potentially submit endless votes.
If you are still not convinced, imagine that your cheesemonger guarantees a delivery within 24hours from the order. Simply based on experience you know that the goods will always be delivered the next morning, usually around 7:30 AM. But one night, at 3 AM, once you are finally able to put your baby to sleep after hours of extenuating attempts, you decide to use that short moment of hardly gained peace to order some dairy products. 20 minutes after you do that you hear your doorbell ringing: there’s the delivery guy with a wide smile on his/her face, all proud for how fast (s)he was able to fulfill your order..
API: adopting a new upgrade policy, even if retro compatible
Most companies don’t have the resources or the discipline to continuously keep up to date their integrations with external APIs. But some of them have engineers cautious enough to build a suite of automated tests to validate those APIs.
This way they feel confident that anytime you update your APIs they will be able to spot any breaking change. Because of that, they assume they will always up-to-date and will be required to do extra adaptation work only by exception. With this in mind, they know they won’t leverage automatically new features but they will feel safe to adopt any fix you will introduce to solve that very dangerous security hole spotted in your APIs.
Your contract states that users of the API will always get updates for free but don’t specify how. At start it seems enough to simply update the code running in production, in a way that proves to be smoothless for their users.
But APIs evolve for many different reasons and after a while you realize you must introduce a major breaking change to deliver that juicy feature many are asking for. Very careful not to cause problems to your existing users, you plan to guarantee backward compatibility by creating a new versioning system that relies on a new pattern to build URLs while leaving the old version of the API untouched and always mapped to the old URL.
It seems to make sense and for most of it it does, a part from the fact that all of a sudden the company described before won’t get anymore updates for free and will not be aware that the situation changed. They are now accumulating even more uncontrolled technical debt and are exposed to all the security breaches they consider as solved by you transparently.
At this point, if you followed me so far, are you still sure it’s a good idea to solve that bug you just discovered in your public APIs? If you are still not convinced that most often than not the answer is NO, stay tuned for a future article where I’ll try to address it.