01 October 2010

The implicit API

Recently I had a little piece of general functionality to implement at work, in a web application I develop - we wanted to be able to trigger various events on the client side (for example pop-up a lightbox-style window), the catch being that the events were to be triggered on the server side, not directly in response to a client request. It is nothing too fancy to implement, a little queue in the database for the actions to trigger + setting up long-pooling in the clients browser suffices as the first attempt. However, I experimented with 2 or 3 designs for this to make it more convenient to use and in the process I realized something I think may be worth sharing.

Consider a typical small software project - a few programmers working together on a single project for a prolonged period of time. Requests for new features are coming in from a customer every few weeks. Some of the requests are probably similar in some ways, so people will want to build reusable modules for the most common tasks. Immediately, perhaps the single biggest problem of software development comes into play - communication.

As a typical software project grows, an implicit “design” emerges from every programmer building tiny APIs only with his/her own use in mind, giving raise to problems:

  • Many different APIs will get created for doing the same things. If you do not explicitly agree to have a library for a particular common task, everybody will create their own little “libraries”.
  • There will be many components sharing parts of the functionality, but each having its own code for achieving it. This code could instead form a shared API and decrease the overall amount of work it would take to build those components. If you don’t explicitly try to design APIs together with normal project development, you will miss many occasions for code reuse.

The longer the project is developed, the worse the effects of the above are. In the end, the system takes more time to build then necessary, maintaining and extending it is harder and the overall design is much more complex, as various aspects of the functionality aren’t separated clearly enough.

Why is it this way? I think there are three keywords that lead to the answer:

  1. Perspective - without conscious effort, one tends to have the “just get the task done” perspective on software. Programmers don’t ordinarily think of the code they write as of something that is meant for other people to use directly, but as of an mean to make the computer do a particular thing - it is both.
  2. Communication - very often team members simply don’t know each others work, do not agree on common practices or on places to look for documentation and don’t build modules to be used by everyone, even for the most common little tasks.
  3. Technical skills - building modules other people will want do use requires particular skills that can only be acquired by continuous practice and proper training.

The true indicator of a successful programming team are its members using each others code without being continuously urged. This can only be achieved by everyone having the necessary skills individually and by having appropriate procedures and conventions in place for the whole team. What I want to encourage in this article, is a shift of perspective that will emphasize communication and a set of best practices to help one acquire the technical skills needed to build clean software as a member of a larger team.

Try the following: the next time you are done with a piece of work you have done in your usual way, try to take a look at it as if you were designing an API for other programmers to do similar things to the one you just implemented. Try to really get into skin of another person trying to modify your code or use it for his/her own purposes. Is it clear what the code does? Are the parts well separated? I put together a set of specific guidelines to help one design programs with this “API perspective” in mind:

  1. Look for abstractions. This is the key point, with each given task, one has to decide if it is a specific instance of some more general problem, how much additional effort does solving the general problem take (sometimes the general one is simpler to solve than the specific one!) and if it is worth to design an API for tasks of this kind. It takes some experience, which comes from a lot of experimentation, to recognize the situations where generalization is worth it. Sometimes the whole “API” you may extract from the task at hand is a single method to format the date in a consistent way across the whole project.
  2. Split things into pieces. It is surprising how underused are the simplest ways of managing complexity, like splitting modules into separate files, splitting methods into smaller ones, distributing code across classes and so on. To avoid every team member “reinventing the wheel”, the modules for common tasks have to be easy to find by just browsing the source tree. If you have your date function in a “date.c” file it is much easier to find it, than if it is in an “utils.c” file. Think of it - if you don’t see an obvious place where to look for a method you need, you will most probably assume it doesn’t exist at all and proceed with writing your own one. It often happens out of laziness, as in many cases it seems that it will be easier to write the given functionality once again than to locate it in the code. What this can lead to, is revisiting the same bugs related to corner cases many times, inconsistencies in the code and also in the end product (if you have three functions for the same purpose of displaying the price in your code, they will probably differ in some way). Every project has some files or directories that are just there to keep things that people do not know where to put, with names like “utils”, “misc” etc. A good first exercise in API design would be to take such a file and split it into well-named modules, preferably until there is no need for “utils” at all anymore.
  3. Use design methods other then just direct experimentation with code. You don’t have the right perspective to do API-level design when looking at your individual classes or functions. I like text-based approaches - a short spec of the public interface of the code you added, consisting of a list of classes/methods with short descriptions might already give you some new insights. Pseudocode can be very helpful when doing lower-level design. You may try UML diagrams, informal pictures, mind-maps or anything else you think may work.
  4. Document and communicate your APIs. You have to agree with other team members on a place where all of you will list and document the APIs you create - perhaps a Wiki page. Let each of the members put descriptions of their modules there. Over time, see if they use your modules or not, if they have any problems with them and so on. This is invaluable feedback on what is good about your code and what isn’t. Every week let one person (alternating between all team members) review all the code added during this week and write an e-mail summarizing any issues found.
  5. Keep consistent. Remember you are programming things for other people. They did not build the API you provided them, so they will certainly make guesses about how it works. To make it easier for them to comprehend your API, you have to strive for maximum consistency to make their guesses right as often as possible. This is in the spirit of “The principle of least surprise”.
  6. Clearly state contracts and signal their violations. In a similar way to 4., the pre- and post- conditions that were obvious to you during writing the code, may not be obvious to other people, so you have to state them explicitly, to make sure that your users immediately know when they are misusing your API. Don’t be afraid to define and use a few custom exception classes for every larger general module you build, it will pay-off very quickly.
  7. Write tests. A module will not really be general and reusable unless it is really well tested. It might be that the API you created works well for your particular problem, but will it also work correctly for a similar one? You have to make sure that what the API promises to do for the user really holds true.
  8. Provide extensions points. If you work on a single project for a longer time, you might sometimes see in advance the ways in which you or other people will want to extend something in the future. You might also have some intuition for what pieces of the program are most likely to change. It is possible to take some steps ahead of time and provide explicit extension points, for example allow the API user to define callbacks that will be used at particular moments of your API operation. In fact, if you have striven for a clear design for some functionality, it might well be that some natural extension points are already in place and if you would extend your code yourself you would use them. The point is to make them explicit and documented, so other people who do not know the merits of your design, but have to extend upon your code, know they exist.
  9. Avoid over-generalization. All of the above points can be misunderstood or misapplied. There are situations where generalization does not make a system cleaner, in fact it does the opposite. Unfortunately, I do not see a clear rule here, this is again a place where you acquire the necessary intuition by experience and experimentation.

Further resources

If you listen carefully to experienced programmers, you might find traces of what I have written about in what they say. Joshua Bloch makes a similar point in his presentation How to Design a Good API, although in the end he focuses more on “explicit” APIs (on writing libraries). Fred Brooks writes about “conceptual integrity”, which is the high-level goal for the practices just described. There also two great books I’m aware of, showcasing how to write really reusable code: Paradigms Of Artificial Intelligence Programming and C: Interfaces and implementations.

Comment