[Pulp-list] What do you want from an API documentation solution?

Jay Dobies jason.dobies at redhat.com
Mon Nov 14 19:42:45 UTC 2011

I'd warn you guys this is a long read, but if you haven't realized by 
now that I'm annoyingly long-winded, you haven't been paying attention.

I have a sprint story related to an API docs solution and have been 
trying to find something to get in place before everyone runs around 
doing their tasks to document the currently undocumented APIs. I'm 
running into problems defining what it is we're trying to solve and 
wanted to think/talk it out before throwing in another attempt at a 
solution. I also hope you'll suffer through reading it all and we can 
discuss at a deep dive this week; bring your ideas.

= Problem =
The way I see it, there are three problems with our API docs:
* Missing
* Incomplete
* Incorrect

== Missing ==
Not much to say to describe this. This also isn't a blame thing; we knew 
they'd be a priority one day and did what we could. Now, they are a much 
bigger priority and we need to get them done and keep them maintained.

== Incomplete ==
By this I mean inconsistencies between API docs. Some have examples, 
some don't. Very few appropriately describe error codes. Things like that.

== Incorrect ==
This isn't so much meant to refer to those APIs that are done wrong 
themselves, more it's meant to address the idea that things change and 
the docs need to be updated to reflect those changes.

= Solutions =

== Automated ==
We attempted to do some sort of generation of API docs from docstrings 
and there's been another increased interest in this area with the 
proposal of sphinx. I'm not sure either are the answer.

The perfect solution would be to write code and have the API docs 
generated automagically from the code itself. That's never going to 
happen since we're talking about REST APIs, not simply python APIs. The 
REST APIs need to cover things like HTTP status codes and example URLs 
and response bodies. Those things just aren't possible by code 

The last attempt was to use a specific format in the docstrings of the 
methods and have the docs generated from there. I'd argue it failed for 
a few reasons:

1. No clearly defined template (could be that I just never knew where to 
look, but there is clearly a format I had to follow and I never knew how 
to get started).

2. Manual process of generate .wiki file and then manually copy/paste 
into the wiki. It's actually _more_ work than editing the wiki directly.

Assuming we fix it so that there's no intermediate manual copy/paste 
step into the wiki, I'm still not convinced it's going to work for a few 

1. Most developers don't bother reviewing docstrings when they change 
code. It's very unfortunate, but also very true. In Pulp alone there are 
a number of places where the docstrings reflect an older version of the 
method (PyCharm can highlight this for you, but I digress).

2. It's a new syntax. That means you still have to dig up the template 
to drop in the docstring and figure out how to use it.

So given those two issues, I don't think the question is simply if we 
edit them in the wiki or in code. Either way you still have to hunt 
around for the template. You still have to remember to actually do it. 
You still have to remember to review it after you've changed something.

I'll talk more about technical solutions later. I'm curious what people 
think about this point though. I haven't dug too much into sphinx yet, 
but I have no reason to think it won't be subject to all of the above 
issues that wouldn't render the wiki (almost*) as easy.

* The biggest benefit to sphinx is co-locating the docs in the codebase 
which arguably isn't necessarily a benefit; you can put a browser 
editing a wiki side-by-side with code and not have to scroll up and down 
to hop between impl and docstring. Plus you have more flexibility on a 
wiki; sphinx calls itself WYSIWYG but it's exactly the opposite of that, 
whereas with a wiki it's much simpler to hit "preview".

== Discipline ==
Since we won't be able to derive the API docs purely from code, at the 
end of the day we all just need to be better about stopping for a second 
to think about if we need to edit them (I'd like to see us use a 
boyscout rule* mentality). In the past, it was easier to blow these off 
since people weren't using them. Now, we have both a community and 
Katello, so we need to be aware of the increased priority (and if you 
still don't buy in, hop to the Katello team for a sprint and use an area 
of Pulp you didn't write; you'll hate us).

* http://programmer.97things.oreilly.com/wiki/index.php/The_Boy_Scout_Rule

== Peer Review ==
Let me get this out of the way now: I'm not suggesting everyone have to 
take part in API documentation review.

John has the skill and patience for debugging really obtuse issues. I 
hate that shit, so I never volunteer for it. I leave it for John (who I 
_think_ enjoys those sorts of things).

Same goes for reviewing docs, both user guide and API docs. Some people 
have no interest in it, so there's no reason to force them to have to 
review other people's docs. Personally, I get an odd OCD satisfaction 
knowing we have really clean, readable API docs. So I'm happy to 
volunteer to do a quick proof read as a fresh pair of eyes who doesn't 
know the code itself and who is compulsive enough to make sure the whole 
docs package fits together.

Different skills from different people. Let's take advantage of that and 
request people who are willing to to take a quick once over when you 
finish docs.

== Template ==
Sayli sent out a proposed template a few weeks ago. I don't know where 
it is on the wiki, but I should. Bookmark the template so it feels like 
less of a chore to dig it up when you have to add new APIs.

I can also do another quick review on what pages on the wiki are slurped 
into the look & feel process when the userguide is generated.

We should also add in a dirt simple way of getting the CLI to dump URLs, 
requests, and responses. I already know where it'd go, I just haven't 
done it yet. Then we all have a consistent way of getting that example 
data for the docs.

== Technical Approach ==
It's probably clear by now that I'm a fan of just using the wiki like we 
did in the beginning. I'm not convinced an in-code solution gets us 
anything that simple discipline wouldn't; arguably, it's yet another 
syntax we have to learn and frankly, I have a hard enough time just 
remembering one wiki format from another.

I'm still willing to look into sphinx, but my early impressions are that 
we're going to have to invest a reasonably large amount of work getting 
it to look at the places in code we want it to look and format it for 
the web site. Ultimately, even if we do, you still have to find the 
template for what our API docs should look like and remember to 
write/update them, so I'm not convinced it's any better than the wiki -> 
user guide process we already have in place.

= Conclusion =
Actually, there is no conclusion, I just needed an ending. Please give 
it some thought though. Why are we in this position in the first place? 
Is an in-code solution really going to make that much of a difference to 
you when you edit something or write a new API?

Let me know in some way how you want to proceed. If you don't want to 
get into a giant reply-to-list war on this thread and think it'd be 
worth a deep dive, ping me separately and I'll set one up. If you don't 
really care and just want to be told "here's the new process", that's 
fine too, just let me know. Otherwise, I'm at a bit of a pause on how to 
proceed with this story, so I'm waiting to see what others think.

Jay Dobies
Freenode: jdob @ #pulp
http://pulpproject.org | http://blog.pulpproject.org

More information about the Pulp-list mailing list