The Importance Of Documentation

Book Pile by Paul WatsonRecently I’ve been working a couple of open source projects and as part of them I’ve been using some libraries. In order to use a library though, you need to understand how it is designed, what function calls are available and what those functions do. The two libraries I’ve been using are Qt and libavformat, which is part of FFmpeg and they show two ends of the documentation spectrum.

Now, it’s important to note that Qt is a massive framework owned by Nokia, with hundreds of people working on it full-time including a number of people dedicated to documentation. FFmpeg on the other hand is a purely volunteer effort with only a small number of developers working on it. Given the complicated nature of video encoding to have a very stable and feature-full library such as FFmpeg available as open source software is almost a miracle. Comparing the levels of documentation between these two projects is very unfair, but it serve as a useful example of where documentation can sometimes be lacking across all types of projects, both open and closed source.

So, lets look at what documentation it is important to write by considering how you might approach using a new library.

When you start using some code that you’ve not interacted with before the first thing that you need is to get a grasp on the concepts underlying the library. Some libraries are functional, some object orientated. Some use callbacks, others signals and slots. You also need to know the top level groupings of the elements in the library so you can narrow your focus that parts of the library you actually want to use.

Qt’s document starts in a typical fashion, with a tutorial. This gives you a very short piece of code that gets you up and running quickly. It then proceeds to describe, line by line, how the code works and so introduces you to the fundamental concepts used in the library. FFmpeg takes a similar approach, and links to a tutorial. However, the tutorial begins with a big message about it being out of date. How much do you trust the out of date tutorial?

Once you’ve a grasp of the fundamental design decisions that were taken while building the library, you’ll need to find out what functions you need to call or objects you need to create to accomplished your goal. Right at the top of the menu the QT documentation has links to class index, function index and modules. These let you easily browse the contents of the library and delve into deeper documentation. Doxygen is often used to generate documentation for an API, and it seem to be the way FFmpeg is documented. Their frontpage contains… nothing. It’s empty.

Finally, you’ll need to know what arguments to pass to a function and what to expect in return. This is probably the most common form of documentation to write so you probably (hopefully?) already write it. Despite my earlier criticisms, FFmpeg does get this right and most of the function describe what you’re supposed to pass into the function. With this sort of documentation it’s important to strike a balance. You need to write enough documentation such that people can call your function and have it work first time, but you don’t want to write too much so that it takes ages to get to grips with or replicates what you could find out by reading the code.

Some people hold on to the belief that the code is the ultimate documentation. Certainly writing self-documenting code is a worthy goal, but there are other levels of documentation that are needed before someone could read and the code and understand it well enough for it to be self-documenting. So, next time you’re writing a library make sure you consider:

  • How do people get started?
  • How do people navigate through the code?
  • and, how do people work out how to call my functions?

Photo of Book Pile by Paul Watson.

Deadlock On Exit With PySide And QFileSystemWatcher

Keys by bohmanLast year Nokia started developing their own Python bindings for Qt, PySide, when they couldn’t persuade Riverbank Computing to relicense PyQt under a more liberal license. While developing DjangoDE I made the choice of which library to use configurable. When running under PyQt everything worked fine, but when using PySide the program hung on exit.

Using gdb to see where it was hanging points to QFileSystemWatcher, which has the following comment in the destructor.

Note: To avoid deadlocks on shutdown, all instances of QFileSystemWatcher need to be destroyed before QCoreApplication. Note that passing QCoreApplication::instance() as the parent object when creating QFileSystemWatcher is not sufficient.

The following code will demonstrate the issue.

import sys

#from PyQt4 import QtGui
from PySide import QtGui

app = QtGui.QApplication(sys.argv)

file_browser = QtGui.QTreeView()
file_model = QtGui.QFileSystemModel()


As the comment says, we need to ensure that the QFileSystemWatcher object, which is created by QFileSystemModel, is destroyed before QApplication. To do this we can connect to the lastWindowClosed and ensure that the QFileSystemModel is fully destroyed.

import gc

def app_quit():
     global file_browser
     file_browser = None


It’s not clear why this code would work on PyQt and not PySide, but it is clearly related to the order the objects are deleted. Given the comment in the Qt documentation though you should probably not rely on it working on PyQt and ensure yourself that the QApplication is the last Qt object to be destroyed.

Photo of Keys by bohman.

Unittesting QSyntaxHighlighter

Testing 1, 2, 3 by alisdairI’m using test driven development while building my pet project, DjangoDE. A key part of an IDE is the syntax highlighting of the code in the editor, so that’s one area where I’ve been trying to build up the test suite.

To test the syntax highlighter the obvious approach is to send the right events to write some code into the editor the check the colour of the text. Although the QT documentation is usually excellent, it doesn’t go into enough detail on the implementation of the syntax highlighting framework to enable you to query the colour of the text. In this post I’ll explain how the colour of text is stored, and how you can query it.

A syntax highlighting editor is normally implemented using a QPlainTextEdit widget. This object provides the user interface to the editor and manages the display of the text. The widget contains a QTextDocument instance, which stores the text. To add syntax highlighting you derive a class from QSyntaxHighlighter then instantiate it, passing the document instance as the parameter to the constructor. This is explained in detail in the syntax highlighter example.

The document stores the text as a sequence of QTextBlock objects. These store the text as well as the formatting information used to display it. You might think that you can just call QTextBlock::charFormat to get the colour of the text. Unfortunately it’s not that simple as the format returned by that call is the colour that you’ve explicitly set, not the syntax highlight colour.

Each QTextBlock is associated with a QTextLayout object that controls how the block is rendered. Each layout has a list of FormatRange objects, accessible using the additionalFormats method. It is this list that the QSyntaxHighlighter sets to specify the colour of the text.

Now we know where the colour information is stored, we can find out what colour a particular character will be. Firstly you need to find out which QTextBlock the text you want is. In a plain text document each line is represented by a separate block, so this is quite straightforward. You then get the list of FormatRanges and then iterate through, checking to see if the character you want is between format_range.start and format_range.start + format_range.length

For an example of this you can check out the test file from DjangoDE here

Photo of Testing 1, 2, 3 by alisdair.