Back Garden Weather in CouchDB (Part 4)

Weather frontIn this series of posts I’m describing how I created a CouchDB CouchApp to display the weather data collected by the weather station in my back garden. In the previous post I showed you how to display a single day’s weather data. In this post we will look at processing the data to display it by month.

The data my weather station collects consists of a record every five minutes. This means that a 31 day month will consist of 8,928 records. Unless you have space to draw a graph almost nine thousand pixels wide then there is no point in wasting valuable rending time processing that much data. Reducing the data to one point per hour gives us a much more manageable 744 data points for a month. A full years worth of weather data consists of 105,120 records, even reducing it to one point per hour gives us 8760 points. When rendering a year’s worth of data it is clearly worth reducing the data even further, this time to one point per day.

How do we use CouchDB to reduce the data to one point per hour? Fortunately CouchDB’s map/reduce architecture is perfect for this type of processing. CouchDB will also cache the results of the processing automatically so it only needs to be run once rather than requiring an expensive denormalisation process each time some new data is uploaded.

First we need to group the five minute weather records together into groups for each hour. We could do this by taking the unix timestamp of record and rounding to the nearest hour. The problem with this approach is that the keys are included in the urls. If you can calculate unix timestamps in your head then your maths is better than mine! To make the urls more friendly we’ll use a Javascript implementation of sprintf to build a human-friendly representation of date and time, excluding the minute component.

function(doc) {
    // !code vendor/sprintf-0.6.js

    emit(sprintf("%4d-%02d-%02d %02d", doc.year, doc.month,, doc.hour), doc);

CouchDB will helpfully group documents with the same key, so all the records from the same hour will be passed to the reduce function. What you cannot guarantee though is that all the records will be passed in one go, instead you must ensure that your reduce function can operate on its own output. You can tell whether you are ‘rereducing’ the output of the reduce function by checking the third parameter to the function.

function(keys, values, rereduce) {
    var count = 0;

    var timestamp = values[0].timestamp;
    var temp_in = 0;
    var temp_out = 0;
    var abs_pressure = 0;
    var rain = 0;

    var wind_dir = [];
    for(var i=0; i<8; i++) { wind_dir.push({ value: 0}); }

To combine the multiple records it makes sense to average most of the values. The exceptions to this are the amount of rain, which should be summed; the wind direction, which should be a count of the gusts in each direction, and the wind gust speed which should be the maximum value. Because your reduced function may be called more than once calculating the average value is not straightforward. If you simply calculate the average of the values passed in then you will be calculating the average of averages, which is not the same the average of the full original data. To work around this we calculate the average of the values and store that with the number of values. Then, when we rereduce, we multiply the average by the number of values and then average the multiplied value.

In the previous, simplified, code snippet we set up the variables that will hold the averages.

    for(var i=0; i<values.length; i++) {
        var vcount;
        if(rereduce) { vcount = values[i].count } else { vcount = 1 }

We now loop through each of the values and work out how many weather records the value we’re processing represents. The initial pass will just represent a single record, but in the rereduce step it will be more.

        temp_in = temp_in + values[i].temp_in * vcount;
        temp_out = temp_out + values[i].temp_out * vcount;
        abs_pressure = abs_pressure + values[i].abs_pressure * vcount;

Here we build up the total values for temperature and pressure. Later we’ll divide these by the number of records to get the average. The next section adds the rain count up and selects the maximum wind gust.

        rain = rain + values[i].rain;

        wind_ave = wind_ave + values[i].wind_ave * vcount;
        if(values[i].wind_gust > wind_gust) { wind_gust = values[i].wind_gust; }

So far we’ve not really had to worry about the possibility of a rereduce, but for wind direction we need to take it into account. An individual record has a single window direction but for a hourly records we want to store the count of the number of times each direction was recorded. If we’re rereducing we need to loop through all the directions and combine them.

        if(rereduce) {
            for(var j=0; j<8; j++) {
                wind_dir[j]["value"] += values[i].wind_dir[j]["value"];
        } else if(values[i].wind_ave > 0 && values[i].wind_dir >= 0 && values[i].wind_dir < 16) {
            wind_dir[Math.floor(values[i].wind_dir/2)]["value"] += 1;

        if(values[i].timestamp < timestamp) { timestamp = values[i].timestamp; }
        count = count + vcount;

The final stage is to build the object that we’re going to return. This stage is very straightforward, we just need to divide the numbers we calculated before by the count of the number of records. This gives us the correct average for these values.

    return {
            "count": count,
            "status": status,
            "timestamp": timestamp,
            "temp_in": temp_in / count,
            "temp_out": temp_out / count,
            "abs_pressure": abs_pressure / count,
            "rain": rain,
            "wind_ave": wind_ave / count,
            "wind_gust": wind_gust,
            "wind_dir": wind_dir,

Now we have averaged the weather data into hourly chunks we can use a list, like the one described in the previous post, to display the data.

In the next and final post in this series I’ll discuss the records page on the weather site.

Photo of Weather front by Paul Wordingham.

Back Garden Weather in CouchDB (Part 3)

almost mayIn this series I’m describing how I used a CouchDB CouchApp to display the weather data collected by a weather station in my back garden. In the first post I described CouchApps and how to get a copy of the site. In the next post we looked at how to import the data collected by PyWWS and how to render a basic page in a CouchApp. In the post we’ll extend the basic page to display real weather data.

Each document in the database is a record of the weather data at a particular point in time. As we want to display the data over a whole day we need to use a list function. list functions work similarly to the show function we saw in the previous post. Unlike show functions list functions don’t have the document passed in, they can call a getRow function which returns the next row to process. When there are no rows left it returns null.

show functions process an individual document and return a single object containing the processed data and any HTTP headers. Because a list function can process a potentially huge number of rows they return data in a different way. Rather than returning a single object containing the whole response list functions must return their response in chunks. First you need to call the start function, passing in any headers that you want to return. Then you call send one or more times to return parts of your response. A typical list function will look like the code below.

function (head, req) {
    start({ "headers": { "Content-Type": "text/html" }});

    while(row = getRow()) {
        data = /* process row */;

To process the weather data we can’t follow this simple format because we need to split each document up and display the different measurements separately. Let’s look at the code for creating the day page. The complete code is a bit too long to include in a blog post so checkout the first post in this series to find out how to get a complete copy of the code.

To start the function we load the templates and code that we need using the CouchApp macros. Next we return the appropriate Content-Type header, and then we create the object that we’ll pass to Mustache when we’ve processed everything.

function(head, req) {
    // !json
    // !json templates.head
    // !json templates.foot
    // !code vendor/couchapp/lib/mustache.js
    // !code vendor/sprintf-0.6.js
    // !code vendor/date_utils.js

    start({ "headers": { "Content-Type": "text/html" }});

    var stash = {
        head: templates.head,
        foot: templates.foot,
        date: req.query.startkey,

Next we build a list of the documents that we’re processing so we can loop over the documents multiple times.

    var rows = [];
    while (row = getRow()) {

To calculate maximum and minimum values we need to choose the first value and then run through each piece of data and see whether it is higher or lower than the current record. As the data collector of the weather station is separate to the outside sensors occasionally they lose their connection. This means that we can just pick the value in the first document as our starting value, instead we must choose the first document where the connection with the outside sensors was made.

    if(rows.length &gt; 0) {
        for(var i=0; i<rows.length; i++) {
            if((rows[i].status &amp; 64) == 0) {
                max_temp_out = rows[i].temp_out;
                min_temp_out = rows[i].temp_out;
                max_hum_out = rows[i].hum_out;
                min_hum_out = rows[i].hum_out;


Now we come to the meat of the function. We loop through all of the documents and process them into a series of arrays, one for each graph that we’ll draw on the final page.

        for(var i=0; i<rows.length; i++) {
            var temp_out = null;
            var hum_out = null;
            if((rows[i].status & 64) == 0) {
                temp_out = rows[i].temp_out;
                hum_out = rows[i].hum_out;

                total_rain = total_rain + rows[i].rain;
                rainfall.push({ "time": time_text, "rain": rows[i].rain });

                wind.push({ "time": time_text, "wind_ave": rows[i].wind_ave, "wind_gust": rows[i].wind_gust });


            pressure.push({ "time": time_text, "pressure": rows[i].abs_pressure });

            temps.push({ "time": time_text, "temp_out": temp_out, "temp_in": rows[i].temp_in });

            humidity.push({ "time": time_text, "hum_in": rows[i].hum_in, ";hum_out": hum_out });

Lastly we take the stash, which in a bit of code I’ve not included here has the data arrays added to it, and use it to render the day template.

    send(Mustache.to_html(, stash));

    return &quot;&quot;;

Let’s look at a part of the day template. The page is a fairly standard use of the Google Chart Tools library. In this first snippet we render the maximum and minimum temperature values, and a blank div that we’ll fill with the chart.


<p>Outside: <b>Maximum:</b> {{ max_temp_out }}<sup>o</sup>C <b>Minimum:</b> {{ min_temp_out }}<sup>o</sup>C</p>
<p>Inside: <b>Maximum:</b> {{ max_temp_in }}<sup>o</sup>C <b>Minimum:</b> {{ min_temp_in }}<sup>o</sup>C</p>

<div id="tempchart_div"></div>

In the following Javascript function we build a DataTable object that we pass to the library to draw a line chart. The {{#temps}} and {{/temps}} construction is the Mustache way of looping through the temps array. We use it to dynamically write out Javascript code containing the data we want to render.

function drawTempChart() {
    var data = new google.visualization.DataTable();
    data.addColumn('string', 'Time');
    data.addColumn('number', 'Outside');
    data.addColumn('number', 'Inside');

        ['{{ time }}', {{ temp_out }}, {{ temp_in }}],

    var chart = new google.visualization.LineChart(document.getElementById('tempchart_div'));
    chart.draw(data, {width: 950, height: 240, title: 'Temperature'});

We now have a page that displays all the collected weather data for a single day. In the next post in this series we’ll look at how to use CouchDB’s map/reduce functions to process the data so we can display it by month and by year.

Photo of almost may by paul bica.

Back Garden Weather in CouchDB (Part 2)

its raining..its pouringIn my last post I described the new CouchDB-based website I have built to display the weather data collected from the weather station in my back garden. In this post I’ll describe to import the data into CouchDB and the basics of rendering a page with a CouchApp.

PyWWS writes out the raw data it collected into a series of CSV files, one per day. These are stored in two nested directory, the first being the year, the second being year-month. To collect the data I use PyWWS’s live logging mode, which consists of a process constantly running, talking to the data collector. Every five minutes it writes a new row into today’s CSV file. Another process then runs every five minutes to read the new row, and import it into the database.

Because CouchDB stores its data using an append only format you should aim to avoid unnecessary updates. The simplest way to write the import script would be to import each day’s data every five minutes. This would cause the database to balloon in size, so instead we query the database to find the last update time and import everything after than. Each update is stored as a separate document in the database, with the timestamp attribute containing the unix timestamp of the update.

The map code to get the most recent update is quite simple, we just need to emit the timestamp for each update. The reason the timestamp is emitted as the key is so we can filter the range of updates. It is also emitted as the value so we can use the timestamp in the reduce function.

function(doc) {
    emit(doc.timestamp, doc.timestamp);

The reduce function is a fairly simple way to calculate the maximum value of the keys. I’ve mostly included it here for completeness.

function(keys, values, rereduce) {
    if(values.length == 0) {
        return 0;

    var m = values[0];

    for(var i=0; i<values.length; i++) {
        if(values[i] > m) { m = values[i]; }

    return m;

You’ll find the import script that I use in the directory you cloned in the previous post, when you got a copy of the website.

So, we’ve got some data in our database. How do we display it on a webpage? First, let’s consider the basics of rendering a webpage.

CouchDB has two ways to display formatted data, show and list functions. Show functions allow you to format a single documents, for example a blog post. List functions allow you to format a group of documents, such as a the comments on a post. Because viewing a single piece of weather data is not interesting the weather site only uses list functions. To get started let’s create a simple Show function, as these are simpler.

CouchApp doesn’t come with a templating library, but a common one to use is Mustache. The syntax is superficially like Django templates, but in reality it is far less powerful. For a simple website like this, Mustache is perfect.

In the show directory of your CouchApp create a new file, test.js. As with the map/reduce functions this file contains an anonymous function. In this case the function takes two parameters, the document and the request obejct, and returns an object containing the response body and any headers.

function (doc, req) {
    // !json templates.records
    // !json templates.head
    // !json templates.foot
    // !code vendor/couchapp/lib/mustache.js

The function begins with some magic comments. These are commands to CouchDB which includes the referenced code or data in the function. This allows you to keep shared data separate from the functions that uses it.

The first !json command causes the compiler to load the file templates/records.* and add it to a templates objects, under the records attribute.

The !code command works similarly, but in loads the specified file and includes the code in your function. Here we load the Mustache library, but I have also used the function to load a javascript implementation of sprintf. You might want to load some of your own common code using this method.

    var stash = {
        head: templates.head,
        foot: templates.foot

    return { body: Mustache.to_html(templates.records, stash), headers: { "Content-Type": "text/html" } };

Firstly we build an object containing the data we want to use in our template. As Mustache doesn’t allow you to extend templates we need to pass the header and footer HTML code in as data.

As mentioned the return type of a show function is a object containing the HTML and any HTTP headers. We only want to include the content type of the page, but you could return any HTTP header in a similar fashion. To generate the HTML we call the to_html function provided by Mustache, passing the template and the data object we prepared earlier.

Now we have data in our database and can create simple pages using a CouchApp we can move on to showing real data. In the next post I will describe the list functions use to show summarized day and month weather information.

Photo of its raining..its pouring by samantha celera.