CouchDB Performance

I’ve been toying with CouchDB for a short while, and I’m definitely impressed by what I’ve seen. Once I’d upgraded to Erlang R12B and trunk CouchDB any bugs I was seeing disappearing and importing all 1 million documents was straightforward.

With 1 million documents the map/reduce takes a long time, as you would expect. What would be nice is if the maps could be spread across different nodes to speed things up dramatically. Once the map has been calculated and cached, retrieving it is relatively fast. Parsing it in Python does seem to be quite slow, taking a few seconds for a few tens of thousands of results. This is far too slow for a webpage response.

Is there any way to speed up CouchDB? Well aggressive use of memcache will probably help, but too me it seems that CouchDB is not suited to large datasets. I do hope I’m wrong though, and I’m going to investigate further because I really want to find a use for CouchDB in my work.


Author: Andrew Wilkinson

I'm a computer programmer and team leader working at the UK grocer and tech company, Ocado Technology. I mostly write multithreaded real time systems in Java, but in the past I've worked with C#, C++ and Python.

One thought on “CouchDB Performance”

  1. Did you ever use couchdb for a > 1 million dataset? I have a mere 100K dataset yet the view(* a word counter) is taking FOREVER to compute.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s