Wetenschap - 7 februari 2013

Order in the statistical chaos

Statistical model from Wage­ningen conquers the world.
New user-friendly version puts Canoco in layperson's reach too.


Thousands of ecologists the world over use Cajo ter Braak's statistical analysis model Canoco to process their research data. At the end of January he presented the latest version of his software: Canoco 5. Even researchers with little knowledge of statistics can work with this version.
So what does Canoco do exactly? An example. If you want to study the negative impact of pesticides on aquatic life, you will often be dealing with an ecosystem with at least 200 species in it. That means 200 graphs plotting the effect the poison has on all the little water creatures. And that is not what you want; you want an overview of the response of all the creatures at a glance. And you can only get that with Canoco, says ecotoxicologist Paul van den Brink. He uses the relevant method in Canoco, and indeed helped develop it, as a longstand­ing colleague of Cajo ter Braak's.
27,000 citations
Ter Braak, who works at PRI Biometris, laid the foundations for this software package back in the nineteen nineties. His publication from 1986, in which he combines several methods of statistical analysis with the prototype for Canoco, unleashed a revolution in the processing of research data. Together with Czech researcher Petr Šmilauer, he went on to develop better and more advanced versions of Canoco. Thousands of scientists make use of Ter Braak's software, witness the 4,000 licenses sold for it. In the EU, Canoco has become a fixture for the analysis of certain complex datasets. And the analysis model turns up in countless scientific articles, making Ter Braak, with about 27,000 citations, the most cited Wageningen scientist.
More user-friendly
Nevertheless, previous versions of Canoco - up to version 4.5 - were by no means child's play to use. For the connoisseurs: it was  based on canonical correspondence analysis - hence Canoco - which Ter Braak combined with a hand­ful of other statistical methods. 'You don't just get that under your belt on a Monday morning; it takes you a week to get to grips with the basics of Canoco,' says Van den Brink. 'But after that you can do something other people can't do.'
Van den Brink loses that advantage now, because Canoco 5 is considerably more user-friendly. Where researchers used to have to call on Ter Braak to advise them on a regular basis, the Canoco Adviser can do that job now. The digital adviser evaluates the research data, chooses suitable analysis methods and tests the research outcomes, complete with a reliability check. 'You no longer need to be an expert to be able to use the software,' says Ter Braak. Canoco has come of age, 'and I am its grandfather.' In recent years, the programming has been done by Šmilauer, with Ter Braak checking his work and helping to write the new manual.
And what is the grandfather of Canoco doing now? Solving crosswords? No: he has been working on genetic algorithms for years with the aim of extracting information from vast and complex datasets. Using the Markov Chain Monte Carlo version of an algorithm, with the aid of Bayesian theory. Ah, yes of course.