DATA ANALYSIS

Xlstat 2006.5

Xlstat 2006.5 - Kovach Computing Services

8 January 2007

Kovach Computing Services

Reviewed by Felix Grant

With Microsoft Excel 2007 arriving on the scene amid much fanfare, this may seem a strange time to launch or review a new release of a well-established add-in product for Excel versions 97 to 2003. Most Excel users out there have never upgraded from version 2000, however, and from a scientific computing point of view the launch of a new, larger sheet marks a good moment to pause and take stock before the add-in market responds.

Convenience is the most obvious reason for using Excel to do statistical work: it’s there, it’s widely used, and everyone knows it. But building an analysis application from scratch is a time-consuming business, as is repetitive use of basic Excel functions. It’s far easier, quicker and economic to build on Xlstat’s decade or so of prior experience. Then again, an alarming proportion of in-house Excel applications incorporate fatal design errors. Buying in an existing solution offers the security of knowing that many mistakes will already have been found and ironed out during that same time in the market.

The biggest argument in favour of Xlstat over raw Excel, though, is about the underlying algorithms. I don’t claim to have pushed them to the limit, but Xlstat uses its own, separate, more reliable and better-validated statistical function computation routines to bypass Excel’s decidedly flaky ones.

Installed, Xlstat adds itself as a menu option, a toolbar, plus a ‘close Xlstat’ button on Excel’s own standard bar. Both menu and toolbar give access to the same set of nine functionally-grouped operational routes (preparation, description, visualisation, analysis and modelling) tests (correlation/association, parametric and nonparametric) and five Xlstat modules, of which more in a moment. The approach works well, providing a well-structured environment within which to conduct statistical work.

The selection of tests under each heading, while smaller than would be available in most big freestanding analytic products, is well chosen for the context and considerably extends the range used or known by most spreadsheet users.

The Xlstat modules are aimed at particular purposes, grouping together techniques likely to be useful. Generalised Procrustes analysis and multiple factor analysis, for example, are found in both ADA and MX modules, but Canonical Correspondence only in the first. The other modules are Dose (dose effect analysis and four parameter parallel line logistic regression), Life (Kaplan-Meier and life tables) and Time (descriptive or spectral analyses, transformation, smoothing, ARIMA and Fourier transform).

Xlstat has a good, fast, detailed help system, a pleasure to use compared to Excel’s. One oversight is lack of any reference to some module names - Time is its own clue, but ADA (Advanced Data Analysis - and MX (marketing analytics) are more opaque.

Perhaps the most visceral illustration of how far you are from raw Excel is the options dialogue box. This allows you, for instance, to explicitly control through tick boxes the circumstances in which cells which will be considered to be missing data.

Whether Excel is the right place to conduct statistical analysis is a valid question, but the dynamics and economics of the real world mean that a lot of work is, and will be, done there. That being so, regimes that use Excel’s strengths and bypass its flaws are vital to both productivity and quality, and Xlstat is a good, cost effective, well implemented, well documented example.

Click here to find out more