STATISTICS

Statistica 9

Statistica 9 - Statsoft

4 August 2009

Statsoft

Reviewed by Felix Grant

The most obvious and most radical changes in version 9 of Statistica are, respectively, to interface and to architecture. However, I'll put those off and look first at the core function developments.

Ability to call external code modules is not new to this release, and specific support for material coded in R language was added to version 8, but full support is still news. R is widespread in academic data analytic computing, and its close relation to S (Statistica will recognise S script files, and often run them as is) provides a further base of users who can transfer to it with comparative ease. At the simplest, R scripts can be used to extend Statistica capability into specialised areas already solved in R, and to place Statistica's container, graphic and reporting structures at the service of R. With thought, more complex setups can be built to pass data back and forth as aspects of both are utilised. The Statistica Visual Basic program R.svp, now installed along with Statistica Release 9, does a good job of mediating between the two systems to provide any level of user-transparent hybrid. Since desktop Statistica can shunt scripts to WebStatistica, and WebStatistica can handle multiple script instances, heavy R analyses (particularly data mining projects) can also be split up over multiple processors under overall Statistica control.

Related to this expansibility are a spreadsheet library, ANSI-92 SQL JOIN support and enhanced macro recording.

A useful new addition to the statistics menu is 'distributions and simulation', an automated multivariate distribution fitting and utilisation robot still in beta, but very usable and very effective. There is also the usual range of enhancements and extensions to existing facilities across the range, plus some new options incertain areas, and Data Mining Recipes (previously also a beta) has become a full product.

Returning to where I started, the architecture now presents two versions: 32- and 64-bit. For many purposes there is no discernible difference, but in applications that push the limits of performance or resources, the advantages of 64-bit become dramatic. In the tasks on which I tested the software, I didn't approach the doubling of speed that the documentation mentions, but intensive mining of very large data sets certainly showed real performance gains. Not that the 32 bit version is a slouch: in some computationally intensive situations it shows visibly significant speed increases over the already impressive performance of its predecessor release.

The interface has been redesigned to mimic Microsoft Office 2007. If (like me) you are not a fully paid up fan of that approach, the familiar menus are only a click away – though if your data is in Excel 2007 you may be better off biting the bullet and learning to love the ribbon bar.

Whichever option you choose, things look much the same in actual operation (though there are numerous valuable handling and formatting refinements) and you get both sets of controls when you open an XLS or XLSX file. My own personal judgement as a critical user is that the new approach works very well for Statistica and has been far better implemented here than in Office itself.

Click here to find out more