There are times when size doesn't matter
There is a strong tendency to be spellbound by the large: broad issues, far horizons, big software packages. Take a close look at what you do with your software. Not what you could do; just what you really do! The results can be illuminating - for what you personally do, is the best tool for the job really a large, all-powerful analysis package, capable of running a factory complex or the economic policy of a small country from your laptop?
'Small is beautiful', Schumacher told my generation. But as a reviewer, I'm as seduced as anyone by the awesome reach delivered into my hands by the large packages crossing my desk; small is all too often ignored. This is a shame. We know that small is not only beautiful, but can also be innovative, responsive, and adaptable to change. Small in size does not mean small in capability.
This 'size-ist' philosophising was brought on by encounters with two data consumers for whom the big, market-leading analysis packages were not the most appropriate solution. I meet a lot of these (usually those for whom data-analysis is just one, often irksome, among several diverse means to an end), but Ishaq and Paige happened along at the same time as the answers to their woes. Both faced the need to analyse and present, within a very short time, large bodies of unfamiliar data for consumption by non-scientists. Each was a newly employed graduate, viewing with dismay the learning curve required for their institution's standard statistical-software suite.
- Two copies of CoStatPro (multiple instances are permitted) run the same dataset imported from an Excel spreadsheet. In the copy on the right, the column options control box is open and a procedure is about to be run from the 'miscellaneous' menu. On the left, a start has been made on building a graph object from variables in the data set, and two control boxes are open for the purpose.
Since this was an ideal opportunity to kill two birds with one stone, we looked through the review list for packages that might lighten their spirits. Not everything that comes through a reviewer's inbox can be recommended for a particular purpose. Some of it cannot be recommended at all and, as my mother used to tell me: 'If you can't say anything nice, better not to say anything at all'. Fortunately, though, there is a lot of good software out there, and two ideal candidates had recently presented themselves ahead of the crowd. Unistat I've reviewed before, in previous versions, though not in this substantial upgrade; Cohort's software I was encountering for the first time.
The very fact of window-shopping for an analysis package recognises that a spreadsheet no longer adequately meets your needs. Where to go next, however, depends on many factors that may be any mix of practical, conceptual and psychological.
Ishaq's raw material was a mass of scientific evidence in a health-related 'class action' litigation case; his task was to provide a series of graded lay summaries to meet the needs of consumption by lawyers, jurors, court officials, and expert witnesses. With his need to produce multiple, layered variations with differing balance of communicative and underpinning material, he liked the look of Cohort Software's spreadsheet-replacement double act, CoPlot and CoStat 6.2. The flagship product here is CoPlot (or CoPlotPro; there are two tiers of product, with differential capacity). CoStat is CoPlot's integrated data editor and analysis program, which can also be used as a standalone program and is available separately but takes three variants. For convenience, and to avoid confusion, I'll refer from now on to just 'CoStat' (the version supplied for review being CoStatPro, the one bundled with CoPlotPro).
- Unistat running in background mode from Excel, with the toolbar and menu additions visible and a procedure being selected from the 'Stats2' menu. A 3D bivariate histogram has been generated at left, and (with the template selector open ready to change), a star icon plot at right.
Paige is a geophysicist at that stage of her career when progress means doing the donkeywork and shouldering any blame for errors, while the professor takes credit for success. Faced with a disorganised archive of complex, unfamiliar data, stored in thousands of Excel worksheets from disparate sources, which she had to quickly turn into Word-based evidence for a bid to the finance committee, she took immediately to Unistat 5.5. Though the software runs happily and efficiently in stand-alone mode, Unistat has put a lot of time and effort into targeting those who wish to operate more effectively within an existing office suite regime. Designed to run either alone or on call from other programs, it emerges from the box ready to disappear again into Excel (and talk directly to Word) at the click of a mouse. For the purpose of this review, I have assumed this 'add-in' mode unless otherwise specified.
Both products have, of course, a worksheet. CoStat describes this as a database table - an appropriate metaphor, since internal spreadsheet functions are moved out to separate control boxes in favour of efficiency. Unistat refers to its worksheet as a data processor, and offers the usual level of spreadsheet functionality found in such worksheets (column formulae and transforms, for example), although in Excel add-in mode it will remain forever unseen. Both offer variable names but, for cell-referencing, Unistat uses Excel's own settings if in add-in mode, or 'CmRn' notation internally. CoStat manipulates variable names as objects (but does use row and column numbers as well, for current cursor position). There is no right or wrong, better or worse about this; it comes down to your operational priorities. CoStat offers tiny file sizes (four to eight per cent of the disk space taken by Excel equivalents, in the review examples) and concomitant speed benefit; Unistat offers the convenience of direct, in-place Excel processing. Pay your money and make your choice.
When it comes to getting data into the sheet, Unistat can obviously receive anything that Excel can open (or receive). CoStat has an unusually wide range of import filters for spreadsheets, databases, files from other maths or stats packages, and so on (not just for the usual suspects but including, for instance, Genstat, Matlab and Statistica) as well as ASCII, binary, ODBC and 'create from clipboard'.
Under their statistics menus, CoStat and Unistat are both more than competent for the majority of normal requirements. Unistat may have the edge in overall terms, but that really depends on the type of work you are doing. If you are looking for something particularly unusual, both are available as demo downloads (though not Cohort's 'CoPlotPro' version, and Unistat will only let you work on its supplied data sets) so you can make a detailed comparison for your particular needs. Neither Paige nor Ishaq found themselves reaching for a tool that was not available. Each has its highlights, of course. Ishaq was ecstatic about the Boolean 'keep-if' construct, which allows conditional inclusion or exclusion of data from any given process or analysis on a case-by-case basis. Paige topped her positive feedback list with Unistat's handling of cumulative probabilities and critical values, which can be stored or dynamically calculated in, and returned to, columns of Excel cells.
'Keep-if' is one example of the subsetting tools in which CoStat is strong, and which are a prime feature of its latest upgrade. Among other new features are progressively more powerful multiple regression subsetting procedures (Forward, Replace1, Replace2, Simons1, and Simons2) for approximate methods. The presence or absence of these defines the three versions of CoStat (see comments on procurement, below): methods up to and including Replace 2 are available in all versions, while the other two are added in the CoPlot and CoPlotPro versions respectively.
Among Unistat's many new developments not already mentioned are the addition of natural log, probit or logit scaling options to (x,y) plots, extensions to tests (Royston's algorithm applied to large samples under Shapiro-Wilk, for instance, and option to save test statistic distributions to file), introduction of stepwise discriminant analysis. These are backed up by significant improvements across the board. Output options are extended: primarily page styles for HTML, an option on Excel output blocking, graphical rich text support. Many dialogues are extended (a means plot checkbox in 2D graphic dialogues, for example), made more intelligent or friendly, or otherwise incrementally developed.
Although only CoStat has been looked at here, its link to CoPlot should not be disregarded. Unistat and Cohort both dramatically improve on the graphic communication capabilities of Excel, and this was important in both cases. CoPlot is dedicated to the purpose, with great precision of detailed control, while Unistat majors on analysis with quick, easily changed templates.
Ishaq's task involved a much heavier requirement than Paige's for communicating complex content to non-scientists, and CoPlot's dedicated graphics capability exerted a large influence on his selection of CoStat. Paige's audience, although non-specialist, was more familiar with statistical graphics and she had no need for 'creative' presentation. Unistat provided all she needed (including a 3D bivariate histogram, something which is surprisingly uncommon). Both packages give the graphics provision of many high-end products a serious run for their money. Unistat does so in a more accessible manner, while CoPlot provides much greater control for those willing to put in the time on learning to fly it.
Both user interfaces are crisp and professional, but vary in presentation. Unistat presents a face as slick as Excel's own, into which it blends, while CoStat has gone for function first and appearance second. CoStat's conventions also differ in some ways from those of the OS (the programs are available under Mac OS X as well as Windows); most of these cease to matter with familiarity, although single-click response in a generally double-click Windows environment leads to occasional overshoot when selecting files or directories. To complicate matters, CoStat can also be used from a command line. This is another area where personal preference comes into play very strongly.
In the initial selection stage, when my two guinea pigs were deciding what to use, Ishaq found CoStat's 'no nonsense' look and feel a positive reassurance ('I know where everything is. It's all crystal clear - I can see what's going on.'). Paige, by contrast, was intimidated ('I like my computer to be a closed box until I choose to open it up, not a transparent Perspex thing that distracts me with a constant view of its working parts; I feel the same about software').
There are facilities to allow some level of batch control in both products, but they differ. Macros in both cases are basic; don't expect these to echo the rich high-end programming languages, which are aimed at a different audience. CoStat's macros are fairly conventional recordings; capturing operations as they are performed, stored on toolbar buttons for convenient access, and played back in the same context on different data. Unistat takes a different approach; actual actions are ignored, only final resultant settings being saved.
As a result, CoStat's macros offer a shorter learning ramp but Unistat's are more efficient. On the other hand, Unistat's macros are only available in stand-alone mode, not within Excel, but Unistat does respond to external control from Visual Basic (or any other language; I managed without much trouble to call it as a background engine from a Pascal routine, and from a word processor table).
Procurement options are an issue to consider, depending on context and needs. Unistat 5.5 is straightforward: pay your money, install, obtain an activation code based on your platform ID, and off you go. CoStat 6.2 is more complicated, since it comes in three variants. The stand-alone version of CoStat is a straightforward purchase, as is CoPlot with a slightly beefed up version of CoStat bundled in; CoPlot Pro, however, which includes the most powerful version of CoStat, is available only on annual licence.
Putting aside differences between the products, both users were clear in end-of-task debriefing that the selection of a small package, which could be quickly learned and easily put to use, was a good way to go. Both felt that the permanent addition of either package to their normal working environment (regardless of their preference) would increase their routine productivity. Also, interestingly, the experience of a short learning curve and immediate productivity on this software had, in both cases, led to more confident use of their large-scale, institutional software for other tasks. Both strongly expressed the view that a smaller personal package on the individual desktop was a complement to, not a competitor for, the larger and more powerful team tools across the organisation.