STATISTICS

GenStat 10th Edition

GenStat 10th Edition - VSNi

9 August 2007

VSNi

Reviewed by Felix Grant

In a period when I seem to be writing mainly about interfaces and diversification in the data analysis software market, GenStat is different. Not that the tenth edition is deficient in either respect: simply that a dramatic amount of work had already been done on both fronts in previous recent releases and this one, while continuing that work, is quieter in that respect. There are nice interface touches, further broadening, and the groundwork for more in the future, but development this time around focuses primarily on core matters.

Having started that way, I really have to kick off with the new GenStat Server (10·1) which includes, as always, an enhanced payload of directives and procedures.

It goes without saying that a range of procedures add to GenStat’s generic or specialised statistical tools repertoires, and anyone contemplating adoption or upgrade will make their own investigation of those. One group, interesting not only in its own right but as an instance of the recent trend across this market to extent function by accepting and running external material, relates to MCMC (Markov Chain Monte Carlo) simulations. GenStat will run WinBugs (a standalone Bayesian program from the Medical Research Council’s Biostatistics Unit in Cambridge, designed to make MCMC available for applied use), accept its output in CODA (Convergence Diagnostic and Output Analysis - generated by S-Plus routines) form, and produce plots from either source.

Generalised models gain two very different new procedures: one for extending hierarchical GLMs to nonlinear cases by inclusion of calculated variates, the other for generalised linear modelling of survey data. Surveys also gain a procedure forming a new bootstrap sample on each call from one or two stage stratified data.

A set of three new directives adds scope and power to text handling, extending the flexibility and convenience of GenStat’s reach into the rapidly burgeoning area of text analysis. TXBREAK operates under parameter, option and restriction control to disassemble text in one file into separated words in another. TXPOSITION is a highly flexible seeker for various string types within a text, while TXFIND looks for required GenStat text structures within given others. There is also TXCONSTRUCT, a synthetic construct rather than an analytic one, assembling a text body from disparate components of various type (factors, pointers, scalars, variates, other texts).

On top of the additions there are modifications to a significant number of existing directives and procedures, most of which provide some level of added value to the function concerned.

Graphic output is an area on which GenStat has in the past placed less emphasis than some others, the priority being analysis, but the signs are that it is now a focus of development attention. A pair of directives address external storage of graphics environments. Bar charts have become high resolution, and three new directives starting ‘LP’ are duplicates for existing graphics types (contour maps, histograms, and line or point plots) in preparation for future conversion to high res equivalents under the existing names. Although the documentation carefully says that high res ‘may’ be introduced in the future, the intent is clear. As an aside, this sort of visible strategic planning for change, providing a clear migration road map for legacy code, exemplifies one of the reasons that GenStat attracts such deep user loyalty.

Click here to find out more