In an ideal world, everyone who uses statistics or statistical methods (for whatever purpose) would fully and thoroughly understand them, making the best possible decisions in every instance. The real world, alas, is not like that. We all have to make use of disciplines which are not our own, making the best decisions we can in partial or imperfect knowledge. Data analysis is a discipline so thoroughly essential in almost every area of science that it is bound to be widely, though unintentionally, misapplied.


There is, therefore, a great need for software tools which try to guide uncertain users into best possible practice. All statistics software has, for a long time now, been exploring and developing ways to do this. Some have made it their first principle and AdviseStats is one such solution: it is built on a decision tree familiar from user guides, but embodies the tools as well. Outcomes in terms of improved practice in comparative tests are impressive.


Open AdviseStats and you are confronted by an unusual sight: not a worksheet, not even a menu system, but an empty (apart from branding) grey rectangular window containing one word at the top left: ‘Start’. AdviseStats is a keyboard-free zone (AdviseAnalytics call this a ‘button tree’ interface) in which you work entirely by pointer selection from beginning to end, and it loves a tablet computer. Click on ‘Start’ and you take the first step into the decision tree; five choices of which two are administrative, two tutorial and one is ‘Access data’. Assuming that you choose to access data, you get the choice of local and web sources or randomly generated matter.


To keep things short, here, we’ll proceed directly to a local disk CSV file containing a mixture of continuous, discrete and qualitative variables and answer a question about missing data values (none, in this case). The next choice is about what we want to do: the options depend upon the data, but in this case we have eight from ‘anomalies’ to ‘view data’ (this last option opens a spreadsheet-style display grid), each of them offering a little more detail when the mouse pointer hovers over them.


Picking ‘Compare’ presents a list of variables, each with a little distribution graphic next to it: I’ll click on ‘Iron’ and ‘Titanium’, then from the grouping popup I’ll choose ‘Site’. The last task is to click compute; another list of fine tuning options is available but I’ll ignore them and click ‘Go’ (as the mouse hovers over this, I am informed that it will choose ANOVA, which is appropriate). One final question asks me whether I will allow the program to ‘transform variables if needed to satisfy assumptions’, then the test is run.


There’s a lot more than that, of course, but I’ve illustrated the central approach. Under the bonnet, so to speak, there is an impressive array of tools ranging, for example, from outlier handling to cluster analysis, composition descriptors to principal components. There is a facility for making very specific interrogations of the data on the basis of natural language constructs to answer particular questions in specific ways.


Is it perfect? Not entirely yet, but it’s getting there and works best for its intended audience. I encountered an ‘Execution error in assembling output’ message a couple of times, for example, but students making fewer assumptions about their own knowledge did not. In comparisons between student groups using Advise stats and those making their own decisions, AdviseStats consistently produced more correct outcomes and higher quality results.


For anyone who is not a statistician but must use statistics, I would definitely suggest taking a good look at AdviseAnalytics explanatory material and downloading the 30-day trial version.


Analysis and opinion

Robert Roe looks at research from the University of Alaska that is using HPC to change the way we look at the movement of ice sheets


Robert Roe talks to cooling experts to find out what innovation lies ahead for HPC users

Analysis and opinion