Planning for the unexpected
Serendipity: I've always found it a delightful word. It sounds like a song, or the flight pattern of a small bird, or the name of a young heroine in an old novel. It evokes by its sound words such as serenity, quiddity, Ceres, and derives from an old Arabic name for the island of Sri Lanka. And, of course, the thing that the word describes is one of the most rewarding moments in life or science: the unexpected and unlooked for discovery of something new and wonderful. In the historical mythology of science and technology we celebrate the serendipitous: penicillin in a spoilt culture, for example. Being a scientist doesn't exclude appreciation of whimsey; quite the opposite, in fact.
On the other hand, the idea of serendipity doesn't sit well with modern demands upon science and technology. Perhaps earlier times could afford to wait for Archimedes one day to shout 'eureka!' in his bath, but Newton, though best-known for his apocryphally serendipitous apple, was already firmly in a new world. I would not, I fear, get very far with any attempt to sell serendipity in a research budget allocation committee meeting.
Nevertheless, serendipities and other happy coincidences remain an important part of the progress, as well as the pleasure, of science. An important step in the management of this dichotomy was the development of DoE (Design of Experiments), allowing focus of experimental attention to be rigorous and economic, but at the same time nimble enough to pursue the unexpected. From the moment that Ronald Aylmer Fisher started the DoE juggernaut at Rothamsted in the 1920s, serendipity would never be the same again.
Curiously, despite its obvious economic advantages to industry and its computationally intensive nature, DoE was a latecomer to widespread desktop computerisation. A lot of work was going on, and freestanding programs started to appear on the market in the mid-1980s. New computerised methods were progressively added to the traditional range, but percolation out into the mainstream didn't match that of data analysis and visualisation. To a considerable extent this remains true, but DoE tools are now available to most segments of the market: DoE is still widely seen as a dark art, but needs no longer be an inaccessible one.
One of the first pioneers in opening up that market was StatEase, a private enterprise emerging from an association between the University of Minnesota's applied statistics school and the quality assurance department of industrial multinational Henkel. The first product was DesignEase in the mid-1980s, followed by DesignExpert three years later. Apart from a brief period after conversion to Windows in the late 90s, when they were merged for a few months, those two products have remained respectively the 'flagship' and 'lite' alternatives ever since.
Happy coincidences can be found everywhere, if you are inclined to see them, and one of them popped up last summer. Much of my work involves extracting juice from bodies of happenstance data, with no opportunity to direct its gathering, but in September several jobs came along that called for some fairly heavy experiment design. At the same time I also received a copy of Design Expert 7 (DX7) from Stat-Ease - the first release for five years. DX7 was installed on the laptop and taken along for the ride.
As I mentioned earlier, DoE is still viewed with some trepidation. I spend a fair amount of time advising on industrial quality assurance, and never cease to be surprised at the nervousness that many people in this area of work exhibit when too close to statistics. That nervousness can escalate to paralysing fear when face to face with DoE; an image problem that is a serious factor in developing adoption of its methods. The situation isn't helped by the fact that DoE is not usually a continuous activity; when I was looking at this issue a few years ago, Richard Verseput of S-Matrix commented that many users may only do experimental design work perhaps twice a year and it's very difficult to develop a proper intuitive feel for procedures that are used so rarely. For many, help and encouragement are at least as important as the facilities themselves.
Responses to this problem by S-Matrix included development of an interface more familiar to office suite software users, and a wizard-driven process that educates them while leading them step by step through the design process. StatEase is known for the jokily accessible examples - a tradition continued in DX7's documentation with, for example, an experiment to discover whether StatEase principal Mark Anderson can sleep longer in the morning without being late for work. Other approaches include placing DoE methods inside Excel, with add-in products such as Matrex and at least one current web-based DoE engine at WebDOE. The big mainstream generic analytical packages such as JMP, Minitab, Statistica or S-Plus have, one by one, incorporated DoE methods into their toolsets as well; while the rationale here is to increase product value, with increased exposure and familiarity being a consequent effect. However it is done, sweetening the pill and easing the learning curve gradient are priority requirements in breaking down resistance.
While not focusing on these, the development of Design Expert has progressively recognised and reflected them. The wizard-driven approach that worked so well for FusionPro has recently been developed strongly in DesignExpert and now leads the uncertain new user confidently through design stages. Part of a contract in October included introducing those with whom I was working to the basic principles of designed experiment work and, among other tools, showed them DX in versions five, six and seven; they were, without exception, significantly less nervous with seven.
Most directly engaging are small touches like a new 'screen tips' button to raise instructions (including Flash demonstration tutorials) for the current context, and the annotation of design evaluation to guide the uncertain user towards relevant aspects through explanation of the statistics. External additionality includes the provision of links to further information on the internet.
Incremental improvements to the interface enhance usability too, and (quite apart from clear productivity benefits to experienced users) this also has its part to play in making methods more approachable. Selecting particular effects from a plot by dragging a box across them, rather than clicking on them individually, significantly reduces the amount of user-time diverted from what matters. Discriminatory resolution is improved, individual bad response cells in an otherwise valid matrix row being isolated by a right mouse click, with similar benefits. Automatic highlighting of layout rows when corresponding diagnostic points are selected provides a more intuitive and rounded perceptual understanding of what is going on; so does immediate updating of 2F factorials to reflect design segmentation changes. The ability to pursue numerical optimisations onward through graphical and point screens similarly encourage confident exploration.
Given the human psychology involved, developments in graphics also become contributions to reduction of user resistance. While contour plots are objectively more useful, for instance, a 3D surface plot (especially if delivered in contoured or graduated colours) is often subjectively valuable - as colour-coding of points can be to indicate critical levels of an additional factor.
Other graphic additions are closer to the coal face; tools, rather than assistants. The magnification tool and cross-hairs window are examples, both providing enhanced precision - the first provides real-time cursor readout in a superimposed window, the second a close-in view at a particular site of interest. Using both together, I was able to make very precise readings from a compound solution spectrum where toxicity rolled over very quickly from mild to fatal.
Moving on from usability to capability, there continues to be a steady extension of both fetch and subtlety. Though I said that the work that arrived coincidentally with DX7 was 'fairly heavy', it didn't make any call on the 50 factor interactions now possible in minimum-run cases (21 for Box-Behnken), nor on the 30 factor by eight block central composite designs. Nor were the 512 by 21 designs available to two-level fractional factorials tested on a live process. Enhanced flexibility, on the other hand, was much appreciated - especially in D-Optimal work.
D-optimal designs fall into the non-classical (post-computerisation) design types that open up new designed experiment territory not available to older methods. They are not restricted to one fit model or experimental objective type, simply maximising the determinant of an information matrix to optimise a chosen criterion. A 'candidate set' of treatment run combinations is provided to the algorithm, which then chooses a subset for inclusion in the design. The arbitrariness of this approach is moderated in DX7 by allowing D-optimal designs to be rebuilt using 'candidate-free' coordinate exchange. Other additions allow runs equalisation and D-optimal addition of blocks to a design. Modification of existing designs is a general theme, not limited to the D-optimal, welcome for its labour-saving value.
The incorporation of DoE tools into other software raises the question: why buy a dedicated product? It's a question frequently asked, and the answer varies. In essence, though, the arguments for a package that specialises in the task at hand are usually greater reach, greater clarity, a more efficient workflow, and in work where that task dominates. If your design is a one-off, the investment in a specialist solution may not be justified. If it is always part of a wider analytical structure, DoE options within your usual statistics software may well be the right solution, provided that they can fit your required methods. But if designed experiments are a major part of what you do, and particularly if you stretch the limits, then it makes sense to place yourself in an environment that places them at centre stage. And that raises the opposite question: if designed experiments are what you do, do you need a generic analytic package?
The answer to that second question depends on what you analyse. If you have to analyse freestanding data sets, then the answer is an unequivocal 'yes'. DX7 would never be a replacement for a generic package. Nor does it try to 'function creep' into the territory of such a package. On the other hand, I've frequently encountered cases of a heavyweight analytic package being used to monitor simple descriptive measures for input and output values from a DoE system - and that is definitely using a shotgun to hunt crayfish. DX7 will, to take that specific example, provide you with mean and standard deviation of each input and output via the design output screen.
A more useful question, though, may be: given that both types of software are available, in which environment is peripheral analysis most usefully done? The answer to that one has shifted considerably. Not so long ago, I would have said that the generic package was the way to go; now it's less clear cut, and I would increasingly often recommend staying inside the specialist environment.
The psychology of this was brought home to me strongly during some of the recent work in which I involved DX7. Having induced clients' staff to develop some confidence in DoE through use of various specialist software methods, asking them to then analyse some aspects in a generic alternative (my own subconscious tendency) was counter-productive.
ANOVA, for instance, is handled within DX7, which now offers preference definitions on sums of squares calculations. Though ANOVA is usually best done in a more general setting, if it relates directly to the design then this is now the place for it. Backward stepwise regression and a Cox model for mixtures both make more sense in situ, as does the inboard Pareto charting (especially as it offers a right click instant view of aliases).
The obvious occasion for turning to DoE tools in external generic software is when a large body of repetitive work has to be done. This is where the programmability of large, language-based analytic software comes into its own - if the volume and complexity justify it then Genstat, for instance, has a built-in design system with preconstructed designs, the tools to build new designs, analytics, and a rich control structure to wrap around the whole thing. Another alternative in the right setting would be S-Matrix's Fusion AE, which is purpose-designed for automation. Neither of these would be novice territory, though, nor would they be economic use of time for small stacks. DX7 allows creation of new designs that inherit characteristics and content from another, which would be better options in such circumstances.
Interaction between DoE software and other components of the larger lab or study setting within which it operates is desirable, and this is another attraction of generic software. On the other hand, specialist worksheets are increasingly standardised and able to read popular file and/or Windows clipboard formats. DX7 is very happy with clipboard data from the likes of Excel, OpenOffice Calc, generic statistics programs, and so on.
Its direct file import options are limited to current or previous generations of StatEase software, with the single exception of XML. Both design and reports can be exported to XML files (AAO output is also available), and XML designs can also be brought back in again. How you view this depends on your context, but for myself I'm inclined to see it as a good thing: I'd rather see the future developing along standards independent of proprietary file formats.
Alongside my enthusiasm for non-proprietary standards, I am a great believer in diversity of provision. In this area of work there is as little chance of finding one monolithic 'best for everyone' solution as in any other - long may the various alternatives thrive alongside each other. That said, it's good to see such a healthy new presence by this particular member of the club: the upgrade has obviously been held until a solid basis of underpinning development justified it, and the market as a whole can only benefit. DX7 should be high on the list of evaluation candidates for anyone looking for an entry to this software genre. As with many products these days there is a downloadable trial version (operational for a generous 45 days), so it can be put through its paces alongside other options.
And serendipity? All the projects into which I took DX7 over the past three months yielded what the clients wanted - plus 107 unexpected but intriguing discoveries, at least a dozen of which look likely to spark new projects. Serendipity is alive and well, thank you very much, as beautiful as ever and happily served by experiment design.
Free downloadable DoE system for Unix, Linux and MacOS X
S-Plus (generic analytical with DoE)
MiniTAB (generic analytical with DoE)
Matrex (Excel add-in)
Corporation Fusion Pro; Fusion AE
JMP (generic analytical with DoE)
DesignExpert & DesignEase
Statistica (generic analytical with DoE)
|WebDOE||Free web based DoE engine||www.webdoe.cc|