Escape from the cell block
Conventional spreadsheets are limited by the fact that each cell holds a single scalar value. Felix Grant finds that DADiSP 2002 - nominally a spreadsheet - breaks through these limitations
The conventional spreadsheet represents granular sampling of a conceptual space that is either rectangular or cuboid in form. A worksheet holds data, relational structures, or other information in a two-dimensional array. At the simplest level, data are represented by variable and case, or field and record, in columns and rows respectively. The collection of worksheets - usually known as a workbook, notebook, or something similar - offers limited extension of the same model into a third dimension; how limited depends on the particular product. In practice, of course, this simplest level is normally used only for data storage of the most basic kind; the modern spreadsheet is flexible enough to allow the construction and embodiment of far more complex interrelations. Nevertheless, the fact remains that each cell of a spreadsheet holds a single scalar value. That scalar may represent a text string; it may derive from a web of conditional formulae and conditional tests; but a scalar it remains. And the cell exists within what is still a rectangular or cuboid universe, with its origins in accountancy.
DADiSP remains, conceptually, a spreadsheet - and it still consists of discrete cells - but it breaks apart the structure, and approaches cell relations from a different standpoint. DADiSP cells are like poppet beads. Remember poppet beads? If not, you are perhaps younger than I... poppets were plastic imitation pearls, each with a small hole on one side and a small spherical peg on the other. By plugging the peg of one into the hole in the next, you could assemble them into an imitation pearl necklace of exactly the required length. Or, if you were a child, you could steal them from the parental dressing table and explore the fascinating world of flexible and potentially infinite strings, of various closed and open one-dimensional geometries - until your parents finally gave in and bought you Meccano, Lego, or one of those kits for building molecular models. DADiSP's cells are like that - they come not in a predetermined array, but ready to be strung together in a potentially infinite range of configurations. And each cell can hold not only scalars, but numerous other data structures - including non-reals, vectors, series, or complete data-arrays equivalent to whole worksheets in a conventional model.
I first touched on all this about 18 months ago, fairly briefly, when describing DADiSP as one 'Excel alternative', available for in-line field analysis of convoluted agricultural data ('Looking Beyond Excel', Scientific Computing World, January 2002). The spread of cheap, portable, dispersed computing power has made increasingly common this sort of collapsed working cycle, where data analysis parallels and informs collection rather than following it; and DADiSP arose from just such a need (see 'Coming up from the tracks', below). I've subsequently played with it at greater length, and explored some of its depths, in a miscellany of other contexts, discovering in the process more of the strengths, weaknesses and idiosyncrasies in the version which was available at that time. Since then, the release of DADiSP 2002 at the end of last year has offered some significant developments, to which I'll come later.
DADiSP 2002 is found across the spectrum of scientific and engineering work, from medicine to seismology. Because it arrived during the exam season, with a fairly short time-line to review, I was unable to follow my usual practice of placing products at the heart of a live study. I was, however, able to insinuate it into aspects of an ongoing study where I have been moonlighting, out of interest, for the past 25 years - an examination of wild tortoise behaviour (don't laugh, please - it's a valid, fascinating field of interest, and faster moving than you think). To balance this, I also put it to work on some archival data from past industrial process-control problems. It was equally at home in both contexts.
The basic building block of a DADiSP project (which I have so far called a 'cell') is actually known as a 'window'. This isn't cussedness on the part of DSP's technical writers; once you get in close and start using the program, the cell analogy, which was useful from a distance, becomes a hindrance. You look into (or through) the cell, and find yourself looking at what can look like a complete spreadsheet, or a graphic, or something else. One of my colleagues, delightedly exploring the visually simple representation in one window of the complexity within another, exclaimed: 'It's like Horton Finds a Who!' (For those of you deprived of familiarity with the works of Dr Seuss - Horton, a humane and thoughtfully introspective elephant, peers into a dandelion puff and finds a whole miniature civilisation living there.)
At the same time, don't let go of the cell idea, because a window may contain a formula very similar to that in a conventional spreadsheet cell. The big difference lies not in the formula itself, but in the fact that the target window to which it refers may contain anything at all. The target could be a single scalar quantity; it could equally well be a matrix, or a theoretically unlimited number of non-reals (DADiSP pages memory out to disk, so disk store size is the practical limitation on construct sizes). Linking the windows together is simplicity itself; more so, even, than with a conventional spreadsheet, since there is no need to reference structural dimensionality. In Excel, for example, you might (as we did in smoothing tortoise-activity cycles) obtain the arcsine of a value stored in another cell using a formula such as: =ASIN(Sheet3!E9)
In DADiSP, the references to sheet, column and row are all replaced by a single identifier consisting of the letter 'W' (for Window) and a number, something like this: ASIN(W7)
If you wanted to generate a series of such powers in a conventional spreadsheet, from a corresponding series of cells, you would then need to replicate the formula the requisite number of times, and repeat this whenever the quantity of data changed. In DADiSP, however, the formula works analogously to many stats packages - simply add your new data to W7 (or whatever) and the formula applies to it automatically.
The functions available to such worksheet formulae are extensive - over a thousand of them according to DSP, though I haven't counted. They break down into 38 groups, as shown on the panel opposite, which has been extracted from the help file. I explored some of these categories pretty thoroughly; others I just dipped into. All that I sampled worked robustly and well.
Within a window, the content may be viewed as a set of values or as one of several graphical representations, with the opportunity to select the view specifically, or to toggle between views from a button on the toolbar at the top of the screen. The defaults select themselves intelligently, according to the type of content, so you may see 2D maps in cycling from one data set, but line plots from another (though you can override these choices, either on the fly or in the program options panel). Similar cycling buttons offer quick alteration of grids and axes.
DADiSP is primarily intended as an interactive exploratory tool, but does have a perfectly capable programming language, SPL, with a hundred or so prewritten routines. Macros can also be written, to automate frequent actions easily, and a good number of these are provided ready-made as well - including, for example, very useful ones that extract series or points using the cursor.
A very welcome development in this release is the new option to save to a single external worksheet file, just as you would with any other spreadsheet, under a single filename with a DWK extension. Prior to this, DADiSP used its own 'LabBook' metaphor (a specialised database management system in effect) and saved the components of your work into its own directory structure of files.
The LabBook system works well, but it means that you touch the files at your peril; multiple datasets, series and worksheets are related by DADiSP's memory of where it last left them. Now, with DWK files available, you can cheerfully pick up a file and move it, copy it, email it, post it on the Internet, delete it, double click it, view it in a browser, all without trepidation. This makes life much easier and more carefree. Since DADiSP 2002 is also OLE compliant, the worksheets can be embedded in other container documents as well. The LabBook was designed in days when operating systems offered little in the way of native data handling, and so programs had to provide their own. It works well, but times have changed and it is no longer necessary. The older system is still available as an option, and you can save data back and forth between the LabBooks and worksheets if you wish.
There is a free DADiSP Browser Plugin, allowing worksheets to be accessed by anyone. It is downloaded from DSP's site, with a 30-day trial install of the full product. Another encouragement to use the software is a free student edition of DADiSP. I haven't tried this, so can't say whether it is an older release or a cut down version of the present one, but either way I applaud its provision. I intend to explore this combination of free browser and student edition as a tool for science education in the new academic year from October. Perhaps I'll be able to report on the results when reviewing a future release (DADiSP 2003 is rumoured for the last quarter of this year).
DWK files are described as 'the first step in moving DADiSP towards a 'document centric' method of managing technical data'. Such a design shift is visible, if not already accomplished, across the gamut of application providers and, in the current computing environment, has to be good news. ActiveX, another arrival in this release and with all that it implies for external third party linking opportunities, is tied to the new document structure. As an automation client and server, DADiSP 2002 can also act as the analysis and/or visualisation engine of other applications, or benefit from control over those applications, through suitable linking code.
My only real criticism is of one aspect in the documentation; it underestimates the learning curve that the product presents, and doesn't pay enough attention to easing the new user's first few hours. The more advanced aspects are well covered, but it takes a surprising amount of tenacity to find help on little things at the outset. Whatever DADiSP (and numerous testimonials) may say, in my personal opinion and experience, new users do not find it easy to get to know DADiSP. I spent a lot of time and concentration, when I first looked at the program, on getting started, and I've watched those guinea pigs (well - herpetologists actually) to whom I've shown it this time round suffer in the same way. It is a wonderful tool. It is built on simplicity of concepts and, if persevered with, delivers simplicity and transparency to the user. All the effort you put into getting to know it is richly repaid. But - that initial effort and perseverance are considerable.
Despite DADiSP's spreadsheet metaphor, there are a number of initial hurdles to surmount when moving from one to the other. Many of these are to do with habits of thought, developed over long periods of conventional spreadsheet usage; others are to do with terminology. Others again result from the unfamiliarity of most spreadsheet users with detailed file operations; these days, we are used to ready-made solutions and rarely have to think about their operational mechanics. DADiSP is rightly presented as carrying the spreadsheet paradigm forward; but the style of thinking required leans more towards products such as MatLab. The unwary spreadsheet user is thus brought to the door, to be then faced with an assumption of fairly detailed familiarity with the ways in which data files are handled by a computer.
I believe that DSP would do itself a big favour if it rethought the stage of its documentation that lies immediately beyond initial description. I get the impression that it knows the product so intimately that it forgets how it might seem to a newcomer.
It would also be nice to see native import and export routines simplifying the transfer of data from common application file formats - Excel and Access, for example. This is a richly interactive, exploratory program which delivers intuitive ease in so many ways; it seems a pity that getting data into it harks back to DOS days. For smaller conventional datasets (for smaller, read a thousand or so individual Excel cell values), I copied and pasted via the clipboard; it seems to work well. Such file facilities can be incorporated by the user, in particular by using the provided ActiveX controls 'XLGET' and 'XLPUT' to fetch or place Excel ranges under SPL, but that's not quite the same as a range of import and export filters under 'File Open' in the top menu.
However - despite these last few paragraphs, I'll repeat that the work you put into that first learning-push really is dramatically worth it. For anyone who must handle, explore, analyse and communicate large quantities of scientific data, DADiSP is a fast, immensely powerful, and (once mastered) intuitive tool.
Coming up from the tracks
In the mid-1980s, a British graduate student of mechanical engineering at the Massachusetts Institute of Technology obtained a database of the world's Formula One racetracks. Comparison of the track geometries showed that a driver spends more time negotiating curves than in straight runs, so a car that could hold the road at higher speeds on a bend would have a competitive advantage. A fellow student suggested that the solution was vortex generation - a phenomenon familiar in aeronautics, where sudden pressure changes at the tips of wings and other aerofoils produce a problematic 'downwash' effect behind an aircraft. If this could be harnessed to increase the speed of the airstream flowing underneath a car, the Bernoulli pressure differential should 'suck' the car downward against the track. This in turn would increase the coefficient of friction, improving road holding and allowing higher speeds when turning.
Wind tunnel tests were encouraging, so the pair moved on to the expensive process of gathering and analysing instrumentation data from a real car. To keep time and costs down, they needed software that could display and reduce data at the trackside, with the flexibility to allow modification in real time as a response to discoveries made. The spreadsheet, relatively new at the time (Lotus 1-2-3 was fuelling the rise of the personal computer), provided exactly the model that the two researchers needed, but not the capacity or the range of data types and structures.
The British mechanical engineering graduate student was Tony Purnell, founder of PI Research and currently a managing member of the Ford-sponsored Jaguar F1 racing team. Vortex generation is now standard practice in the sport. His collaborator was Randy Race, founder in 1984 and now chief technical officer of DSP Development Corporation. The software became its founding product, DADiSP, and now claims 100,000 users worldwide.
3D and 4D Graphics