Thanks for visiting Scientific Computing World.

You're trying to access an editorial feature that is only available to logged in, registered users of Scientific Computing World. Registering is completely free, so why not sign up with us?

By registering, as well as being able to browse all content on the site without further interruption, you'll also have the option to receive our magazine (multiple times a year) and our email newsletters.

Writing up is no longer so very hard to do

Share this on social media:

Topic tags: 

Once the research has been done, the report has to be written. It used to be boring, but now Felix Grant has found software that makes it bliss

As I first learnt in school, somewhere around age 13, actually doing science is the sexy part, but the mundane task of writing up is equally vital. I started this piece with the intention of investigating ways to improve my participation in the research and report cycle of groups with whom I work on shared projects - without having to give up my own individual word processor preferences. What I ended up with was a love story.

When word processors became reliable, flexible, and portable enough to replace typewriters, they brought the ability to re-use text. 'Paradigm shift' is a grossly overused phrase, but not too strong for the change which was wrought in writing practice, from laundry lists to research papers. Organisation of stored text was a clunky business to start with, though. Notes had to be consciously marshalled in a structure of files, directories (if you were lucky), and tapes or disks. When the word processor ceased to be a piece of hardware and became software on a personal computer, the revolution really took off: now database management software could run on the same machine.

The novelist Len Deighton, not academic writers, first showed what could be done. Databases of characters, locations, descriptions, dialogue, could be retrieved at will for reference or re-use when developing new material. The old card index of notes became not just a container, but a supercharged feed line into the writing and revision cycle. And that, in turn, utterly changed the research and report cycle. In fact, for most practical purposes, it created that cycle as a concept. Up until then, reporting had in effect been the final stage of a linear process which started with hypotheses and was strung out like separate beads on a one-way thread. Now it became imaginable for research and report to grow organically together, informing each other in a productive two-way dialogue. Spreadsheets enriched the mix and blurred the boundaries still further.

It was an exciting time, yet in some new ways very firmly straight-jacketed. The new tools imposed a certain way of thinking; much of the card index's intuitiveness had been lost. Another push was needed, to make all this technical power accessible in more transparently human ways. An individual should be able to adopt any solution that works, but a team (or larger co-operation between disparate teams) must be able to interact.

I make no secret of my admiration for tightly integrated research-writing systems like Nota Bene. The most efficient report-writing machine for inline mathematical notation within standard text is WordPerfect. But in the world outside those bubbles of textual productivity, the lingua franca will be Word, RTF, or HTML (and, increasingly, XML). So what personally productive systems could satisfactorily mesh my preferences with other alien environments at any level of co-operation?

The key was to find a way to database all components of the report in a single entity which made them coherently available to any user or application that might require it. Having such a database inboard, in the form of Orbis, is what makes the likes of Nota Bene so successful. My external equivalent was provided by Onfolio, the lynchpin around which everything else fell into place. I freely admit that I fell hopelessly in love with Onfolio as soon as I installed it; three months in, I'm more besotted still; and I've only begun to exploit all the potential that it offers.

Onfolio does several things, any one of which would justify its purchase. It is listed by one UK reseller, for instance, under bibliographic reference management software. It is a site feed manager and reader. It handles book-marking. It is a text base handling system. It does a lot of things; but first and foremost it is the integration of all those things in concert.

On opening, three tabs present themselves, of which 'Collections' is on top. Files in Onfolio are called collections, and I'll use that term here to distinguish from files from other applications which can be contained within a collection. Each collection is divided into folders and subfolders in the conventional way, and each folder contains links to information items. An information item may be a web site (or page), an image, a file, a note, or a 'snippet'. In most cases, links can be either to the original source or to a copy stored within the collection itself. A note is a small richly formattable text document typed in an environment resembling Windows WordPad, and is an ideal place to store text; a snippet is usually extracted by highlighting part of an existing document. Onfolio has its own spell checker, and a lot of routine writing can be done directly into the note editor. Each of these components can be sorted, categorised, and otherwise organised in a number of useful ways, and can be tagged with colour coded flags. This summary does not even begin to cover the extent of what can be done - but space is limited.


Two website fragments annotated for demonstration purposes. A link referenced in a report is shown without annotation (top right), with text highlighting and callout icon minimised as they would be seen on visiting the page (top left) and with the callout maximised or as they would appear under the mouse cursor (centre inset). The organise menu has also been opened. In the lower half, annotations for Athens login have been produced, including a voice note. Text holding icons are shown both minimised and maximised.

While very simple to set up and use, Onfolio has a maze of customisation options; one set concerns how the user wants to display and access it. It can be a 'desk bar' - a window like any other (minimise it, maximise to full screen, resize it). Or it can be docked to left or right of the screen and, optionally, set to disappear until popped up by a mouse pointer run off the screen. Or it can be opened as a left hand pane in your browser (Internet Explorer or Firefox, with tabbed browsing well supported). Or do both - I experimentally opened 17 instances of Firefox and 30 of Internet Explorer, plus the desk bar, all showing different collections, before I lost track and had to lie down. Choose whether the detailed content of a selected folder is placed below the tree (to save space), or beside it (to maximise display), how much detail to show, and so on.

Onfolio can be a superior replacement for the bookmarks or favourites lists in a browser. All bookmarks are now backed up properly along with other documents; Firefox is particularly bad at suddenly losing them, and IE is not above doing the same thing occasionally. For users employing more than one browser for their different strengths, a single bookmark list is accessible in all of them (though Onfolio can only appear inside IE and Firefox, the deskbar can launch Navigator, Opera, Amaya, or whatever). Other benefits include greatly enriched book-marking with access to proper citation referencing, descriptive notes, and a properties tray.

I started by just using the textbase as a working notes organiser and in that role, though shamefully underemployed, it performed wonderfully. Notes typed or scanned from paper or microfiche sources came first, alongside snippets of electronic origin. Then came similar material added as complete files - notes jotted on a PDA, photographs and sound files, equations and, crucially, word processor files in multiple formats. Each report section can be generated in a linked file from whatever application I choose, while always remaining within the larger framework; when complete it can be imported and fully integrated. While external files usually open in a designated application, material stored internally is always viewed in a browser.

Files or web pages can be linked (editing them edits the original) or added to a collection as wholly separate entities (editing the original leaves them untouched), and this can be exploited for audit trails. I have found it very useful, for instance, to set up a collection purely for storing incremental versions of a database or spreadsheet as a study progresses. It's surprising how often the process by which a data set grew (a process that is lost in the usual way) can be enlightening later.

Next up the ladder came a whole structured task: organisation of a conference within Onfolio. Over the weeks I have gradually extended my grasp of the potential until I've now written and assembled within it an entire report on an epidemiological study, from beginning to end.

There are a number of ways to bring in existing electronic material, either as a whole or as selected parts. From within a supported web browser, either a toolbar click or the F9 key will open a short context menu of options.

Site feeds are a separate tab from Collections, but are organised in the same hierarchic folder structure and work in much the same way. The main difference is that links, here, are to 'newspapers' of summaries and thence, through further links, to the source sites. For anyone who uses feeds a lot, this is an excellent way to integrate them into a larger information management strategy - especially as they can be searched from the same Onfolio tab as the collections. Onfolio won me over to maintaining a few feeds even after scrapping the ones set up purely to investigate the potential.

The 'Search' tab allows several levels of sophistication, with offers both simple and advanced interfaces. Searching can be within the current collection, recent collections, 'favourite' collections (you can designate up to nine of these), all collections, or feeds. It can be constrained in a number of other ways as well, including (to take a couple of random examples) by flag and snippet type.


The Onfolio desk bar (left, shown unhidden) has open the (highlighted) data files folder of a collection called 'proteins deficit run' holding both the study and its report. Audit trail copies of a spreadsheet file (visible in the background) have been saved into this folder by dragging to it from the file window just visible at bottom left, triggering the capture options window ('Onfolio – capture item', image centre). The remaining frames (foreground right and lower centre) show aspects of the properties tray.

What happens to all that material, once in? It's all very well having endless notes, or a complete report in separate paragraphs, but to be useful it eventually has to go somewhere. There are many ways to pass it out to other applications - by item, by folder, or by collection, either in full or as metadata. There are also a variety of ways to share it with colleagues on a local network, on the web, by email, through extended wide area networks, and the rest. For most of the team work I do, the best sharing mechanisms seem to be either a multipart HTML page or a blog. Although blogs are usually seen as teenage diaries writ large, they can equally well be collaborative science project notebooks - and Onfolio automates their publication. Whichever method is chosen, transferring material into another setting (word processors included, in the case of a report) couldn't be easier.

As I've worked through the changes, I've discovered how to keep my favourite applications for those parts of the task which they do best, but to let them go when others are just as good. Different sections, all contained within a single Onfolio structure, are written in different word processors best suited to their content. Regardless of origin, all the text can be finally assembled into whatever format seems appropriate - either as a final step, or as an intermediate process. An up-side of this is that a lot of work can be done in separate phases with a concomitant gain in efficiency. In the early days of word processing, it was normal to get all the words down first and then worry about formatting later. We've forgotten that nowadays, when every word processor thinks it is a desk top publisher, but I've been rediscovering the benefits of such an approach - not just in relation to formatting and style sheets but for embedded objects (equations, diagrams, illustrations), spell checking and bibliographies too.

At the final stage, material can be opened in an appropriate MS Office application to take advantage of sciPROOF.

SciPROOF is hard to write about at length because it does its work largely out of view - but it does that work efficiently and well. It goes beyond the normal remit to check a wide range of scientific terms, non-Roman characters including all those Greek symbols of which science is so fond, and style aspects. Multiple dictionaries are supported, and the addition of new terms goes beyond the usual 'add to dictionary' to embrace font format and placement of components within the term. In Word it installs a three button addition to the toolbar; in Access, Excel, FrontPage, Outlook and the rest, it operates as a near invisible extension to the native checker.

The same discipline is equally productive with formatting bibliographic citations. Formatting (or even generating) a bibliography whilst writing is aesthetically satisfying, and can be psychologically comforting while waiting for inspiration to return. But it's not really a prime use of expensive brain time. Unless your word processor handles the whole publication style thing in the background (as Nota Bene does, and OpenOffice Writer aims to do soon, but most do not), the edge in productivity again lies with citation marker codes placed as writing is done, but with generation and formatting on completion. Bidding a tearful goodbye to my faithful copy of EndNote 7, which would run snugly inside WordPerfect, I embraced release 9 which does not - but will happily format an RTF or Word file before despatch. Objectively, the benefits are undeniable: I now get more done in less time, and spend less energy on unformatting or rebuilding bibliographies.

That two-generation move to EndNote 9 brought a number of advantages, including valuable interface refinements and the arrival of XML and Unicode. Particularly useful is having EndNote talk directly to Onfolio's Academic and Scientific edition (communication with version 7 was via tabbed conversion file, usable but not elegant). While I would not personally use Onfolio as my primary bibliographic reference store, having it mirror those EndNote records which relate to a particular project has tremendous value. Any updating in one place is reflected in the other, all formatting is handled from one place, and I don't have to remember which container holds a particular reference. As a further great advantage, it's now possible to run literature analyses using either OmniViz or RefViz from within EndNote on a single button click. What used to be a multistage process is now completely integrated.

Not all web or intranet material needs to be imported. If it may disappear or move, or will be frequently searched, import into Onfolio is the way to go - but if not, why duplicate it? Annotating in place can be more efficient, and there are several ways to do that now. I wouldn't like to guess how the ongoing debate about server-side or client-side storage will play out; URL extension offers the most universally elegant solution, but things need to settle down a little before a clear favourite emerges. In the short term, after trying both types, I've plumped for iMarkup's local authoring client, with a freely distributable reader plug-in which also accesses the same company's server product. So far, colleagues near and far have been quite happy to install plug-ins emailed to them, and the system works well. Large volumes of reference material can be left where it is, with only the annotation mark-up itself passed around or included in the final result. Although iMarkup will only run in Internet Explorer, I've never encountered a collaborator who doesn't have IE lurking on their machine, even if it's not normally used.

Annotation options themselves are rich enough to be productive without being too complex to internalise. Various 'sticky note' icons expand as the mouse hovers over them to reveal useful quantities of annotation text (or a voice recording) and can be re-sized, colour-coded and so on. Text can be emphasised by format or highlight, or struck out. Other options range from point markers to icons which load a whole supplementary file (a PDF perhaps, or a spreadsheet) or hotlink to another URL. All can, at a single mouse click, be minimised or expanded to full view.

Even more important is structure. There's little point in creating a network of superimposed information if it can't be accessed easily and quickly. You can sort, manage, and search your (or someone else's) annotations. Group them in various ways (by origin, category, date, sender, and so on). Move through them sequentially from a toolbar. Search for a particular term throughout all of your annotations, and load the page(s) on which it is found. Email them to a colleague either as pure annotations (applied to the page next time she goes there) or as a snapshot of the page itself with the annotations in place. And so on.

For anyone who usually works in a word processor which generates extended characters without interrupting the flow, one of the most wearing things about Word is having to use the 'Insert, Symbol' menu - and many programs don't even provide that. A good solution is AllChars, a freeware program which mimics the Unix 'compose' function for instant keyboard access to characters in the current font and script. Tap the Ctrl key, then a two character nominal or visual mnemonic, and the character appears at the cursor position in most Windows text environments. By default, for instance, the mnemonic for Ì is 'mu' while that for ? is 'O/'. These mnemonics, and the characters they produce, can be redefined and extended by the user, multiple mnemonics for a single result are allowed, and different definition files can be stored for different purposes, fonts and scripts. Using the right fonts, it will even do some fairly hard core mathematical notation.

All in all, I now have what the personal computer first promised decades ago: individual tools which suit me best, bound into a coherently integrated whole, as a live part of a larger collective work. Research builds its own report in real time - and the report is continually informing the research. Bliss is here. And if that sounds romantic - well, I did warn you that this was a love story.