DATA ANALYSIS

The science of perfection

The science of perfection

Felix Grant on the application of statistical software to quality control

Scientific Computing World: December 2007/ January 2008

In a long ago school history lesson, I was told that one factor in the victory of industrial north over agrarian south in the American Civil War lay in the interchangeability of rifle bolts. The bolt of a southern rifle, we were told, was machined to precisely fit the weapon for which it was made; a northern bolt, by contrast, was a sloppier thing, mass produced and matched randomly to other parts made to the same standards. The result, apparently, was that a Confederate soldier with a damaged bolt held a completely inoperative firearm while his Union counterpart could simply replace the damaged part. I have no idea whether or not this account was true, but it did make a point: control of accuracy in manufacturing process is a strategic issue, with a gamut of consequences to be envisaged and planned against rather than assumed.

A decade or so later, as a maths’n’stats undergraduate working graveyard shifts in a valve factory, fruit cannery or beverage bottling plant to pay my way, I learnt the economics of quality control in a vividly personal form. Any fault which escaped my attention caused a deduction from my wages; so did any losses when I mistakenly pulled a satisfactory item from the line or, worse still, stopped the line itself. There was astonishingly little science in the control of production quality, however. Canned fruit was assessed by eye (through a foggy Pyrex window) before the lid went on. Even in an armaments factory, randomised sampling meant pulling a component from the output box when I happened to feel like it and subjecting it to a crude ‘go, no go’ gauge.

Fast forward to 2008, and huge strides have been made in some ways, but remarkably little in others. We now have industries controlling production quality to tolerances that would have been unthinkable at that time, and others at a level that would not have taxed the manufacturing capabilities of the late 18th century.

Nor is the division always between large, wellresourced organisations and small ones on a tight budget. My freelance work includes a lot of quality-related consultation and I see small workshops run by one person and a dog where production control is as tight as a drum, then facilities of a multinational corporation accepting variations that would not be brooked in a primary school craft lesson. The difference of attitude lies in the balance of cost (real or perceived, for both producer and consumer) between wastage and controls. Since achieved fault incidences are asymptotic to zero, progressive reductions will always be smaller and achieved at greater cost; when the second ceases to be justified by the first, a judgment call needs to be made with great care and in full possession of every analytic insight available. The costs of making sound decisions, however, have been brought down dramatically by spin-off benefits from escalating scientific computing power. You might think that pure science laboratories would therefore be in the vanguard of good practice, but that isn’t always so. As I write this, a quiet scandal is sweeping just below the surface of one important research community over multiple failures of quality control, which call into question years of work, findings and established literature.

NWA Quality Analyst individual and range charts show laboratory processes in (background) and out of (foreground) control.

NWA Quality Analyst individual and range charts show laboratory processes in (background) and out of (foreground) control.

In that particular case it’s not economics at the root of the problem, but a particular form of professional and intellectual blinkers. Management of quality to a given tolerance level, in whatever sphere and whatever that level may be, has two main parts: controlling the production process itself, and checking the output from that process. Both parts have data analysis at their root, though it’s more obviously visible at the checking stage. It’s surprising, then, to find little call for underpinning statistical knowledge among many quality professionals.

Even some ‘Six Sigma Black Belts’ (see ‘Six Sigma’) have only a black box understanding of the analytic foundations upon which their systems, techniques and approaches are constructed. Reading the quality assurance and control industry’s literature shows a pattern of consultants and columnists who have the background to see activity in terms of rounded theoretical frameworks. They also do a good job of delivering particulate expertise into a readership with localised knowledge of the needs and conditions in their own fields. This apparent dichotomy is made possible by the success of scientific computing in placing powerful analytic methods behind friendly software interfaces under the guidance of intelligently defaulted, knowledge-based systems. As two members of the trade press point out (see ‘The view from the quality industry press’), when properly applied it places powerful control mechanisms in the hands of people who best know the context in which they are applied.

Software for those two parts of the mix does not have to be the same product, or even from the same source. Nor does it have to be purposedesigned. Perhaps the most widely-used model, in fact, is a paper-based system for the first, and generic analytic software for the second – though such systems struggle to satisfy the formal standards.

The trend, nevertheless, is towards software-based systems throughout; there is a ystems, and most substantial analytic software products have, for some time, been incorporating features specifically aimed at the quality market.

Control of a production process is not limited to the activities directly involved in production itself. Fully implemented, it involves the whole organisation as an entity, from research and input sourcing through to marketing and sales. Auditing such a wide range of information, systems and structures is a mammoth undertaking and takes its toll on organisational capacity unless properly managed and contained. The criteria involved are formalised in the set of ISO 9000 standards covering documentation and harnessing of key organisational aspects towards improvement.

Central computerised databases of documents and data are the most efficient and responsive model for this, and are steadily pushing out older, physically-realised equivalents. Actual implementation of such a standards-based system in practice clearly involves (or, at least, should involve) a continual cycle of data acquisition and analysis feeding back into both standards monitoring and planning review.

Minitab 15 process capability visualisation (right) and zone chart (main) of a process well in control for its defined limits of ±500 grammes variation per tonne.

Minitab 15 process capability visualisation (right) and zone chart (main) of a process well in control for its defined limits of ±500 grammes variation per tonne.

And, since the analytic procedures need to be embodied in software, direct linkage between database and analysis is an obvious way to go. Probably the best known and most widely applied single system is the Quality Analyst line marketed by Northwest Analytical (NWA).

Quality Analyst itself, the core product, is a charting and analysis package designed (rather than adapted or supplemented) for statistical process control (SPC) use. A companion web server offers distributed reporting, while other options allow expansion to integrate data collection from all components and stages of the process, automate alarms for unacceptable dispersion, centrally supervise multiple processes, data mine the incoming flow, access external databases, and so on. The advantages of such a system don’t stop with software capability, however; the software supplier (NWA in this case) usually has accumulated expertise in the problems and opportunities affecting the industries within which its customers operate, which recycles as value-added benefit to those customers (existing and potential) as they apply the disciplines of SPC to their own circumstances.

One illustrative example quoted in NWA’s literature is the case of chemical industry raw materials, which are bagged by suppliers to a given specification (subject to its own variability) and then used ‘as-is’ as production input units. This is an area where lack of statistical knowledge or experience on the user’s part can cause things to go wrong, and the normal ±3σ variation control chart (see ‘The ‘father of statistical process control’, above) may not apply since weight varies either side of a mean value, but impurity levels vary only upward from zero.

NWA’s list of Quality Agent customers is long and impressive; results bear witness to the effectiveness of this approach, and I only ever see them as a spectator since there is no call for outside help.

A single umbrella solution isn’t appropriate to every case, however. Sometimes the investment in an existing system precludes any thought of starting over from scratch; in-house expertise is more valuable than acquired systems, or particular circumstances require particular methods. Initial cost of a whole system, expended on faith without yet fully understanding the situation which it will be managing, can also be daunting. Then again (especially in science sector enterprises), the core functions of an organisation or its essential ancillaries may already use analytic software that can be extended to the task.

Bending the long-suffering office spreadsheet to SPC work is a widely adopted route, and has obvious interface familiarity advantages, although it eventually runs into the familiar law of diminishing returns. There is a healthy market in add-ins, mainly – though not exclusively – for Excel, some of which replace shaky internal statistical functions with their own more robust versions and many of which do not. With acquisition of add-ins, good professional advice on their selection, and time expended on implementation, the apparent cost advantages of this route can often be illusory. Spreadsheet systems, both at initiation and in ongoing maintenance, are lucrative sources of business for freelance consultants – though they can also, in the right hands and setting, offer an entry point to otherwise inaccessible benefits.

Between the spreadsheet and the all-encompassing, ready-made system lie various levels of solution based on analytic software. Statsoft’s Statistica has a long history of quality module development, taking the necessary tools to a high level of sophistication matched by equally advanced built-in help, guidance and defaults, the latest additions being multivariate quality charting and process analysis extensions.

Statistica also has a well-established track record in external database communication. There is widespread admiration, too, for Minitab’s move from education to successful penetration of the Six Sigma market, where other analytic packages, such as JMP and Statgraphics Centurion, are also finding their future. These are not just the data analysis bases for reliable quality systems, they are also the tools used for process capability studies preceding setup, as well as for parallel checking and firefighting in systems that do not perform to plan.

Often forgotten are the other software types that contribute to quality outcomes by providing information on initial states, to which analytic methods can then be applied. Design of experiments is crucial in efficiently establishing the parameters within which systems will best operate, and symbolic or numeric modelling (the adoption by Toyota of MathWorks tools for controls systems origination, and Maple for model-based design, is a good example) provides stable theoretical platforms upon which subsequent regimes can be founded.

Despite industrialisation of research, pure science sometimes tends to lag behind applied practice in the rigorous application of assurance systems and can learn from its back catalogue of experience. Walter Shewhart commented that ‘applied science... is even more exacting than pure science in certain matters of accuracy and precision’. This may be less true now; both applied and pure science have since progressed to addressing smaller targets and more elusive effects, with the need for ever more precise control rising accordingly.

 The view from the quality industry press

Dirk Dusharme, editor of the quality-focused monthly Quality Digest, comments: ‘I think most old timers would say that producing control charts and performing rigorous analysis of process data, in order to understand process capability and how to improve it, was something largely done only by companies who could afford a statistician.

Smaller companies either had no idea of who Shewhart was or, if they did, didn’t have theInexpensive software turned that on its head. While it is still necessary to understand the concepts and goals behind SPC, it is no longer necessary to be a statistician; the software does the hard work. This has enabled SPC to be driven to the shop floor, into the hands of shop workers, where it does the most good... As with most software tools, what SPC software has brought to the game is speed and ease of use.’

Columnist Tom Pyzdek adds: ‘Software has also made the use of statistics for process improvement and new product and process design a lot more widespread. Six Sigma Green Belts and Black Belts use software extensively for exploratory data analysis and hypothesis (dis)confirmation. Design teams use statistical design of experiments and simulation to model new processes and products.’

‘The father of statistical process control’

Statistical quality control methods trace their lineage back to the work of Walter Shewhart in the first half of the 20th century. Shewhart was the first to lay out in the 1920s the now commonplace idea of distinction between chance causes of variation and those which were assignable to controllable factors, and the importance of concentrating on the latter.

Shewhart’s name lives on in the ‘Shewhart chart’, used to track divergence in process control. Means of successive test samples are plotted as dispersion from intended process mean on the y-axis against time or sequential order on the x-axis. In most practice the dispersion is weighted according to zones (the result being known as a zone chart) one standard deviation wide, the weight of a drift beyond the third zone either way commonly being given a weight of 8 and action being taken. The ±3σ figure represents a decision that one alarm per thousand instances is an acceptable compromise between variation and intervention.

Sources

SAS, JMP: http://www.jmp.com/forms/jmp_contact_nonus.shtml

Maplesoft, Maple: info@maplesoft.com

MathWorks, Matlab and associated tools: info@mathworks.co.uk

Minitab Inc, Minitab: sales@minitab.com

Adept Science, NWA Quality Analyst: info@adeptscience.com

Quality Digest, Quality Digest magazine: http://qualitydigest.com

StatPoint Inc, Statgraphics Centurion: info@statgraphics.com

Statsoft, Statistica: info@statsoft.co.uk