In pursuit of problem solving
Michael Resch, director of the Stuttgart High Performance Computer Centre
Michael Resch likes to solve problems. But he has found that, sometimes, to solve a problem you need to get more involved than you planned. His pursuit of problem solving has taken him to being director of one of Germany’s largest computer centres. He has taken it from a provincial university computer centre to being a major European resource, despite becoming a director at a young age.
At the Stuttgart High Performance Computer Centre he has created a facility that leads research on making high performance computing (HPC) a utility that users barely know that they are accessing, rather than on the fundamentals of computer science. At one time he was one of those users, trying to get access to the powerful machines needed to run large simulations. His dream is that users should not even know that they are users, they just tap into the grid and the grid returns the results.
Horst Simon, associate director for computing at Berkley National Laboratory, says that it was a surprise to some people that Resch had risen to such a senior position so early in his career: ‘I came to appreciate his ability to lead a major computing centre afterward. It was an excellent decision to appoint him. He has done many things to boost Stuttgart as a major supercomputing centre. Many centres claim to work with industrial users, but it is Stuttgart that is engaged with the industrial community.
‘Stuttgart has made a much greater effort to integrate its supercomputing resources. He has always been very application-, user- and results-oriented and his selections of platforms and the focus on getting usable cycles to users has always been his approach. A lot of other centres are driven by buying the biggest and highest teraflop machine and not paying attention to what users can get out of the system.
‘Michael is one of the few people in the field who comes from a traditional engineering background, rather than computer engineering. This is why the approach is very hands-on and getting computing capacity to the users.
‘He had a difficult political situation to resolve and he has done this well. Stuttgart is on a par with the other supercomputing centres in Germany. He did a wonderful job, considering he was so junior when he started.
‘Michael has a certain dry humour. He always has a strong presence at meetings and is an active participant in any meeting, even when everyone else at the meeting is an American and so, by default, are native English speakers. He has established a strong management culture and has the respect of his team.’
Resch was born in Graz, Austria, but he moved as a baby to Salzburg. He went to school early, but confesses that he was a little bit lazy until he reached his teens. His grandfather encouraged him to do mathematics and he found the subject to be great fun. He was also interested in language and when it came to deciding what to do at university he was torn between the two subjects. But he decided that mathematics was more fun, so he studied applied mathematics with computer science.
He did his studying at a relaxed pace to begin with and spent a lot of time working with a student organisation similar to the US fraternity system, but with a political theme. He went to university at Graz, where he had a lot of relatives. Later he started working as a research assistant in the university.
This is where he started his relationship with high performance computing. His research topic was studying blood flow in large arteries, treating it as a Newtonian fluid. Computer models were used to simulate common diseases of the arteries. Although his family had a long tradition of serving in the army, Resch postponed his service so he could study and, by accident, avoided it totally by moving to Germany after he graduated.
He then took a job with a private research organisation working on ground water flow simulation using finite element code.
He says: ‘At the time this kind of research was not very prominent. But for me it was interesting because it was an environmental problem and it was the time when the Green Party was rising in Austria and I was a keen supporter. There were only one or two other groups in Austria working on these biological problems. My interest in mathematics was always in solving real world problems and problems with a meaning. In blood flow you know people get sick and it is partly a mechanical problem. The same is true with ground water, because people had spoiled it during the 70s and we needed to find ways to clean it up.’
Resch had started working on parallel computing problems and he realised that Austria was too small a country to get the kind of resources needed for parallel supercomputing. He saw a job advertised in Stuttgart and applied. It turned out that he had the right mix of experience and there were not too many people with his background – and he got the job.
He had originally intended to spend a couple of years there but, gradually, he started getting involved in teaching at Stuttgart, then he got involved in some large European projects and eventually they asked him to stay.
In the mid-1990s the government had wanted to create national supercomputing centres and Stuttgart was one of the universities chosen to be a major centre. All this time he was studying for his PhD and with so many distractions it took him a long time. He sat his first interview for the position of professor the day after he was granted his PhD.
By 1998 he was head of parallel computing at the new High Performance Computing Centre in Stuttgart (HLRS). He liked this very much, but soon it was announced that the head of the Computing Division of the university, which included the High Performance Computing Centre, was going to retire. The university decided that it was going to split the division into two parts, and the HLRS was going to be an independent organisation.
He says: ‘They asked me to be acting head of the HLRS for two years and after that there would be a new professor. In Germany the rule is that new professors come from outside the organisation, in fact it is the law that internal candidates cannot apply. Having a new professor would mean that everything would be changing, so I started looking around for a job.
I was offered a job in industry when Barbara Chapman at the University of Houston suggested that I apply there, and I immediately jumped on it. When you work in HPC, working in the US is not a bad thing. I worked as an assistant professor.’
Soon after he arrived in Houston his son was born, in addition to the three-year-old daughter he had already. He found it completely different, but very exciting. His plan was to stay at Houston for at least three years, even five years, and look for a full professorship either in Germany, Austria or the US.
He says: ‘Before I left, my professor at Stuttgart said that, in order to get some practice at applying for a professorship, I should apply for the job at Stuttgart. I did a presentation and, as I was leaving to go to Houston, they told me I was number one on the shortlist. While I was working at Houston I was waiting for the Government to actually send me the letter confirming that I had the job. I had just started settling down in Houston when I got the letter. It was obviously something I could not possibly refuse. It was a tenured professorship, it was in HPC and it was in Stuttgart which I knew well. The professorship came with the job of director of the HLRS.’
Resch had to face some challenges because, previously, the HLRS was part of the University of Stuttgart computing centre and his team was small. He says: ‘What I had to do first was grow the centre, because it was impossible to keep it going with so few people. I had to bring in thirdparty funding. I also had to change it to focus on research activity while still keeping the system operations side going. We changed from a German centre to an international centre, because the European Union is prepared to help fund a Europe-wide centre.’
When it came to research, he did not really have the resources to do fundamental research in computer science. He decided to focus on the operational side of grid computing and looking at ways of making grid computing accessible to users who are working in real applications. He realised that there were many issues around ‘customer service’ and support.
He says: ‘You have to move away from an approach that is batch-oriented towards a more general approach. At Stuttgart we have been working a lot with industry. The engineers want to just use the service without having to understand how it works. We needed to give more support and ease of access for users. We started out trying to couple systems which, for us, was very natural, because we had a Cray T3E and a Cray Y-MP and we were able to extend that to working with another Cray on the other side of the Atlantic.
‘What was more interesting for us was thinking about what we could do to improve the quality of the experience for the users.’ Resch says there has always been a spirit of co-operation in the HPC community and many of the ideas developed at Stuttgart have been adopted elsewhere. The same spirit has been adopted in the emerging grid community, of which the HLRS has been an active supporter. But its focus has not been to duplicate the fundamental work, but rather to concentrate on the customer service challenges thrown up by a grid model.
One issue his research team has looked at is in managing the interconnections between different grids. If there was just one grid then, one day, someone will come along and flood it with work, slowing everyone down. Resch says that most industrial users have their own grids and then want to be able to move easily in and out of external resources when they need them on a pay-as-you go basis. They do not want other people coming into their internal grids. There are issues around accounting and ensuring quality of service, while at the same time allowing people to move easily between systems when they need to and to keep the commercial secrets that can be tied up in a simulation. Resch says: ‘The problems come when you have to decide who is allowed what resources and have to create new rules for occasional users. You need to find out if the person requesting a service is who they say they are and what quality of service they have contracted to have. We have some users who are members of two organisations and only one of them is entitled to use the service. You have to find a way of figuring out who they are working for at the time they request a service.’
Resch is mainly involved in management issues, but he still keeps his hand in by joining in with research groups. He has about 40 PhD students working at the centre and he is constantly helping them to develop their ideas rather than trying everything himself.
Resch works very closely with industry and he has learned from that experience that ultimately there comes a time when you have to change jobs. He says he is always looking at options that might involve him going back to research and, while he is not ready to move any time soon, he knows that he may not want to do his present job for the rest of his life.
He says: ‘We are currently a centre with 25 people in permanent positions and about 50 people in research. My real ambition is that we can make HPC disappear. There are so many things in the health, environment and climate research, and industry, which require a lot of computer power to gain real insight. We have a lot of big project work at the moment, but ultimately I want to make HPC ubiquitous so that any researcher can submit a job to a system and get a result back in a reasonable time. The users will not care how it happens; they are just getting the results they need.
‘I started trying to understand blood flow and so I learned to do simulations, that took many days so I started doing HPC to speed it up, then I had to understand parallelism. I am still trying to find out the answer to a problem. I am not doing HPC because I want to do HPC; I do it because I want to solve a real problem like blood flow. I understand the issues of the users, because I was originally one of them and I know what the problems are; I also know all their tricks. I am just the same guy who tried to access a computer in Graz.’