Software outstrips grad students in cluster management
Boise State University in Idaho has selected Bright Cluster Manager to provide job scheduling, monitoring and cluster management for its collaboration research cluster R1.
Although the cluster is used by many disciplines, there is a particular emphasis on advancing novel methods of molecular-targeted therapeutics for cancer research and one of R1’s unique outputs is the ability to display very high-resolution images across parallel display panels, making its complex simulations come alive.
However, the cluster has an added layer of complexity in that it is located behind a federal firewall at the Idaho National Laboratory, a US Department of Energy nuclear research and development facility. Directly accessing the cluster is not possible on a day-to-day basis, creating the potential to lower productivity if nodes aren’t working properly or crash unexpectedly. The university’s researchers needed a way to troubleshoot quickly and resolve issues remotely
Ken Blair, HPC Systems Engineer at Boise State, said: ‘At one point, several of our nodes were rebooting for no apparent reason. Bright’s support team advised me how to use secure shell (SSH) to create a tunnel to the web interface of my IPMI controller so that I could access the console. Within a very short period of time, our cluster was up and running again, and a situation that could have presented a major issue was averted.’
He noted that in the past: ‘I’ve tasked graduate students with cluster management using open source tool-kits. This approach was low-cost but time-consuming on my part -- and somewhat risky. We don’t have the bandwidth to write scripts for node installation, synchronisation or for ongoing cluster maintenance, all extremely time-intensive tasks. Bright makes tackling these tasks easy, and lets us automate a lot of important but tedious procedures.'