Faces of HPC: Tony Travis
Tony Travis is a post-doctoral researcher at the University of Aberdeen where he uses HPC in his work on GWAS (Genome-Wide Association Studies) to analyse millions of SNPs (Single Nucleotide Polymorphisms). Tony is particularly interested in the democratisation of HPC using COTS (Commodity Off The Shelf) hardware and in emerging 'many-core' MIMD (Multiple Instruction, Multiple Data) low-latency mesh-connected processors such as the 'Epiphany' used in Parallella, which is a kickstarter project to develop a low-cost parallel SBC (Single Board Computer) like the Raspberry Pi.
Introduction
On leaving school at the age of 17 Tony Travis initially went to Bangor University to study Forestry and Wood science. During his Ph.D. on ion-transport mechanisms he became frustrated by the tedious nature of manually measuring cells on microscope images and turned to computational image analysis for his first post-doc. This has lead to a career consulting in the field of informatics and its applications, with a continuing interest in how HPC can contribute to science.
Biography
Tell us a bit about yourself.
I was born in Preston, Lancashire and left when I was 17 to study Forestry and Wood Science at Bangor University, where I did a lot of rock-climbing. However, after three years of Forestry I changed courses to study Agricultural Botany and did a PhD on the ion-transport mechanism in stomatal guard cells at Lancaster University: My PhD involved a lot of tedious work measuring stomata down a microscope and I became interested in image analysis as a way of making these measurements automatically, which I pursued during my first post-doc at Plymouth Polytechnic. My second post-doc at NIAB in Cambridge was about developing an image analysis system to identify weed seeds in samples of commercial seed. I went on from there to do my third post-doc contributing to the development of a system for automatic karyotyping for the MRC at the Western General Hospital in Edinburgh. I then worked for 25 years at the Rowett Institute in Aberdeen. I ran Bio-Linux with 32-bit OpenMOSIX and Kerrighed Linux SSI kernels on the Beowulf that I built at the Rowett Institute, which I later upgraded to run 64-bit Linux using Sandia National Laboratories "oneSIS". Most of the bioinformatics work I did on this Beowulf cluster was 'embarrasingly' parallel. We did run MPI versions of bioinformatics programs, of course, but bandwidth was restricted even with separate Gb network fabrics, as in the EPCC BOBCAT design, which uses two isolated network fabrics: One for 'system' and another for 'application' node interconnects plus a separate LAN connection. This design is very good with limited bandwidth because it allows you to retain control of the Beowulf cluster even if the 'application' network fabric is saturated.
After being made redundant in Dec 2011, I set up a company to do consultancy work on informatics and did contract work for the BiGCaT bioinformatics department at Maastricht University. Since then, I've done more post-doc work at the University of Edinburgh and University of Aberdeen. I'm a member of the 57North Hackspace in Aberdeen and I'm interested in the politics of IT in relation to academic freedom: In particular the way that restrictive University IT policies can be used as an instrument of management.
What is your current job?
I'm now doing my sixth post-doc, working on the molecular genetics of drought tolerance in rice in Adam Price's research group at the University of Aberdeen. My work involves GWAS (Genome-Wide Association Studies) on millions of SNPs (Single Nucleotide Polymorphisms), which is computationally intensive and, like much of the bioinformatics work I've done, benefits from the use of HPC. I would describe myself as a computer-literate biologist, and I know enough about HPC to build my own systems. I now do this commercially through my company Minke Informatics Limited.
How did you become interested in HPC?
I became interested in image analysis as a way of reducing the huge amount of tedious work involved in making manual measurements of stomatal aperture during my PhD. At the time, in the late 1970's, personal computers were just emerging in kit form and I built several computers from scratch. I learned a lot from building my own computers, which I still do today. However, at Plymouth Polytechnic, I got on a priority educational list for the BBC Micro and started to develop image analysis software for it.
I first became aware of HPC when I worked at NIAB, where they had bought a turn-key image analysis system based on a systolic array processor based on 6809's running PolyFORTH. I reverse-engineered the PolyFORTH system and added the vocabulary necessary to compile FORTH and use it as development system. I then ported a Small-C compiler for the 8086 to the BBC Micro and changed it to generate FORTH run-time code for the 6502, which I used to boot-strap a self-hosting Small-C on the BBC Micro and used it to develop image analysis software for seed testing at NIAB.
After NIAB, I went to the MRC research Unit at the Western General Hospital in Edinburgh and joined Denis Rutovitz's pattern recognition group, who were developing systems for automatic karyotyping. My role in the group was to commission a CLIP4R (Cellular Logic Image Processor v4 built by RAL) array processor with 96 x 96 = 9216 SIMD (Single Instruction Multiple Data) custom LSI processors designed by Terry Fountain and Michael Duff at UCL. I replaced the existing pdp 11/34 array host with a VAX 750 running BSD 4.1 and wrote a Unibus-level device driver for the processor array. I also debugged the CLIP4R instruction-set as implemented by RAL (Rutherford-Appleton Laboratories) and developed image analysis software for CLIP4R.
When I started by job at the Rowett Institute, I did quite a lot of image analysis work using "Torch" Quad-X 68020 workstations running Unix (Interestingly, Torch was set up in partnership with Acorn employees who did not join ARM and went into receivership). In time, we replaced these with Sun workstations until the day I benchmarked them against Intel PC's! Soon after that, I was building a Beowulf cluster based on EPCC BOBCAT (Budget Optimised Beowulf Constructed using Affordable Technology). Although the original EPCC BOBCAT is no more, one of its progeny is alive and well running at the Mario Negri non-profit pharma research institute in Milan, where it is the head-node of their LSI (Life Science Informatics) HPC cluster.
Although my main interest in HPC was, originally, for image analysis I built the BOBCAT cluster when I worked in Andrew Chesson's biological chemistry group at the Rowett Institute with Bruce Milne, who used it for (DFT (Density Field theory) chemical modelling. I then became interested in bioinformatics and have used HPC in various ways to analyse DNA/RNA sequences of different organisms. Most recently my work concerns selecting rice cultivars that grow well under drought.
As part of this project we want to celebrate the diversity of HPC, in particular to promote equality across the nine “protected” characteristics of the UK Equality Act, which are replicated in world-wide equality legislation. Do you feel an affiliation with this matter, and if so how has this interacted with or impacted your job in the HPC community?’
I've worked with many different people from many different countries and I've learned that whatever stereotypes you might imagine, people are people and the best way to find out about a country or culture is to listen to what people from there have to say and what their perspective is on the way we lead our lives.
Is there something about you that’s given you a unique or creative approach to what you do?
My grandfather was Irish and he gave me an attitude to mindless authority that has stayed with me all my life. He was a soldier and familiar with the way that people in authority often behave in a way that belies the fact that they do not actually understand the purpose or objectives of the organisation they work for. All too often in academia, I see the administrative tail wagging the dog...
Were there any challenges when you first entered the field?
The main challenge I've encountered is that there are not enough hours in the day to do all the things I'm interested in. However, the main obstacle to progress I've encountered when 'deviating' from the now standard-issue 'WinTel' PC's issued by University IT departments are obstructive IT managers who are more interested in enforcement of the rules than understanding why we want to do things differently. An entire ecosystem of 'enterprise' IT culture has grown up around those few of us remaining who have always built our own computers. Sadly, the 'enterprise' mentality has also seeped into HPC administration too.
What’s the best thing about working in HPC?
I see HPC, and computers in general, as a way of increasing the power of what you can do with your mind. It is incredible how quickly the performance of computers has increased during my career. I'm very excited about the Adapteva "Epiphany" MIMD mesh-connect processor used in the Parallella SBC. This was a kickstarter project to democratise HPC in the same way that the Raspberry Pi (and BBC Micro before it) democratised access to conventional computing. One of the most depressing era's of IT education was when children were taught how to use Microsoft Office, to make them employable, instead of educating them to learn what computing was all about and that girls make very good programmers!
If there’s one thing about HPC you could change, what would it be?
The cost of electricity.
What’s next for you in HPC?
I'm involved in the NERC/EOS Bio-Linux project and have used this to teach bioinformatics running VM's on the CyVerse "Atmosphere" OpenStack cluster at the University of Arizona. I've also been using "Devstack" and learing how to use MaaS to deploy nodes to a Beowulf cluster we've built at the Mario Negri Institute in Milan for cancer bioinformatics. I'm interested in HPC resources like CyVerse that are free, as in beer, at the point of use and free, as in speech, to develop. I'm also interested in using Parallella to learn about 'many-core' processors and encourage other people to think about how to write better parallel programs.
Tony Travis was interviewed by Toni Collis in June 2017.