Image shows breast cancer cells with the enlarged cell to the left dying  ©

Dr David Becker, Wellcome Images under by-nc-nd 4.0. Image cut and resized.

Using machine learning, researchers have developed a method to identify cancer cells which may help speed up medical diagnoses

While artificial intelligence may be making headlines in terms of robotics, its possibilities are also being explored in fields such as medicine which, although perhaps not Hollywood movie-friendly, are likely to have a significant impact on people’s lives. Take for example the research conducted by an international team of scientists which uses machine learning to identify different kinds of cells. The new method offers significant benefits over previous methods such as using fluorescent staining that binds to components of the cells – it would enable them to be seen using microscopy, but would also alter the cell’s behaviour. 

Professor Paul Rees of Swansea University’s College of Engineering who was an author of the research paper explains the evolution of the research. ‘Our project started when I took a sabbatical to go to the Broad Institute which is a part of MIT and Harvard, the project I was working on then was to take lots and lots of cell images and then measure different information about those cells.’ They would look at size of nucleus, or the cell body, how round, textured, or granular it is. ‘These are called features.’ 


The team realized, like others in the research community, that measuring all these features had a reference point in the field of face-recognition. ‘Imagine if you are trying to identify a person’s face in a crowd,’ says Professor Rees. ‘What the computer program would do would be to measure things like how separated are the eyes? How long is the nose? How thin is the mouth?’ The algorithms they would develop would work in a similar way to those for face-recognition. ‘So we would measure hundreds, maybe a thousand different features or properties of each cell and then we would store those into our computer.’ They could then train the computer algorithm to find those features in new cells. 

One of the elements driving the acceleration of this kind of research over the last five years is new computer technology. ‘We are starting to have computers which are big enough to store these large databases,’ says Professor Rees. ‘We could be thinking of storing a typical run on one of the instruments that could deliver a million cells and we could have five hundred features measured for each one of those cells. To have algorithms that can crunch that kind of amount of numbers is just about coming online now.’

Training the algorithms is still at the research stage, and as Professor Rees says that means there still has to be a human that understands that that is actually a cancer cell before we can train the computers. ‘There is quite a lot of work that goes into preparing a set of samples where we know what each cell is to be able to train,’ says Professor Rees. And there is another branch of mathematics which is unsupervised learning, which is also quite cool in that the computer can have educated guesses as to how many different cell types it could find. It doesn’t necessarily have been trained before.’ The biggest challenge they at the moment is exploring getting the training algorithms good enough to consider using them for a diagnosis and they are working on this with the Leukemia group in Newcastle University.


The scale of the project and the different expertise involved is demonstrated by the research network: Swansea University’s College of Engineering; the Broad Institute of MIT and Harvard in Cambridge, Massachusetts, USA; Helmholtz Zentrum Munchen in Munich, Germany; The Francis Crick Institute in London; and Newcastle Upon Tyne University. Professor Rees’ own scientific career has traversed laser physics, to looking at non-linear science in a physics department, to engineering working on lasers and nanotechnology.  Now he reflects that while he is in an engineering department, all his collaborators are in medicine. 

The future of research is likely to encourage scientists who are comfortable working in collaboration with different disciplines. So for example, the current project says Professor Rees, ‘requires people who are very skilled in computer science, people who are mathematicians who understand the algorithms, and the high-end medics who are generating this data. It is a very multidisciplinary field.’