Meet Etch a Cell: The citizen science project advancing our understanding of subcellular activity
Advances in technology have led to great leaps in our understanding of life sciences, such as biology. However, with the increasing power of technology comes increasing amounts of data. Scientists now have microscopes that can create images at nanometre scale, creating a terabyte a day (10¹² bytes = 1000 gigabytes). of data every day. This poses challenges in terms of data processing, analysing and storage. But how are these data processed? This challenge led Martin Jones, a physicist at the Francis Crick Institute, to decide to harness the power of the public and create Etch a Cell.
Etch a Cell is a project that takes a new approach to processing data; it invites citizen scientists, that is, members of the public who take part in science, to log on in order to identify the nucleus membrane (i.e., the wall separating the nucleus from the rest of the cell). This allows scientists to build 3-D models of the cells’ architecture and gain greater understanding of cells’ role in disease. I chatted with Martin Jones, to find out more Etch a Cell, what type of data they collect and why he decided to get help from citizens scientists.
What data is collected?
The data processed in the Etch a Cell project is generated by electron microscopes hidden four storeys under the Francis Crick Institute. These microscopes can capture cell architecture at nanometre scale. This allows them to understand how a disease or pathogen may disrupt the formation of the nucleus or avoid detection by the immune system, Martin explained.
The electron microscope fires a beam of electrons at the surface of the sample and the electrons that bounce back and create an image. Another slice is peeled off the top of the sample and another beam of electrons is fired, generating a stack of cross-sectional images of the cell. The nucleus of each cell must be traced on a computer screen to digitally build a 3D image. These automated electron microscopes can collect vast amounts of data within a few days, even just one 3-D image of a cell nucleus can be up to several terabytes.
How are these data processed?
Usually, a microscopist traces the cell structure, however, “there aren’t enough electron microscopists in the world to get through this data” according to Martin. That’s why the Etch a Cell team decided to recruit citizen scientists to speed up this stage of the research process. The citizen scientists log on to Zooniverse, an online platform for citizen scientists, and carefully trace the outline of the nucleus membrane displayed in the electron microscopes’ detailed images.
Martin says citizen scientists are motivated to contribute to a project which, down the line, could help disease prevention and treatment. The Etch a Cell team call it “advanced colouring-in”, with some likening it to adult colouring books designed for meditative relaxation. However, real cells are not the clear-cut spherical shapes shown in textbooks.
Why can’t an algorithm do it?
With the astonishing advances in computer vision, I was surprised that an algorithm can’t detect the nucleus membrane. This is because the microscope image is created from the electrons that bounce back from the sample’s surface which produces a greyscale image (almost the Etch a Sketch shades of grey), which means an algorithm can’t easily distinguish the nucleus membrane from the surrounding cell architecture. Subsequently human vision is required for the standards of molecular biology research — looks like we’re still ahead… for now at least!
Martin was very appreciative of the citizen scientists for their help, but mentioned that “at the current rate, citizen scientists can process the data within a reasonable amount of time, a few months or so, but with one more technological advance, purely human processing will be too slow.” The next advance in electron microscopes will give 80 times the current data output. Data processing is the key issue hindering advancement of scientific knowledge, rather than the measurement tools themselves.
Feeding the machine
But there is hope for automated processing tools: The data created by citizen scientists, that is, the traces of the nucleus membrane, can be used as training data for supervised algorithms. Although deep learning requires millions of training data points, this figure has been greatly reduced by advancements such as data augmentation techniques, whereby a training image is tweaked so that the algorithm processes it as a new image. This allows researchers to get more out of a single human processed image. Martin estimates the amount of training data required for an algorithm to detect a nucleus membrane is within reach, but even larger human-processed training data will be required for more complex shapes, such as mitochondria or endoplasmic reticulum.
It looks like the invaluable work of citizen scientists will be needed for some time to come. Etch a Cell has over 5,000 citizens scientists volunteering their time to help advance our understanding of how diseases and pathogens disrupt the functioning of healthy cells. And their effort is paying off; Etch a Cell celebrated 100,000 nucleus membrane classifications in October 2018! If you would like to get involved, visit the Etch a Cell Zooniverse page.