Biological Thumb Drives

by Reed Stubbendieck (@bactereedia)

“Every cell in your body contains seven hundred and fifty megs of data,” the engineer said. “For comparison, one of your fingers holds as much information as the entire internet. Of course, your information is repeated and redundant, but the fact remains that cells are capable of great storage.”

Legion: Skin Deep, Chapter 5

Legion: Skin Deep, cover

Note: This post contains no plot spoilers for Legion: Skin Deep.

In Brandon Sanderson’s novella Legion: Skin Deep, Stephen Leeds is tasked with finding a corpse and the information it “knows”. Before his death, eccentric engineer Panos Maheras encoded crucial information in his cells. Leeds needs this information to prevent a pandemic caused by a new virus. There are a ton of fun ideas to explore in this story, but today I want to focus on the idea of storing information inside of cells. As referenced above, an engineer states that each cell contains 750 megabytes (MB) of data. So, in my first post, I want to explore the following two questions:

  1. How much information is contained within a single cell?
  2. Can we store the entire internet in a human finger?

Before we begin, let’s have a brief refresher on DNA. Inside of most animal and plant cells, there is a tiny organ (an “organelle”) called the nucleus. The nucleus contains the cell’s DNA in the form of chromosomes, which is also known as the nuclear genome. Each chromosome contains a double-stranded DNA molecule wrapped tightly around many different proteins. Information is encoded within a double-stranded DNA molecule via the nucleotide base pairs. A single strand of DNA is made from a sequence of the four nucleotides: adenine (A), cytosine (C), guanine (G), and thymine (T), which pair with T, G, C, and A, respectively on the complementary strand of DNA. In a human’s haploid (i.e., single) set of 23 chromosomes, there are ~3 billion base pairs of DNA. To determine the storage capacity of 3 billion base pairs, we will need to take a brief trip from the world of carbon to the world of silicon.

In computing, the basic unit of information is called a bit, and can have a binary value of 0 or 1. If we translate the language of nucleotides into the language of bits, it takes 2 bits to encode each base pair uniquely (AT, CG, GC, and TA as 00, 01, 10, and 11, respectively). This conversion will allow us to directly determine how much information is contained in our DNA. For a single set of 23 chromosomes, we calculate:


It is important to note that estimates of the total length of the haploid human genome vary from 2.9 to 3.2 billion base pairs. Thus, the information content of the haploid human genome is between 691 – 763 MB, which bound the value of 750 MB (see update, below) given by the engineer in the novella. However, most human cells contain a diploid (i.e., two) sets of chromosomes. We inherit 23 chromosomes each from our father and mother for a total of 46 chromosomes. Therefore, each cell contains 1430 MB of information stored in the nuclear genome. In addition, human cells also contain organelles called mitochondria, which have their own genomes. The mitochondrial genome is much smaller than the human genome at ~16,000 base pairs. Unlike the nucleus, each cell contains many mitochondria. For our calculation we’ll estimate that the average cell contains 1000 molecules of mitochondrial DNA:


That’s not a lot compared to the nuclear genome. Therefore, the total information stored in each cell is ~1.4 GB. This data could be stored on a USB drive that costs < $3!

But can we store the entire internet on a finger? If we estimate the number of cells in a human finger and then use our value of 1.4 GB per cell, we can calculate a finger’s DNA information content. For a male, the average finger width and length are 20 and 110 mm, respectively. Assuming that a finger is roughly cylindrical in shape, the density of human tissue is close to that of water (1 g/cm3), a 70 kg human body contains 1013 cells, and that the distribution of cells in the body is uniform across all body sites, we can estimate the data storage of a human finger as follows:


7 exabytes (EB) is an extremely large amount of information. In fact, a single EB can store nearly 43 million Blu-ray discs. However, the internet contains much more information: in 2012 it was estimated that 1 EB of data was created on the internet daily. Therefore, though the potential information stored on a finger is impressive, it will not store the entire internet. If we use every cell in the human body, we can store 14 zettabytes (ZB) (615 billion Blu-ray discs). With this much storage, we can hold the entire internet. At least, for a little while longer, because global internet traffic is estimated to reach 3.3 ZB per year in 2021!

In conclusion, even though the engineer was a little enthusiastic about the storage potential of a human “thumb drive”, he was correct that cells have a great capacity to store large amounts of data! In a future post, I will explore how scientists have already engineered cells and DNA to store music, movies, and books.

Update 05/22/2018: In the calculations for this post, I used the classic definition that defined 1 MB as 220 (1,048,576) bytes instead of 106 (1,000,000) bytes. In modern parlance, the former is now defined as a Mebibyte (MiB) and the latter is a MB (see here for a discussion of the historical differences). If we use the modern definition of a MB, then the calculation matches the 750 MB value given in the novella. Note, this value is still in reference to a haploid genome.

No, CRISPR will not Lead to a World of Genetically Manipulated Criminals

by David Green (@GradDavid_Green)

Lately, an article posted in the Daily Mail has been making its way through the social media spheres. Titled “Criminals could manipulate their own DNA to avoid detection on police databases with £150 online gene repair kits” (, the article has been met with a mix of concern from the public and well-deserved derision from the scientific community. While I think it is important that ethical concerns of new scientific discoveries are discussed among everyone, not only scientists, this is increasingly difficult with the existence of these kinds of articles written to incite emotion instead of to inform. So why is it so ridiculous that CRISPRs could be used to create a class of forensic invisible criminals?

To answer this question, we must first discuss what is a CRISPR? CRISPRs are an element of a bacterial antiviral defense system that can target and cut DNA at specific sites. When used to target pieces of DNA in other organisms, the cell will attempt to repair the break in their DNA, this repair mechanism runs the risk of causing an error by inducing a mutation at the site where the CRISPR cut the DNA.  Realizing that this would be incredibly useful for both research and medicinal purposes, scientists have taken this natural system from bacteria and isolated it so that we can use it across many different organisms, including humans! This discovery has been significantly impactful and there is no doubt that a Nobel Prize is forthcoming for its discovery. While CRISPRs can manipulate DNA, there are major hurdles that would make such a task very difficult.

It is true that CRISPRS can edit the DNA of an individual cell, however what the article obfuscates is that there are serious challenges that would make using such a technique to cover up crime near impossible. For one, the number of cells that an individual would have to alter to successfully dodge forensic scientists is massive. Our body is composed of trillions of cells, even focusing on the most likely cells to leave behind on a crime scene; skin, blood ect the task would be daunting. The most likely solution would be to target the stem cell populations, groups of cells whose role is to divide and replace cells as they die. Our would-be super criminal would have to alter the DNA in stem cell populations across their entire body. Now, if you can get CRISPR system into a cell it can perform the task. However, it is incredibly difficult to get large molecules such as the machinery to run the CRISPR system into a cell. It is in fact one of the major challenges to the use of the system. The second major problem requires an understanding of how forensic scientists identify individuals. Forensic scientists do not check a single site of the genome and check for similarities, they look at hundreds. To effectively cloak an individual, it would require a stunning number of mutations, to a level that would significantly risk generating  diseases (and even more unlikely Spider-Man). It is more likely that an enterprising thief would unintentionally give themselves cancer before successfully cloaking their DNA

Articles like the one in the Daily Mail are frustrating not only because they sensationalize scientific discoveries, but also because they waste valuable opportunities to engage with real issues that arise with these technologies. There are real ethical considerations for CRISPR technology, they unlock the potential to significantly tailor an individual’s DNA, if not in an adults, then in embryos. These are real ethical concerns that need to be discussed and boundaries need to be set before they are tested. These boundaries must not only be formed by the scientific community, but also with input from all members of the population. It makes it our responsibility as part of this system to make sure that we set the record straight and not only call out sensational articles like this one, but also to engage and explain these technologies and their uses as well.


Signals from Noise

by Scott Mattison (@FoolsPizza)

Imagine paying almost $20 to go to see the newest Marvel movie. The previews finally end and your movie starts, only parts of the screen are randomly dark. Likely, you would be upset and would either ask the theatre to fix the error or demand your money back. What if I told you the lasers that provide the light for imaging technologies like confocal microscopy and two-photon fluorescence microscopy had that exact problem?

Lasers enable scientists to easily capture incredibly detailed images of biological tissues and cells that were previously challenging, if not impossible, to achieve. One of the earliest challenges that had to be solved when using lasers for biological imaging was how to reduce an effect referred to as “speckle”.

If you have a laser pointer at home, you can observe speckle just by shining it at the wall; if you look at the spot made by the laser, you will see some areas that are bright and other areas that are dim. This is speckle!

Speckle is the result of two paths of light interacting with one another. In some cases, the two paths combine to make a brighter light whereas in other cases the two paths combine and cancel each other out. More specifically, this interaction is called interference. Interference causes speckles that appear as a grainy pattern of bright and dark spots.

A simulated example of speckle originating from a laser beam illuminating a wall

As you can imagine, when you are trying to capture a detailed image of the inner workings of a cell, speckle is not desirable as this grainy pattern can degrade your image quality. In this regard, we consider speckle to be noise within our images. Luckily, scientists and engineers have worked over many years to find creative ways to reduce and remove speckle in imaging applications. As cool and interesting as a lot of these methods are, I am not  here to talk about how we can reduce speckle in our images, I am here to talk about how we can utilize it. However, before we can discuss how we can use speckle to our advantage, we need to know a bit more about it first.

When a beam of light interacts with a rough surface, light that bounces off this surface and the different paths will interfere with one another, creating a random speckle pattern. If neither the light source nor the rough surface move, the speckle pattern will remain constant. Any movement of the rough surface will cause corresponding changes to the speckle pattern that is generated. Now, we can start to see how we can utilize speckle to our advantage.

By tracking changes in speckle patterns, researchers can determine the movement of a sample over time. This technique has been used to monitor how tissue reacts when a specific force is applied. From this, properties related to the tissue such as its strength or how well it recovers after being changed can be determined. This approach has the potential to allow doctors to differentiate between healthy and cancerous tissues or identify unhealthy regions of blood vessels.

Tracking movement of tissues isn’t all speckle can do. By simply observing changes in speckle over time, researchers have demonstrated that we can actually tell the difference between the movements of liquids and solids. This has led to amazing techniques for mapping out small networks of blood vessels within the body, and has even allowed researchers to image blood flow in the brain!

To me, speckle is an awesome example of what makes research so powerful. We had this noise source in our images that was really slowing down progress in research. Instead of just finding a way to solve this problem (which we did), researchers have found a way to take that noise and make something useful out of it, a signal.

Who’s watching the kids?

by Andrew Anderson (@AndersonEvolve)

I am an evolutionary biologist and, as such, I find the diversity of life to be amazing and love pondering how divergences between species/populations occurred.  But I have a confession: fishes are the best, hands down. Sexual selection and parental behavior are often intertwined, and fishes cover a wide breadth of behaviors, traits, and systems, especially when compared to other vertebrates.  I am sure that there are probably a lot of entomologists and other invertebrate biologists shaking their fists at the screen right now. I’ll concede that those taxa are also amazing in their range of adaptations, but fishes are just cooler, so I’ll focus on them.

Throughout my time on this blog, I hope to point out many different features and adaptations of fishes as well as what processes may have caused them.  I hope to touch on everything from males that look like females to sneak mates, to males who steal eggs from rival nests to make it seem like more females have chosen them, to a species that loses 20% of its genome from its somatic cells (that is, cells that won’t make eggs/sperm).  For this post, though, I’ll touch on a fascinating evolutionary outcome that I really hope to delve into more as my career progresses: which sex watches the kids.

Male Gulf pipefish, Syngnathus scovelli, exhibiting male brood care.  He’s ready to pop!

Fishes are unique, because in over half of species that care for their offspring, the male engages in parental behaviors instead of the female.  The lab I entered studies Syngnathids, a.k.a. pipefishes and seahorses, which are known for their male pregnancy. What I found out is the family of fishes that are the closest relatives of Syngnathids, Solenostomids, also have brood care.

The ghost pipefish on the bottom is a female with larger fins on its underside that form a pouch to hold eggs until they develop into juveniles.  The top fish is male.

You can see some similarities between the two families, but in Solenostomids, a.k.a. ghost pipefish, the females have evolved brood care.  So we have two closely related families that evolved a pouch to hold developing offspring, but the sex responsible is flipped. What caused that?  A good start point would be to figure out what sex cared for offspring in the ancestor to both groups. Without a strong fossil record we can only infer what that ancestor might have looked like by comparing what traits the next closest families to Syngnathids and Solenostomids might have had. It turns out the next families are trumpetfish, cornetfish, and shrimpfish, none of which engages in brood care. As a result, there are three possibilities: 1) the ancestor had male brood care, 2) the ancestor had female brood care, or 3) the ancestor had no brood care and Syngnathids and Solenostomids independently evolved it with a different sex.

Samurai gourami, Sphaerichthys vaillanti, with a secondary sex trait of bands.  Female is on the top with two males below

In order to tease apart what might have happened, I needed to know if there were other groups of fishes that changed which sex took care of the offspring.  Sure enough, I was able to study this using a group of gouramis. Most species of gouramis have male brood care, and in one species, the samurai gourami, females evolved a secondary sex trait (a trait that is different between the sexes that isn’t directly involved in reproduction that may be used to attract or compete over mates).  

Chocolate gourami, Sphaerichthys osphromenoides, whose sex is not determined.

The closest relative to samurai gourami is the chocolate gourami, which has female brood care and is monomorphic between the sexes. I have sequenced the genomes of both gourami species and I am working on acquiring the transcriptome (what genes are turned on and how many times is a particular gene activated). My hope is to piece together what happened at the genomic level to cause such a wholesale behavioral change.

While this work is personal and I’m excited to share it, my goal is to show readers of this page some other peculiar results of evolution, especially in that most extraordinary group: the fishes.  Until next time

Welcome to Scientific Sidequests

Scientific Sidequests was started by a group of biological scientists from diverse fields who shared common interests in games, movies, and shows.  Of the many things we enjoyed when together playing a game or watching a movie was hearing and discussing what the others were doing and working on in fields just beyond our own.  We each also brought our own perspectives from our disciplines that led to many interpretations of shared experiences. The hope we all have with Scientific Sidequests is to bring those different perspectives to one place and have anyone with a curious mind view the amazing breadth of biology.