How will we know if we’ve found Martian microbes?

By Reed Stubbendieck (@bactereedia)

curiosity
Figure 1. Curiosity Mars Rover taking a selfie. [Source]
Drew’s recent blog post spurred a conversation between the two of us about the first extraterrestrial life that humans will encounter. In the end, we both agreed that when/if humans discover aliens, they’ll most likely be microbial.

We are not alone in this assertion. At this very moment, the NASA Mars rover Curiosity (Fig. 1) is currently roaming the surface of the red planet using its suite of instruments to detect and characterize organic molecules that could be indicative of life from ancient aqueous environments. Intriguingly, data already collected by Curiosity has indicated that there are environments on Mars that may have once been habitable for microbial life!

Unfortunately, Curiosity is unable to directly detect living microbes, which begs the question: how will we really know that we’ve found genuine alien microbes?

To address this question, we first need to review the seven fundamental characteristics of life. All living organisms 1) are composed of cells, 2) are ordered, 3) grow, 4) reproduce, 5) pass down genetic information, 6) possess homeostasis, and 7) possess metabolism. In this post, we will consider how growth, reproduction, genetic information, and metabolism are currently used by scientists to detect life both on Earth and in the greater universe.

First and foremost, I am a microbiologist and prefer to follow an old proverb that states, “seeing is believing”. Thus, I would personally be most convinced of life on Mars if I saw an alien microbial colony emerge from a sample cultured on a Petri dish, which would demonstrate the necessary characteristics of growth and reproduction. However, unfortunately this level of evidence is most likely untenable for the foreseeable future.

As an example, on Earth if you directly count the number of bacterial cells from an environmental sample, such as soil or ocean water, using a microscope and then culture that sample on a Petri plate, only ~1% of those bacterial cells will form a visible colony. This phenomenon is known as “The Great Plate Count Anomaly” and has plagued microbiology since its inception. The anomaly is partially caused by how bacteriological medium is prepared, but is more majorly a result of our lack of understanding the nutritional requirements for different individual bacterial cells. Put another way, if we don’t know what Martian microbes like to eat, then we’ll be unable to coax them to reveal themselves.

iChip
Figure 2. The iChip after being removed from the ground. [Source]
On Earth, we’ve developed methods that circumvent and accommodate these picky eaters. For instance, the isolation chip (iChip, Fig. 2) is a relatively recent technology that allows microbes to be cultured in situ (at the place of their origin). This device works by trapping individual microbial cells into tiny wells that are sandwiched between semipermeable membranes. The membranes allow the passage of molecules between the trapped cells and their environment. Thus, the iChip allows microbes to access the nutrients they require without requiring scientists to determine specific requirements and formulate special media. This approach has been used to cultivate up to 50% of the microbes in a soil sample, which led to the discovery of a new antibiotic scaffold from a previously uncultivable bacteria!

Alternatively, as direct culture is a bottleneck for identifying living microbes, culture-independent approaches based on DNA sequencing have exploded in the field of microbial ecology. The primary approach that is used is called amplicon sequencing, which allow us to use specific DNA sequences as barcodes to identify different microbes. An alternative approach is to sequence all of the DNA present in a sample. This approach is called metagenomics and has been used to characterize the genes that are present in different environments on Earth. An advantage of metagenomics over amplicon sequencing is the ability to assemble entire intact genome sequences from environmental samples!

kate_rubins_nanopore
Figure 3. Astronaut Kate Rubins using the Oxford Nanopore MinION Sequencer in Space [Source]
Though once prohibitively expensive and technically challenging, advancements have rapidly decreased the cost of DNA sequencing and shrunken sequencers from the size of a refrigerator to a device that can fit in your hand (which has even been used in space, Fig. 3)! Thus, it may soon be possible to equip our future rovers with their own tiny sequencers. However, there are still important hurdles to overcome before implementing DNA sequence technology. First, direct DNA sequencing can’t distinguish between living and dead microbes. Further, contamination with microbes or microbial DNA from Earth may confound our analyses. Finally, a practical consideration is sampling throughput (the amount of samples that can be processed in a given period of time) and reagent usage.

Whether we attempt to directly culture microbes or sequence their DNA from Martian samples, there’s another practical consideration to discuss: where do we sample? As mentioned above, our rovers will likely carry only a limited amount of reagents for bacterial culture or DNA sequencing. Thus, we need to determine a method to narrow our search space for microbes: we need to identify biosignatures of life.

Fortunately, Curiosity is already equipped with instruments that detect organic molecules. Remember, all living organisms possess metabolism, perhaps we can follow molecules like methane as biosignatures and locate microbes. Unfortunately, because Mars has no atmosphere, the planet surface is bombarded with ultraviolet light, which may destroy volatile biosignatures.

sagan_and_viking
Figure 4. Carl Sagan posing next to a model of a Viking Lander in Death Valley, California. [Source]
As an alternative, we can attempt to identify biosignatures under controlled conditions. One such method was employed in the 1970s during the Viking program. As part of this program, two landers (Fig. 4) were sent to Mars with the mission to search for evidence of life. One experiment performed by the landers was called “Labeled Release”. A Martian soil sample was combined with a mixture of seven radioactive 14C-labeled nutrients and monitored for production of labeled carbon dioxide (14CO2) gas, which would suggest that living organisms had consumed the nutrients and produced the gas as a waste product. The experiment gave mixed results: though both landers initially produced positive results, repeated injections of the labeled nutrients failed to yield additional 14CO2. Though controversial, it is now believed that the 14COobserved in these experiments was produced abiotically. Perhaps future experiments will combine radioactive labeling with culturing or sequencing to identify microbes.

The above list of approaches to identify life is by no means exhaustive. However, I hope this post has highlighted the difficulties that scientists face not only in our search for extraterrestrial life in the universe, but also in characterizing the vast diversity of terrestrial microbes that inhabit on our own planet!

Mitochondria: more essential than Midichlorians?

By Reed Stubbendieck (@bactereedia)

Midichlorians are a microscopic lifeform that resides within all living cells… And we are symbionts with them… Lifeforms living together for a mutual advantage. Without the midichlorians, life could not exist, and we would have no knowledge of the Force.

Qui-Gon Jinn, Star Wars Episode 1: The Phantom Menace

Note: this post contains a minor spoiler for Star Wars: Episode VIII – The Last Jedi

cp2ccare-720.jpg
Image Credit: Wookieepedia

In one conversation between Jedi Master Qui-Gon Jinn and Anakin Skywalker, the origin of the Force shifted from the mystical to the microbiological. And while I love all things microbiology, I can’t say that even a ten-year old Reed appreciated the introduction of midichlorians into the Star Wars canon (though let’s be fair, not many people did).

Qui-Gon Jinn says that life cannot exist without midichlorians and the Force is conducted through the midichlorians and refers to the midichlorians as microscopic lifeforms that live inside our cells. It’s well documented that Star Wars director George Lucas derived inspiration for midichlorians from an organelle that exists within most of our own cells called the mitochondria. 

Similarly to midichlorians, all human life is dependent upon mitochondria. However, while midichlorians connect living things to the Force, our mitochondria connect us to an even more powerful force: aerobic metabolism!

In today’s post, we will explore what would happen if a human being was suddenly cut off from the force of aerobic metabolism (much like how Luke cut himself off from the Force in The Last Jedi). More specifically, we will determine how long a human can survive if they were to suddenly lose all of their mitochondria.

mitoEM
Transmission electron micrograph of a mitochondrion. Image credit: G. Angus McQuibban

Before we begin, I will briefly describe the function of our mitochondria (pictured above), which are (in)famous for being “the powerhouse of the cell”. That is, most of our energy generation occurs due to biochemical reactions that take place within the mitochondria.

Learning about metabolism is one of the banes of introductory biochemistry courses, but for our purposes, we can represent the many enzyme-catalyzed reactions, substrates, cofactors into a single equation, where C6H12O6 is glucose (a sugar) that we consume and adenosine triphosphate (ATP) is the energy currency of our cells:

eq1

This process, which is called oxidative phosphorylation, absolutely requires our mitochondria to occur. Without mitochondria, our cells can metabolize glucose in an oxygen-independent process called anaerobic glycolysis. However, anaerobic glycolysis is much less efficient than oxidative phosphorylation and causes a buildup of lactic acid. Under conditions of oxygen deprivation (e.g., asphyxiation), our brains rapidly suffer damage due to a combination of lack of energy and acid buildup in brain tissue.

Similar to my first post, I will use back-of-the-envelope calculations to estimate how long a human will survive without their mitochondria. I will make the following two assumptions:

1) The first organ to suffer irreparable damage from complete loss of mitochondria is the brain.

2) Death will occur due to lack of energy, in the form of ATP molecules.

To determine how many long it would take for our brains to deplete their total ATP, we need to determine how much ATP our brains contain and the rate of ATP consumption. First, we will calculate the amount of ATP in our brains. A rat neuron contains 2.6 mM ATP, an average human cell has a volume of 4000 µm3, and the brain contains ~240 billion cells. Using these numbers we can estimate the ATP content in a human brain:

eq2

This number corresponds to ~1.4 grams of ATP contained in our brains. For comparison, a paperclip has a mass of ~1 gram.

Next, we need to determine how much ATP our brains generate. Under normal conditions, over the course of a single day, we produce ~60 kg of ATP! Using this rate, we can calculate how much ATP our bodies generate per second:

latex_2169373ff5f1f0b374f1a33ea30a1053

But, recall that if we suddenly lose all of our mitochondria, then we can only generate ATP from anaerobic glycolysis, which only yields 2 molecules of ATP for every molecule of glucose, which is 18× less efficient than oxidative phosphorylation:

eq4

Our brains use roughly 20-25% of the total oxygen that we consume, so one-quarter of the ATP that we produce occurs in our brains. With this, we can calculate how much ATP generation occurs per second in the brain:

eq5

Finally, we need to calculate how much ATP our brains use. The human brain consumes 80 µM of ATP per second and averages 1.5 liters in volume. Using simple multiplication, we calculate:

eq6

Now we have all the numbers to determine how long it takes for our brains to deplete their ATP after losing all our mitochondria. We know our brains contains 2500 µmol of ATP,  they produces 20 µmol of ATP per second, and consumes 120 µmol of ATP per second. We calculate:

eq7.png

 

eq8

Thus, it will take our brains less than a half-minute before they deplete their energy stores, which corresponds closely with clinically established 1 minute time frame before brain cells begin to die due to lack of oxygen.

Therefore, if you suddenly lost all of your mitochondria, your brain will begin to die 21 seconds before Qui-Gon Jinn can finish ruining the mystique of the Force (for reference, 46 total seconds)!

 

Look forward to an upcoming post about the origins of organelles including mitochondria, chloroplasts, and maybe even midichlorians.

 

Microbiology beyond the micrometer

by Reed Stubbendieck (@bactereedia)

Fig1
Figure 1. Light Microscope from the late 1800s. Photo credit: Reed Stubbendieck, Personal Collection.

If I asked you what the signature tool of a microbiologist is, you would likely respond: the microscope (Fig. 1). Not only do the field and instrument share the prefix micro (from Greek mikrós, meaning “small”), but the development of microscopy established the field of microbiology. In the mid-seventeenth century, Robert Hooke and Antonie van Leeuwenhoek built the first microscopes. Using his microscope, Hooke provided the first description of a microorganism when he reported on the fruiting bodies of fungi and coined the term “cell” in reference to the structures that he observed in cork. Meanwhile, van Leeuwenhoek was the first individual to observe protozoa and bacteria, which he called “wee animalcules”.

Born from the observations of Hooke and van Leeuwenhoek, generations of microscopists have used the microscope to study the structure and substructures of cells, observe proteins in motion, and diagnose diseases, among many more applications (Some of which I’m sure Scott will explore in future posts). Thus, since the inception of microbiology, it has been difficult to disentangle the field and the instrument. However, microbes are not confined to the micrometer scale and hidden beyond our eyesight. In this post, I would like to highlight some examples of microbes that you likely encounter in your daily life and are visible with your naked eye.

Fig2
Figure 2. Mushroom fruiting bodies. Photo taken in East-Central Texas by Reed Stubbendieck.

Some of the most obvious examples of macroscopic microbes are fungi and include the beautiful mushrooms that bloom near the trunks of trees in our yards and forests (Fig. 2). But given that antibiotics are one of my primary research interests, I’d be remiss if I didn’t highlight the humble bread mold Penicillium. Who among us hasn’t kept a loaf of bread for a little too long, only to discover an unexpected fuzzy blue-green growth (Fig. 3A)? Likely, we write off the loss, toss the loaf, and head to the grocery store to acquire more bread. However, this contaminant is among the most important microbes ever discovered! In 1928, spores from this mold contaminated a Petri plate in Alexander Fleming’s laboratory and produced a region where bacteria were unable to grow (called a “zone of inhibition”) (Fig. 3B). Later on in 1940, Fleming’s observations inspired the chemists Ernst Chain, Howard Florey, and Norman Heatley to develop methods to mass produce the active agent from the mold. The result of their work was the antibiotic penicillin (Fig. 3C), which was likely the single most important advancement in modern medicine. Not a bad cost for a missed sandwich.

Fig3
Figure 3. The bread mold Penicillium produces antibiotics. (A) Moldy bread covered in Penicillium. Note: this is a stock photo and not a picture of my own bread (B) Penicillium inhibits the growth of the pathogen Staphylococcus aureus on Petri plates. (C) Core structure of penicillin. The R group is variable and distinguishes different types of penicillins. Photo credits: Panel A: Wikipedia user Henry Mühlpfordt  under the GNU Free Documentation License, Version 1.2. Panel B: Christine L. Case, Ed.D.

While antibiotics treat our infections, the yeast Saccharomyces cerevisiae helps feed the body and soul. To simplify an introductory course in biochemistry, when yeast consumes sugars but is starved for oxygen, it produces carbon dioxide gas and ethanol as byproducts of its metabolism. We use the former to make our bread rise and the latter to give our booze its buzz (Fig. 4A). But that’s not all, together the interactions of yeast and bacteria give sourdough bread its signature tangy acidic taste (Fig. 4B), form the biofilms that make up the rinds of cheese (Fig. 4C), and provide starters for Kombucha tea (Fig. 4D) (see here for more information).

Fig4
Figure 4. Fermented beverages and foods. (A) Homebrewed wheat beer in fermenter showing Krausen, the foamy head that consists of yeasts and wort protein, at the air-liquid interface. (B) Sourdough starter showing carbon dioxide bubbles after feeding. (C) The rinds of cheese wheels are a mixed bacterial and fungal biofilm. (D) Kombucha tea with pellicle at the air-liquid interface. Photo credits: Panels A and B: Reed Stubbendieck. Panel C: Wikipedia user Myrabella under a CC BY-SA 3.0 license. Panel D: Graciously provided by anonymous.

Outside the kitchen, microbes also live together symbiotically and produce amazing structures. For example, the lichens that cover rocks, tree trunks (Fig. 5A) and gravestones (Fig. 5B) are made up of algae, cyanobacteria, ascomycete fungi, and even yeasts living together! Within nature, symbiosis is the name of the game.

g3053
Figure 5. Lichens. (A) Lichen covering a tree trunk in Wisconsin. (B) Lichens growing on a gravestone. Photo credits: Panel A: Reed Stubbendieck. Panel B: Brian Robert Marshall under a CC BY-SA 2.0 license.

One of my favorite symbiotic systems is studied by my current laboratory and involves  fungus farming ants, the fungus they cultivate (called the cultivar), and a symbiotic bacteria called Pseudonocardia. These ants cut leaves and other plant material (depending on the ant species), but the ants do not eat the leaves. Instead, the ants feed the leaves to their fungal crop, which they consume it as their sole food source. Because the fungal cultivar is maintained asexually with limited opportunities for genetic recombination, it is not genetically diverse and is susceptible to pathogen infection. One pathogenic fungus is called Escovopsis and it is only found within the fungal gardens. The Escovopsis can overgrow and consume the fungal crop. To prevent this infection, the ants use grooming behaviors and form a symbiotic relationship with bacteria called Pseudonocardia, which produces antifungals that inhibit the growth of Escovopsis but are not harmful towards the fungal cultivar. The ants have evolved specialized structures to house and feed the Pseudonocardia on their exoskeletons and some ants, such as Acromyrmex sp. cf. octospinosus, can become totally covered by their symbiotic bacteria (Fig. 6)!

(Note, I will almost certainly cover the relationship between fungus-farming ants and Pseudonocardia in more detail in a future post.)

77652397-cf_octospinosus6
Figure 6. Symbiosis between ants and bacteria. Acromyrmex sp. cf. octospinosus worker ant covered in white symbiotic bacteria. Note the chewed leaves in the fungal crop. Photo credits: Alex Wild, purchased for use on this website.

Finally, should you find yourself in orbit over Earth, please find comfort in the knowledge that you can still revel in the beauty of our microbial world. Satellite photographs from NASA have captured images of massive blooms of algae in our planet’s oceans and seas!

1280px-Bloom_in_the_Barents_Sea
Figure 7. Algae bloom over the Barents Sea. This photograph showing a large algal bloom was taken by the NASA Aqua satellite in 2011.

This post has barely covered the diversity of microbes in our world, but I hope it has convinced you to leave the microscope behind and look for examples of microbes in your daily life. Check your showers for the pink-colored Serratia marcescens, observe a lovely mushroom on your lawn, or  overturn a rock and find a lichen. I would love for you to share your pictures with me! Use the hashtag #macromicrobiology and be sure to tag @bactereedia and @SciSidequests in your post.

 

If you’re interested in checking out more beauty in the microbial world, I highly recommend the book Life at the Edge of Sight by Scott Chimileski and Roberto Kolter and the corresponding museum exhibit at the Harvard Museum of Natural History.

Will bacteria become the next thumb drives?

by Reed Stubbendieck (@bactereedia)

Dispilio_tablet_text
The text from the Dispilio tablet [Source]
In 1994, a wooden tablet was unearthed from the swamps near Dispilio, a neolithic settlement in modern-day Greece. The Dispilio tablet was carbon-dated to ~5200 BC and is considered to be among the world’s oldest examples of recorded information, having lasted for >7000 years (see image above). For comparison, without intervention the lifespan of most modern digital storage media ranges from 5 to 20 years (side note: have you backed up your data recently?). However, while the longevity of the Dispilio tablet is impressive, living cells have been storing information in DNA for 3.8 billion years. In my previous post, I discussed the potential for cells to store large amounts of information. Today, I want to cover some recent examples of how scientists and engineers are tapping into this immense storage potential.

My favorite example of using cells to store information comes from a paper published last year (2017) in Nature. In this paper, the authors used CRISPR-Cas technology to introduce DNA into cells and store images. Recently, CRISPR-Cas technology has gained fame for its applications in genome engineering, including a dubiously alleged ability to hide genetically modified criminals from law enforcement. However, in its natural context, the CRISPR-Cas system functions as an adaptive immune system for archaea and bacteria. It’s this feature that the authors co-opted for information storage, which I will discuss below.

Though we often think of viruses as disease-causing agents of humans and other Eukaryotes, bacteria suffer from a far greater number of viral infections. In fact, viruses of bacteria, also known as bacteriophages (or simply phages), are the most abundant biological entities on Earth. Estimates place the global number of phages at 1030, which collectively cause 1023 infections of bacteria each second. For comparison, Avogadro’s number is 6.022×1023, meaning that there nearly one mole of phage infections globally per six seconds (or one round of combat in D&D)!

Bacteria are not powerless to stop phage infections. One mechanism that bacteria use to prevent infections is the CRISPR-Cas system. Though the specific molecular details are beyond the scope of this article (see here, if interested), I would like to take a brief moment to explain how the CRISPR-Cas system functions in bacterial cells. During infection, a bacterial cell may capture small pieces of the phage genome and insert them into a region of the chromosome called the CRISPR array. Subsequently, if the bacterium survives, it uses these captured DNA sequences to generate an immune response against future infections from the same phage. Importantly, the cell inserts new DNA sequences into the CRISPR array in a predetermined position. Thus, the CRISPR array stores a history of infection in linear order, which is passed to both daughter cells when the bacterium divides.

By taking advantage of the ability of the CRISPR array to store new DNA sequences, one research group stored the information to reconstruct images inside of Escherichia coli cells. Instead of infecting E. coli cells with phages, the researchers generated large numbers of synthetic DNAs called oligonucleotide protospacers and tricked the cells into incorporating the custom DNAs into the CRISPR arrays. At the beginning of each of the protospacers was a 4 base pair sequence the authors called a “pixet”. The pixet defined the set of pixels described by the following 28 base pairs of the protospacer, where each of the nucleotides (A, T, G, and C) corresponded to a different shade of gray. By introducing 112 protospacers into the population of E. coli cells, the authors were able to store a 56 × 56 pixel 784 byte grayscale image of a human hand in the bacteria. To access the data, the researchers used high throughput DNA sequencing technology and determined the DNA sequences of many different CRISPR arrays from the population of bacteria. By using a custom algorithm, the researchers were able to decode the information from the CRISPR arrays and they digitally reassembled the original image (see image below).

hand-image_0
Retrieval of an image of a hand stored in bacterial DNA [Source]
This research group was not satisfied by encoding a single image. Instead, they wanted to store a movie. Specifically, the researchers encoded five frames of Plate 626 from Animal locomotion. An electro-photographic investigation of consecutive phases of animal movements by Eadweard Muybridge from between 1872-1875. To store this animation, the researchers split each frame into protospacer sequences as above, but instead of introducing all of the information at once, the DNA encoding each individual movie frame was successively introduced into the population of E. coli cells. Recall that the CRISPR array stores a history of infection in linear order. Using this approach, each cell stored a piece of each of the five frames. By sequencing the entire CRISPR array from the population of bacteria and splitting the spacer sequences by order of appearance, the authors were able to reconstruct each frame from the movie (see .gif below).

GifDNA-Horse-Inline
Movie of a galloping horse stored in bacterial DNA [Source].
One caveat of the above examples is that the images decoded from the E. coli genomes were not perfect reproductions, which is evident from several spurious pixels in the reconstructed movie. The authors found that the differences between the encoded and reproduced frames was most often due to changes in the protospacer sequence by DNA synthesis errors, DNA sequencing errors, or mutation. This latter finding highlights a limitation of storing information inside of cells. In the opening, I mentioned that cells have been using DNA to store information for 3.8 billion years. But, unlike the information encoded in the inscriptions on the Dispilio tablet, this information storage is imperfect. DNA mutates and cells evolve. This process is essential for continuing life but is inconvenient for perfect information archival.

Engineers at Microsoft have recently developed their own form of DNA storage technology. Instead of using cells, the engineers store information in isolated DNA molecules and, under special conditions, these molecules are predicted to last for >2000 years. Though etchings on preserved wood have still exceeded the current longevity estimations of DNA storage, I think we’ll find a more effective solution for perfect information archival before those DNA molecules degrade in the year 4000!

Biological Thumb Drives

by Reed Stubbendieck (@bactereedia)

“Every cell in your body contains seven hundred and fifty megs of data,” the engineer said. “For comparison, one of your fingers holds as much information as the entire internet. Of course, your information is repeated and redundant, but the fact remains that cells are capable of great storage.”

Legion: Skin Deep, Chapter 5

legion-skin-deep-by-brandon-sanderson3.jpg
Legion: Skin Deep, cover

Note: This post contains no plot spoilers for Legion: Skin Deep.

In Brandon Sanderson’s novella Legion: Skin Deep, Stephen Leeds is tasked with finding a corpse and the information it “knows”. Before his death, eccentric engineer Panos Maheras encoded crucial information in his cells. Leeds needs this information to prevent a pandemic caused by a new virus. There are a ton of fun ideas to explore in this story, but today I want to focus on the idea of storing information inside of cells. As referenced above, an engineer states that each cell contains 750 megabytes (MB) of data. So, in my first post, I want to explore the following two questions:

  1. How much information is contained within a single cell?
  2. Can we store the entire internet in a human finger?

Before we begin, let’s have a brief refresher on DNA. Inside of most animal and plant cells, there is a tiny organ (an “organelle”) called the nucleus. The nucleus contains the cell’s DNA in the form of chromosomes, which is also known as the nuclear genome. Each chromosome contains a double-stranded DNA molecule wrapped tightly around many different proteins. Information is encoded within a double-stranded DNA molecule via the nucleotide base pairs. A single strand of DNA is made from a sequence of the four nucleotides: adenine (A), cytosine (C), guanine (G), and thymine (T), which pair with T, G, C, and A, respectively on the complementary strand of DNA. In a human’s haploid (i.e., single) set of 23 chromosomes, there are ~3 billion base pairs of DNA. To determine the storage capacity of 3 billion base pairs, we will need to take a brief trip from the world of carbon to the world of silicon.

In computing, the basic unit of information is called a bit, and can have a binary value of 0 or 1. If we translate the language of nucleotides into the language of bits, it takes 2 bits to encode each base pair uniquely (AT, CG, GC, and TA as 00, 01, 10, and 11, respectively). This conversion will allow us to directly determine how much information is contained in our DNA. For a single set of 23 chromosomes, we calculate:

eq1

It is important to note that estimates of the total length of the haploid human genome vary from 2.9 to 3.2 billion base pairs. Thus, the information content of the haploid human genome is between 691 – 763 MB, which bound the value of 750 MB (see update, below) given by the engineer in the novella. However, most human cells contain a diploid (i.e., two) sets of chromosomes. We inherit 23 chromosomes each from our father and mother for a total of 46 chromosomes. Therefore, each cell contains 1430 MB of information stored in the nuclear genome. In addition, human cells also contain organelles called mitochondria, which have their own genomes. The mitochondrial genome is much smaller than the human genome at ~16,000 base pairs. Unlike the nucleus, each cell contains many mitochondria. For our calculation we’ll estimate that the average cell contains 1000 molecules of mitochondrial DNA:

eq2

That’s not a lot compared to the nuclear genome. Therefore, the total information stored in each cell is ~1.4 GB. This data could be stored on a USB drive that costs < $3!

But can we store the entire internet on a finger? If we estimate the number of cells in a human finger and then use our value of 1.4 GB per cell, we can calculate a finger’s DNA information content. For a male, the average finger width and length are 20 and 110 mm, respectively. Assuming that a finger is roughly cylindrical in shape, the density of human tissue is close to that of water (1 g/cm3), a 70 kg human body contains 1013 cells, and that the distribution of cells in the body is uniform across all body sites, we can estimate the data storage of a human finger as follows:

eq3

7 exabytes (EB) is an extremely large amount of information. In fact, a single EB can store nearly 43 million Blu-ray discs. However, the internet contains much more information: in 2012 it was estimated that 1 EB of data was created on the internet daily. Therefore, though the potential information stored on a finger is impressive, it will not store the entire internet. If we use every cell in the human body, we can store 14 zettabytes (ZB) (615 billion Blu-ray discs). With this much storage, we can hold the entire internet. At least, for a little while longer, because global internet traffic is estimated to reach 3.3 ZB per year in 2021!

In conclusion, even though the engineer was a little enthusiastic about the storage potential of a human “thumb drive”, he was correct that cells have a great capacity to store large amounts of data! In a future post, I will explore how scientists have already engineered cells and DNA to store music, movies, and books.

Update 05/22/2018: In the calculations for this post, I used the classic definition that defined 1 MB as 220 (1,048,576) bytes instead of 106 (1,000,000) bytes. In modern parlance, the former is now defined as a Mebibyte (MiB) and the latter is a MB (see here for a discussion of the historical differences). If we use the modern definition of a MB, then the calculation matches the 750 MB value given in the novella. Note, this value is still in reference to a haploid genome.