PDF Summary:The Gene, by

Book Summary: Learn the key points in minutes.

Below is a preview of the Shortform book summary of The Gene by Siddhartha Mukherjee. Read the full comprehensive summary at Shortform.

1-Page PDF Summary of The Gene

Every part of our bodies, from toenails to hair and everything in between, is built based on the instructions coded into our genes. Therefore, understanding what genes are and how they work is crucial to understanding our bodies, our health, and even our identities. Siddhartha Mukherjee wrote The Gene to give the average reader a basic grounding in the history and the science of genetics.

Mukherjee is an immunologist and geneticist known for his work in cancer research, a field closely linked to genetics. The Gene explores scientists’ efforts to learn about people by studying the genes that create us: The book traces the history of genetics from Darwin’s Origin of Species to modern gene sequencing technology, as well as taking a brief look at what genetic engineering might mean for humanity’s future. Our commentary will provide background information and explanations to help readers understand the science behind The Gene. We’ll also examine advancements in genetics since The Gene’s publication in 2016.

(continued)...

How Genes Create Proteins

The process in which DNA creates functional proteins is intricate and involves many different enzymes (proteins that aid chemical reactions). However, Mukherjee describes the process in two broad steps:

1. Transcription. Enzymes read the DNA “blueprint” and create a matching RNA molecule of the genes to be translated into proteins.

(Shortform note: This step is crucial because only a small part of your genome gets transcribed into RNA at a time. If your cells read your DNA directly and made proteins based on that, they could end up trying to create an entire genome’s worth of proteins at once.)

2. Translation. Other enzymes read instructions encoded in the RNA molecule, retrieve the needed amino acids (simple organic compounds that make up proteins) from the blood, and assemble them into proteins.

(Shortform note: There are a total of 20 amino acids that, when put together in various combinations and shapes, create countless proteins. Of those 20, nine are considered essential amino acids because our bodies can’t produce them. In other words, they’re essential parts of our diet, because the only way we can get those amino acids is by breaking down proteins from other organisms that produce them. Perhaps the best-known essential amino acid is tryptophan, which is found in turkey (among other sources), and is supposedly responsible for the post-Thanksgiving drowsiness many people experience.)

Parts 3 and 4: Writing and Reading Genes

Scientists had discovered that the language of biology is encoded in DNA, and it consists of only four letters. The next step for geneticists was to figure out how to read and write in that language.

Editing Genomes

Mukherjee tells us that, in 1970, Stanford biochemists Paul Berg and David Jackson successfully created recombinant DNA—DNA containing genes from multiple different sources—by inserting the genome of a virus called SV40 into the DNA of a bacteriophage (a type of virus that infects bacteria).

Combining the genomes of two species was an exceptional feat in itself, but it also hinted at a way to quickly and efficiently create drugs such as insulin and certain antibiotics—substances that are normally produced inside of living organisms. For example, inserting an insulin-creating gene into a virus’s genome and allowing that virus to replicate would naturally mean that the insulin gene gets replicated as well. In other words, by editing viruses’ genomes, scientists could turn them into microscopic medicine factories.

(Shortform note: Today, recombinant DNA technology has many uses even outside of medicine, particularly in agriculture. Most notably, by altering plant genomes, scientists can create crops that resist diseases, require less water or less fertilizer, and have greater yields than their unaltered counterparts.)

Reading Genes With Gene Sequencing

As we’ve said, it isn’t nucleotides themselves that encode genetic information—four chemicals aren’t nearly enough to account for the enormous array of proteins that our bodies produce—but rather the order in which those nucleotides are arranged. Therefore, Mukherjee tells us that in order to decode genetic instructions, scientists first had to learn how to sequence genes—in other words, to determine exactly which nucleotides are present and in what order.

In 1977, Cambridge biochemist Frederick Sanger fully sequenced a genome for the first time. Using specially tagged nucleotides, he was able to follow along as the virus replicated itself, painstakingly copying down the approximately 5,400 base pairs of a virus called Phi X174. By doing so, he was able to match genes with the proteins they created—in essence, he learned how to read the virus’s genetic code.

Innovations in DNA Sequencing

“Sanger Sequencing'' is still the method of choice for small-scale sequencing projects, such as finding a mutation in a single gene. Geneticists favor Sanger Sequencing partly due to its low cost and high accuracy, and partly because it’s a well-established procedure that most geneticists will already be familiar with.

However, Sanger’s method isn’t efficient or cost-effective for large projects like whole genome sequencing. That’s because Sanger Sequencing can only sequence one fragment of DNA at a time, and those fragments are relatively short—anywhere from 300 to 1,000 nucleotide pairs. An improved method called next-generation sequencing (NGS) can sequence millions of those fragments simultaneously, making it much more suitable for large-scale projects.

Intergenic DNA and Introns: Genetic “Filler”

Mukherjee adds that, as scientists continued sequencing genomes of different species, they found something very odd: Animal genomes contain long stretches of DNA that don’t actually code for proteins. These noncoding zones can be found both between genes (where they’re called intergenic DNA) and within genes (called introns).

In fact, in humans, a full 98% of our genome doesn’t code for anything. Mukherjee says that even geneticists aren’t sure why that is, and he explains the three competing theories:

  1. The noncoding DNA regulates genes—the extra space helps control when they’re activated and deactivated.
  2. The noncoding DNA serves some other purpose that we haven’t yet discovered.
  3. The noncoding DNA is genetic junk left over from millions of years of evolution, and it serves no purpose at all.

(Shortform note: Contrary to what Mukherjee writes here, scientists do know at least one purpose of introns: Noncoding sections of DNA get removed from the transcribed genetic instructions, effectively breaking up genetic “sentences'' into individual “words.” This is significant because it allows for alternative splicing—essentially, rearranging the remaining pieces into different combinations. This process allows a single gene to code for multiple different proteins.)

Gene Sequencing in Medicine

Mukherjee says that nowadays, improved gene sequencing technology allows doctors and researchers to find and diagnose genetic diseases.

A doctor of internal medicine named Victor McKusick led the charge to bring genetics to medicine. He first became interested in genes in 1947, when he found that a certain disease (now called Peutz-Jeghers syndrome) ran in families and concluded that it must be the result of a defective gene.

By 1998, McKusick had discovered some 12,000 disease-causing gene variants. He’d also found that some disorders are the result of a single mutation—such as sickle-cell anemia—while others are much more complex. For example, Down’s syndrome is the result of someone inheriting an entire extra chromosome, while conditions like cancer and heart disease can be influenced (though not directly caused) by numerous different genes.

In many cases, gene sequencing techniques even allow doctors to diagnose diseases and disorders in utero, allowing the mother to make informed decisions about whether to proceed with the pregnancy. The first such case occurred in 1968, when a woman known only as J.G. decided to terminate her pregnancy rather than give birth to a child who was likely to live a very short and painful life.

Types of Mutation

There are so many disease-causing variants of genes because DNA codes for complex and highly specific proteins, and mutations often result in those proteins being made incorrectly (or not being made at all). Since proteins carry out thousands of different tasks within the body, those mutations can interfere with bodily functions in countless ways.

Broadly speaking, there are three types of mutation:

  • A silent mutation has no effect; despite the mutation, the gene ends up coding for the same protein. It’s like replacing one word in a sentence with another word that means the same thing. For example, ”I’m driving to the store” becomes “I’m going to the store.”

  • A missense mutation causes the gene to produce a different protein than usual. Proteins with missense mutations are often less effective or efficient in carrying out their tasks, if they can perform them at all. For example, in sickle-cell anemia, a change in the protein hemoglobin causes red blood cells to become deformed and rigid, making it more difficult for those cells to carry oxygen throughout the body. Again, imagine replacing one word in a sentence, but this time you replace it with something that changes the meaning—“I’m driving to the store” becomes “I’m dancing to the store.”

  • A nonsense mutation makes it so the gene’s instructions are cut off early, usually making the protein stunted and nonfunctional. For example, in cystic fibrosis, a missing or nonfunctional cell membrane protein prevents moisture from passing through and entering the lungs. As a result, mucus that would normally get cleared out of the lungs becomes too thick and sticky to expel. A nonsense mutation is like replacing a word with a period, so “I’m driving to the store” becomes just “I’m.”

The Human Genome Project

In 1989, a group of biologists began the massive undertaking of sequencing the entire human genome. A council of 12 advisers, chaired by American geneticist Norton Zinder, led the effort.

Mukherjee tells us that the human genome contains over three billion base pairs—for a sense of scale, remember that the first fully sequenced genome was a virus consisting of about 5,400 base pairs. Completing the project would take an estimated 50,000 person-years and cost three billion dollars—about a dollar per base pair.

However, in spite of its massive scope, the Human Genome Project (HGP) released a first draft of the complete human genome just over a decade later, in 2000. Then, in 2003, the HGP’s chair officially declared it complete: Every human gene had been accurately sequenced and mapped. The Project uploaded its final results to the internet, where the genome map is still publicly available today.

However, Mukherjee says that even with all of this understanding of human genetics—where every gene is, what it codes for, and how—we still understand very little about how all these different genes coordinate and cooperate to build and maintain our bodies. In other words, Mukherjee believes that the next step for scientists should be a deeper study of human genomics: in other words, how the genome as a whole works.

Side Benefits of the Human Genome Project

The Human Genome Project had benefits far beyond simply improving our understanding of human genetics.

  • Medical benefits: The data from the HGP helps doctors identify and treat genetic diseases, and find genetic markers that may put someone at risk for conditions ranging from cancer to heart disease.

  • Scientific benefits: Over the course of the HGP, scientists developed improved gene sequencing technology and techniques that continue to be used today.

  • Economic benefits: Every dollar the US government invested in the HGP returned an estimated $141 to the US economy, largely through creating new jobs that only exist because of a mapped human genome.

However, there are also legal and ethical concerns about how the information in the HGP could be misused. In particular, there’s the chance that genetic data could be used to discriminate against people who carry certain genes or come from certain backgrounds.

Part 5: Genetics and Identity

We’ve now had a brief overview of the history of the gene up to the present day. The remainder of this guide will focus on the current state of genetics, how genes impact us personally, and what the future might hold for both the field of genetics and the human race.

As we’ve said, our genes contain the blueprints for our bodies. Therefore, in a very real sense, our genes determine who we are. According to Mukherjee, each of us has crucial elements of who we’ll become—our ability to learn, to use language, and even our physical appearance—encoded in our DNA.

Our Genetic Identities Are Very Similar

Mukherjee says that genetically speaking, humans are all much more similar than we are different. People who believe in significant differences between “races”—for instance, that people of Asian descent are naturally good at math, or that those of African descent are more athletic—are mistaken; there simply isn’t enough genetic variation to account for such differences.

Mukherjee adds that every human alive today can trace his or her lineage down the maternal line to one woman who lived in Africa about 200,000 years ago. The fact that we have a common ancestor, especially such a recent one (by evolutionary standards), also suggests that we’re much more alike than people think.

Furthermore, scientists now believe that the first humans left Africa less than 100,000 years ago. Mukherjee tells us that it would take several times that long, at least, for any significant genetic differences to arise—in other words, for us to split into different “races.”

(Shortform note: If we all came from a common ancestor, and are still almost genetically identical as Mukherjee states, how do we explain the differences that do exist between ethnicities? The most obvious difference between “races'' is skin color, which has changed more quickly than other traits because of natural selection. Populations that live near the Earth’s equator tend to have darker skin because it protects them from the intense sun and UV rays. Conversely, people who live far from the Earth’s equator—especially people in the northern hemisphere—tend to have pale skin so they can more efficiently absorb energy from the limited amount of sunlight they get.)

Genetic Differences Between Individuals

While Mukherjee is correct that there’s not much genetic variation between races (so to speak), there can be a great deal of genetic variation between individuals.

The most obvious example of genetic differences between individuals is biological sex (male versus female). This is most commonly explained as a difference of a chromosome—females have matching XX chromosomes, while males have one X and one smaller Y chromosome—but Mukherjee says the difference is even smaller than that.

Mukherjee says that in 1989, a geneticist named Peter Goodfellow narrowed “maleness” to a single gene on the Y chromosome, simply called SRY. To test his theory, Goodfellow genetically altered female mice to carry a copy of the SRY gene. Some of the offspring, though chromosomally female (XX chromosomes) seemed male in both anatomy and behavior. In other words, by altering a single gene, Goodfellow completely changed the identities of those mice.

This is just one example of how our genetics play a role in who we are.

(Shortform note: While a single gene may explain both sex and behavior in mice, humans are quite a bit more complex. Scientists have, to date, identified 19 separate genes that help determine the masculinity or femininity of the human brain—in other words, whether that person will “feel like” and identify as a man, a woman, or neither. When a person’s brain and biological sex don’t match, it can result in a condition called gender dysphoria, where the person feels trapped in the wrong body.)

Environment Plays a Large Role in Identity

Very few traits are purely genetic because most of them are also influenced by the environment. For example, identical twins (who, by definition, have all the same genes) could look and act very differently if one becomes a professional athlete while the other becomes an office worker.

Also, Mukherjee says that genes often create tendencies or predispositions toward certain kinds of behavior, but those behaviors still won’t appear unless the environment draws them out. For example, someone who’s genetically predisposed to alcoholism could go his or her entire life without ever drinking, and therefore without becoming addicted to alcohol.

Different Types of Environment

A major part of a person’s identity is how he or she acts, but, as Mukherjee says, our traits and behaviors are heavily influenced by our environment. In fact, in Behave, Robert Sapolsky—a professor of biology and neurology—explains that there are actually three different types of environment that interact with our genetics to determine how we act in any given moment:

  • Physical environment. Our surroundings influence how we act in numerous ways. For instance, if the area is clean and well-maintained, we’re more likely to be polite and follow the rules because it seems like the sort of place where that’s expected.

  • Moral environment. How close we are to a situation, both physically and emotionally, determines how we’ll respond to it. For example, if we see a child in danger, we’re likely to jump into action; if we read a story about a child in danger, we’re more likely to do nothing. Similarly, if the child is someone we know personally, we’re much more likely to try to help than if the person is a stranger.

  • Social environment. We’ll often factor who’s nearby into our decisions about how to act. For instance, men tend to be more decisive and aggressive when women are nearby. Also, people tend to be more likely to put themselves in danger if others are around—perhaps because there’s the chance that they’ll help, or maybe simply for the chance to be seen as a hero.

The Environment Permanently Changes Our Genes

We’ve said before that genes switch on and off as you grow from a single cell into an infant, and that’s why you have so many different types of cells even though they all have the same DNA. However, genes also switch on and off throughout our lives in response to environmental factors. For example, when you’re exercising, your body will activate genes that burn extra nutrients in order to boost your energy.

Even more interestingly, Mukherjee says that those repeated activations and deactivations leave permanent marks. Molecules called methyl tags attach themselves to genes during this process, and enough methyl tags on a gene can affect how it works. For example, researchers believe that some cases of cancer are due to these methyl tags blocking the “off switch” for cell division, causing potentially deadly tumors to form.

(Shortform note: Environmental effects on genes may be even more widespread and impactful than Mukherjee suggests. In Lifespan, geneticist David Sinclair explains his theory that these long-term changes in gene function are the reason why we age. Even more astoundingly, Sinclair believes that it’s possible to undo these changes with genetic engineering—to remove the methyl tags and return cells to their original functions. In short, Sinclair believes that someday it will be possible to reverse the aging process, and that it might even happen in our lifetimes.)

Eugenics: The Misuse of Genetics

In 1883, biologist Francis Galton published a book called Inquiries into Human Faculty and Its Development. Galton, inspired by his cousin Darwin’s work, theorized that selective breeding programs could improve the human race much more quickly than natural selection would: Those with desirable traits like high intelligence, health, and physical strength would be encouraged (or forced) to breed, while those with undesirable traits like chronic illness would be prevented from breeding. Mukherjee points out that Galton’s ideas were deeply immoral from the start, and implementing them would severely infringe on people’s reproductive freedoms.

Galton’s ideas reached their terrifying conclusion decades later. In 1933, Adolf Hitler became chancellor of Germany. Hitler dreamed of using eugenics to create a “perfect” human race, and so his followers began massacring undesirables—a label that included Jews, Roma, and disabled people, among others. By 1934, they were forcibly sterilizing some 5,000 people every month, and by the time of Hitler’s death in 1945, the Nazis had killed an estimated 11 million people in pursuit of Hitler’s ideal human race. The subject of eugenics has been largely off-limits in the scientific community ever since.

Mukherjee says that, if any good can be said to have come from the Holocaust, it came from making eugenics taboo.

Iatrogenics: Hurting by Trying to Help

Eugenics and the Holocaust are some of the most extreme and horrifying examples of iatrogenics—harm caused by an attempt to make things better. Statistician Nassim Nicholas Taleb discusses iatrogenics in Antifragile, saying that people almost invariably make matters worse when they try to improve on what nature has created.

The term iatrogenics comes from medicine. Until relatively recently, when doctors discovered the existence of germs and started using proper antiseptic techniques, people commonly got sick and died from the very offices and hospitals where they sought treatment. Doctors trying to help ended up causing more harm. Similarly, Taleb believes that events ranging from wars to climate change are the result of people forcing their will upon the world when it isn’t needed or wanted.

According to Taleb, this damage happens because people put their trust in science and mathematics—predictions, models, and the like—when those processes are inevitably flawed. The natural world is the result of millions of years of competition and evolution, while human ideas and interventions are based on extremely limited studies and tests.

Therefore, Taleb believes that we need to fundamentally change how we think about taking action. Currently, the burden of proof rests on the naysayers: People who are against an idea have to prove that it’s dangerous or harmful. Instead, Taleb believes that people in favor of an idea should have to prove that it’s not harmful, or at least that the benefits will be worth whatever harm it could cause. So, for example, if scientists want to move forward with genetic engineering projects, Taleb would first want them to prove that their work can’t reasonably be used to harm people or damage the environment.

Part 6: Genetic Engineering—the Future of Genetics?

Mukherjee believes we now understand genetics well enough that our next step forward is to start manipulating genes. In this section, we’ll discuss how scientists have already begun to explore the possibilities of gene manipulation with new technologies like stem cell research. However, progress is slow due to ethical and legal concerns, especially when it comes to modifying human genes.

Gene Therapy Could Be the Future of Medicine

Mukherjee says that gene therapy—using genetic engineering to fix damaged or disease-causing genes—offers promising treatments for diseases ranging from hemophilia and cystic fibrosis to cancer.

(Shortform note: Safe and effective gene therapy is still a work in progress. In the US, it’s currently only available to patients who agree to participate in clinical trials.)

One major area of study is in pluripotent stem cells: immature cells that can be genetically manipulated to grow into any type of adult cell. While there are obvious ethical issues with harvesting immature cells from human embryos, doctors now believe it’s possible to manipulate the genomes of adult cells so that they revert to stem cells. From there, the cells can grow into whatever’s needed. In other words, doctors may be able to treat patients using stem cells harvested from their own bodies.

Theoretically, doctors could use these stem cells to regenerate damaged nerves and organs, helping people to heal from injuries and diseases that are otherwise untreatable.

(Shortform note: Aside from exciting new treatment possibilities, stem cells are also useful to model how diseases progress and to test new drugs. For example, a researcher could take a cell sample from a patient, grow that sample into new tissue, and observe how the disease affects it. The researcher could also use that sample to test experimental treatments without putting the patient’s health at risk.)

Mukherjee also discusses CRISPRs: “Clustered Regularly Interspaced Short Palindromic Repeats”—in simple terms, repetitive and easily identifiable sequences of nucleotides.

A gene editing technique called CRISPR-Cas9 targets those sequences using an enzyme called the Cas9 nuclease, allowing scientists to make precise cuts to DNA. That, in turn, allows specific sequences of DNA to be removed and other sequences to be inserted. In short, scientists can use this technique to make precise, controlled edits to a cell’s DNA, thereby changing the genetic instructions encoded in it. Editing genes this way could potentially cure a wide range of genetic diseases, correct harmful mutations, and possibly even treat cancer.

Problems With CRISPR

While CRISPR-Cas9 is a powerful gene editing technique, it has several serious limitations that scientists are still working to overcome:

  • Hard to scale. It’s difficult to edit DNA in large numbers of cells at once, which severely limits CRISPR’s usefulness as a treatment for widespread or systemic issues throughout the body.

  • Not 100% effective. Even genes that are targeted by CRISPR may not show edited genomes as intended.

  • Not 100% precise. Though rare, CRISPR can affect genes other than the intended targets. These unintended edits could, in theory, cause the sorts of harmful mutations that CRISPR-Cas9 is intended to correct—for example, an uncontrolled mutation could cause cancer.

Germline Editing: Inheritable Changes

Mukherjee says that gene therapy currently only affects the person it's performed on and doesn’t get passed to that person’s children. However, it’s theoretically possible to create a human embryo using genetically modified stem cells, if those stem cells can be converted into gametes (sperm and eggs). While that should be possible—stem cells should be able to turn into any type of cell—the technique is still unproven.

But if scientists could create genetically modified embryos in this way, it would mean all of that person’s cells, including his or her gametes, would carry the modifications. Therefore, those changes would be passed down to any children the person had. At that point, Mukherjee says, we would have gone from editing a person’s genes to editing a person’s genome; in doing so, we’d have created an entirely new type of organism, and potentially changed the gene pool forever.

(Shortform note: The first, and so far the only, known instance of germline editing happened in 2018: Biologist He Jiankui used gene-editing techniques on human embryos with the goal of creating people who were immune to HIV. Three gene-edited babies were born from He’s work. While those three people are apparently healthy children today, many scientists agree that He crossed an ethical line by performing the procedure on people who couldn’t consent to it—meaning both the embryos and any children they might have in the future. Those scientists also argue that the children could suffer unintended side effects, such as harmful mutations or cancer. He served a three-year prison sentence for violating medical regulations, which ended in April of 2022.)

Currently, germline modifications—changes that will be passed on to future generations—are illegal, and Mukherjee says that’s a wise policy for several reasons. On a practical level, scientists’ understanding of genomics is still limited; we simply don’t know enough about how genes interact with each other and with environmental factors. That means that even a seemingly beneficial change to a gene could have unforeseen and devastating consequences.

Furthermore, on an ethical level, genetically modifying the human race raises uncomfortable echoes of eugenics and the Holocaust, and poses many difficult questions. For example, should we engineer away undesirable traits if we can do it without killing living people, or would that be giving medical treatment without consent? Should parents be able to choose what traits their children will have, thereby creating “designer babies?” If we’re able to “improve” the human genome, would that change what it means to be human—in other words, would people who don’t have those changes be considered somehow less than human?

These are deep moral questions without easy answers; but they’re questions that Mukherjee believes we’ll have to face before we push genetics too much farther.

Counterpoint: In Support of Germline Editing

Dr. John Harris—one of the world’s leading authorities on bioethics—believes that scientists have not only the right, but the moral responsibility to use germline editing to prevent genetic diseases and thereby reduce human suffering.

Harris argues that there’s no evidence backing up concerns about side effects, and that those who talk about human rights, dignity, and personal identity can’t adequately explain how germline engineering will infringe on any of those things.

Furthermore, Harris points out that many people argued strongly against in-vitro fertilization (IVF) using many of the same points: that it’s unnatural, that there could be side effects, or that people born from IVF would be inherently different somehow; not quite human. However, today, IVF is a widely used and accepted treatment for infertility—the children born from IVF are perfectly healthy, and there’s no sense of exclusion or “otherness” because of how they were born. Harris believes that someday the arguments against germline editing will prove to be just as unfounded as those against IVF were.

Want to learn the rest of The Gene in 21 minutes?

Unlock the full book summary of The Gene by signing up for Shortform.

Shortform summaries help you learn 10x faster by:

  • Being 100% comprehensive: you learn the most important points in the book
  • Cutting out the fluff: you don't spend your time wondering what the author's point is.
  • Interactive exercises: apply the book's ideas to your own life with our educators' guidance.

Here's a preview of the rest of Shortform's The Gene PDF summary:

What Our Readers Say

This is the best summary of The Gene I've ever read. I learned all the main points in just 20 minutes.

Learn more about our summaries →

Why are Shortform Summaries the Best?

We're the most efficient way to learn the most useful ideas from a book.

Cuts Out the Fluff

Ever feel a book rambles on, giving anecdotes that aren't useful? Often get frustrated by an author who doesn't get to the point?

We cut out the fluff, keeping only the most useful examples and ideas. We also re-organize books for clarity, putting the most important principles first, so you can learn faster.

Always Comprehensive

Other summaries give you just a highlight of some of the ideas in a book. We find these too vague to be satisfying.

At Shortform, we want to cover every point worth knowing in the book. Learn nuances, key examples, and critical details on how to apply the ideas.

3 Different Levels of Detail

You want different levels of detail at different times. That's why every book is summarized in three lengths:

1) Paragraph to get the gist
2) 1-page summary, to get the main takeaways
3) Full comprehensive summary and analysis, containing every useful point and example