This article is an excerpt from the Shortform book guide to "Everybody Lies" by Seth Stephens-Davidowitz. Shortform has the world's best summaries and analyses of books you should be reading.
Like this article? Sign up for a free trial here .
What are the most memorable quotes from Everybody Lies? How can these quotes help you understand data science?
Everybody Lies, by Seth Stephens-Davidowitz, is about big data’s potential to revolutionize social science research. The book’s central premise is that people reveal more about themselves when making web searches than they would ever reveal in public or in a traditional survey.
Read more for a few Everybody Lies quotes to explain Stephens-Davidowitz’s argument.
Everybody Lies Quotes
Stephens-Davidowitz has a Ph.D. in economics and has worked as a data scientist at Google and as a contributor to The New York Times. Everybody Lies draws on his research using Google search results as well as data from PornHub, Wikipedia, and more. Though the book contains many surprising and interesting findings from this research, its real purpose is to explain the benefits—and drawbacks—of big data research. Stephens-Davidowitz says he hopes to inspire readers to enter the field of data science, much as Steven Levitt and Stephen J. Dubner’s Freakonomics inspired him to do the same.
Here are three Everybody Lies quotes with explanations.
“The next Freud will be a data scientist. The next Marx will be a data scientist. The next Salk might very well be a data scientist.”
In addition to improving our natural intuition, data studies can make the social sciences more rigorous. Stephens-Davidowitz notes that traditionally, there’s a divide between hard sciences (such as physics and chemistry) and soft sciences (such as psychology and sociology). That divide boils down to differences in method and types of evidence, with critics accusing the social sciences of advancing theories that can’t be falsified.
Stephens-Davidowitz gives the example of Freud’s theories of sexuality, which Freud based on his own observations and interpretations rather than on experimental evidence. Stephens-Davidowitz shows how Google and Pornhub search data let us test these previously untestable ideas (he finds no evidence for Freud’s claim that phallic symbols in dreams reveal latent desires; on the other hand, he finds a surprising number of searches for parent-child incest videos, suggesting some truth to Freud’s Oedipal theory).
“I am now convinced that Google searches are the most important dataset ever collected on the human psyche.”
Google searches and other internet activity reveal truths that might never come out in traditional data-gathering methods like surveys. For example, Stephens-Davidowitz shows that in states whose laws oppose gay marriage, the percentage of self-reported gay men is much lower than the estimated average across the whole population. But, he says, if you look at searches on Google and porn sites, the percentage of male users looking for gay porn (or asking how to tell if they’re gay) is much closer to that average. Also, the percentage of gay men as defined by search results is roughly stable from state to state.
This suggests that search data is a more accurate—and honest—measure of gay male sexuality than traditional surveys. Similarly, Stephens-Davidowitz says that search results reveal truths about all kinds of topics that we have an incentive to lie about or hide in real life, such as:
1) Sexuality: Stephens-Davidowitz says that, in addition to the data on gay men, search results cut against common stereotypes about sexuality. For example, women are just as likely to ask Google why their husbands or boyfriends don’t want sex as men are to ask the same about their wives and girlfriends.
2) Prejudice: Stephens-Davidowitz argues that the prevalence of searches for racist terms and phrases reveals that there is a lot more explicit prejudice (as opposed to unconscious bias or systemic inequity) than traditional surveys suggest.
3) Child Abuse: During the 2007-08 financial crisis, experts predicted a rise in child abuse and neglect only to be surprised by a downturn in cases. Stephens-Davidowitz shows that searches like “my mom beat me” went up in heavily affected areas—suggesting that abuse and neglect increased, but that cases went unreported or uninvestigated because of lessened resources.4) Abortion: Stephens-Davidowitz explains that searches about self-induced abortion are more common in states with restrictive abortion laws.
“If you can’t understand a study, the problem is with the study, not with you.”
Stephens-Davidowitz argues that when used well, data science is an extension of our natural intuition (though as we’ll see in a moment, it often defies our intuitive expectations and assumptions). He says that one of our basic activities as humans is spotting patterns and cause-effect relationships to make predictions. Good data science, he says, is just an expanded, more rigorous version of this activity. However, data science has two advantages over our natural intuition: It can consider much bigger sample sizes, and it doesn’t get distracted by compelling stories.
———End of Preview———
Like what you just read? Read the rest of the world's best book summary and analysis of Seth Stephens-Davidowitz's "Everybody Lies" at Shortform .
Here's what you'll find in our full Everybody Lies summary :
- How people confess their darkest secrets to Google search
- How this "big data" can be used in lieu of voluntary surveys
- The unethical uses and limitations of big data