SARAHA BYRNE: Good evening, everyone. My name is Sahara Byrne. And it is my great pleasure to welcome you here tonight for the Annual Distinguished Lecture in the Social Sciences. This series was started by Dan Lichter, who-- Dan, could you please stand. I saw you somewhere. Dan Lichter right here.
Dan was the director of the Institute for Social Sciences. So thank you, Dan, for your years of service in social sciences and for bringing this important event here to Cornell. Thank you so much.
So tonight's lecture is now sponsored by the new Cornell Center for the Social Sciences. The CCSS was established earlier this year by the provost. Peter Enns, who is sitting right there, and I were named as the inaugural co-directors just this summer. I'm a faculty member in the Department of Communication in CALS. And Peter is a faculty member in the Government Department and the director of the Roper Center for Public Opinion Research.
Together Peter and I have been working closely with Chris Wildeman and Emmanuel Giannelis in the Office of the Vice Provost for Research to envision a center that will enhance the social sciences all across Cornell. We'd like to take this opportunity to thank Provost Mike Kotlikiff, Chris and Emmanuel for their support and enthusiasm for the new center. We're developing a huge variety of new initiatives, and we look forward to announcing those in the coming months.
Peter and I would also like to quickly thank everyone at CCSS for their help in planning the vision of the new center. And, of course, tonight's events could not have happened without them. Anneliese Truame is the CCSS administrative manager. Megan Pillar-- Megan Pillar is the program and communications coordinator. And our incredible student staff members, Tiamen Montgomery is right here. Karen [? Zwan, ?] who I believe is out putting signs up to direct you to the reception. And Laura, you, right here.
So after this talk, we have copious amounts of food and drink-- if you're over 21, lots of glasses of wine and beer-- in the Statler Ballroom. And there are many white signs that say Annual Lecture Reception in the Statler Ballroom. And you don't need to leave the building. The signs will lead you there. And I believe our two student staff members and all of us will be leading the way over there too, so you can quickly get through the building. So please join us over there to help us eat all this food and drink all this drink and to talk about the work or about to hear.
And now, Peter Enns will introduce our speaker.
PETER ENNS: Thank you, Saraha. Before I introduce our speaker I want to note that after the talk there will be an opportunity for audience members to ask questions. And then as Sahara just mentioned, we'll adjourn and everyone is invited to the reception.
We're thrilled to have Professor David Lazer for this year's Cornell Center for Social Sciences Distinguished Lecture. Perhaps fittingly, Professor Lazer is a distinguished professor in the Department of Political Science, and College of Computer and Information Science at Northeastern University. His research focuses on the nexus of network science, computational social science, and collaborative intelligence, with applications to some of the most important social and behavioral issues of our time.
What I want to impress upon you is how field-defining Professor Lazer's research is. The field defining nature of his work is easy to identify. It's made evident by hundreds of publications in the most prestigious academic journals across multiple disciplines, made evident by the fact that his research has been cited more than 16,000 times, and that he has collaborated with 125 different co-authors, and made evident by all of you, who came to hear his talk today.
But I also want to highlight some of the reasons I think Professor Lazer's work has become field defining. First, he identifies critical social issues and questions before most people are even aware they exist or need to be studied. These are issues like DNA in the criminal justice system, fake news on social media, or how technology influences social relationships.
He then considers these issues from a novel, even radical perspective, studying the issues with theoretical insight and methodological rigor at a massive scale. For example, to better understand the nuanced effects of unemployment, he and his collaborators linked cell phone call record data to unemployment data at the individual, community, and province level for a 15-month period. To understand how web search engines tailor their responses to individuals, he used GPS coordinates to see how location influences what information search engines provide.
Professor Lazer also stands out for the ways he supports others. In addition to extensive collaboration with students and postdocs, he supports those outside his personal network. One example comes from Volunteer Science, which Professor Lazer founded. Volunteer Science is a web laboratory that accelerates behavioral research by creating a community of active and engaged participants, lowering researcher costs and promoting citizen engagement with interest in and attention to scientific research.
Not only does professionals identify big problems, study them in a rigorous and radical ways, and support the research of others, he does this at a scale and frequency that is unparalleled. Simply put-- the breadth, depth, and impact of Professor Lazer's research is awe inspiring. I can't wait to hear what he has to teach us tonight. Please join me in welcoming David Lazer.
DAVID LAZER: Well, thank you for the all too generous introduction. And I want to thank you all for having me here. This has been a wonderful day. And actually, it's just postcard perfect out there in Ithaca. So thank you for that as well. Our timing was perfect.
So I'm actually going to begin with a brief digression. I hope you'll forgive me. My mother actually went to Cornell many years ago. This is her yearbook. And a picture of her from a yearbook.
And the story I want to tell, a brief story I want to tell, involves her life a few decades later when my grandfather was diagnosed with prostate cancer. And this is a picture of my mother about that time. And she went in with my grandfather to talk to the doctor. I don't actually have a picture of the doctor. But in my mind's eye he looks something like this.
And a few years prior-- I should note this is a very distinguished doctor. He was recommending a surgical approach, a radical surgical approach for my grandfather, which distressed my grandfather, unsurprisingly. And my mother pointed out that there had been a randomized control trial comparing hormone-based approaches versus surgery that came out in this study a few years prior.
The doctor did not take this suggestion of mother's well. You know, she was not a doctor, not a medical doctor. She was not a man. She was short in stature and strong in conviction. She was also an MIT-trained economist who understood the power an RCT.
And in the end, they walked out and my grandfather did not have the surgery. And the doctor, his closing comment was, this is why lay people should not be allowed to read medical journals.
Now, the takeaways here I think there are a number of things to think about here. One is how much the informational ecosystem has changed, because the amount of information you could access about, let's say, prostate cancer is just radically different than you could have 50 years ago. The notion of-- I'm sure this doctor did not encounter many people who were coming in critiquing the methodology.
It also highlights the role of mediation between information and in particular the role that experts, the privileged role the experts used to have, and the power that came with that mediation role. And part of what we need to think about is our information structures have changed radically to change that mediation between people and information that's relevant to them.
It also, by the way, highlights the role perhaps the ego plays. The name of a surgical procedure is actually named after the doctor, which is interesting, which is also to say, by the way-- even though the way I tell it, I should note my grandfather lived for another 25 years and did not die of prostate cancer-- is that there isn't like an objective reality here of what the right thing was. There are different elements of expertise.
There's just one person, my grandfather. And we don't know what would have happened in his case. We don't know sort of what the odds were. There were different bases for expertise and knowledge. Hers was around research methods and thinking about what RCT means and what the power of that was. His was informed in part by clinical practice.
But that, in some ways, is also a metaphor for how democracy works, which is that there isn't necessarily an objective truth about exactly what works and what does not work. So that brings us to "Democracy, today," such as it is. And I want to ask you a question. Does Finland exist?
And the reason I ask is that there is this very well-developed Reddit debating this question. It began actually with a spoof post on Reddit, but then there's a vigorous debate that sort of some of the people clearly take pretty seriously. And the element of the argument is that after the war between Russia and Japan in the early 20th century that there was a settlement that gave Japan special privilege fishing rights in the Baltic Sea and in the area that we think of as Finland but really it's sea. And there are Japanese fishing boats there. And that there is this vast conspiracy involving forged GPS and satellite imagery and so on. And the clincher is that no country in the world could possibly be that good. So it's just the reality of Finland is just utterly implausible.
Now, I'm going to get back to Finland in a moment, but like generally when we think about democracy, I think there's some premise that we have choices, meaningful choices, and that we have the knowledge to understand the consequences of those choices. And we can debate this, but this is sort of the premise in some ways of the talk that we know something about the world, we make some choices informed by what we know about the world.
And that's layered on top of the structures of democracy. And there are, of course, many structures, whether we're talking about the electoral college, who's eligible to vote, who is how distant from voting stations. But, today, I want to talk mostly about the informational structures, how we find out about our choices and knowing about the consequences of the choices. Let's say voting for the presidency is like one of the key choices we have, as we've seen in the last few years.
So I would view knowledge as a network phenomenon, meaning that-- there's so little that we really know, right? I mean, how many of us have actually seen Finland? Raise your hands if you've been to Finland. More than most crowds. But still, the vast majority haven't been to Finland.
Now, I'm not going to ask for a show of hands for who does and does not believe in Finland. But I'm going to guess that most of you believe in Finland. But you don't have direct knowledge of Finland. But you do have some knowledge of Finland. And that knowledge is derived from the people we know, the sources we trust.
But we do have diverse understandings of the world. There are the people out there who don't believe in Finland, and they get to vote too. And democracy in some ways is what happens in the spaces in between those. If we all believed and thought the same thing, then democracy would be pretty simple. We would just have unanimous results in our elections, which we don't.
So why do I believe in Finland? You know, I've seen it on maps. It's been mentioned in history books and newspapers. And I co-authored with people who claim to be Finnish. And it seems far more plausible that Finland exists than to believe that there's this vast conspiracy to fool me and fool everyone that Finland actually exists. My guess is that you all have similar experiences as to why you believe in Finland.
But, you know, Finland exists, but if the reason why you believe that, the reason why you know that is because you've been subject to this network of informational influences ever since your childhood saying Finland exists, Finland exists, Finland exists. You've seen it on a map. You see references to Helsinki. You've met people from Finland and so on.
And you can't even identify the cause why you believe in Finland in all likelihood. It's just been in the air. And that's actually pretty typical of how we all know things that aren't immediately accessible to our perception.
So when we think of democracy, we should think about, I think-- and the structures of information-- there are two key elements-- these aren't the only elements-- but to think about all the bits of information about there-- mediation and curation. And I have a picture of a librarian there to capture a sort of canonical role sort of mediation and curation of information. And those processes of informational structures have changed of mediating between us and the realities of the world have changed dramatically over the last generation.
And part of what I want us to ponder today is what are the meanings and what are the choices we have about the shape of our democracy in the 21st century? So I want to begin looking a little backwards. I'm going to do this quickly. I'm going to present some stylized facts that are approximately correct, but not perfectly correct. And that's just for economy of time to talk about one of the ways that still affects how we learn a lot about system about the world which is the 20th century media system.
And there are a few elements here that I want to highlight. One was the emergence of professional standards of reporting. That really emerged in the early 20th century that some norm of objectivity of trying to report how the things really are in the world, that that was not the case, let's say, in the 19th century, where at least in the US, where, let's say, early in the 19th century newspapers were often more partisan organs than anything else.
We've talked of attentional economies today. That is Google and Facebook monetize our attention. But, of course, newspapers did and do that and, in fact, that was what they were reliant on to exist was advertising. That is you flip a page. You see an ad. And people paid for that because your attention on that ad was worth something.
And there was very limited regulation, especially on the airwaves. There wasn't much regulation of print.
And there were economies of scale. So that having the infrastructure to be able to print and circulate and report and so on required some substantial fixed costs. So the emerging characteristics of this where local news oligopolies and national-- and so we had, you know, let's say, if you're in Boston, there would be a couple of papers in Boston. There would be a few local news shows. And then, of course, nationally there was the news networks. And that's sort roughly what the news ecosystem looked like 20, 30 years ago.
It meant that there was very limited catering to local tastes. So some research suggesting how news media tend to tilt ideologically a bit towards their local audiences, but more or less, because these were small numbers of competitors, they're all shooting for the general audience and were all pretty mainstream. Again, there are some exceptions.
There was gatekeeping by pretty homogeneous class of people, both in terms of educational background and socioeconomic. This is a picture of The New York Times newsroom, circa 1983. And diversity meant wearing a red cardigan. Otherwise, you have middle aged white guys, which is fine. You know, that's my tribe. But maybe they were missing some elements. I want to be careful about how much nostalgia we have for the ecosystem status quo ante.
I'm going to I talk about the vulnerabilities of manipulation of social media and the like. But it's important to note that mainstream media are highly vulnerable to manipulation as well. There are vastly more people involved in public relations than there are reporters, like on the order of tenfold. And their jobs are to manipulate what appears in mainstream media.
There's this wonderful government report that Britain published. They have a nice bureaucracy that issues reports and then it became public like decades after that talking about their manipulations of US media during the early 1940s to curry favor with public opinion. So like giving reporters stories or trying to manipulate the finances of some of the major newspapers at that time. And all with substantial success.
And so when we think about foreign interference potentially in the news ecosystem today, we have to realize that this is not new, but the mechanisms have changed, but that old systems were not invulnerable. We might think about how professional standards and so on guarding against some of these things. But also some of those standard operating procedures actually make the news media subject to manipulation just as we saw, let's say, with WikiLeaks and provision of certain kinds of newsworthy information that then required reporting in the last election cycle.
So definitely still vulnerabilities to certain kinds of manipulation, as well as sort of bargains for access rights. So there's this standard sort of network of relations between elites who are being reported on in news media. And so all of this is to say isn't it-- maybe it's a little bit to critique mainstream media, but let's also say that's the way it's been. And that's actually in certain ways the way it still is with the news ecosystem, with some strengths and some weaknesses as well.
That model has been in decline. That advertising reliant model has been undermined by more efficient systems of advertising, whether it's Craigslist or Google or Facebook. And so the resources have been sucked out of many of these canonical media. I could update the numbers here to being right up to date, to yesterday they would look even worse. But let's say if you're looking at circulation papers dropping 30% between 1990 and 2012 even though our population grew substantially during that period of time. Employment in newspapers has dropped 40%. Ratings for network news has dropped by more than half.
And trust in the media has plummeted, but especially on the right. I mean essentially around the year 2000, around half of the Democrats and half of Republicans trusted the media. And if we fast forward to 2016, basically it's still half of Democrats. But almost no Republicans trust the mainstream media anymore. And this is based on, I think, Pew survey data. So there's been a complete cratering of trust in the media on the right in particular.
At the same time, we've seen the rise of a synergistic and/or parasitic system. I'll call them-- I need to come up with a better term-- the meta mediators. This would be, I think, qualify as like the worst movie title ever.
And what do I mean by meta mediators? Well, in some sense newspapers and the television and so on are telling you about the world, because you're not directly experiencing that. So they're mediating between you and, let's say, what's happening in Syria or what's happening in the Ukraine and what's happening in Washington.
But now we have Google and Facebook and other social media that are providing pathways then to those media. So you find out about a New York Times article or a CNN story through Twitter or because someone shared something on Facebook. So those, they're even mediating your relationships with your friends, which is in some ways blowing, because can you imagine-- of course, telephone companies in some sense mediated in a very narrow technical way your connections with other people, because it was packed with wires and so on, but like as a kid, you know, we'd call someone and the telephone company was standing between you and the other party.
But there wasn't that degree of agency in actual infrastructure and choosing what you saw. Like if I called my cousin, I would just get on the phone. I dial the number, and hopefully my cousin would pick up.
Now imagine if the phone companies did something like this, which was maybe today isn't a good day for you to talk to your mother. So the phone is not going to ring. Can you imagine if Ma Bell said-- I mean, well, you know, if your telephone company said not good for you to talk to this person today. Of course, they do that with spam, but just like evaluating, like saying this friend, this familial relationship, not a good day.
But Facebook, Facebook does that. Your Facebook may literally be saying, no, it's not good for you to see that post from your mother today. And so there is this sort of agency in the structure that-- and I'm using the word agency liberally here, because it's sort of funny to use it in that way, but it's choosing for some reason for you to see a post from this friend, this post from this friend and not that post from this friend.
But first let me just talk about the rise of these meta mediators, by which I mean like search and social media. And these are data from Pew. And around 2/3 of Americans report sometimes getting news on social media, by which usually it means really news, often mainstream news finding out about the world. This is from a couple of years ago. So these numbers are probably a little higher today. And Facebook, YouTube and Twitter seem to be the top three in that regard.
Search is particularly important. In this one survey, almost 2/3 of people report using search engines to search for news online more than once a day. So search in particular is incredibly important, although we don't understand exactly what the psychological import of these things are.
But in any case, the empirical reality is that we're all heavily reliant on Facebook and search and so on to access news. But typically that news is then other institutions that are producing news.
So then the question is what is the emergent logic of this hybrid media system? And I have five points. So this isn't meant to be like comprehensive. Like there are other things to say. But I think these are some important things.
The first thing is just sort of obvious. I'm going to belabor it. But we have a lot more choices for news than, let's say, when I was a kid or when I was in my 20s. Like in the 1980s, you have the local paper. Maybe you subscribe to a weekly magazine like Time or Newsweek. You could watch local news or the network news. But that was pretty much it.
And, today, I think we all access vastly more options and have vastly more options, because the internet pretty much makes it costly to access anything. So I think that's sort of obvious.
But then concomitant with that are a number of other things that are less obvious. One is that there's increased concentration of the media I think in part substantially driven by this hybrid nature with the internet.
There's an interesting permeability that social media introduced between mainstream media, between those 20th century media, and they pay attention to social media. And that then seeps into reporting. And that's the synergistic element.
And then there's the question of polarization and vulnerability, whether the media sources today support polarization of various kinds and whether they're vulnerable to manipulation. So I'm going to go through these last four in a little bit more detail. I'm going to talk about concentration.
But first thing, I'm going to do a little digression on an experiment that my collaborators and I did a few years ago, which I think is on point. It was a marketing experiment. We were working with a large telephone company, Telenor, which is in 13 or 14 countries. They had cell phones.
And we were working with them in one of their Asian markets. And they wanted to do a marketing experiment on the circulation of discount codes for data. So what they wanted to do, or we did, was to-- the beautiful thing about this and was the idea here was spread of stuff in social networks is that they actually have a pretty good picture of the social network of an entire country, because they can see who calls whom. They can see who text texts whom.
And so the idea here was to provide something of value that could circulate in the population among their users. And so what they did was they circulated 70,000 discount cards. So these were unique, alphanumeric codes that all provided 15 megabytes of free data. And you could then share that code with other people.
And so the question was, how were these 70,000 codes going to spread in the population? And this is something of real potential value within this particular market. A lot of people would find this valuable.
So each code is different in terms of its alphanumerics, but identical in terms of value provided the user. And they could be shared during the term of the experiment, which lasted two weeks with an unlimited number of people. There was no cap.
Now, the metaphor here I want you to think of this is in some sense these are 70,000 ideas that we have sort of deposited with people that have value to other people, to other individuals. And the question is-- they just vary in their starting points in the social network-- and the question is, how do they spread? What are the processes of spreading?
And so what happened? And I'm going to ask you a couple of questions. Or here were our two core questions. I'm going to ask you a couple of questions.
The two focal questions I can focus on is, what is the distribution of spreading among these codes? There were 70,000 codes. Some of them are going to spread a lot. Some of them not as much. So what's the inequality in the spreading process amongst these 70,000?
And the second question is what's the temporal nature of the spread? Like does it sort of start slow and pick up speed and so on?
So to develop this a little bit more, the question I want you to think about here is, of those 70,000 codes, what proportion of adoptions will be for the top 10% of codes? That's 7,000 codes. So it could be it's entirely from the top 10%. It couldn't be lower than 10%. It has to be at least 10%. There has to be somewhere between 10% and 100% for all spreading is accounted for by the top 7,000 codes.
So I want you to think, do you think it's 1%? Do you think it's one person accounting for it? Sorry, what I meant to say is, what proportion will be in the top 1%? And what proportion will be the top one person?
And then the second question is, what is the likely rate of spreading? And we could imagine in one kind of world that it just spreads the same amount every day. It's sort of linear. And whatever the rate of spread is Day 1 is the same as Day 2 is the same as Day 3. We could imagine that starts fast and slows down. You could imagine more of an S-shaped curve where it starts slow, but then those two people tell those two people and so on.
And I'll tell you sort of the prior expectation of most social network scholars would be, and certainly my expectation was, that it would be S-shaped, because you spread it to a bunch of people. And then they start telling lots of other people. And it starts taking off right.
Well, to summarize the elements of the experiment, there were almost 900,000 adopters of these codes, of the 70,000 codes. And then the company said, that's enough. But 94% of the codes were never adopted. 5% of their codes were adopted once. And about 1% of the codes were adopted more than once. The top code was adopted by around 80% of all the adopters. And then the next code was adopted by like 10% or 12% and the next code 5%.
And so there was this unbelievable inequality and spreading, like 1 out of 70,000. They're all equal value accounted for 80% of spreading. This was not at all what I expected. To be fully frank, I thought there was a screw up in the data collection. And I went back to the company and said you screwed it up.
And then they showed me the next thing, which I reconfirmed my intuition, was that the spreading was linear. And I can tell you diffusion is one of the most studied things ever in the social sciences. And if it appears bumpiness, there's a diurnal bumpiness. So people do not adopt in the middle of the night. But other than that, it's almost identical numbers of people every day.
And I said linear spreading? This has not occurred in any of the thousands and thousands and thousands of papers on diffusion. But it turned out that what was broken wasn't the data. It was my expectations. And that there was a very simple answer for why this had occurred, which was that someone had posted these codes online and that there was a background level of search.
Like how many people here have searched for discount codes? Raise your hands. OK. You know it's a funny thing, by the way. I've presented this in Europe, no one raises their hands there. So this is a bit of a culture specific thing. But in the US, people search for deals. And I guess in this country as well.
And so the thing that drove this was that there was a pretty steady background rate of searching for discounts. So there's no reason why Day 3 would have way more searching going on than Day 5. And so that's why it was pretty linear.
And there were a bunch of reasons why we realized this was the case. One was that we found these codes being posted online. Another is that the ones that were shared a ton, there was almost no spatial component, whereas there were a bunch that were shared a little bit and it was all very geographically concentrated in people's social networks.
So there was an asocial spreading and there was a social spending. I was thinking was going to be a social process. And so I was thinking there's going to be this nice S-shape. But instead it was an asocial process, and it was Google or whatever the search process it was that was occurring. And the reason why there was a winner take all system was because it's a lot better to be number 1 on Google than number 2.
And the other thing I want to note is that for the codes that spread to tens of people, they had-- these are our pictures of invasion trees. What does that mean? This is a kind of network picture where this was, let's say, the initial seed. Then we could see it spread to some of those contacts. And then it spread to some of those contacts and so on.
Now, for something that spreads virally, that infects a large part of the population, invasion trees spread out. So they start out with a small number. But then it gets bigger and bigger.
But all these invasion trees are narrow and deep. What that means is that the spreading tendency of this with below what's called a critical threshold and that these 70,000 codes would spread to a few people and then it just would have stopped unless you had some other pathway to spread it. And so what was remarkable about this is that even though this was desirable, that the tendency to share that information with other people was pretty low. And so it just wasn't high enough to sustain virology for that to spread.
So what are the key takeaways here? I think there are actually some profound takeaways that have broader implications. One is that there is this enormous winner take all effect. Again, it's amazing, 70,000 "ideas," quote unquote, with equal virtue and one of them wins. It's completely arbitrary because someone got to this website and posted it first.
We call this on demand diffusion because it's really a novel kind of diffusion, because it's based on individuals searching for information literally through search. And by those narrow invasion trees, we sort of have an idea of what the world would have been like without search, without Google, which is that this is stuff that would have died out. It would not have spread to the population absent some channel to mass broadcast it. Now, could be in the status quo ante, maybe someone put it in a newspaper and a lot of people would have adopted it that way. But just mouth to mouth spreading would not have been enough to spread the information.
But this is the profound thing that if there's a bit of information out there that is identifiable through search that is a value to people that are just sort of routinely searching for it, they'll find it, that the world is this unbelievable haystack and there could be that one needle and we can find it. And I'll just give you a little example here from a story years ago.
My wife had a watch, and she could not figure out how to change the time. I deal with tech stuff in the household for whatever reason. And so I was like, sure, I can fix this. I can set the time, right. It's just going to be find the mode and so on.
And a couple hours later, I was so pissed off, and I get on Google. And some woman had posted deep in some forum somewhere saying, my husband had this watch. He couldn't figure it out. He gave it to me. I worked on for a few hours, drove me crazy. I figured it out. And I know someone else is going to be driven crazy by this. So I'm posting the answer here. And I found it.
And, you know, this, is so routine for us that we don't realize what a miracle it is that there is some woman somewhere in the world-- I don't know whether in Australia or wherever who had had this watch, who came up with the answer, used before, posting it somewhere, and I found it, and I could change the time. It's like some weird sequence of button presses.
And that's a miraculous thing. Like this was an idea. She had figured something out and it spread to me. It was a kind of innovation. And that's something that's very special about today's ecosystem, information ecosystem. And search is changing all the time. And an increasing fraction of information of the world is going to come under this kind of regime as search technology becomes more powerful.
So sorry for the long digression, but like this is one of the papers I've written is I'm most excited about ever. And it's been cited twice, which just seems unjust to me. But it's also actually on point here, right, because the point here is that search is just we know it's important, but it's like it's really, really important. It's about the tyranny of the list.
It is really, really good to be number one in Google. It is really good-- most of our presentation of information in the world of these technologies are somehow linear. There's an order to it. You get on Facebook, there are some things on top. And then you can keep scrolling and scrolling and scrolling. But you know the thing on top is just much, much more likely to get your attention. Same thing goes with Twitter.
And, of course, I should note newspapers have their own ways of prioritizing. There's front page, above the fold, and so on. But that that prioritization of information is incredibly important. And that part of the question with thinking about technologies like Google is what are the principles driving that.
And this is something I'm not going to show you a lot of slide on. But my collaborators and I-- and I'll highlight Christo Wilson, Aniko Hannak, Ron Robertson, among others, at Northeastern have been doing all this work auditing Google in particular.
And so, for example, Google has-- well, first of all, one interesting thing, Google does not particularly personalized information, what information you'll see. So if I search for something and you search for something, we're going to get the same thing generally. But they will geolocate certain kinds of things for commercial establishments. Like if you search for pizza, you'll find a bunch of pizza places in Ithaca, and right now I will as well since I'm in Ithaca. But if I go back to Boston, I'll get Boston Pizza places.
But I won't do that for politics. So like if I search for my member of Congress. We'll get identical results here, Los Angeles, Boston, Miami. We'll get the same results. So there's an interesting question of why pizza is geolocated and why not politics, which also has a strong geographic component. And that's a value decision. But in any case, the key, big takeaway here is the notion of concentration.
I want to show you just one thing here. This requires a little unpacking. We did a study, which I'm going to refer to in a few minutes around fake news on Twitter.
We came up with this way of-- we identified 16,000 people who we followed during the election. And we extracted all the domains that they tweeted about and were exposed to regarding news about the election. And so I'm going to show you the results of how much fake news they were exposed to.
But the other thing-- and this is actually pretty fresh-- is to think about how concentrated their exposure to different news domains were. And it turns out that it was pretty concentrated. And so this one is a domain-- if you rank the most popular domain, this would be 10 to the 0, so we need to make this a little clearer, but you know, 10, 100, 1,000. But how much of all domain exposure was accounted by the top domain versus the top 10 versus the top 100 domains?
And what we should read here, for example, is that the very top domain that people were exposed to, which was The New York Times, accounted for roughly 8% of all news exposure to people in the country, which is a lot. And if we go up to 10, we see around the top 10 sources accounted for roughly 40% of all news exposure about the national election. And the top 20 account for the majority of exposure to news during the 2016 election. That's pretty darn concentrated.
And I haven't produced the sort of equivalent plots to how it would have been, let's say, 30 years ago if you're looking at newspaper circulations. But I can guarantee you it would have been a lot less concentrated. And I think that's in part because it reflects-- and we also have stuff on search and browsing. It's not quite as concentrated, but it's pretty darn concentrated. And it does reflect the fact that if you have a national market for information, it's going to naturally drift towards reinforcing the top few. And I should note, others have noted this, I'll highlight the work of especially Matt Hindman around Google. It's just that this logic has really been I think taking a hold with a vengeance in the last decade.
Permeability-- and I want to highlight or I just read your shot out of a book from three colleagues of mine, Sarah Jackson, Moya Bailey, and Brooke Foucault Welles on Hashtag Activism, Networks of Race, Gender, and Justice. That's going to come out from MIT Press shortly. And the neat thing about their book is it looks into a number of cases-- and this is #Ferguson and Black Lives Matter are some of the examples that they unpack-- where the news coverage of the event in part was informed by people on the ground who were tweeting about it, often fairly peripheral people who are very close to the event.
And one of the things I think that's occurred with social media and this hybrid media system is that there is a sort of permeability between social media and media, where part with the media is talking about is being driven by what's happening in social media. It also goes the other way. But this has created some degree of the periphery being able to reach in to media coverage in a way that really wasn't possible a generation ago.
Polarization, I'm not going to go deeply into this but to note that one of the most influential hypotheses out there talked about the media as being the filter bubble hypothesis. That is the idea that's increasingly easy to select compatible sources of information online and that that will help drive polarization. It's clear that the core arguments of that are right.
Actually, it's sort of clear that they're substantially, not totally, but substantially wrong in part because on the one hand, you do see more ideologically diverse sources of information. On the other hand, you also have this sort of centralization in certain kinds of ways and that is just a lot of incidental information exposure. And some of this, you know, you can see on social media, on the one hand, you may see a lot of compatible information. On the other hand, you incidentally get exposed to a lot of very diverse media. In some ways likely a larger variety of information that you might have been exposed to a generation ago.
And even in our social networks, the research I've done, multiple papers suggests that we do not choose our friends, let's say, principally based on ideological compatibility. That said, one of the other things that's clear is that it's much more likely that the people that we could choose to be friends are much more likely to be ideologically compatible with us because there's a geographic polarization around politics that has been occurring over the last 40 years really. But that 40-year pattern is not being driven by social media, obviously. It's being driven by other social forces that are grinding away in our country.
So vulnerability to manipulation. I'm going to give you an example here, not from politics, but from health. There's a lot of great misinformation about health.
One element, one myth-- I mean I think people are familiar with anti-vax arguments. But here, I'm not talking about apricot seeds and cancer. And if you could see the top, you would see that there was a search for cancer apricot seeds. One of the myths out there is that apricot seeds will cure cancer.
It doesn't, by the way. So don't start eating apricot seeds. They can kill you actually, because they have cyanide.
But if you search for apricot seeds, a number of interesting things happen. So, first of all, you see some products, like on the right that you can buy. This is for searching for apricot seeds cancer. So immediately it pops up products.
This looks a little different now. There's actually more of this. There's a lot of stuff on the side about buying stuff, which is automatically a signal in a way that it might be worth buying given that's of interest.
Now, there is a mix of quality of views here. So like there's something here on apricot seeds debunking the myth from Dana Farber in Boston. On the other hand, there's this sort of large excerpt here that if you read all of, if you go to the website, it says, yeah, yeah, this going to kill you.
But like the large excerpts up here, interestingly, from this website, Medical News Today, actually says may have some health benefits and some people suggest it may help fight cancer. And then that sounds pretty positive. But then you get to the few sentences later. It says not actually, the science is pretty clear. It can kill you. But you wouldn't get that from this excerpt, which makes you wonder in part what they're aiming at with the excerpt.
But then you dig deeper, there's sort of some other elements here. You see an ad from Amazon advertising, which is keyed in to apricot seeds cancer. You see apricot seeds cancer at Amazon, oh, my gosh. And look at that rating, 4.7 rating out of 5 point scale. That's not bad.
If you click on some of the video options at the bottom, all the YouTube videos are apricot kernels kill cancer cells, seed cured his cancer, Jason Vale cured his cancer with apricot kernels. I'm not, well, cherry picking-- I'm not apricot picking here. But like this is just what immediately pops up on Google pointing to YouTube.
Here, if you go to Amazon, for years this has been deemed a cancer fighter mostly in Mexico. I was diagnosed with stage 4 cancer and had been given-- finally for almost two years, it has remain dormant. However, after I started taking this and with other items I found at last on Amazon, and on my last MRI doctor appointment my tumors are finally shrinking. And this is why this gets 4.7 rating.
I should note, it's not just Amazon. But if you go to WebMD, which is generally a pretty good source of information on health, WebMD it also gets like a 4.6 rating, because they have a crowdsourced part of their website on products. And so the pro-apricot seeds folks have invaded WebMD. And WebMD is not doing anything to check the validity of these reviews.
Here's my favorite one, which is on NIH. You can go to PubMed, where they pull in automatically publications. And there's a publication from actually are very suspect publisher, which highlights saying amygdala is blah, blah, blah, and may be able to basically be a great anticancer agent. So even the NIH has been invaded through the back door of taking in suspect journals to promote misinformation about apricot seeds and cancer.
All this is to say, is to highlight, that there clearly is vulnerability to our crowdsource systems and automated systems for information. Interestingly, like this is in principle and NIH is more of an elite type of thing, but it's all automated.
So I want to talk a little bit about fake news and then the issue of robustness. And then I'm going to pull it all to a point.
So we have these potential these potential vulnerabilities. And one of the interesting places that we've looked at my lab is at Twitter as a vector for misinformation. And prior research highlights the potential for Twitter to rapidly spread misinformation in part because novel content and fake news tends to be more novel is a lot more viral. So this particular paper argued that misinformation went faster to more people further than true information.
In our paper, we wanted to evaluate the prevalence of fake news on Twitter during the 2016 election. And we defined fake news as publishers that do not adhere epistemic standards for news production. That is they don't have the kinds of professional standards for news production that would be standard in the news media.
And so the way we did this-- I'm just going to give a very quick feel for this. I'm not going to go deeply into methods-- was that we matched Twitter handles to voter data. So we had 16,000 people who were matched representatives-- roughly representative sample-- voters matched to Twitter data.
And we looked at fact checkers. And we found certain publishers were repeat sources of misinformation. We called those publishers fake news, like Infowars would be a classic example. And then we evaluated how much fake news was being shared and then how much people were being exposed to fake news. And then we also look at co-exposure levels.
And so again, we had these three categories of fake news-- black, red, and orange. But for now, they're all fake news. I can go more into this if people have questions.
What were our findings? Well, the first finding was there's a non-trivial amount of fake news that people were being exposed to that roughly 5% of the political content that people were being exposed to during the 2016 election was fake news by our criterion.
But it was very concentrated among the very small number of people. So again, this sort of gives a sense of the distribution. If we say the 0.1% most exposed people, we have this is blue, it's just sort of general news. Red and black and orange are different flavors of fake news. So you 0.1% accounted for roughly half of all exposure to fake news. And that if we get to the top 1%, that the top 1% would account for roughly 80% of exposure to fake news.
But then if you look at the sharing patterns, we found in our sample of 16,000 people, 16 people-- 0.1%-- accounted for almost all the fake news being shared. That if we again go from the 0.1 percent and up here roughly 80% of the fake news being shared came from these 16 people.
What was happening here was that these people were just flooding the platform with misinformation. So they were just sharing and generally a lot more-- also sharing regular news more. But they had a vastly much higher share of fake news in what they were sharing than other people.
And so when we're asking what's going on here? Maybe box slipped into our sample. But it was pretty clear for various reasons, these were real people. We could see they had websites. These were people who had a big virtual footprint.
But then there were other clues there that they were likely using automation to help them posting high volume of misinformation. And so we would call these sort of cyborgs. But the thing that made them potentially compelling was that they're actually embedded in real social networks. Bots may have a hard time to get real people following them. Real people are embedded in real social networks.
And this gives you a sense of the co-consumption patterns. Each dot is a publisher. The biggest one is The New York Times. It's really much larger. It's scaled so bigger dots account for more of exposure. We haven't made it quite proportional. But they're colored red and blue, but depending on how right or left leaning they are. And they're filled in dark if they're fake news.
And what we see here is that most of the exposure is occurring with these connected mainstream media sites. And then there's this very dense territory of fake news sites, where if you consume one of these, you consume others of these. But the takeaway here in some sense is that that fake news isn't really a broad systemic problem on Twitter. It's really a question of a very, very seedy, but small neighborhood of Twitter.
And it does present an interesting puzzle. This is sort of, I guess, a glass half full kind of puzzle, which is, why isn't it worse? Like you have these people pushing all this misinformation. But then we don't see this vast amount of exposure.
And we don't really actually have the answer here. But it could be that people unfollow super sharers of misinformation. It could be that maybe if you get exposed to one bit of misinformation, like I don't know, maybe that's true, but if you follow someone who's just cranking out this stuff, you figure out this is a crazy person. I'm going to unfollow them. I don't know, but that would be a mechanism of robustness for Twitter that people are actually good given a large enough sample of figuring out what are high quality sources of information.
Another hypothesis is that maybe the Twitter algorithm reduces the impact of super sharing. Maybe they say, you know, it's OK that first tweet of yours gets promoted in timelines, but that hundredth tweet of yours of the day, no one sees. I don't know. We don't really know how the Twitter algorithm works. And we haven't audited. And it's not so easy actually to audit for various reasons.
But this gets to the broader question of curation robustness in all the things we've talked about here. Like Amazon, how are they curating what reviews get included. Yelp has its magical ways of trying to pick out reviews that they think are fraudulent. How good are they at that?
How good are people at curating their own news feeds? Because you have friends on Facebook. You may unfriend people on Facebook. You follow people on Twitter. You may unfollow people on Twitter. So how effective is that combination of technological and social?
So this highlights a general challenge. In some sense, the dominant logic of the 20th century information ecology was sort of this artisanal curation. You have editors and you have reporters. And then there was scalable circulation, especially locally scalable circulation, newspapers and the like.
And the dominant logic of the 21st century information ecology is a crowd and very scalable curation system, with a globally scalable circulation built with the objective of monetizing attention. And so the challenge that confronts us is like, how do we make this latter work for democracy? And we need to think about how to build these systems with democracy as a goal.
And I'm going to I walk very quickly and then conclude about an experiment we did to try to rewire our democracy in a certain way. I'll point to our book, Politics with The People, which involve these online town halls, where we have citizens meeting with their members of Congress. So we had small groups of citizens randomly assigned. So this is a randomized control trial. We did this also with a larger group of senators. And this was real members of Congress, real citizens. And these are the various members. This is the senator who was involved.
And we tried to develop this well-designed way of getting people information, but also to structure the interaction. So we informed constituents ahead of time, focused mainly on the issue of immigration. There was an authentic interaction with a member. They could hear the member's voice. So they knew it wasn't just a staffer or someone. There was a neutral moderator, who was fair in terms of distributing questions among participants. So it was a diverse set of attendees, and it was focused, like I said, on that one issue.
And like I said, it was conducted like a drug trial. It was a randomized controlled trial. We had people who didn't participate. We had people who did participate. We then did a pre-test, post-test, and then a post-election survey.
And the results were really quite wonderful. The attendees were diverse. They're more representative of eligible voters than actual voters are. Think about that. It's really because in part because younger people were showing up to these.
They were high quality discussions by standard deliberative criteria. It was well distributed participation. There were huge increases in knowledge after the event. And then there was also rational persuasion in terms of movement towards the member in various kinds of ways.
And so I'm highlighting this both because I think it was in the successful experiment in democracy, but also gets to the notion of like how do we design systems of, rather than ultimately with the objective of like trying to channel people to buy products, which is ultimately the point of the design of not the entire internet, but many of the systems, like Google and Facebook and the like. But how do we design our systems with democracy in mind? And, also it's scalability? Because it's spread through the social network.
So I want to conclude-- this is my last slide-- with a thought on the role of university in the 21 century, that whether the modern university-- I think the modern university does have increased burdens and duties to inform our democracy, to inform public discourse, which has not actually been a traditional duty of a university, that we need to strengthen our mediation and curatorial capacities for public access to knowledge, and that we need to be engaged in producing knowledge that is relevant to public discourse.
Here I have-- this didn't get saved. And I have their names and I don't remember offhand. These are the three grad students from West Virginia University. And they had a project evaluating how Volkswagen did this miraculous thing with their diesel automobiles, that they could reduce emissions so much.
And what they uncovered was, in fact, Volkswagen was lying, and they gaming the system. And it has resulted in tens of billions of dollars of loss of Volkswagen, but also dramatically improved health for people who live in areas with cars. I think this is sort of a nice example of sorts of this amazing role that a university can play. I'm not saying this is the only thing at all that should go on in the university.
But this was a very powerful thing that these three grad student West Virginia University did, very consequential thing. And to think about this as a model for how we might engage for maybe supporting local media. There was a wonderful article in The New York Times on how university newspapers are suddenly becoming sort of the local newspapers for a number of communities where local newspapers have died.
But there's also the question of partnering with news media, doing a good job of-- a better job, honestly, than just thinking of press releases about science, but engaging in and informing the public about science, about society, and that we need to take a more proactive view, in part because the other institutions, many of our other institutions are failing at this. And that we actually have arguably a business model, which is self-sustaining.
We have a lot of slack capacity. We have a lot of capacity where students and faculty could put efforts into this in a way that would help society. So I wanted to end with that note, because I think it's a prescriptive just generally for the modern university. So with that I want to thank you for being a patient audience.
Before people leave I'm open to answering questions if people have questions.
AUDIENCE: Hi, thanks so much for coming. My name [INAUDIBLE]. I'm an undergrad here studying information systems engineering and government. So I'm really interested in the broad spectrum work.
So I guess like while it's still very unclear how sincere these social media platforms are about combating fake news, one pressing issue that's kind of frustrated both fact checkers and also social media platforms is gauging the impact, like guaging the metrics for reducing the impact of fake new.
So you touched on this briefly. So I was wondering if you could talk more about while these platforms are looking into assessing how much effect they're having in reducing fake new. What would be some factors going into the metric system for that?
DAVID LAZER: Great question. So just to let me restate, which is obviously the platforms are putting in a lot of efforts to tackle misinformation and fake news. And one of the conundrums they face is how do they evaluate impact? How do they measure whether their interventions are having an effect?
And I have a few thoughts on this. And I do know the platforms are, I think, sincerely investing-- ever since roughly November 2016, they've been putting in a lot of effort into this issue. And I think there are a few metrics. And, of course, one of the challenges with metrics is whenever you have a metric, there's the question of how close it gets to the real thing you care about. Because we don't even honestly know how bad-- we don't know if the problem was better or worse than it was 20 years ago.
We actually do have some measures how informed people are. There's some like standard survey questions on do you know your member of Congress? There's some science questions that have been asked over the years, which suggest actually people are better informed than they were a generation ago.
But there's no way to ask how misinformed people are, because there are a lot of ways to be misinformed. And the way we're misinformed today are different than where we are misinformed 10 years ago. So there are a few easy metrics. But then there are much harder metrics, which actually require a little bit more work on the part of the platforms.
So easy metric would be to say something like let's take a representative sample of what people were exposed to a year ago, two years ago, three years ago, and say evaluate what proportion of it was misinformation. What proportion of it was informative information and so on? And you could say, people are being exposed to less information, misinformation, and being exposed to more healthy information. And so that would be one metric. That would be a reasonable metric.
Now, ultimately, though, we probably do care about what ends up in people's heads. And that's why you actually also have to ask people, because you can imagine a situation where you decrease exposure and yet actually it has no impact, because the people who were looking at were very skeptical, but now it's just funneling to people who are being persuaded or what have you.
And so ultimately I think it actually does require a mixed approach. And some of it will be metrics of things you can measure and clicks and what people are exposed to. But then you're actually going to have to interview people, survey people, and so on.
And that interestingly is like alien to platforms. Platforms are really-- I've talked a lot-- I've gone to Silicon Valley. And I've said, you gotta talk to social scientists. I also have a thing that I say to social scientists is like you got talk to computer scientists.
Ultimately, you have to think about what we really care about. And we really care about what's in people's heads. And just seeing what people are exposed to, what they click on is not a good way of getting at that.
It's a partial way. It gets you part way. And what you'd like to do is put the entire package together.
And part of what I would argue, though, is if you have a trillion dollars in market cap, you can probably afford a few studies in the field. And I really think that the platforms historically have vastly under invested in that. And so my advice to them-- I my advice that I've given to them is that they should do more of those kinds of in the field studies in addition to seeing what people are exposed to, even though there's a lot you can do even just-- I mean, we didn't survey people, but there's a lot you can do just with click data and what people are exposed to.
But it's not the whole package. Facebook has a lot more resources than my lab, though.
AUDIENCE: Hi. I'm Soham. I'm a CS Ph.D. Student. I have just a couple of questions about your virtual town hall experiment that you're talking about. So the first is you mentioned that you randomly sampled constituents, but you also talked about making sure you had a representative sample. And then you said that the people who showed up were representative. Was it sort of randomized within a representative sample kind of setting?
DAVID LAZER: So the way we did it was we used a company called-- well, back then it was Knowledge Networks. Now, they're like being ultimately absorbed into Ipsos. But Knowledge Networks maintains a very high quality panel of subjects.
And so what we do is, let's say, we had a member of Congress X. We would say you have a sample of so many hundreds of people. And we will randomly sort everyone in your sample from this congressional district, this high quality sample, into these three categories.
But then the challenge that we confronted and the first thing we had to deal with methodologically was the fact that while you can invite people to the online town hall, you can't actually force them to come. Now this turns out to be a standard issue with like medical research, for example, that you have to model compliance issues.
Then our first paper out of this, we said, well, you know, it's actually interesting, who showed up? Like we invite 100 people. 30 of them show up. What is predictive of showing up?
And it turned out that on the demographic side that most of the things that predict participation in politics, like being older or being wealthier, that kind of thing, the sign on predicting participation went the opposite way. The only thing that was the conventional direction was education.
And so that's what I mean that the people who, let's say, might be less likely to vote, like younger people, were actually more likely to participate in these online town halls. So that was the basis of my statement.
AUDIENCE: I see. And my second question you said that people moved closer to the member on these issues.
DAVID LAZER: That's right.
AUDIENCE: So do you mean sort of the across all of the town halls that you held where people presumably had different positions depending on their party affiliation, there was a secular movement towards the member positions?
DAVID LAZER: That's right and across parties. And part of our argument-- there's an interesting question-- we argued that this was evidence of rational persuasion. So I mean, there's a question of whether maybe people were being bamboozled and fooled by the member. But in part, putting it together with the high quality of discussion, we would argue that this was actually evidence of high quality deliberation. I mean, it's a debatable point, but that was our argument. Yes.
AUDIENCE: Hello. My name's Sam. I'm an undergrad here, studying computer science and public policy.
DAVID LAZER: Great.
AUDIENCE: So my question is, what do you see to be the role of social media in the 21st century? And what steps do you think social media companies should take to be more conducive to a healthy democracy? If they even can be.
DAVID LAZER: That is an awesome question. And then there's another layer of question, which is we can imagine the hypothetical company, which is like let's make democracy work versus like companies that have shareholders and are trying to make money and so on. So there are two layers of elements here.
Like one is if I was in one of these companies and I cared-- I mean, I've talked to a lot of people from these companies. They have very capable people. They have teams that are devoted to civics and thinking about our democracy. I think a lot of these teams are like, again, you know, created a little late. I mean like created like early 2019 as everyone's freaking out. Sorry, early 2017. So I think that a lot got realized, belatedly they're in the democracy business. That said, better late than never.
I think that there are a number of things. Honestly, actually, I'm less-- although at the moment I'm a little distressed about the state of our democracy, I'm actually less worried about their impact on US democracy than on other democracies that have less robust information ecosystems more generally.
And so my first recommendation honestly would be to invest in substantial capacity in the various countries that they're operating in to understand those countries, to understand those contexts, and to be measuring what's happening. What's the impact of those platforms in those countries? That these companies are rich enough where they should have a substantial team of people looking at what's happening in the Philippines or Indonesia or Myanmar.
And so these companies obviously have very much a Silicon Valley orientation. They at least begin with some naive but more than zero knowledge about the US, not a representative sample, mind you. But I think after you asked Facebook or Google or whomever how much capacity do you have in Sri Lanka? Do you have researchers who are experts in Sri Lanka? Do you have people monitoring what's going on in Sri Lanka to understand the local context? That you'd actually find they have almost no capacity there.
And, so honestly, I mean, it's a little tangential to the main just of my point, but watching what's happening, watching the impact, and reflecting on it is like step one. It's like step one is admitting you have a problem. Step two is like learning about it.
And I actually think that much as we have our problems here right now that there are much bigger concerns elsewhere and that there's a lot more ignorance and a lot less worry-- I mean we're very worried about the US, because we're such a huge market. There is a lot less concern in other markets. And, in fact, we may end up imposing certain kinds of solutions that are US oriented that are not appropriate in other contexts.
And so that's ironically my biggest concern. It highlights also when we think about regulatory solutions how we think about this in a global way. And I've studied global regulation. And I can't wrap my mind around this, because how do we think about a shared information infrastructure that is literally global in scale? And what kinds of solutions? How do we make solutions local for such a global scale? Where the whole logic of the business model is scalability. It's to not customize things. It's to minimize the human humans in the loop.
And so there is a basic conundrum here. And it's not clear to me like when we talk about antitrust solutions, for example, like governor interventions that that really solves the core problems, because it doesn't solve the core incentives to the platforms. It may actually reduce their capacities in certain ways to address these things because there are a lot of fixed costs in understanding different contexts.
So it solves some problems like the concentration problem on the social media side may reduce that. So Facebook doesn't looms large if Instagram is a competitor. But if they're both don't care about misinformation, have we really solved the problem?
But in any case, if I was within the platform, I'd say let's think about information quality. Let's have a mix of human and computation. Think about how we develop fairly scalable solutions, but have some degree of human judgment in the loop, more judgment in the loop than is currently the case. And to some extent we do see companies, like Facebook, moving in that way with fact checkers and so on.
And all of this is to say that none of this is easy, because the notion of fact checking buys into the notion that you and I can agree on what a fact is. And at the end of the day a lot of people wouldn't agree what a fact is. There are legitimate some things are crazy, but there are also a lot of non-crazy things that people disagree about. So it's just really hard.
Of course, newspapers have been dealing with this in certain kinds of ways having to make truth determinations. But again that was artisanal model that I was talking about. So I haven't really given you a great solution. Sorry.
PETER ENNS: We have time for one more quick question before we adjourn to the reception.
AUDIENCE: Great. My name is Chris Sperry. I'm a co-director of a media literacy organization that works with K-12 teachers around amongst other things fake news. And I was really intrigued by your challenge, to my understanding, that social media really reinforces the filter bubble, the echo chamber. So I'm interested in that.
Some of the research that's important in our work is around confirmation bias and around understanding that we think as teachers just give people the truth, that in fact confirmation bias plays a very, very strong role in how young people, and adults for that matter, understand what's true and what's not true. Can you speak to the role of social media in reinforcing or maybe perhaps not reinforcing confirmation bias through the filter bubble?
DAVID LAZER: Great question. And I think there are a few elements there. So just again, the question is what the notion of confirmation bias, that we have a tendency just to pay attention to things that are consistent with our prior views and ignore and discount things that are inconsistent. And, of course, you know, we may get-- there are a couple of challenges. Like one argument out there-- and like companies like Twitter have been saying, we need to expose people to more diverse information. So what we will do in our news feed is to make sure that your news feed is more diverse and that you get exposed to some counter attitudinal information.
There's this lovely study out last year by Chris Bail and collaborators in PNAS that asked some people to follow an account, which then actually did exactly this expose them to counter attitudinal content. And what it found was that people actually polarized and moved away from that content. It actually pissed them off and made them go the opposite direction.
This in addition to, I'll note, a study by a colleague of mine, Dong [? Qi ?] Zhou, who had an app which did a similar thing, which said here's a curated list. And some people got ideologically consistent content. Some people got ideologically inconsistent content. And it found that there was more polarization when you expose people to inconsistent content, because it repelled them.
Like let's imagine a few of you in the room are, let's say, progressively oriented and you were forced to watch Fox News and read Breitbart. It might actually push you away from reporting even though a lot of that reporting might be fairly accurate, that there's some basic grounding in reality in, let's say, the news being reported there. But like you're just totally skeptical of it. But if you're reading something that was more liberal leaning and they reported like, yeah, there's stuff going on in Syria that you would be more likely to believe it.
And so it's tricky because like I think-- and a lot of the tech companies have really taken this notion of the filter bubble very seriously and yet maybe producing some very perverse outcomes. And I wish I could then say, well, here's the easy thing to do. It's not clear to me that reinforcing the filter bubble is the answer. I mean reinforcing that you only hear like minded people. But it could be the only people that could persuade you to change your mind in the direction of reality are people who are credible to you.
And so again, even here, a lot of the conventional wisdom, which is being implemented by Silicon Valley because they're still reading the stuff on the filter bubble actually doesn't hold up to academic scrutiny. And, again, this is part of the role of academia I hope in this space is to rigorously test these propositions and engage in a conversation with the powers that be to say, you know, actually this might be counterproductive. And thinking about what are the other kinds of interventions.
Another question is what's the role on K through 12? And the time that we get most informed in life is earlier. But even there, we don't really actually have a rigorous evaluation of is there some way to teach people so that they're skeptical of the stuff they should be skeptical of but trusting of the stuff they should be trusting of? You don't want them to all be solipsists who don't believe anything. But we don't actually have any rigorous evaluations of what kinds of interventions might work, what might be counterproductive, because it may make you skeptical of everything.
So, again, I'm better identifying problems here. I mean, part of the question here is also whether to what extent does labeling of content work? Maybe, maybe not. For egregious content, does de-prioritizing low information quality content-- is there an implicit endorsement by Google when something is number one or something's in the top 10 that this is good information, then does Google need to like take information quality into account when they put something in the top 10? And I'd probably say, yeah, at least somewhat.
I mean, if you want to search for apricot seeds cancer stuff and really find it that should be somehow findable. But it should prioritize the high quality information. So like if you're a desperate person with cancer, you don't come up with this stuff that seems to suggest in very credible looking websites, ranks high on Google, that apricot seeds will cure cancer.
But like I said, in political domains, it's a little trickier still than health. So, again, I haven't offered the answer to everything. I wish I could have.
We've received your request
You will be notified by email when the transcript and captions are available. The process may take up to 5 business days. Please contact firstname.lastname@example.org if you have any questions about this request.
David Lazer, professor of political science and computer and information science at Northeastern University, delivered the Cornell Center for Social Sciences’ Distinguished Lecture in the Social Sciences on Oct. 24, 2019.