
The way images, videos or concepts can abruptly spread
across the web, using e-mail and social websites, is online
culture's most unique phenomena.
Spanish researchers claim to have found a way to accurately
predict how quickly and widely new pieces of information, or
memes as
they are called, will spread. The ability to forecast this viral
behaviour would be of great interest to sociologists and
marketeers, among others.
The secret, they say, is to recognise the fact that people vary
in how "infectious" they are when it comes to sharing content
online. While some people pass on things they receive right away,
others do so after some delay, or not at all.
Medical models
The viral spread of information online has conventionally been
modelled using epidemiological tools developed to analyse the
spread of biological viruses. One of the concepts borrowed is that
of an infection's R0, or
basic
reproduction number, which describes how many other people
someone with the virus can be expected to infect.
Knowing the R0 number help predict the likelihood and
extent of real life epidemics, such as
H1N1swine
flu. But models that apply the idea to online information can
only indicate whether an internet meme is likely to be successful
or to die out quickly, says
Esteban Moro at the Carlos III
University of Madrid, Spain.
Moro, working with José Luis Iribarren at IBM in Madrid, used
IBM's company e-mail newsletter to show the importance of
variations between people's infectiousness in propagating memes
online.
E-mail trail
They started a reward scheme offering prize draw tickets for
recommending the newsletter by providing e-mail addresses of other
people and tracked how widely and quickly the recommendations
spread. After two months it had reached 31,000 people.
But while people took 1.5 days to respond to a recommendation
e-mail on average, there was a huge variation at the individual
level: some users responded within minutes, other in months, says
Moro.
And only by combining some expectation of that variation with
the R0 number is it possible to build a model able to
predict the meme's spread. The team use a small chunk of the
initial data on the content's spread to predict how many people it
will reach in total, and how fast. "Our model can give predictions
within 1 per cent error once secondary reproductive number and
human activity are estimated," Moro says.
The model cannot predict whether a piece of content will go
viral before it has been released; only its likely reach once it
starts spreading. And the researchers think their approach to
modelling should apply to information spreading via social
networking sites and other online services as well as e-mail.
Remarkable result
Statistician
Claudio
Castellano, at the Sapienza University of Rome, calls the match
between prediction and real result remarkable. He adds that there
is other evidence to back up the idea people vary in online
infectiousness.
For instance,
David
Liben-Nowell at Carleton College in Northfield, Minnesota, and
colleague Jon
Kleinberg at Cornell University last year traced an
11-year-old e-mail chain letter to show up the differences
between the spread of real viruses and viral information.
Moro's study agrees with his own results, says Liben-Nowell.
"Many models of information propagation discount both the role of
time and [differences between] people." But, there is more to
discover, he says. For example, how people may vary in
infectiousness depending on the type of content they receive.
Journal references: Moro and Iribarren study -
Physical Review Letters (DOI:
10.1103/PhysRevLett.103.038702)
Liben-Nowell and Kleinberg study -
Proceedings of
the National Academy of Sciences (DOI:
10.1073/pnas.0708471105)
First published on
NewScientist.com