Friday, February 28, 2014

Strength in numbers

I have a feature in Nature on developments in crowdsourcing science, looking in particular at the maths project Polymath on its fifth anniversary. Here’s the long version pre-editing. I also wrote an editorial to accompany the piece.

____________________________________________________________________________

Researchers are finding that online, crowd-sourced collaboration can speed up their work — if they choose the right problem.

When, last April, the hitherto little-known mathematician Yitang Zhang of the University of New Hampshire announced a proof that there are infinitely many prime numbers differing by no more than 70 million, it was hailed as a significant advance in a famous outstanding problem in number theory. In its simplest form, the twin primes conjecture states that there are infinitely many pairs of prime numbers differing by 2, such as (41, 43). Zhang’s gap of 70 million was much bigger than 2, but until then there was no proof of any persistent limiting gap at all.

But perhaps as dramatic as the reclusive Zhang’s unanticipated proof, published in May, was what happened next. “One could easily envisage that there would be a flood of mini-papers in which Zhang's bound of 70 million was whittled down by small amounts by different authors racing to compete with each other”, says Terence Tao, a mathematician at the University of California at Los Angeles. But instead of such an atomized race, this challenge to reduce the bound became the eighth goal for a ‘crowdsourcing’ maths project called Polymath, which Tao helped to set up and run. Mathematicians all around the world pitched in together, and the bound dropped from the millions to the thousands in a matter of months. By the end of November it stood at 576.

There is nothing new about the notion of crowdsourcing to crack difficult problems in science. Six years ago, the Galaxy Zoo project recruited volunteers to classify the hundreds of thousands of galaxies imaged by the Sloan Digital Sky Survey into distinct morphological types: information that would help understand how galaxies form and evolve. Galaxy Zoo has now gone through several incarnations and incorporates data on the earliest epochs of the visible universe from the Hubble Space Telescope. It provided a template for other projects needing human judgement to sort data, and has itself evolved into Zooniverse, which hosts several online data-classifying projects in space science and other areas. Participants can, for example, classify craters and other surface features on the Moon, tropical cyclone data from 30-year records, animals photographed by automated cameras on the Serengeti, cancer data, and even humanities projects such as tagging the diaries of soldiers from the First World War. Almost a million people have registered with Zooniverse to lend their help.

Expert opinion

But Polymath, which had its fifth anniversary in January this year, is rather different. Although anyone can join in to help solve its problems, you’re unlikely to make much of a contribution without highly specialized knowledge. This is no bean-counting exercise, but demands the most advanced mathematics. The project began when Cambridge mathematician Timothy Gowers asked on his own blog “Is massively collaborative mathematics possible?”

“The idea”, Gowers explained, “would be that anybody who had anything whatsoever to say about the problem could chip in. And the ethos of the forum would be that comments would mostly be kept short… you would contribute ideas even if they were undeveloped and/or likely to be wrong.” Gowers suspected there could be a benefit to having many different minds with different approaches and styles working on a problem. What’s more, sometimes a solution requires sheer luck – and the more contributions there are, the more likely you’ll get lucky.

His first challenge was a problem called the Hales-Jewett theorem, which posits that any sufficiently high-dimensional collection of number sequences must exhibit some correlated structure – it must be combinatorial – rather than being entirely random. Gowers’ blog sought a solution for one particular form of the theorem, known as the density version. Gowers had hoped for new insights into the problem, but even he was surprised that by March, after nearly 1,000 comments, he was able to declare the theorem proved He called that period “one of the most exciting six weeks of my mathematical life”, and adds that “the quite unexpected result – an actual solution to the problem – added an extra layer of excitement to the whole thing”. The proof was described in a paper attributed to “D. H. J. Polymath”.

Tao was drawn into that challenge, and has since hosted other projects on Polymath. Mathematics is perhaps a surprising discipline in which to find this sort of collaboration, as traditionally it has been viewed as a solitary enterprise, exemplified by the lonely and often secretive work of the likes of Zhang or Andrew Wiles, who proved Fermat’s Last Theorem in seclusion in the 1990s. But that image is misleading – or perhaps projects like Polymath are playing an active role in changing the culture. “One strength of a Polymath collaboration is in gathering literature and connections with other fields that a traditional small collaboration might not be aware of without a fortuitous conversation with the right colleague”, says Tao. “Simply having a common place to discuss and answer focused technical questions about a paper is very useful.” He says that such online “reading seminars” helped researchers get to grips quickly with Zhang’s original proof.

Refining that proof – Polymath 8 – produced another paper for D. H. J. Polymath. One of the big leaps came from James Maynard, a postdoctoral researcher at the University of Montreal in Canada, who last November showed how to reduce Zhang’s bound of 70 million to just 600. Maynard, however, had already been working on the problem before Zhang’s results were announced, and he says his work was essentially independent of Polymath.

All the same, he sees this as an appropriate problem for such an approach. “Zhang's work was very suitable for many participants to work on”, Maynard says. “The proof can be split into separate sections, with each section more-or-less independent of the others. This allowed different participants to focus on just the sections which appealed to them.”

The success of Polymath has been mixed, however. “Polymath 4 and 7 led to interesting results”, says Gil Kalai of the Hebrew University of Jerusalem, who has administrated some of the projects. “Polymath 3 and 5 led to interesting approaches but not to definite results, and Polymath 2,6 and 9 did not get much off the ground.” And Gowers admitted that for at least some of the challenges the “crowd” was rather small – just a handful of real experts. Partly this might be just a matter of time: after Polymath 1, he remarked that “the number of comments grew so rapidly that merely keeping up with the discussion involved a substantial commitment that not many people were in a position to make.” And perhaps some of the experts who might have contributed were simply not a part of the active blogosphere.

Polymath “hasn't turned out to be a game-changer”, says Tao, “but it’s a valid alternative way of doing mathematical research that seems to be effective in some cases. One nice thing though is that we can react rather quickly to ‘hot’ events in mathematics such as Zhang's work.” He says that the crowdsourcing approach works better for some problems than others. “It helps if the problem is broadly accessible and of interest to a large number of mathematicians, and can be broken up into parts that can be worked on independently, and if many of these parts lie within reach of known techniques.”

“Projects which seem to require a genuinely new idea have so far not been terribly successful”, he adds. “The project tends to assemble all the known techniques, figure out why each one doesn't work for the problem at hand, throw out a few speculative further ideas, and then get stuck. We're still learning what works and what doesn't.”

It’s with such pitfalls in mind that Kalai says “it will be nice to have a Polymath devoted to theory-building rather than to specific problem solving.” He adds that he would also like to see Polymath projects “that are on longer time scale than existing ones but perhaps less intensive, and that people can get in or spin off at will.”

Gowers recognized from the outset that collaboration won’t always eclipse competition. He admits that “it seems highly unlikely that one could persuade lots of people to share good ideas” about a high-kudos goal like the Riemann hypothesis, which relates to the distribution of prime numbers. This, after all, is one of the seven Millennium Problems for the solution of which the privately funded Clay Mathematics Institute in Providence, Rhode Island, has offered prizes of $1m.

All the same, that didn’t deter Gowers from launching Polymath 9 last November, which set out to find proofs for three conjectures that would solve another of the remaining six Millennium Problems: the so-called NP versus P problem. This asks whether all hard problems for which solutions can be quickly verified by a computer (denoted NP) coincides with the class of problems that can be solved equally quickly (denoted P). Gowers did not expect all three of his conjectures to be solved by Polymath 9, but admitted he would be pleased if just one of them could be. However, the results were initially disappointing, and Gowers was about to declare Polymath 9 a failure when he was contacted by Pavel Pudlak of the Mathematical Institute of the Czech Academy of Sciences with a proof that one of the three statements he was hoping to be proved false was in fact true, apparently cutting off this avenue for attacking the problem. Gowers is philosophical. “It’s never a disaster to learn that a statement you wanted to go one way in fact goes the other way”, he wrote. “It may be disappointing, but it’s much better to know the truth than to waste time chasing a fantasy.” In that regard, then, Polymath 9 did something useful after all.

Polymath now functions as a kind of elite open-source facility. People can post suggestions for new projects on a dedicated website maintained by Gowers, Tao, Kalai and open-science advocate Michael Nielsen, and these are then discussed by peers and, if positively received, launched for contributions. “The organization is still somewhat informal”, Tao says. Setting up and sustaining a Polymath project is a big commitment. “It needs an active leader who is willing to spend a fair amount of effort to organise the discussion and keep it moving in productive directions”, says Tao. “Otherwise the initial burst of activity can dissipate fairly quickly. Not many people are willing or able to do this.” “It’s quite difficult to get people interested,” Gowers agrees; so far, he and Tao have initiated all but two of the projects.

Although surprised by Polymath’s success, Kalai says that the trend toward more collaborative efforts started earlier, as signaled by a rise in the average number of coauthors on maths papers. “Polymath projects do not have enough weights to make a substantial change. But they add to the wealth of mathematical activities, and, for better of for worse, their impact on the community is larger than their net scientific impact.” It’s not clear that this is a good way to do maths, he concludes – “but we can certainly explore it.”

Cash or glory

Some other “expert” crowdsourcing ventures are being run as commercial ventures by companies that aim to link people with a problem to solve with people who might have the skills and ideas needed to solve it. These generally charge fees and offer financial rewards for participants. Other initiatives are government-led, such as the NASA Tournament Lab, which seeks “the most innovative, most efficient, and most optimized solutions for specific, real-world challenges being faced by NASA researchers”, and the US-based Challenge.gov, which offers cash prizes for solutions to a whole range of engineering and technological problems.

One of the most prominent commercial enterprises is Innocentive, which hosts a variety of scientific or technological challenges that are open to all of its millions of registered “solvers”. These range from the seemingly banal, if important (developing economical forms of latrine lighting in emergencies, or “keeping hair clean for longer without washing”), to the esoteric (“seeking 4-hydroxy-1H-pyridin-2-one analogues”, or ways of stabilizing foamed emulsions). InnoCentive’s founder Alph Bingham says that their approach “has produced solutions to problems that had been previously investigated for years and even decades.” Good challenges, he says, “are ones where the space of possible solutions is immense and therefore hard to search on a serial basis”.

In contrast to that broad portfolio, other crowdsourcing companies such as Kaggle and CrowdFlower specialize in data analysis. Kaggle has been used, for example, in bioinformatics to predict biological behaviours of molecules from their chemical structure, and in energy forecasting. It has been recently used by a team of astronomers seeking algorithms for mapping the distribution of dark matter in galaxies based on its gravitational-lensing effects on background objects. Through Kaggle, the researchers set up a competition called “Observing Dark Worlds”, which offered cash prizes (donated by the financial company Winton Capital) for the three best algorithms. The winning entries improved the performance, relative to standard algorithms, by about 30 percent.

While this was valuable, astronomer David Harvey of the University of Edinburgh, an author of that study, admits that it’s not always straightforward to apply potential solutions to the problem you’ve set. “Many of the ideas that came out of the competition were great, and provided really interesting insights into the problem”, he says. “But none of the algorithms are ready to be used on real data – they need to be fully tested and developed. And it’s very hard to take some algorithm from someone not in your field and develop it.”

Harvey says that indeed the winning algorithm for “Observing Dark Worlds” still hasn’t yet been fully developed. “However, the advantages of these competitions is not always obvious”, he adds. For example, the second-place entry was written by informatics specialist Iain Murray of the University of Edinburgh, who is continuing to collaborate with Harvey, and now with other astronomers too. “This wouldn’t have happened if it wasn't for Kaggle”, Harvey says. That experience shows how “it’s vital that the winners of the competition work in collaboration post-competition on the problem and develop the initial idea all the way through to a final package.” But Harvey admits that “often these are just side projects for participants, and while they may have a sincere interest in the problem, they do not have the time to commit.”

Harvey points out that the call for such projects might nevertheless be increasing, especially in astronomy. “With new telescopes such as the Square Kilometer Array, the large synoptic survey and Euclid on the horizon, astronomers will be facing real problems of data processing, handling and analysing”, he says. However, Thomas Kitching of University College London, who was the lead scientist on the Dark Worlds project, admits to having mixed feelings about what ultimately such efforts might achieve. In part this is because real expertise might be hard to harness this way. “Most people are not experts, but might have a bit of time”, he says. “There may be some experts, but they have very little time.”

While Polymath relies on unpaid efforts of researchers whose sole reward is professional prestige, Innocentive and Kaggle recognize that harnessing a broader community requires more tangible incentives, typically in the form of cash prizes. “In academia, people are willing to spend a lot of time for ‘kudos’ or for the sake of science – but only up to a point”, says Kitching. “Once the problem requires a lot of time, like coding in Kaggle, then monetary incentives or prizes seem to be required. No one is going to spend seven days a week trying to win unless it’s already their job, so money offsets time.”

Innocentive’s 300,000 solvers stand to gain rewards of between $5,000 and $1m. Kaggle now hosts some of the efforts of Galaxy Zoo for a prize of $16,000 (also provided by Winton Capital). This sort of funding is not necessarily just philanthropic for the donors – Winton Capital, for example, were themselves able to recruit new analysts via the Observing Dark Worlds initiative for a fraction of their usual advertising and interviewing costs.

But it’s not all about lucre. “Winning solvers rarely list the cash among their top motivations”, says Bingham. “Their motivations are frequently more intrinsic, such as intellectual stimulation or curiosity to explore where an idea might lead." InnoCentive aims to encourages non-cash incentives, such as prospects for further collaboration or joint press releases. Yet Bingham adds that “dollar amounts also serve as a kind of score-keeping.” Some of Kaggle’s projects have no cash prizes, and Harvey says that “a lot of the time computer scientists will go there because they want to work on something new and exciting, and not for financial gain.” Indeed, the company invites participants to “compete as a data scientists for fortune, fame and fun.”

“A competition can help to advertise a problem to people who have not thought about it before, a prize can attract them to spend time, and a metric can help to sort signal from noise”, says Kitching. “So in this sense competition, if well posed can help in science. But a poorly posed problem may just increase noise.”

But as Kalai points out, there can be as much value in identifying important questions, and tools to tackle them, as in finding solutions. Kitching recalls a computer called Multivac that appeared in several of Isaac Asimov’s short stories, which was very good at answering questions but still required human scientists to pose them in the first place. Kitching suspects that the crowdsourcing pool will act more like Multivac than like its interrogators. “In the crowdsourcing approach the key to successful science is working out the correct questions to ask the crowd”, he says.

No comments: