I have been participating in CASP since 1998 in the ab‐initio/free modeling (FM) category. Our physics‐based method was very computationally intensive and I was as convinced then as I am now that using human knowledge and intuition to guide the search process could significantly accelerate the time to solution. We began developing a tool to support human‐computer interaction in 2001 and we had a prototype ready to try the human steering approach in 2003.

Although we found that the combination of human intuition and computer power was an improvement over the original method, we realized that it was not good enough. The main reason was that the interaction process was focused on our physics‐based method and, although this had strengths, it also had many weaknesses. What we actually needed to crack this difficult puzzle was the aggregation of different methods and ideas. However, a large‐scale collaboration in the context of CASP didn’t seem feasible then.

In 2009 the mathematician Timothy Gowers created a project called Polymath. The Polymath project had one ambitious goal: doing math research in an open way. Using blogs and wikis to mediate a fully open collaboration, this project let any person follow along and contribute ideas to the solution. And it worked! The contributors found the proof of the density Hales‐Jewett theorem and published 2 papers about their research. Currently, there are 8 active Polymath projects.

Since I couldn’t attend the CASP9 meeting and I wanted to prepare a review of the results in the free modeling category, I read the assessors report available at the prediction center site http://predictioncenter.org/casp9/doc/presentations/CASP9_FM.pdf. This report includes some discussion points from the FM modeling roundtable. The following comments from roundtable participants suggest that the community is finally ready for a collaborative effort:

“We are stuck in a very deep local minimum.”

“The present dead‐lock situation in CASP comes from the fact that almost all participants apply the same methods, there are no innovators.”

“Get these top predictors to work as a group to solve these tough problems rather than perfecting one method of their own.”

“The less impossible scenario is to have one open‐source platform for the whole community, like SBML or Cytoscape, where developers in the field contribute to it without any reservation.”

 

After reading these comments I felt encouraged to write an email message to about 20 representatives of different groups to find out how receptive they were to the idea of a collaborative effort during CASP. The feedback was very positive and encouraged me to scale up the conversation.

 

The proposal

Tim Gowers started the Polymath conversation with a post on his blog entitled “Is massively collaborative mathematics possible?” and, based on the results the answer is: Yes, it is!

The question for us is: Is a massively collaborative group effort possible during CASP?

We don’t know the answer yet but a number of CASP participants think we should try it and that by itself is a very positive step forward!

The idea is to have a forum for the online discussion and collaboration during CASP10 (or even sooner with the Rolling CASP experiment). The invitation would be open to groups and individuals who want to participate in this experiment by sharing their insight and observations, their data, their most recent code, or whatever they’d like to contribute. This collaborative experiment will let different groups or individuals work on different components of the prediction pipeline thus making it possible to leverage expertise like never before. For example, a group posts the best alignment, another group models the loops, a person ponders about a scoring function and another group implements those ideas and applies the scoring function to the current set of structures, etc, etc.

What would be the goal? The immediate goal would be to submit predictions for a small group of targets. The long‐term goal would be to shake up the field and push it out of its “very deep local minimum”.

What would be the advantage of proceeding this way? There are some potential advantages to this approach. Michael Nielsen, co‐author of the Polymath project paper published in Nature, addresses this issue in his blog very eloquently; therefore, I will paraphrase his words here.

First, you can think of this forum as a way of scaling up the scientific conversation, so that conversations can become widely distributed in both time and space. Today, only a small group of people have the opportunity to listen as D. Baker, Y. Zhang, J. Skolnick (fill in the name of your favorite protein scientist here), brainstorm with the members of their labs or in the post‐CASP meeting; why not have hundreds of talented people listening in? Why not enable people who have specific expertise contribute their insights back and combine all that insight and test the combined knowledge during CASP?

The exchange should be informal and rapid fire: let’s shoot ideas, let’s combine ideas, let’s test these ideas!

Second, as Gaurav Chopra put it, a collaborative effort makes perfect sense with respect to the division of a prediction pipeline handled by best methods and people who do different parts of the pipeline like alignment, scoring, refinement, etc.

Finally, another benefit of this forum is to make the conversations searchable so that future protein modelers can also benefit from this insight.

In summary, advantages of this open collaboration include:

  • Aggregate methodologies and knowledge
  • Leverage knowledge and strengths of different groups
  • Encourage CASP outsiders to contribute their insight and perspectives
  • Encourage groups that focus on specific aspects of protein structure prediction to contribute
  • Make expertise searchable

 

The resources

We are very fortunate to have the support of the National Energy Research Scientific Computing Center (NERSC). They will help us get the infrastructure (web services, wikis, groupware, file sharing, etc) that we need to run this collaborative effort.

 

The ground rules

Of course, there is a lot to be discussed about how to implement this collaborative, how to choose the targets, how to choose the models for submission, etc. Let’s start with some of the ground rules set by Gowers for his Polymath project and then add more specific rules for this project.

1. Comments should be concise.

2. Comments should be easy to understand so others get the idea and can build on it.

3. Stupid comments are welcome. Not stupid like “unintelligent” but stupid like not fully thought through.

4. If you can see why somebody else’s comment is stupid, point it out in a polite way. And if someone points out that your comment is stupid, do not take offense.

5. Don’t actually use the word “stupid”.

6. If you are convinced that you could answer a question but it would just need a couple of weeks to polish your thoughts and try a few things out, then resist the temptation to do that. Instead, explain concisely why you think it is feasible to answer the question and see if the collective approach gets to the answer more quickly. Only go off on your own if there is a general consensus that that is what you should do.

7. If you think of an approach that is completely different from the current one then you should suggest starting a different track. We need to decide how we’re going to choose which models to submit.

8. If the experiment results in something publishable then the paper should be submitted under a pseudonym with a link to the people who contributed and the entire online discussion.

9. The collaborative experiment should focus on a small number of targets, leaving plenty of time for the individual groups that want to participate in CASP to do so as usual.

 

A final invitation

I hope you’re interested in joining me on these discussions. We’re going to discuss this collaborative at the Zing Conference on Protein and RNA Structure Prediction next week. I’ll post a summary here so that everyone knows what we discussed and can further comment.

We can’t guarantee that this experiment will produce groundbreaking results. However, no matter what the short‐term results may be, let’s hope that it will create the spark to fire up the field and to speed up the rate at which discoveries are made.

Advertisements