• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

What is folding? What does it do? Why should I?

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Captain Newbie

Senior Django-loving Member
This guide is in no way official, but rather a compilation of various sources including the F@H FAQ, papers by the Pande Group, and the experiences of this forum.

Academic review is *absolutely* welcome -- feedback to doug at iclick-fantastica dot net.


Folding@Home: Stanford University Distributed Computing
What it does, how it works, and why I should contribute

What is protein folding? Proteins are fundamental biological molecules that replicate quickly (on the order of milliseconds). The process of protein replication is called folding. Protein folding (or misfolding) are either a cause of, or a symptom of diseases such as Alzheimer's Disease and bovine spongiform (BSE, or, colloquially, mad cow)[1]. Because of the quick timescales, it's fairly difficult to observe the process. Thus, simulations (computational biology) are used to observe the process.

Reference 2 said:
The overarching research goal of the Folding@Home project is a quantitative, predictive model of the folding process.

So, why is this goal so difficult? Chemically speaking, proteins are large structures [2] and so simulating each and every last atom would require a simply tremendous amount of computing power. It takes approximately one CPU-day to simulate one nanosecond of folding on a single-processor process. As one might guess, it takes a LOT of time with the world's most powerful supercomputers to even come up to the millisecond level. In fact, it took several months of CPU time to simulate one millisecond of folding [3]. The F@H algorithims model a folding protein as a series of free-energy barrier crossings [2]. This is a key breakthrough and allows F@H to work.

And this helps how? While protein folding involves multiple free-energy barrier crossings, it is possible to work around this. If you have M simulations running, and the first crosses into an intermediate state (or, rather, has a transition), you can reset all the other simulations to that state, and minimize the amount of time wasted [2].

Distributed Computational Biology With the proliferation of the personal computer and the Internet, distributed computing en masse has become a practical reality. The F@H project essentially uses any computer that is part of the project as one 'processor' in a large, asynchronous, heterogeneous supercomputer [2], and uses the Internet as the bus. A F@H client is running a part of a single trajectory (on the 100-picosecond scale). Each client completes the picosecond-scale steps, calculates the total energy variance across the time period, and submits the data to Stanford's servers. The server, in turn, looks at the total energy variance and if it exceeds a certain value, it is identified as having gone through a transition, and resets the other simulations to that new state.

So, Why Should I Help? Even though the Folding@home network has total power of somewhere near 175TFlop [4], there are a huge number of proteins to simulate, and a huge amount of data overall to be crunched, irregardless of the amount of power currently assembled. The amount of computations involved would require a tremendous amount of time (and therefore, money spent on renting that time) on a lot of the world's supercomputers. This data will benefit all humanity, and donating your extra CPU time (the client seamlessly uses extra time, without interrupting normal usage) in any amount will help the project tremendously [5].

How can I help? Simple. Download the folding client from Stanford's Website, configure it as you desire, and fold away!

References:
[1] Pande, Vijay et al (Stanford Pande Group). "Atomistic Protein Folding on the Submillisecond Time Scale Using Worldwide Distributed Computing." March 2002.
[2] Pande, Vijay et al (Stanford Pande Group). "Folding@home and Genome@home: Using Worldwide Distributed Computing to Tackle Previously Intractable Problems in Computational Biology". No date given.
[3] Duan et al. "Pathways to a protein folding intermediate observed in a 1-ms simulation in aqueous solution". Science, 1998.
[4] Wikipedia's Folding At Home Page. Wikimedia Foundation. 1 Nov 2005
[5] Shirts, M.S. and Pande, Vijay. "Mathematical Analysis of Coupled Parallel Simulations". Phyiscal Review Letters. 2000
 
Last edited:
This should be converted into a printable poster, then we can use it to convince people to borg :)
 
I'll get around to doing the actual citations (the [#]s) probably tomorrow, so that you all have information on where this information comes from :)

Thanks for the PDF version; since this is a 'revisable' document, I'd appreciate it if a link to the forum were included.
 
Back