Frequently Asked Questions (FAQ)
Shutdown of the Genome@Home project
will happen to the stats?
If no team changes are made in the config, will the existing G@H teams
become F@h teams and the individual results show on the F@H stats.
What about machines
that have to be turned off on nights and weekends (in which case
it should still be set on genome@home)?
Unlike other distributed computing projects, Genome@home is run by an academic institution (specifically the Pande Group, at Stanford University's Chemistry Department), which is a non-profit institution dedicated to science research and education.
The results from Genome@home will be made available on several levels. First, we put statistics and information about the protein sequences being designed on the web for everyone to see. These are updated daily, and include information about which users contributed which sequences. Second, analysis of the sequences will be submitted to scientific journals for publication, and these journal articles will be posted on the web page after publication. Thirdly, after publication of these scientific articles which analyze the data, the raw data will be available for everyone, including other researchers, here on this web site.
We keep many types of statistics of users and work accomplished on our web page. Check out our main statistics page here. We also keep track of how much work has been done, how many users are signed up and who's currently running here. You can see how many units you have processed so far (as well as all other users) on this page. There's a lot of information, so browse around and see what you can find. Note that not all of the statistics are updated automatically, so there may occasionally be some discrepancies.
Genome@home studies real genomes and proteins directly, by designing new sequences for existing 3-D protein structures, which come from real genomes. The protein structure files that are sent out as work contain the Cartesian atomic coordinates of a protein. This data was obtained experimentally through X-ray crystallography or NMR techniques. Note that this was not done by us; thousands of scientists have spent decades compiling this data, which is generously made freely available to the public. By designing new sequences that could form these specific protein structures, we're setting the stage to attack a number of significant contemporary issues in structural biology, genetics, and medicine. For example, the Genome@home data will be used to:
Both the Windows and Linux versions are new.
i) Caching - If you can't connect to the Genome@home server (it's down, or you're offline) when it's time to deposit finished work, you'll get a few PutWork error messages. After twenty minutes, Genome@home will give up, store the results on disk, and rerun the work units you've already got. Since the results are different each time Genome@home runs, this is completely equivalent to downloading brand new work units. The client will continue rerunning work units (potentially for weeks, if you're on vacation or something) until it is able to connect to the server, at which time it will upload all the results to the server, and you will get credit for the total of all the work units processed.
ii) Checkpointing - Genome@home now checkpoints itself after each of the 30 sequence design iterations. If you complete the design of 23 sequences and your computer crashes, or you Ctrl-C G@H, or there's a California rolling blackout :-), or whatever, the client will continue on and start designing the 24th sequence once it's restarted. However, even if you were 90% done designing sequence 24 when Genome@home was stopped, it still has to start over again from the beginning with sequence 24. Thus, the most you'll lose is the time it takes to design one of the 30 sequences (roughly an hour or so for most machines and proteins).
Both the Windows and Linux versions are new for version 0.93.
The Linux version is recompiled to accomodate older processors, such as Pentium II and AMD K-6, which were not supported in earlier Linux versions.
The filenaming error (causing some results to not be sent in) found in verion 0.91 has been resolved. Also, a bug in the loop counting and reseeding of the random number generator in the protein design algorithm itself has been resolved.
Both versions will now attempt to send back any finished results upon (re)start-up of the client. The client will also get new work, save it, allow any unfinished work to finish, and then start new work. If both a new and an old work unit are already present, no new work will be downloaded.
Both the Windows and Linux versions are new for version 0.98. This version of the client will not process pre-0.98 work units. When upgrading, it is best to allow an old client to finish a work unit, then install the new version.
The client has a number of additional data integrity checks which verify the integrity of the work unit before it's processed and before the results are returned. The identifying information for each work unit is more closely tied to the work unit and maintained by checksum verifications. The client will reseed a rerun work unit with a random, rather than incremented, 32-bit seed. All these changes were designed to eliminate the possibility of duplicating work units or fabricating false results. The client will also warn the user if it was shut down improperly or if another instance is already running in the same directory.
The client maintains a rudimentary screen log of it's progress (scrlog.gah), with timestamps at each step. This logging system may be changed/augmented in the future.
A number of new features have been added in the form of command-line flags; these also appear separately in the Windows Start menu.
The client will attempt to get new work more often than previous versions. After three failed attempts, it will try to rerun any current work units. The wait-time between get work attempts and put work attempts has been reduced to two minutes.
Yes, the Genome@home client will work with most modem setups. It will give an error message if it tries to connect when you are not online, but it will continue re-trying every five minutes. Once you go online, it will be able to connect to the Genome@home server.
.If you are behind a firewall, please answer yes at the "firewall" dialog box, and then give the client some info about your firewall.
Not all firewalls are supported. Also, please make sure that SOCKS is running.
Occaisonally the server goes down, but the clients (console & screen saver) are designed to wait for the server to come back up and then go from there. You don't need to do anything; this should happen automatically. It does wait several minutes each time it tries to connect, so don't worry if it sits there for 5 or10 minutes (or occasionally even longer).
Sometimes, things will just plain go wrong with the client. Usually, all you need to do is delete the file "input.inp" from the Genome@home directory on your computer, and restart the client. This will force it to get rid of the bad work unit and get a new one, which almost always solves the problem.
It hasn't stopped; this second stage of the algorithm just takes awhile. It could take up to an hour on slow machines.
Genome@home requires at least 32 MB of RAM. Weird things happen under Windows with less memory.
Microsoft has these DLLs on their site. In particular, you need DLLs for winsock2. These are built into most copies of windows NT, 98, and 2000. However, many copies of windows 95 do not have these.
The Windows socket 2 update for Microsoft Windows 95 resolves a number of Winsock2 issues. This update also resolves a number of TCP/IP stack issues.
If you get something like:
Network Recv Timeout
then don't worry. It is having problems connecting to the server, and is waiting to try again. If it fails to connect for a day or so, it might be best to start it over again or reinstall. Hit Ctrl-C to exit gracefully, and start it again.
Genome@home tells you how it's progressing through your work unit. It starts off with a huge variety of possibly good sequences, and iteratively searches through and refines these sequences, until a well-designed sequence is found. The core of the design algorithm repeats itself thirty times, each time producing one "best" sequence. After thirty iterations, Genome@home will send the data back to the server and get more work.
Yes. Genome@home supports dual processor machines. You just need to run two copies of Genome@home, each installed into its own directory.
We are constantly and rapidly improving the Genome@home software. We release new versions to fix bugs reported by the users to help make the project run as smoothly as possible.
Your computer will automatically upload the results to the Genome@home server each time it finishes a work unit, and download a new job at the same time.
We define a work unit as the complete design of one 100-amino acid protein sequence. This generally takes about a day or two on an average computer. The size of the protein you get sent may well be shorter or longer that 100 amino acids, and we calibrate for the size of your protein sequence when we calculate work statistics.
To find out data that has been reported back you can check our user stats page. If your computer is returning data, you should see your username there along with the number of work units completed. If your name isn't there yet, you probably just haven't sent back any work units yet.
Yes. Genome@home should run fine while SETI@home and/or Folding@home is running, assuming that you have enough RAM for both.
The e-mail addresses collected by Genome@home are never distributed to any other organization. We are a non-profit research group at Stanford University, and we have no commercial interests. A confirmation email will be sent to the address when you first download Genome@home, or if you start a Genome@home team. Infrequently, we may send a message about new version upgrades or exciting news about the project. If you don't wish to receive further emails, you can opt out on your user page (not yet available).
The simplest way to change your username is to uninstall, then download and reinstall Genome@home. It also is an opportunity to upgrade to our latest version, which should run more smoothly and interact with our servers better. Please give the install program the desired username and all new work units completed will be associated with that username.
To start a team, go here. To join a team, you need to enter your team's account number in the "account" dialog box that appears the first time you run Genome@home. If you're already running Genome@home, you should uninstall, download a new version, and re-install. To see how your team is doing, check here.
Only work units "labelled" with your team's account number will add to the team total. Make sure that your team's account number appears in the ghclient.cfg file on your computer(s). Any work units that you completed before joining a team will not count towards the team total.
Yes. The Genome@home server will assign each machine a unique cpu id. You can enter the same user name in the "group name" dialog box on every machine onto which you install Genome@home.
You can use anything except whitespace (space, tab, etc.) If you want a space in your user name, use an underscore "_".
Because the new client allows caching and checkpointing, the calculation of the rate at which a user processed work units becomes difficult, buggy, and slightly irrelevant. In the interests of a stable client and server, and smooth user interface for the client and website, we've stopped calculating these statistics.
The Genome@home logo is a combination of three elements. The name itself is presented in bold, colourful letters, floating slightly above the rest of the logo. The two-colour helix represents a string of DNA, the molecule which makes up genes and genomes. This element also appears in the logo of our partner project, Folding@home.
The Genome@home client software is available for download only from this web site - we do not support Genome@home software obtained elsewhere, and in fact would appreciate it if you would notify us if other people are offering the software for download. This software will upload and download data only from our data server here at Stanford. The data server doesn't download any executable code to your computer. In fact, Genome@home client is much safer than the browser you're running to read this!
We're anticipating requests for other versions for various operating systems. Porting the client to other platforms should be easier now that we've had some practice with Folding@home.