Message boards : Number crunching : Discussion on increasing the default run time
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next
Author | Message |
---|---|
Matthew Lei Send message Joined: 5 Jun 06 Posts: 4 Credit: 258,058 RAC: 0 |
|
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. The sooner they bring this in the better i say, less people hammering the servers. pete. |
6dj72cn8 Send message Joined: 18 Apr 06 Posts: 5 Credit: 207,684 RAC: 0 |
My preference is for a minimum of two hours. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Harry, could you please talk a bit about WHY that is your preference. What is it about how you use your machine that makes this better for you? Rosetta Moderator: Mod.Sense |
6dj72cn8 Send message Joined: 18 Apr 06 Posts: 5 Credit: 207,684 RAC: 0 |
After some thought I am unable to justify my preference in a fashion likely to be helpful or meaningful to the project. I therefore withdraw my previous comment and ask you to ignore it. |
LizzieBarry Send message Joined: 25 Feb 08 Posts: 76 Credit: 201,862 RAC: 0 |
It may be that a short run-time is chosen by some users to better reflect the kind of time they spend on a computer before shutting down, thus enabling a WU to complete without keeping the computer on longer than wished. While checkpointing has been infrequent this may be a way of preventing the same WU from restarting over and over. But if checkpointing is to be made more frequent with the new Mini Rosetta version being tested soon that part of the problem may disappear. Going back through this thread, the idea that the minimum be increased from 1 to 2 hours and the default increased from 3 to 4 hours as a first interim change makes sense. Not as drastic as doubling it all at once. It can be assessed for unexpected results before increasing the default from 4 to 5 hours maybe a month later. Then again to 6 hours for a while before changing the minimum from 2 to 3 hours. Every change would go towards helping the server loads to a lesser degree. |
Virtual Boss* Send message Joined: 10 May 08 Posts: 35 Credit: 713,981 RAC: 0 |
For those who may be interested in the effect of runtime changes. Below is a list which shows credit vs down/up traffic for 3 weeks pre/post runtime change. Weekly Dates Credit Down MB UP MB 26Oct08-01Nov08 2079 150.1 6.0 02Nov08-08Nov08 2519 235.9 8.7 09Nov08-15Nov08 3222 211.2 29.1 16Nov08-22Nov08 3894 118.2 6.4 23Nov08-29Nov08 4348 120.0 12.3 30Nov08-06Dec08 2839 117.3 4.9 After this thread started I changed my runtime preferences. Before 15Nov all my hosts were default 3hrs. On 15Nov I changed runtimes as follows: 10hrs - 1 Host (~80% of RAC) 6hrs - 2 Hosts (~11% of RAC) 4hrs - 3 Hosts (~9% of RAC) The List shows the obvious drop in internet traffic and increased Credit output due to changing the runtime (which was the only change made). [EDIT] The increase in credit shown here is more likely due to variation in project crunching ratio - but overall shows ~10-15% increase since change.[/EDIT Bruce |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Yes virtual, I wouldn't "sell it" as a RAC improvement. It should be basically unmeasurable. And, as you can imagine, if a new protein come in to study during your test, then you had an extra couple of 2-3MB files to download. It really varies. So, it isn't even really intended to be much of a bandwidth saving. I really boils down to the number of hits on the scheduler. After that, the specific file transfers are not the main focus of changing these values. You can also reduce file transfer bandwidth (and scheduler hits) if you keep more days of work on your machine. If you say connect about every 3 days for example, rather then the 0.1 days default, that can make a nice reduction in hits on the servers. Not that setting to .1 days means you actually always do 240 hits per day, but the higher value will reduce the number of requests. Rosetta Moderator: Mod.Sense |
Virtual Boss* Send message Joined: 10 May 08 Posts: 35 Credit: 713,981 RAC: 0 |
Yes virtual, I wouldn't "sell it" as a RAC improvement. It should be basically unmeasurable. And, as you can imagine, if a new protein come in to study during your test, then you had an extra couple of 2-3MB files to download. It really varies. So, it isn't even really intended to be much of a bandwidth saving. I really boils down to the number of hits on the scheduler. After that, the specific file transfers are not the main focus of changing these values. Hi Mod.Sense I agree that there is a large number of variables, but they do tend to average out over the longer term. Below are the figures for 2 months pre/post which still show a significant decrease in traffic. Even though during the post period I have noticed there have been increased numbers of new proteins and several series which repeatedly 'crashed out' on my hosts and problems with credit generated, which would all have the effect of reducing the amount of traffic reduction I have seen. These figures indicate a 33% increase in credit per MB of Download. Date ranges Credit DownMB Ratio 16Sep08-14Nov08 25373 1196.2 21.21 15Nov08-13Jan09 30202 1057.2 28.57 Simple maths will tell you that for a particular protien, if you double your runtime then you will roughly double the number of models completed, thereby roughly doubling your credit (per MB DL). If you still crunch for the same number of hrs per day this means your traffic is roughly halved. I believe in the longer term my stats will approach that figure. I was also wondering where the total credit increase came from, and suspect it may partially be due to less cpu time 'wasted' by 1 - network traffic and 2 - loading and initialising the work unit before it can start actually crunching any useful data. I guess more time will give more accurate findings. And Yes - My overall servers hits have reduced considerably. (maybe by 30-40% guesstimate) Bruce |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I was referring to RAC. You are using a credit per MB of BW system. So, yes, I'd expect that the more factors you can play together that reduce MB of download will improve your credit per MB. But, it will not make a material difference in credit per day of crunching, which is what RAC amounts to. I'm curious too, how did you measure your bandwidth? Are you using a proxy server that recorded that? I'm not questioning your figures. Just looking for more ways to measure it my self :) Rosetta Moderator: Mod.Sense |
Virtual Boss* Send message Joined: 10 May 08 Posts: 35 Credit: 713,981 RAC: 0 |
I was referring to RAC. You are using a credit per MB of BW system. So, yes, I'd expect that the more factors you can play together that reduce MB of download will improve your credit per MB. But, it will not make a material difference in credit per day of crunching, which is what RAC amounts to. I am using a commercial program called BWMeter, primarily to control b/w allocations to each host on my network to stop any host 'hogging' the internet. I also has quite good statistics among many other features. |
]{LiK`RangerS` Send message Joined: 27 Oct 08 Posts: 39 Credit: 6,552,652 RAC: 0 |
im going quad :D |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
So, what was the decision on increasing the minimum and default runtime? Did you decide to upgrade the DB server instead? Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
mike46360 Send message Joined: 21 May 07 Posts: 10 Credit: 18,011 RAC: 0 |
We are planning to increase the default run time from 3 hours to 6 hours and the minimum from 1 to 3 hours to reduce the load on our servers. I increased the run time from 3 hours to 6 hours last night.. Does this help the folding at all or is it just to ease the pain on the servers? |
ByRad Send message Joined: 12 Apr 08 Posts: 8 Credit: 15,846,101 RAC: 439 |
But there will be albo a problem... Just for try I have changed my runtime from default (3 hours) to th maximum value of 24 hours for couple of days. Ane the effect was that only 4 of 14 tasks have finished without errors (I tried 2 days on WinXPx86 and 2 days on Win& 64b, so it doesn't depend on the wersion of rosetta (I mean x64 / x86, not v.1.74) ). In that period I was running my PC all the time (24h a day) restarting it once or twice a day. So increasng the runtime will also reduce the number of Work Units finishing properly. Because of this I think that it woult be really nice idea if the result of every finished WU (valid or erroneus) were sent to the serwer, because the error can occur in the first model but also after 100 models finished properly. And if there were also sent some informations about error, it would give some debug informations for developers (without huge increasing of the traffic). |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
mike, it just keeps your machine busy with less overhead on the project servers. ByRad, the Rosetta applications does send partial successes. If you complete 50 models and then number 51 fails, the task is reported back and should show as a success. It also sends some information back to help diagnose what caused the problems with the 51st model. So, the system may not always work perfectly, but the suggestions you have made are already in the code. Rosetta Moderator: Mod.Sense |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. I see that nothing has been done about this, it might help you with the type of server problems your having at the moment. putting the default to at least four hours. I.M.H.O. |
Warped Send message Joined: 15 Jan 06 Posts: 48 Credit: 1,788,185 RAC: 0 |
I live in a bandwidth-impoverished part of the world, with high prices and low speed. Consequently, I have selected 16 hours run time. However, I find this thread as well as the others discussing long-running models to be of little interest when I have work units running for about 4 hours. Is the preferred run time really applied? Warped |
dcdc Send message Joined: 3 Nov 05 Posts: 1831 Credit: 119,500,523 RAC: 11,167 |
I'd happily change my run-time prefs so that computers that are on lots have a high run-time and the others have a low run-time but I find this really difficult as they're tied to the BOINC work/home/school settings (which I think are poor, but not the project's fault ;) ). I also use BAM but that doesn't allow changes to the run-time, so I'm left with the default. Being able to select a run-time preferences per machine would be useful, but probably only for a minority i guess... (just noticed the project haven't posted on this for a while!) |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
I live in a bandwidth-impoverished part of the world, with high prices and low speed. Consequently, I have selected 16 hours run time. I have noticed that on my faster machine, the limit of 99 decoys is usually reached before the 12-hour expected runtime I've requested. You might want to check the report visible on the Rosetta@home of how well the workunit succeeded to see if your workunits also often stop at the 99 decoys limit instead of near the requested run time. |
Message boards :
Number crunching :
Discussion on increasing the default run time
©2024 University of Washington
https://www.bakerlab.org