Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 337 · 338 · 339 · 340

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2373
Credit: 45,041,428
RAC: 25,234
Message 112763 - Posted: 5 Jun 2025, 2:04:29 UTC - in response to Message 112760.  

But there are now six Rosetta tasks stuck in Ready to Report. Two of them have been there since Sunday and four all day Monday.

Requesting updates provokes the message that "Communication is deferred" for two hours or more. Their deadline is the next day.

This is all very odd. It seems like your PC can't connect to the bakerlab servers again.
I presume you've tried a reboot over the last couple of days? It might be worth a try.
Someone who knows more about servers than me (almost everyone) might be able to suggest something. I'm useless at this kind of thing tbh. Sorry.

Six Rosetta tasks still stuck in Ready to Report, four days now... Update always results in "communication deferred"

I've rebooted several times. Other projects are running fine.

I'm thinking of re-setting the project. Maybe that will jar the system into action.

I'm not inclined to think a project reset is going to make the bakerlab server any more reachable.
Can you confirm the few lines in your event log that result in communication being deferred? I'm assuming the server isn't reachable, but just to be sure.
And re-check your hosts file (without any extension - make doubly sure) is still as it should be.

I just don't get how you were able to contact the server to grab and return a few dozen tasks, then it becomes unreachable without something changing in between.
Anyone else with any ideas, pipe up.
ID: 112763 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Swisher

Send message
Joined: 10 Jun 13
Posts: 66
Credit: 54,655,306
RAC: 144,131
Message 112764 - Posted: 5 Jun 2025, 2:30:09 UTC - in response to Message 112763.  

Anyone else with any ideas, pipe up.

ping -c5 bakerlab.org
ID: 112764 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Swisher

Send message
Joined: 10 Jun 13
Posts: 66
Credit: 54,655,306
RAC: 144,131
Message 112765 - Posted: 5 Jun 2025, 5:05:25 UTC - in response to Message 112764.  

The "-c5" depends on the flavor of your OS.
ID: 112765 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
William Albert

Send message
Joined: 22 Mar 20
Posts: 28
Credit: 2,111,050
RAC: 5,469
Message 112766 - Posted: 5 Jun 2025, 18:31:00 UTC - in response to Message 112762.  

Your AMD Ryzen 7 5800X machine has so many error'd/invalid WUs that I would strongly suspect failing hardware.

It is. It's a repeated disk failure that's kind of described here
I don't trust myself fixing this, so I'm waiting for my hardware guy to get back.
He's out of the country atm - and my fingers are permanently crossed that it doesn't become irretrievable before he returns.


If you have hardware that you know to be unstable, I would suggest not using it for crunching WUs.

Not only is it a waste of electricity when WUs error out or are invalid, since Rosetta@Home doesn't validate results with wingmen, the WUs that don't outright fail can still contain bogus results.
ID: 112766 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 127
Credit: 1,028,210
RAC: 1,053
Message 112767 - Posted: 5 Jun 2025, 20:15:12 UTC - in response to Message 112759.  
Last modified: 5 Jun 2025, 20:25:58 UTC

Two more Validate errors tonight, meaning 2x12hr tasks not being awarded credit.
Another unheard appeal for the daily job that cleans this up to be reinstated.

Probably caused by some disk errors I'm getting locally, but annoying nonetheless :(

More disk errors, 5 more Validation errors (likely more to come). All lost credits again.
I'm going to have to do something about this...

More did come - 8 in all. A temporary fix is in, but it'll return until I can clone onto a new drive :(

Another 8 validation errors and 4 compute errors on top <sigh>

Up to 22 now. So demoralising...


If you knowledgeable people are having so much trouble with Rosetta, I don't feel so stupid when I can't get it to work. That's why I said this project was too unreliable and quirky for a person with my limited abilities.

One would think the project administrators and researchers might have a concern about the problems their crunchers encounter with their project. No? They might get better results and happier volunteers if these issues were resolved.

But nobody cares if you are demoralized?

S. Gaber
Oldsmar, FL
ID: 112767 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2373
Credit: 45,041,428
RAC: 25,234
Message 112768 - Posted: 6 Jun 2025, 3:00:50 UTC - in response to Message 112766.  

Your AMD Ryzen 7 5800X machine has so many error'd/invalid WUs that I would strongly suspect failing hardware.

It is. It's a repeated disk failure that's kind of described here
I don't trust myself fixing this, so I'm waiting for my hardware guy to get back.
He's out of the country atm - and my fingers are permanently crossed that it doesn't become irretrievable before he returns.

If you have hardware that you know to be unstable, I would suggest not using it for crunching WUs.

Not only is it a waste of electricity when WUs error out or are invalid, since Rosetta@Home doesn't validate results with wingmen, the WUs that don't outright fail can still contain bogus results.

I know you're right
I've finally got round to whatsapp'ing my guy to see when he's free
It's only a 1Tb HDD so shouldn't take too long once he can get to it
ID: 112768 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2373
Credit: 45,041,428
RAC: 25,234
Message 112769 - Posted: 6 Jun 2025, 3:15:42 UTC - in response to Message 112767.  

Two more Validate errors tonight, meaning 2x12hr tasks not being awarded credit.
Another unheard appeal for the daily job that cleans this up to be reinstated.

Probably caused by some disk errors I'm getting locally, but annoying nonetheless :(

More disk errors, 5 more Validation errors (likely more to come). All lost credits again.
I'm going to have to do something about this...

More did come - 8 in all. A temporary fix is in, but it'll return until I can clone onto a new drive :(

Another 8 validation errors and 4 compute errors on top <sigh>

Up to 22 now. So demoralising...

If you knowledgeable people are having so much trouble with Rosetta, I don't feel so stupid when I can't get it to work. That's why I said this project was too unreliable and quirky for a person with my limited abilities.

One would think the project administrators and researchers might have a concern about the problems their crunchers encounter with their project. No? They might get better results and happier volunteers if these issues were resolved.

To be fair, the cause of the problem is that I've transferred this hard drive over 2 or even 3 machines, and is probably a dozen years old by now, having been thrashed by Rosetta 24/7/365 and it's finally coming to the end of its life.
It's entirely my own laziness that's causing my issues to persist for so long.
And my appeal for the project to reinstate their cleanup job is just me asking them to immunise me from the consequences off my own inaction.
I have plenty of things I'm dumb as a rock over and the benefit of these forums is there's seemingly always someone who knows their way around every issue, and is generous enough to offer the benefit of their expertise, however baffling things appear.
ID: 112769 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 337 · 338 · 339 · 340

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2025 University of Washington
https://www.bakerlab.org