Go Back   Pro/Forums > ProCooling Technical Discussions > Snap Server / NAS / Storage Technical Goodies
Password
Register FAQ Members List Calendar Chat

Snap Server / NAS / Storage Technical Goodies The Home for Snap Server Hacking, Storage and NAS info. And NAS / Snap Classifides

Reply
Thread Tools
Unread 07-13-2007, 11:40 PM   #1
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default 4100 problem - HDD bad?

can someone help me with this. I'm new to dealing with crashes on Snaps. log attached.
Snap_Server:
Model Software Hardware Server # BIOS
4100 4.0.860 (US) 2.2.1 509534 2.4.437

possible power outage in the middle of the night. around the time when I run a backup off the snap (around midnight). when workers showed up they could not access the server. they reboot the server by powering it off around 9:30am since they could not get a hold of me (my lucky day off). so that explains the 2 main events from the log. after messing with different things all day I took it down to switch the drives into my other snap but this one has the older OS:
4100 3.4.803 (US) 2.2.1 552109 2.4.437
and I see that it is not compatible with the larger drives? Is this correct? The server that is down has 4x160GB seagates.
I see it is mentioning drive 10018 which from other posts I see is drive 4 but I don't understand if it is telling me that just drive 4 is bad or there is some other situation.

Please help ASAP! I need to try to have the server back up by Monday and if I have to get a drive that matches the others then that cuts down on my time...

Shawn Thomas
if it is something that I may need to be walked through, email me and I can give someone a call back.
Thank you!!
Attached Files
File Type: txt Snap_Log.txt (8.3 KB, 7 views)
TheShawnThomas is offline   Reply With Quote
Unread 07-14-2007, 06:39 AM   #2
blue68f100
Thermophile
 
blue68f100's Avatar
 
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
Default Re: 4100 problem - HDD bad?

Please save the log or the results of any cmd in HTML code next time. It makes it a lot easier to view. Text strips all of the formating out.

Please do not install the drives into the other snap. With different OS you you can over write the OS.

10018 is drive 4 as you determinded. Which indicates to be having a problem during the rebuild. You should still beable to access the data if it does not go into panic mode.

I need more info, can you post the results of "co de info". I need to make sure the starting point is the same on all drives.

If you have a copy of SpinRite by GRC. Remove drive 4 and run spinrite on it. It can correct a lot of drive problems, and many user have had suscess with it. Spinrite is not OS dependent and will do damage the drive. If it hangs up in the 4-5% area a new HD will need to be installed.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5,
1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5,
1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy

Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820
blue68f100 is offline   Reply With Quote
Unread 07-14-2007, 12:27 PM   #3
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

thank you. I hope this log is how you needed it. co de info attached.

I do have spinrite 6. I will give it a try when I get back to the office in 2 hours.
Attached Files
File Type: txt Snap Server [Server Debug] - co de info.htm.txt (5.4 KB, 6 views)
TheShawnThomas is offline   Reply With Quote
Unread 07-14-2007, 02:58 PM   #4
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

and here is the weird thing: disk 1, 2, 3 are a stripe. disk 4 is separate. yet both of them show offline and "Fatal error during disk check."
Attached Files
File Type: txt Snap Server [Server Log].htm.txt (58.2 KB, 5 views)
File Type: txt Snap Server [View Disk Status].htm.txt (7.2 KB, 5 views)
TheShawnThomas is offline   Reply With Quote
Unread 07-14-2007, 05:19 PM   #5
blue68f100
Thermophile
 
blue68f100's Avatar
 
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
Default Re: 4100 problem - HDD bad?

You need to save the files in a format that can be viewed. All that you need to do is use "Save Page as" enter a file name with no suffix.

If this is a striped, raid 0, there is no redundency. If you loose one HD you loose all data. Your only hope at this point is SpinRite, if that does not work a file recovery service. If Spinrite fails in the 4-5% range, you need a recovery service. Douglas at Frontline recovery service is very knowledge of the snap file system. He use to work for Quantium before SnapAppliance now Adaptec bought them, in the HD division. His prices normally are below the competition. He goes by Snap-tech here on the forum.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5,
1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5,
1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy

Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820
blue68f100 is offline   Reply With Quote
Unread 07-15-2007, 05:51 AM   #6
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

I just changed the extension on the htm files since I could not upload .htm to the forum. I will reupload the logs in zip files since that seems to be allowed.

Updated info: I ran spinrite on all 4 drives and it showed no "problems" with any when completed. so I put the drives back in the same order and now when I plug in the server with all drives attached it cycles the fans momentarily and goes off like normal, but then does it again and again until I unplug it. I disconnected each drive one at a time and when I disconnect drive 3 it will allow me to power on like normal. Does this mean that drive 3 is hosed? or is there something wrong with the power supply? motherboard? something else?

without drive 3 I realize my stripe is hosed. I have a backup from last weekend so we only lose 1 week of data... now individual disk 4 still showed up as offline so I ran "Repair all errors" and it seems to have become available again with only 6 files missing. They are all mac .rf files so it should not be an issue. Should I assume that disk 4 is perfectly fine for use? is this something you can tell from the log? is there a different tool I should run besides SpinRite?

Thanks again for all your assistance.
Attached Files
File Type: zip SnapLogs2.zip (12.1 KB, 5 views)

Last edited by TheShawnThomas; 07-15-2007 at 06:10 AM. Reason: changed attachment
TheShawnThomas is offline   Reply With Quote
Unread 07-15-2007, 06:30 AM   #7
blue68f100
Thermophile
 
blue68f100's Avatar
 
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
Default Re: 4100 problem - HDD bad?

Logical device 2 is drive 3, the one you have out. Drive 4 show to be good by the log. I'm not sure if you are have a power supply problem or a bad drive. A week power supply will prevent the snap from booting due to the high inrush current. This is easy to test if you have another PS, like from your other unit. But since Spinrite did not have a problem, I would bet on a weak power supply. You can also use a power supply from a PC, just need to tie the grounds to the same point (for reference) and jumper the pins to start. The 4100 does not require the top to be installed to cool properly, like some models.

At this point I think its a hardware problem, weak power supply.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5,
1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5,
1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy

Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820
blue68f100 is offline   Reply With Quote
Unread 07-15-2007, 04:48 PM   #8
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

sound about right. so before taking apart both servers to swap power supplies I tried swapping power leads between drives and the system didn't keep spinning the fans continuously so I hooked everything up like normal and the system decided to boot... I swear I reconnected the stupid cable 3 times to make sure it was plugged in correctly. sigh.

So anyway it boots and drive 4 still mounts and is available but now it shows "Logical set member 2 not found" which you say is disk 3. So I'm leaning towards bad disk but I will still try swapping power supplies.

If it is the same with the other power supply I will order another drive that is as close to the same specs. If I can get this working again I will reconfigure for RAID 5. Can I start it with 3 drives and when I get the 4th just add it to the set? or should I wait until I get the replacement drive?

Thanks again.
TheShawnThomas is offline   Reply With Quote
Unread 07-15-2007, 05:22 PM   #9
blue68f100
Thermophile
 
blue68f100's Avatar
 
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
Default Re: 4100 problem - HDD bad?

You can't recover from a RAID 0, so everythig will be lost unless you use a recovery service. You can not expand a Raid5 on a snap, you can add a spare. But you will only be using 1/2 the capacity that way. Depending on the age of the drive I would be tempting to replace all if over 5 yrs. Try to get drive that are server rated for RAID, they have a min seek time. Snaps can be picky when it comes to seek times. But I think the smallest WD makes is 160gig, and the last tiem I bought drives the 250gig were cheaper ($71). But they were going into 4500 with plenty of power. Installing 250gig is a 4100 is a lot of extra power for spinning up the second platter, I would stick with 160's. I did test the 250's in my 4100 and had no problem with them. If you happen to power down and it does not spin back up, I would cool the drive off in the freezer for an hour in a ziploc bag to keep it dry. Sometimes it will kick start them.

Good luck.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5,
1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5,
1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy

Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820
blue68f100 is offline   Reply With Quote
Unread 07-15-2007, 06:06 PM   #10
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

okay so I've been copying files off of drive 4 since it was still functional but then all of a sudden was disconnected. Then I got an email from the server saying
"FATAL ERROR PANIC : ifree: freeing free inode"
I'm trying to reconnect now but cannot. The System light on the front is blinking rapidly and the Link light stays lit if I have the network cable connected. but cannot connect in any way.

I'm going to try to switch the power supplies now to see if that makes a difference but if that does nothing, should I just consider this this snap as hosed?
TheShawnThomas is offline   Reply With Quote
Unread 07-15-2007, 07:27 PM   #11
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

What the freak is happening with this stupid machine!??!

so I swapped the power supplies (which are MPW-6150F 150W) and first thing I notice is that disk 4 is offline again. Log for disk: Drive4 - Single disk shows:
FSCK fatal error = 27
and
Partially allocated inode I=71452

NOW disk 3 is available again, it goes through the check and says:
Clean flag not set in superblock (Fixed)
Modified flag set in superblock (Fixed)
and the data seems to be there!

So now I'm going to try to pull off the data from the last week that is not on the backup. But what should I do now???

Is the OS messed up? something on the motherboard? These problems and subsequent miraculous recoveries are WAY too random.

Last edited by TheShawnThomas; 07-15-2007 at 09:07 PM.
TheShawnThomas is offline   Reply With Quote
Unread 07-16-2007, 02:14 AM   #12
Phoenix32
Thermophile
 
Phoenix32's Avatar
 
Join Date: May 2006
Location: Yakima, WA
Posts: 1,282
Default Re: 4100 problem - HDD bad?

Power or heat related most likely.
__________________
~
6 x Snap 4400 (SATA Converted)
2 x Snap 4500 (SATA Converted)

1 x Snap 110
5 x Snap 410
3 x Snap 520

2 x Sanbloc S50

Drives from 250GB to 2TB (PATA, SATA, and SAS)

GOS v5.2.067

All subject to change, day by day......
Phoenix32 is offline   Reply With Quote
Unread 07-16-2007, 06:40 AM   #13
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

well I thank you both for your input. as it stands now I was able to get the last weeks files off. The stripe, drives 1-3, are still up. I got the go ahead to order a new server so I'm not messing with it until that comes in and I can move everything over. Hopefully it lasts that long!

At that point I can frankenstein it and wipe everything and stuff.
Thanks again.
TheShawnThomas is offline   Reply With Quote
Unread 07-16-2007, 02:34 PM   #14
blue68f100
Thermophile
 
blue68f100's Avatar
 
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
Default Re: 4100 problem - HDD bad?

When it's in panic mode, you need to do a reset to factory to clear if reboot does not correct the problem. Then on boot you will get a message in the log saying which drive failed, which you already know.

The 4100 are normaly pretty stable, unless you have a drive or cpu fan that is flakey.

Apparently you have the drives spaned and not striped (raid0). From what I have gathered.

I think you have a heat problem like Phoenix said. You can fix the CPU fan so it runs 24/7 that way it want be running hot. Normally it is controlled buy the firmware. But my test shows that it reaches 140-150F at times before the fan starts. Which is to high for my taste. The shuts off in the 110F range.

The Snap 4500 are good but the newer models use SATA. But beware of Adaptec. I read a bulitin that the latest models were going to require the clinet to buy drives from them. Apparently clinets are buying lower end models then replacing the drive with higher capacity. But think about who want to pay Adaptec $600 for a 250gig HD that can be purchased for <$100. Not Me.

I & others have moved up to the Guardian OS units which are more robust and Vista Compatiable with the v4.1.061 Guardian OS, but you need v4.2 for the new DST Fix. These have a bandwidth of 300MB/sec speed with dual gigE ports.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5,
1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5,
1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy

Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820
blue68f100 is offline   Reply With Quote
Unread 07-16-2007, 07:33 PM   #15
TheShawnThomas
Cooling Neophyte
 
Join Date: Jul 2007
Location: N Palm Beach, FL
Posts: 9
Default Re: 4100 problem - HDD bad?

Quote:
Originally Posted by blue68f100
Apparently you have the drives spanned and not striped (raid0). From what I have gathered.
According to the Configure Disks screen it says that "Drive4" = Individual and "Stripe" = Stripe. So I assume it actually is, especially when disk 3 was not available, the whole Logical drive was offline. My understanding of a span is that if disk 3 died then the data on 1 & 2 would still be accessible. Of course that could be different with proprietary RAIDs...

Quote:
Originally Posted by blue68f100
I think you have a heat problem like Phoenix said. You can fix the CPU fan so it runs 24/7 that way it want be running hot. Normally it is controlled buy the firmware. But my test shows that it reaches 140-150F at times before the fan starts. Which is to high for my taste. The shuts off in the 110F range.
WOW! That is horrible heat control. I think I will leave it going. So I switch it to J12 on the other side of the RAM? Can I hot swap that? I really don't want to power down until I get the replacement server.

Quote:
Originally Posted by blue68f100
But beware of Adaptec. I read a bulletin that the latest models were going to require the client to buy drives from them.
Yeah, I read that in another post. Greedy, greedy...

BTW, if I decided to purchase a replacement power supply for this, any pointers on where and what model to get? Another poster said that the one adaptec is selling has a power switch on the back, which makes it not fit into the case properly. Maybe even one with a little more power. And noise is not a concern since it sits in the closet. The only one that hears it is me.

Last edited by TheShawnThomas; 07-16-2007 at 07:40 PM.
TheShawnThomas is offline   Reply With Quote
Unread 07-19-2007, 06:05 AM   #16
blue68f100
Thermophile
 
blue68f100's Avatar
 
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
Default Re: 4100 problem - HDD bad?

Quote:
WOW! That is horrible heat control. I think I will leave it going. So I switch it to J12 on the other side of the RAM? Can I hot swap that? I really don't want to power down until I get the replacement server.
Now you know why users run the fans 24/7. The 1100 is worst, it at a higher level. The v1 2000 runs 24/7 the v2 does not.

Yes you can hot swap the fan, but do it when the fan cycles off. If the fan is running contionus which I dought, use some compressed air to help cool it off while you swithc it.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5,
1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5,
1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy

Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820
blue68f100 is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 02:59 PM.


Powered by vBulletin® Version 3.7.4
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
(C) 2005 ProCooling.com
If we in some way offend you, insult you or your people, screw your mom, beat up your dad, or poop on your porch... we're sorry... we were probably really drunk...
Oh and dont steal our content bitches! Don't give us a reason to pee in your open car window this summer...