View Single Post
Unread 05-19-2008, 05:28 AM   #4
blue68f100
Thermophile
 
blue68f100's Avatar
 
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
Default Re: SnapServer 110 reliabilty?

Continued corruption can be caused by a couple of things. Hardware is the most common. If you have a copy of spinrite run it on maintance mode and let it check the whole HD. Andy and I do this with all HDs. It will locate and correct any problems found. Since HD mfg are no longer checked for media, this is a good thing to run. All mfg rely on the SMART tech to repair on the fly. This can cause timming issues with RAIDs. This will check and update ALL of the table info, beside moving data off bad areas. All of the WD RE drives I have checked have been super clean.

But don't void the warranty. I have gotten several older models that were referb that were bad ever since Adaptec bought SnapAppliance. I would see if the warranty is still good and opt for a replacement. It's indicating that you may have a bad MB. BUT since you have done this and have the same problem not likely.

How may users and how much ram is in the unit?

You may want to try a routine boot every week and see if the problem goes away. If so I think you may be short on ram. The GuardianOS needs a min of 512meg to run, with optimum in the 1-2gig range. My 4500 was upgraded to 1.5 gig from 512meg. Andy runs 2 gig in his units. We had a user awhile back that discoverted his FTP clients were not being released, so it users just kept multiflying till it hit a limit. A route reboot will clear out the cache. I have not been in any of the newer units. But I think the ram is upgradable, ECC most likely.

I do not know if this applys but are you running on a UPS? And do you have it set to auto restart on power restore? If not I would suggest using one. Dirty power can cause a lot of problems. I would recommend a APC Smart-UPS over the other. It has the capability to trim and record all power problems. 1 unit with the network card can remote shutdown 20 devices.

Are you allowing root access to all of your users (def)? Users with root access can browse the system files if running linux work stations.

The GuardianOS does a filesystem check on startup. The logs will confirm this. Do you have the 110 set to send SNMP traps? If so you may see where the problem is accouring, like path names to long. As far as reliablity the older unit I have has been rock solid, and andy's units have been to. Andy is a hardware tech and repair units. I think most of the problem he finds is RAM related. But he would have to answer this, since their are so many things that can cause problems.

I do know of one issue with the AV, related to restarts. Lets say it has issues with system reboots and the AV startup.

Are you using SnapShots?
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5,
1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5,
1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy

Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820
blue68f100 is offline   Reply With Quote