![]() | ||
|
|
Snap Server / NAS / Storage Technical Goodies The Home for Snap Server Hacking, Storage and NAS info. And NAS / Snap Classifides |
![]() |
Thread Tools |
![]() |
#1 |
Cooling Neophyte
Join Date: May 2006
Location: Chicago, IL
Posts: 14
|
![]()
hello -
I have a dell 705N which is actually a 4100. One hard drive went out, indicated by a flashing light. Tried reformatting, and then rebuilding the raid 5, and that worked fine for awhile and then the drive apparently went out again. This prompted me to find a replacement drive. I purchased an exact replacement, the Quantum Fireball 60GB - installed, formatted the replacement, but the formatted size is not large enough to allow the raid to rebuild. I did try the /nocore format and that did not appear to change the formatted size. So I then tried taking that drive out of the snap, put into a std. PC and ran Maxtor's disc formatting utility, doing a low-level format. My thinking was that some bad blocks were keeping the drive from formatting to it's full capacity. Well that did not work either, nothing changed. I thought, OK - there must be something wrong with this drive. Found a second replacement drive, put into the snap, executed the /nocore format, and this 2nd replacement drive is formatting to the exact same size as the first one (again, too small to use to rebuild the raid). If I run an info-device, it tells me the formatted size of the 3 original drives to be 6023884, and BOTH of the replacement drives formatted to exactly 58633216. I cannot see any difference to the physical drives themselves. Could anyone please help me figure out what to do, I would greatly appreciate it! |
![]() |
![]() |
![]() |
#2 |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]()
If it's not formatting to the same capacity then it might not be an exact replacement. Over time manfacturers change the internals of the drives and so even if it has the same name and size on the label, might be very different on the inside.
When you do an "info log t" in the command line, what model and firmware revision does it report for all the drives? For example, I had four Western Digital drives in my 705N, they were all WD1200JB, and when one of them went out I bought a new one as a replacement. The label has the same model and drive size, but when I check the info in the log it reports: Code:
10/07/2006 15:13:49 45 D SYS | Intf: 0, dev: 0: Model: WDC WD1200JB-00REA0 10/07/2006 15:13:49 45 D SYS | Firmware Rev: 20.00K20 Serial #: WD-WMANN1132794 10/07/2006 15:13:49 45 D SYS | Intf: 1, dev: 0: Model: WDC WD1200JB-75CRA0 10/07/2006 15:13:49 45 D SYS | Firmware Rev: 16.06V16 Serial #: WD-WMA8C1305039 10/07/2006 15:13:49 45 D SYS | Intf: 2, dev: 0: Model: WDC WD1200JB-75CRA0 10/07/2006 15:13:49 45 D SYS | Firmware Rev: 16.06V16 Serial #: WD-WMA8C1309021 10/07/2006 15:13:49 45 D SYS | Intf: 3, dev: 0: Model: WDC WD1200JB-75CRA0 10/07/2006 15:13:49 45 D SYS | Firmware Rev: 16.06V16 Serial #: WD-WMA8C1321357 All I can suggest is that you may want to get a drive with the next step up in size, like maybe an 80GB and see if that works. |
![]() |
![]() |
![]() |
#3 |
Thermophile
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
|
![]()
You might see if swaping the controller board (or firmware chip) correct the problem. I use this technique to recover data from bad HD's. Some mfg have utilites that allow you to change the parameters. You may need to contact Quantium which is owned by who now???
You can also install a larger drive and the snap should adjust the file size down to what it needs as suggested by rpmurry.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5, 1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5, 1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820 |
![]() |
![]() |
![]() |
#4 | |
Cooling Savant
Join Date: Aug 2004
Location: UK
Posts: 909
|
![]() Quote:
Seagate have recently bought Maxtor....
__________________
Snap Server Help Wiki - http://wiki.procooling.com/index.php/Snap_Server Snap Server 2200 v3.4.807 2x 250GB Seagate Barracuda 7200.9 w/ UNIDFC601512M Replacement Fan "Did you really think it would be that easy??" Other NAS's 1x NSLU2 w/ 512mb Corsair Flash Voyager Running Unslung 6.8b 1x NSLU2 w/ 8Gb LaCie Carte Orange Running Debian/NSLU2 Stable 4.0r0 250GB LaCie Ethernet Disk Running Windows XP Embedded |
|
![]() |
![]() |
![]() |
#5 |
Thermophile
Join Date: May 2006
Location: Yakima, WA
Posts: 1,282
|
![]()
You might try a larger drive (as long as it is not drive 0)... Backup the data, then put the new 60 GB drive in drive 0 slot and wipe them all and build new.
|
![]() |
![]() |
![]() |
#6 | |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#7 |
Thermophile
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
|
![]()
Drive 0, is the main boot drive. Most OS's handle the boot drive a little different.
Snap's determine or calculate the sise of a raid based on Drive 0. Some user may have run into this when they upgraded drives. Indicating the original capacity not the new drives. I don't think the snap will accept a different size for drive 0.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5, 1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5, 1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820 |
![]() |
![]() |
![]() |
#8 |
Thermophile
Join Date: May 2006
Location: Yakima, WA
Posts: 1,282
|
![]()
My mistake, I was not very clear in my intentions in that last post, let me try again here (I was trying not to write a book).
Let's look back a moment. His problem is a bad drive within a RAID 5 arrary and he wants to recover his data. The replacement hard drives he is trying to use are x number of sectors too small for it to rebuild. The key here is he does not want to lose his data. With me so far? Now, in typical RAID 5 arrarys, the size is based on the first drive in the array, drive 0. Usualy, no matter the size of the drives after drive 0, only the size of drive 0 will be used on the other drives. As an example, if you have a RAID 5 array with 3 x 60 GB and 1 x 80 GB drives (the 80 not being drive 0), you will end up with a RAID 5 array that is 4 x 60 GB, with only 60 Gb of the 80 being used and the remainder not being used and is unavailable. This is where I got my, "as long as it is not drive 0" point. Meaning, as long as his bad drive is not the drive 0, there may be an option here as I will explain. The solution I am offering here is not perfect by any means. It is a pain in the arse and a small chance it won't even work, but it should recover his data for him, and get his system back up and running in the end. It is the best I can offer short of what a few others have said about aquiring a drive to match the old drives which could be a difficult and expensive adventure. Step 1 - As long as the bad drive is NOT drive 0 (explained above), replace it with a larger drive you may have laying around (more than 60 GB in this case). Let the system format the drive. Step 2 - Attempt to add the new larger drive to the RAID 5 array. If it adds it in, great, let it finish rebuilding the arrary. If not, then this solution is not going to work, but I suspect it will let you add it in. Step 3 - When the array has finished rebuilding (this may take some time), your data should now be available. Back up the now available data to another location. Several copies if possible for safety of the data. Step 4 - Pull the larger drive out of the SNAP, it is no longer required. Step 5 - Place one of the newer smaller 60 GB drives in the drive 0 position (so the smaller size is used for the array). If the 2 new drives (he said he bought 2) are different in size, use the smallest of the 2. Smaller and smallest here are defined as fewest sector count. The reason for this is so that if he has this problem again in the future, he will not have to go through this again (since the size is based on drive 0). Since the other drives in the array are most likely the same age and up/use time as the now failed/bad drive, it is assumed the other drives may not be too far behind for failure. Step 6 - Optional - replace one of the still good, but older drives in the array with the second new drive. Might as well while you're in there working. Step 7 - Format all 4 drives in the SNAP (2 new and 2 older, with one of the newer drives in the drive 0 position). Step 8 - Build a new RAID 5 array with the now freshly formatted drives. This new RAID 5 array should be size based on one of the newer/smaller drives now, and thus if another of the drives fails, a replacement should be a simple swap out replacement. Step 9 - Put your backed up data back onto the SNAP with the new freshly build RAID 5 array. This should have you back where you wanted to be. Again, it is a pain in the arse, but it should work. The data should now be saved, the SNAP back up and running, and as a bonus, easier to repair should another drive fail. It's not perfect, it has a small chance it wont work, but it is the best I can offer if an exact replacement drive cannot be aquired. Just another idea for the pool of ideas. I hope I cleared up what I was trying to say now. |
![]() |
![]() |
![]() |
#9 |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]()
OK, I didn't know about the business with it determining raid size based on the first drive. None of that information was in instructions I saw about replacing a failed drive. It just said to make sure that the drive wasn't smaller.
I'd always assumed that the raid, once it was built, stored redundant information on what the size of each of the drives should be and then used only that amount of space when a drive was replaced (even if it was larger). My thinking here was that drive sizes will always go up, and a well designed raid solution should take into account the fact that finding a drive of the same size might be difficult several years down the road. So I guess the copy it makes of the drive 0 configuration data on drive 1 is useless. |
![]() |
![]() |
![]() |
#10 | ||
Thermophile
Join Date: May 2006
Location: Yakima, WA
Posts: 1,282
|
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#11 |
Cooling Neophyte
Join Date: May 2006
Location: Chicago, IL
Posts: 14
|
![]()
Hello - thank you everyone for your responses over the weekend! This is fascinating stuff. Let me first point out that the raid is still operating in degraded mode (in other words, it acted exactly as a raid-5 configuration should after losing one member.) There has been no loss of data, and in addition we have complete backup copies.
So yes, the main goal was to install a new drive and simply rebuild the raid to full redundancy. Here is the result of the info log t command as suggested by rpmurray: 10/21/2006 11:59:31 41 D SYS | Intf: 0, dev: 0: Model: QUANTUM FIREBALLP AS60.0 10/21/2006 11:59:31 41 D SYS | Firmware Rev: A1Y.1300 Serial #: 196103937962 10/21/2006 11:59:31 41 D SYS | Intf: 1, dev: 0: Model: QUANTUM FIREBALLP AS60.0 10/21/2006 11:59:31 41 D SYS | Firmware Rev: A1Y.1300 Serial #: 196103934228 10/21/2006 11:59:31 41 D SYS | Intf: 2, dev: 0: Model: QUANTUM FIREBALLP AS60.0 10/21/2006 11:59:31 41 D SYS | Firmware Rev: A1Y.1500 Serial #: 196104536529 10/21/2006 11:59:31 41 D SYS | Intf: 3, dev: 0: Model: QUANTUM FIREBALLP AS60.0 10/21/2006 11:59:31 41 D SYS | Firmware Rev: A1Y.1300 Serial #: 196102535993 So yes it does look like the replacement drive is of a newer Firmware Rev (A1Y.1500 instead of .1300). What I do not understand is that I thought hard drive capacity was a simple caluclation of bits and bytes, in other words if the geometry of the drives are the same, the formatted capacity should also be the same. I am wondering: Does anyone know what a "nocore" format actually does, compared to the regular automatic formatting that the snap server does? And how would I know if this command actually worked, because in the log file, after comparing the automated format and the "nocore" format, I did not notice ANY difference in the log files. I was under the impression that this "nocore" format would free up more space compared to the regular automated formatting, because it is spelled out in the field service guide as the solution to this exact problem, the replacement drive not formatting to it's full capacity. I wonder what it actually does (or is supposed to do) that would make the formatted capacity different. I realize that I can switch the drives around and rebuild this thing from scratch, I am just dreading that becuase of the time involved in moving all of the data around! Thanks a bunch to all of you. By the way we did upgrade the RAM from the 128MB it came with, to 256MB (based on suggestions from other threads). I have not noticed any specific improvement as of yet. I have another empty 4100 that I am about to upgrade ram and install 160-GB drives to see what will happen. I realize we will not see the full 160GB in this model snap. Are there any hard drive experts out there that can shed light on why the formatted capacity of these drives would be different? |
![]() |
![]() |
![]() |
#12 |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]()
Hmmm, I was thinking that the model number or firmware rev would show the drives as being enough different to account for the problem. Could you also do an:
info dev so we can see what it thinks all the sizes are? |
![]() |
![]() |
![]() |
#13 |
Cooling Savant
Join Date: Oct 2001
Location: Dallas, Tx
Posts: 469
|
![]()
A newer drive might have less platters and different cluster size.
__________________
Snap Servers: 1100 - 1x300gb Seagate Baracuda (SnapOS Version 3.4.807) 2200 - 2x80gb Maxtor (one dead) (SnapOS 4.0.860) |
![]() |
![]() |
![]() |
#14 | |
Thermophile
Join Date: May 2006
Location: Yakima, WA
Posts: 1,282
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#15 | |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]()
Something that blue68f100 posted when I was having trouble with my 705N and was formatting the new drive:
Quote:
And don't take this the wrong way, but the field service guide mentions that you have to type the command exactly with the spaces and whatnot in it. Have you checked to make sure you left the space after the Logical Device ID and the /reinit and another before the /nocore? I just say this because I *cough* have been known to typo a command now and again. Last edited by rpmurray; 10-23-2006 at 12:42 PM. |
|
![]() |
![]() |
![]() |
#16 | |||
Thermophile
Join Date: May 2006
Location: Yakima, WA
Posts: 1,282
|
![]() Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#17 | |
Thermophile
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
|
![]() Quote:
The "co de info" will give you the needed info, verify before doing anything.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5, 1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5, 1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820 Last edited by blue68f100; 10-23-2006 at 01:21 PM. |
|
![]() |
![]() |
![]() |
#18 |
Thermophile
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
|
![]()
Differerent firmware may allocate more sectors for the smart data to use when errors are reported. All of this activity happen at the controller level and we never see it. In the old days there was a sheet that came with the drives showing where the bad track/sectors were located at. You had to enter this data manually. Now days mfg do not check for bad sectors. They just allocate abunch to be used by smart data. Which is proably the difference in the capacity size.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5, 1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5, 1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820 |
![]() |
![]() |
![]() |
#19 | |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#20 | |
Thermophile
Join Date: May 2006
Location: Yakima, WA
Posts: 1,282
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#21 | |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]() Quote:
![]() |
|
![]() |
![]() |
![]() |
#22 |
Cooling Neophyte
Join Date: May 2006
Location: Chicago, IL
Posts: 14
|
![]()
OK guys I have some news. Have not done anything with the 705, I still have the problem where my replacement 60GB hard drives are not large enough to tag as hot spares to rebuild the raid-5.
I decided that I would try building my "new improved" 4100 using a 256MB ram chip and four 160GB Maxtor drives that I had. My idea being that, once I got this one on it's feet, I can move the data over from the 705, and then try to rebuild the 705 using the "smaller" 60GB drive in the first slot. So back to the 4100. Installed the RAM - no problem. Installed the four 160's, set all jumpers to Master, turned it on - and I could see the snap formatting each drive - GREAT! As everyone knows, it did not give me 160GB, but more like 128GB formatted. Once it was done, it gave me four separate drives with no error messages. So then I began to try changing the disc configuration to Raid-5. Everything seemed to be going fine, and the process does complete, but I have a problem that I have seen posted on here in other threads with no real solution. Once the rebuild is complete, under "View Disk Status" I am seeing: RAID5 - Large data protection disk OK Unknown disk operation error. The shares do mount and seem to be usable, but I am concerned about this error message. In addition, once restarted, it tries to rebuild the raid again. In the disc log, I have this message: Failed to resynchronize logical set 60000, error -1 I wonder if it really a practicable idea to use the 160GB hard drives. Has anyone actually done this sucessfully? One last question - would ANYONE be willing to email me the files needed to upgrade to either 3.4.805 or 3.4.807? Perhaps a more recent OS would be more robust and help solve the issue. I can be emailed for this purpose at: printperfectinc AT aol dot com I will be happy to report back to you guys if I can get this to work. If anyone has successfully installed 160's into a 4100 I would like to hear from you alslo! Thanks much! |
![]() |
![]() |
![]() |
#23 |
Thermophile
Join Date: Jul 2005
Location: Plano, TX
Posts: 3,135
|
![]()
I am not a fan af updating when problems exist. It just seams to compound the problem. If you are at 3.4.803 you should be fine. And since your started out as a Dell, with now the Snap OS loaded. DO not know if it presents its self with different problems.
I know of just a couple of users that when that way with 160, but no reported problem, that I recall. You may try seaching the threads for 160 or 4100 and see what show up. If you have any 120gig drives you could test to see if the 160 are the reason for the warnings. It may be because it know a larger drive exists, and not using all the sectors.
__________________
1 Snap 4500 - 1.0T (4 x 250gig WD2500SB RE), Raid5, 1 Snap 4500 - 1.6T (4 x 400gig Seagates), Raid5, 1 Snap 4200 - 4.0T (4 x 2gig Seagates), Raid5, Using SATA converts from Andy Link to SnapOS FAQ's http://forums.procooling.com/vbb/showthread.php?t=13820 |
![]() |
![]() |
![]() |
#24 |
Cooling Neophyte
Join Date: May 2006
Location: Chicago, IL
Posts: 14
|
![]()
Just to be clear, the snap I have put the 160's into IS an actual Quantum Snap. However the motherboards are absolutely identical, down to every digit of every number stamped on every chip and bar code, etc.
If anyone can send me the updated software, I would like to try it. I have no data on the server so I have no worries about losing anything. Anyone with 3.4.805 or 3.4.807 please email to printperfectinc AT aol dot com Thanks! |
![]() |
![]() |
![]() |
#25 |
Cooling Savant
Join Date: Apr 2006
Location: Tennessee
Posts: 157
|
![]()
I can tell you that I was able to put four 160 Seagate drives into a couple of 705N (4100) and it works. One of the unit's has 3.4.790, so I don't think it's the OS that's causing the problem.
My guess is that there's something that it doesn't like about the Maxtors. Are they all the same model? Check using info dev and see if they all formatted to the same size. |
![]() |
![]() |
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
|
|