tag:blogger.com,1999:blog-768233104244702633.post5151745197465987146..comments2023-11-16T03:16:54.746-08:00Comments on The Scale-Out Blog: When SANs Go BadRobert Hodgeshttp://www.blogger.com/profile/05379726998057344092noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-768233104244702633.post-82640246013980492882010-11-05T06:29:40.250-07:002010-11-05T06:29:40.250-07:00Among the meriad of problems around SANs and their...Among the meriad of problems around SANs and their causes:<br />- companies use SANs as a backup strategy.<br />- The SAN word has an implicit "enterprise" label on it, so it must be good.<br />- In many people's minds, SANs can't/won't fail. I kid you not.<br />- SANs are often difficult to monitor, they tend to insist on you just receiving SNMP traps. So, when the SAN dies, it'll let you know. All extra foo in there aside, here's something fundamentally wrong about that logic.Documentarieshttp://www.humanrestore.com/noreply@blogger.comtag:blogger.com,1999:blog-768233104244702633.post-15276168433479937012009-06-21T09:03:26.913-07:002009-06-21T09:03:26.913-07:00@anonymous
You are most welcome.
Not to pile on ...@anonymous<br />You are most welcome. <br /><br />Not to pile on the horror stories but here is another one I experienced on my applications--we had a power failure on the fabric that took down the Fibre Channel switches. Solaris applications not only did not notice the problem but kept writing to the fabric. We lost 20 minutes of data on Oracle before anybody noticed a problem. This was a while back and may not be a problem for newer technology (we were using Brocade switches at the time), but it's another illustration of the surprises that await the unwary.Robert Hodgeshttps://www.blogger.com/profile/05379726998057344092noreply@blogger.comtag:blogger.com,1999:blog-768233104244702633.post-31913959345604735562009-06-21T08:40:33.006-07:002009-06-21T08:40:33.006-07:00Huh. Thanks for scaring me, then! We're doing ...Huh. Thanks for scaring me, then! We're doing fairly text book architecture on decent hardware (MDS switches, netapps, etc. Even in my testing, we never saw a case where all hell didnt break loose when we ripped out fiber cabling or power from the switches or the netapps. <br /><br />It's nice to know that there is some cases where the OS freaks out and DTWT. <br /><br />Thanks for the reply!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-768233104244702633.post-21930655924169816802009-06-21T08:15:57.143-07:002009-06-21T08:15:57.143-07:00@anonymous
I don't have full technical details...@anonymous<br />I don't have full technical details as I was diagnosing the software problems but both systems were (as far as I know) dual-pathed using Fibre Channel switches. In the first case the failure was on Solaris, in the second on Linux. In both cases the problems seem to have arisen as a result of improperly handled failures on the SAN. <br /><br />I have also used RAID but never seen this type of behavior, though obviously RAID fails as well. The problem as I mentioned is for non-specialists like most of us the failures are sufficiently rare it's somewhat hard to generalize.Robert Hodgeshttps://www.blogger.com/profile/05379726998057344092noreply@blogger.comtag:blogger.com,1999:blog-768233104244702633.post-8808027945180078912009-06-21T02:45:45.056-07:002009-06-21T02:45:45.056-07:00Among the meriad of problems around SANs and their...Among the meriad of problems around SANs and their causes:<br /> - companies use SANs as a backup strategy.<br /> - The SAN word has an implicit "enterprise" label on it, so it must be good.<br /> - In many people's minds, SANs can't/won't fail. I kid you not.<br /> - SANs are often difficult to monitor, they tend to insist on you just receiving SNMP traps. So, when the SAN dies, it'll let you know. All extra foo in there aside, here's something fundamentally wrong about that logic.<br /><br />@anonymous if it's no better, why does it cost more ;-)Arjen Lentzhttp://openquery.com/noreply@blogger.comtag:blogger.com,1999:blog-768233104244702633.post-57604455864929242612009-06-20T15:50:18.417-07:002009-06-20T15:50:18.417-07:00Can you explain a bit more about your fabric? Did ...Can you explain a bit more about your fabric? Did you do multipathing? What kind of cards, switches, and on what OS?<br /><br />I've generally found a well architected SAN to be no better or worse than RAID hanging off of a local controller.Anonymousnoreply@blogger.com