« More about quorum - updated | Main | Alan eats his own cl_respawn dog food. Yum!! »

04 November 2007

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e54ed61e07883300e54ef24ff88834

Listed below are links to weblogs that reference Availability, MTBF, MTTR and other bedtime tales:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Samantha

I work with a company who is just begging to dive into the world of IT automation. In actuality they had little choice as their new software applications have reeked havoc on the company’s network. They are desperate to improve application availability (http://www.stratavia.com)throughout the system mainly because the software they implemented recently is software than their clients use for their websites and as those have become extremely slow, when they’re even up and running, the time for change has come. I’m part of a team that’s been looking into new automation tools and am compiling a report that’s due by the end of this week. So far Opalis and Stratavia are looking good but I’ve got to dig up more info on both companies.

Alan R.

Interesting. I'm not familiar with either company, or their products, but I'll go look them up and see what they're up to.

Automation is a very hard thing to do right over a broad scope - there are many opportunities to make things worse rather than better. As you probably have gathered, my personal perspective is to approach things from the availability management perspective. Not that this is the only way, or somehow the best way. It does have the advantage of being a perspective that has largely well-proven technologies.

C.P. Gupta

Is it possible to find the probabilty of failure of a device at any time t in terms of only the known parameters like MTTR & MTBF or you can suggest me some reference. I want to use this for my doctoral research

Wes Tafoya

I'm not sure about laptops or pc (although I heard Apple (MAC + Powerbooks)is very stable)I still wonder why people still talk about availability as if this is a new technology. I know that NEC has a server that is 100% redundant and only because they have to cover their legal back ends do they say it has 99.999% up time - Oh, this includes 0% downtime for Windows updates as we know should be calculated into the downtime equation. I know some companies prefer to spending a small fortune for cluster software and I guess if 99.9% up time is good (8 hours of downtime a year!!)and you don't mind paying for all the licenses etc. . .I just figure that buying one server that has a money back guarantee against crashes, one copy of the os etc - would seem as a better bargain.

Another good company that I have ran into but never tried their product personally is Marathon (marathontechnologies.com) has a unique software that is really cheap and does a fantastic job in redundant solutions.

Please understand, while cluster software has it's purposes - IT Directors need to do better research in finding complete redundant systems that are not so darn expensive and that can insure the internal components, the CPU / ram - what ever, are 100% redundant.

Alan R.

I spent the first 20 years of my career working for Bell Labs on exactly those kind of highly redundant systems. They've been largely abandoned largely because they are too expensive, and to get the benefit from them they need special software. Ditto for the Tandem systems - abandoned as too expensive.

Everything fails. EVERYTHING. You just have to wait long enough. Eventually the sun will burn out. The only question is what you're going to do when it fails...

Quite frankly, I think all HA cluster software (as it's been traditionally understood) is doomed. Virtualization makes redundancy and failover simple, and eventually it will make it easy - probably mainly through cloud computing.

The comments to this entry are closed.

Become a Fan

AddThis Social Bookmark Button
Blog powered by Typepad