When More Is Too Much

The London Stock Exchange suffered an outage today and while the details are a little sketchy, it seems that excessive trading brought the system down, or at least some part of the system - and here lies the problem. Systems are increasing in complexity and the interrelationships are becoming less and less understood. The result - an outage in a minor application can bring the whole system to a standstill.

Companies cannot afford to overlook the infrastructure on which their IT runs. If the IT becomes more complex, for example using virtualization or Service Oriented Architecture or third party service providers then these have to be included in Disaster Recovery / Business Continuity plans. DR/BC plans are often not tested frequently enough - with one of the reasons being given that ‘it will effect the customer’… well, at least in a test you get to choose the date and time. Friday’s at 8pm is a good time to start, or Saturday at 10am after the end-of-week tasks have completed… that way, when it doesn’t go to plan, you have another 24 hours to sort it out.

So… once more, revisit those DR/BC plans and schedule a test and make sure it includes all the 3rd parties as well. How well would your infrastructure cope if you had a 100% increase in demand… twice the business or none at all?

(Virtual) Disaster Recovery

As virtualization takes the IT world by storm, the disaster recovery (DR) plans are only just beginning to catch up. The problem is this… you have a server and it runs an application. Traditional DR looks at the server and says ‘we need another one of those’ - so this is a little simplistic, but in essence its true, and more to the point, it works. Now lets bring in virtualization, one machine is no longer one machine it is multiple machines. Each could well be at different patch levels and so it is not just a case of duplicating the hardware but ensuring that the virtual environments are also up to date.

Results from a recent survey has said that 55% of people are revisiting their DR plans because of virtualization - which is good. BUT… it also highlighted that only 37% of respondents back up their virtual systems! Before virtualization that would be seen as a travesty and an accident (or disaster) waiting to happen, so why is this the case now? Lack of tools is the basic problem. But if you haven’t got the tools why go with the technology. There are the tools out there, including those to backup virtual systems - its time to look at the risks and avoid the hype. Virtualization offers great benefits, but treat it with the respect it deserves or it will come back to bite you.

When Is Cloudy Day Is Better Than A Sunny One?

Cloud 999
It happened again, the cloud went away. Of course we are not talking about clouds in the sky, but one of those on the Internet. The outage was 8 hours this time - so a ‘working’ day. It was a Sunday, but that doesn’t mean that people aren’t working - we live in a 24×7 world, so 8 hours is 8 hours.

(Some) customers were quick to come to the defence of the service this time - but perhaps they wouldn’t have been if it had lasted a week… or maybe if it had been a Tuesday…

Choosing a service provider is not as easy as it appears - you do need to ask about their Disaster Recovery / Business Continuity plans and ensure their plans meet your needs, otherwise you could end up with no service and no business.

One Man, One Password, One Cell

So just how important can one person be? If they happen to the the IT administrator and they have a grudge, then perhaps the answer will scare you. In a recently reported incident one employee locked out a whole city from the computer system - and then refused to hand over the password. Implicit Trust fails once more. If that had been your company what would you have done? In this case they threw the individual in jail and are waiting… and trying to crack the password themselves!

More to the point, what could you do to prevent it from happening? This is a tough one - obviously you could have audit trails (but if you can’t log in, then how can you find the information), perhaps you could have a secret backdoor (not such a good idea - some cyber-criminal will find it), perhaps you can have policies and procedures (not that they help when you are locked out)… so what to do? Maybe the best thing to do is to ask your IT administrators how they would solve the problem - they will no doubt come up with a solution that would work for you and your network. If you think using this case might be a little close to the mettle, then how about framing it as an ‘accident’ when everyone gets locked out - it’s own form of ‘disaster’.

It’s The Summer… It’s Raining… Disaster?

So, we’ve had some serious rain here in the UK this week - not yet as bad as last year, but it’s only the start of July, so plenty of time for more. In the US they have also had some serious rain, thousands of forest fires and other natural disasters. This has prompted a number of companies to re-evaluate their disaster recovery plans - we should be doing the same thing over here… just in case. One interesting comment was that the plans need to take into account the possibility that staff will need to take the time off to deal with family issues… while this may seem obvious, it is worth being reminded about!

Perhaps having a well thought out DR plan dealing with the likes of floods, will ensure that we will have a summer of sunshine… a bit like the hope of carrying an umbrella will prevent the rain… on the other hand… P5 rules… (Proper Planning Prevents Poor Performance)