Does it really matter if the software applications supporting your security systems are only available for 99% of the time? Well, it probably doesn’t if you’ve installed a video surveillance system primarily to deter shoplifters, but the loss of what equates to more than 90 minutes of unplanned downtime per week will be significant if you’ve invested in an integrated mission critical security solution. Duncan Cooke expands on the detail.
There’s no shortage of solutions available to ensure minimal disruption if a server fails or the business is put in a position whereby it must recover from a cyber attack. What follows is a jargon-busting overview of the best of them.
Back-Up and Restores
A standard x86-based server typically stores data on RAID (Redundant Arrays of Independent Disks) storage devices. The capabilities of x86 servers range from vendor to vendor and support a variety of operating systems and processors. However, a standard x86 server may have only basic back-up, data replication and failover procedures in place, which means it would be susceptible to catastrophic server failures.
A standard server isn’t designed to prevent downtime or data loss. In the event of a crash, the server stops all processing and users lose access to their applications and information. In this scenario, data loss is likely. Standard servers don’t provide protection for data in transit, which means that if the server goes down, this data is also lost. Though a standard x86 server doesn’t come from its vendor as highly available, there’s always the option to add availability software following initial deployment and installation.
Traditional high availability solutions which can bring a system back up quickly are typically based on server clustering: two or more servers that are running with the same configuration and connected with cluster software to keep the application data updated on both/all servers.
Servers (nodes) in a high availability cluster communicate with each other by continually checking for a heartbeat which confirms other servers in the cluster are up-and-running. If a server fails, another server in the cluster – designated as the failover server – will automatically take over, ideally with minimal disruption to users.
Computers in a cluster are connected by a LAN or WAN and managed by cluster software. Failover clusters require a storage area network (SAN) to provide the shared access to data required to enable failover capabilities. This means that dedicated shared storage or redundant connections to the corporate SAN are also necessary.
While high availability clusters improve availability, their effectiveness is highly dependent upon the skills of specialist IT personnel. Clusters can be complex and time-consuming to deploy and they require programming, testing and continuous administrative oversight. As a result, the total cost of ownership is often high.
It’s also important to note that downtime isn’t eliminated with high availability clusters. In the event of a server failure, all users who are currently connected to that server lose their connections. Therefore, data not yet written to the database is lost.
Fault-tolerant solutions are also referred to as continuous availability solutions. A fault-tolerant server provides the highest availability because it has system component redundancy with no single point of failure. This means that end users never experience an interruption in server availability as downtime is pre-empted.
67% of Best in Class organisations use fault-tolerant servers to provide high availability to at least some of their most critical applications. Fault tolerance is achieved in a server by having a second set of completely redundant hardware components in the system architecture. The server’s software automatically synchronises the replicated components, executing all processing in lockstep such that ‘in flight’ data is always protected. The two sets of CPUs, RAM, motherboards and power supplies are all processing the same information at the same time. Therefore, if one component fails, its companion component is already there and running and the system keeps functioning.
Fault-tolerant servers also have built-in, fail-safe software technology that detects, isolates and corrects system problems before they cause downtime. This means that the operating system, middleware and application software are protected from errors. In-memory data is also constantly protected and maintained.
A fault-tolerant server is managed exactly like a standard server, making the system easy to install, use and maintain. No software modifications or special configurations are necessary and the sophisticated back-end technology runs in the background, invisible to anyone administering the system.
In today’s business environments where downtime needs to be kept to the absolute minimum, ensuring that you have fault-tolerant systems will provide you with peace of mind that crucial data isn’t lost.
Duncan Cooke is Business Development Manager (UK & Europe) for Stratus Technologies