Making the case for block redundancy in pursuit of superior reliability, flexibility, and uptime.
There are few subjects as important in the data center space as reliability. In an age where even a single hour of downtime or interruption can cost an enterprise as much as $5 million before related fines or penalties, there’s never been quite the amount of pressure and emphasis on companies maintaining uptime and recovering more quickly from a disruption.
But while most conversations around redundancy tend to focus on network redundancy — both as a necessary component of a robust business continuity/disaster recovery (BC/DR) strategy and as contributor to network availability objectives — there’s another crucial element that factors into overall uptime: electrical distribution.
Modern data centers are literal powerhouses, collectively consuming tens of billions of kilowatts per year. Everything they are, and everything they can do (like supporting thousands of physical servers and other IT infrastructure) is predicated on how much power they can ingest, how efficiently they can distribute it, and how reliably they can maintain that distribution.
Data center developers and operators most commonly use two options for building redundancy into their electrical distribution infrastructure: distributed redundant or block redundant systems.
Here’s a look at how the two compare and a deeper dive into why STACK INFRASTRUCTURE has made a conscientious decision to standardize one of them in many of our current and all of our future data centers
The Great Debate: Distributed vs. Block Redundancy
There’s no such thing as an ideal choice or one without any drawbacks. Power redundancy design for uninterruptible power systems (UPSs) is no different, as both distributed and block redundant systems have their respective pros and cons.
Distributed Redundant Systems (DRS)
What is it?
A DRS features multiple independent UPSs — an A-UPS, B-UPS, and C-UPS — each capable of carrying the entire critical electrical load for a prescribed set of breakers and connected hardware. The “four-to-make-three” (4M3) design automatically switches power distribution to the remaining systems — whether due to unplanned downtime or for cases of maintenance and repair — until full power distribution can be restored. In DRS, loads are evenly distributed with each system supporting one-third of the load or up to two-thirds (in case of failover) of the equipment rating.
Why it’s great
DRS designs are highly resilient with no single point of failure, which dramatically reduces the risk of downtime. The independent underground power feeds from power distribution units (PDUs) support help to ensure consistent, predictable power and in the case of a UPS failure, the switchover is automatic. In many cases, a DRS is an affordable design option, making it ideal for smaller deployments.
What are some drawbacks?
For the simplicity and affordability of the design, DRSs do require a significant amount of complex and precise load planning. Electrical load must be properly balanced between the A, B, and C systems to ensure the critical load is properly supported without overloading any single system. That’s especially important for larger deployments, such as 20KW implementations, because if power to one system is oversubscribed under failover there’s a significant risk of a PDU or UPS failing, creating a cascading failure effect in which multiple UPSs fail and could adversely impact other customers.
It’s also worth noting that expanding within a deployed 3M2 system is often quite challenging to plan for and even harder to implement. Usually, it means deploying multiple 3M2 systems at once, even if they’re not all going to be used, for addressing growth, but brings with it additional planning and costs.
Block Redundant Systems
What is it?
Like a 4M3 DRS, a block redundant system will also have four systems, but instead of using all four at once, three of them supply the primary power and one remains in reserve for failover. In these systems, a failure at the A, B, or C block will switch to the reserve system rather than spreading the load out over the remaining operational systems.
Why it’s great
Unlike DRSs, block redundant systems are generally much easier to set up and manage. These systems are simpler to define how much capacity they can or should handle, and where it can go. That eliminates a significant amount of human intervention and the risks of human error that come with it — the most common cause for outages. These systems tend also to be easier to scale and plan for future deployments, because they can be deployed individually in stages as they’re needed, rather than all at once.
What are some drawbacks?
The downsides to block redundant systems are usually related to costs and efficiency. Block redundant systems require installing load transfer equipment, potentially adding to the cost of the materials and the installation. As the number of active blocks supported by a reserve block increases, the cost impacts relative to DRS systems become less relevant. At the same time, the reserve system is designed to not actively function at all times, decreasing resource utilization and potentially impacting efficiency.
The STACK case for block redundant systems
Site reliability for customers is one of STACK INFRASTRUCTURE’s core values. Our aim to maintain five 9s availability across our portfolio of facilities has led us to make the conscientious and calculated decision to make block redundant systems our basis of design.
Though some of our longer-standing facilities feature distributed redundant systems, more recent projects and all STACK-constructed facilities in the future will be designed with block redundant systems to minimize operational complexity and risk, while improving reliability, long-term cost certainty, and the ability to grow confidently as our customer base and implementations do.
Choosing block redundant designs helps eliminate the months-long and continuous capacity calculations common to DRSs. There are fewer resources required and reliability is factored into the design, enabling STACK to reallocate human and financial resources toward other customer support and engagement initiatives aimed at constantly improving their experience and long-term success.
At the same time, the block redundant approach simplifies customer expansion and growth, as there’s more PDU size and configuration flexibility than in distributed systems. With block systems, it’s easier to help individual customers create a roadmap for powering future deployments, which is vital as they begin implementing power-hungry applications like artificial intelligence and machine learning platforms.
Systems availability is a business imperative in the modern data center space and there are myriad opinions and arguments to be made about the best way to approach power redundancy and reliability. After carefully comparing how each affects our ability to support the evolving needs of our customers across our national footprint, the simplicity, predictability, and flexibility of block redundant systems made it a clear winner for us now and in the future.