The Challenge
A Tier III data center providing critical services to financial and healthcare clients was experiencing more frequent power disturbances than their 99.982% uptime SLA allowed. The facility had experienced three unplanned outages in the previous year, each resulting in significant customer impact and financial penalties. Initial analysis revealed aging UPS batteries, inadequate generator maintenance, protection coordination gaps, and power quality issues affecting sensitive IT equipment. The client needed to enhance reliability without extensive downtime for the facility.
Our Solution
We conducted comprehensive power system analysis and implemented strategic improvements to address identified vulnerabilities. Our approach prioritized issues by risk and impact, implementing quick wins first while planning longer-term enhancements.
Key Components
Implementation Phases
Phase 1: Assessment and Analysis (Months 1-2)
Comprehensive evaluation of power system reliability and identification of vulnerabilities.
- •Detailed single-line diagram verification
- •Fault tree analysis identifying single points of failure
- •UPS system assessment including battery testing
- •Generator inspection and load bank testing
- •Power quality monitoring at critical loads
- •Protection coordination study
- •Transfer switch timing analysis
- •Prioritized improvement recommendations with risk assessment
Phase 2: Quick Wins and Critical Fixes (Months 3-5)
Implementation of improvements addressing highest-risk issues with minimal downtime.
- •UPS battery replacement with improved monitoring
- •Generator maintenance and control system upgrades
- •Protective relay settings optimization
- •Transfer switch mechanical maintenance
- •Critical connection retorquing
- •Improved preventive maintenance procedures
- •Staff training on emergency procedures
Phase 3: System Enhancements (Months 6-9)
Implementation of longer-term improvements enhancing overall reliability.
- •Additional UPS module for true N+1 redundancy
- •Upgraded automatic transfer switches with improved reliability
- •Installation of power quality mitigation equipment
- •Enhanced monitoring and alerting system
- •Redundant cooling electrical improvements
- •Utility service entrance enhancements
- •Updated emergency procedures and documentation
Phase 4: Validation and Optimization (Month 10)
Comprehensive testing to validate improvements and optimize performance.
- •Integrated system testing simulating various failure scenarios
- •Generator-UPS transfer testing under full load
- •Single point of failure testing verification
- •Power quality validation measurements
- •Updated reliability calculations
- •Final documentation and training
- •Ongoing monitoring dashboard implementation
Results & Impact
Exceeded SLA target of 99.982% with zero unplanned outages in 18 months following improvements.
Mean time between failures increased from 122 days to 538 days through systematic reliability improvements.
Reduction in power quality events affecting IT equipment, measured by sensitive monitoring equipment.
Annual savings from avoided SLA penalties and reduced emergency response costs.
Client Testimonial
"ClarkTE's methodical approach to reliability improvement gave us confidence in every recommendation. They didn't just identify problems—they quantified risks and helped us prioritize investments for maximum impact. Our clients have noticed the difference, and we've eliminated the SLA penalties that were costing us millions."
Technical Details
Power System Configuration
- ▸Dual utility feeds from separate substations (2 × 5 MW)
- ▸2N UPS configuration: 8 × 500 kVA modules
- ▸N+1 diesel generators: 4 × 2.5 MW
- ▸Static transfer switches for critical loads
- ▸Battery autonomy: 15 minutes at full load
- ▸Tier III topology with no single points of failure
Improvements Implemented
- ▸Complete UPS battery replacement (1,280 cells)
- ▸Additional UPS module for true N+1 redundancy
- ▸Generator control system modernization
- ▸Three automatic transfer switch replacements
- ▸Active harmonic filter installation (750 kVA)
- ▸Real-time power monitoring at 47 locations
- ▸Automated alert system for anomalies
Testing and Validation
- ▸Generator load bank testing to 110% rated capacity
- ▸UPS battery discharge testing at design load
- ▸Transfer switch operation timing verification
- ▸Single point of failure testing (12 scenarios)
- ▸Parallel generator operation validation
- ▸Power quality measurements per IEEE 1159
- ▸Thermal imaging of all connections
Lessons Learned
- Power quality issues can cause reliability problems even with perfect uptime numbers
- Battery monitoring systems provide early warning of failures before capacity tests reveal problems
- Generator transfer testing under actual load conditions reveals issues not found in commissioning
- Real-time monitoring with intelligent alerting enables proactive intervention
- Regular preventive maintenance is more cost-effective than reactive emergency repairs
- Documentation of failure modes and contingency plans improves emergency response
Have a Similar Challenge?
Our experienced team can help you solve complex power system challenges. Let's discuss your specific needs.
Contact Our TeamRelated Case Studies
Hospital Power System Upgrade and Arc Flash Mitigation
How we helped a major healthcare facility upgrade their power distribution system while reducing arc flash hazards and improving reliability.
Utility Substation Protection System Modernization
Modernizing protection and control systems for a utility substation to improve reliability, reduce maintenance costs, and enhance monitoring capabilities.
