A Journey to Reliability Excellence: The Effects of Making Work Go Away

Bruce Hawkins

Senior Maintenance and Reliability Consultant Total Resource Management (TRM), Inc.

June 30, 2026

As I mentioned in the first blog in this series, The Business Case and Executive Support we achieved some very significant business results. My plant reduced maintenance spend by $22 million per year just by making work go away. This journey demonstrates how disciplined practices drive reliability excellence across complex operations. 

Building the Foundation: CrossSite Reliability Collaboration 

One of the things that we did right was that we created inter-site Reliability teams, where the rotating equipment folks, and instrument and electrical folks got together on a periodic basis to share lessons learned and share failure history. So that that we would continue to learn, we maintained corporate reliability support. We had a Rotating Equipment Engineer that would provide support as needed to the site Reliability Engineers. We also had a corporate E&I engineer to provide support as needed. Our ongoing SAP support was a critical element as well. 

Early Wins Through RCM and Operator Knowledge 

We were able to find some low hanging fruit early in the journey using RCM. For example, we had a chronic issue with a with a spray condensing system that was supposed to help pull a vacuum on a process vessel. We had trouble getting down to the right level of vacuum. During the RCM analysis, our experienced operators told us they used to clean out spray nozzles on that spray condenser periodically, but do not do that anymore. The result was poor spray efficiency, and vapor was carrying over to the steam ejectors that were only sized to remove the non-condensables, not excess vapor. We put the nozzle cleaning practice back in place and the problem went away. 

During our journey, four of our plants were purchased by another company that brought the mentality that no failure is acceptable. They challenged us to do some level of root cause analysis on every single failure. We developed a three-level process where the level of rigor put into the analysis was based on the business impact of the failure. The high impact failures would have a cross functional team assigned to do the analysis. Reliability engineers were assigned to perform root cause analysis on the next tier of failures. Finally, we got the maintenance technicians engaged in the routine failures that happened day to day. 

Empowering the Craft Workforce Through Root Cause Analysis 

We implemented the root cause analysis at the craft level, adapted from the “seven cause category” approach described in the book Machinery Failure Analysis and Troubleshooting by Heinz Bloch and Fred Geitner (Bloch, H. and Geitner, F., 1999, Machinery Failure Analysis and Troubleshooting, 3rd Edition, ElSevier, Houston, TX, Ch 10). Their philosophy was that there are only seven ways that equipment will fail unexpectedly. By using a process of elimination, from looking at the failed components, talking to the operator, look at equipment history, spending only about 20 to 30 minutes of craft resource time, you can generally rule out five or six of those cause categories and zero in on the most likely one. We also set up all our failure coding standards in SAP to fit those cause categories. Then we trained the crafts at each one of the sites in using that method. 

One of the sites was a chemical plant that had dozens of distillation columns, tanks, heat exchangers, fired heaters, miles of piping and a whole bunch of centrifugal pumps – over 1,100 centrifugal pumps.  

A Pump Population in Crisis: Understanding the Scale of the Problem 

When the reliability improvement program began, they started measuring mean time between failure for these pumps and found that the average life was nine months. That means they were going through about 1,400 pump replacements a year, which is about four a day. Day shift was installing the spare pumps and night shift was rebuilding them (and neither were being done very well). 

Engineering, Installation, and Design Improvements That Changed Everything 

They studied the failure cause category data in SAP found some patterns. They found that they had some design issues with the wrong mechanical seal for the service. Some pumps were of a flimsy design when they required a more robust design. They found that they had some assembly and installation defects. With that many pump replacements a day, it’s likely that some mistakes are going to be made in installation, it’s likely that some mistakes are going to be made in assembly. They implemented a precision centralized pump repair facility, where they did a precision job on repairing the pumps. They also ratcheted up expectations on field installation, attacking things like corroded foundations, correcting bolt bound situations, eliminating pipe strain and doing precision installations in the field. 

We also found that there were a lot of failures caused by improper operations. In some cases, the operators were doing some things wrong around startup and shut down that resulted in pump failures. The site implemented operator training program to teach operators how pumps work and how to properly care for them. 

To summarize, Reliability personnel at the site analyzed the data provided by the crafts and designed and implemented a programmatic solution for each one of those issues. These solutions affected the entire pump population rather than a single installation. The chart below illustrates the dramatic improvement in pump reliability that was experienced. 

An 83% Reduction in Pump Failures: The Impact of Making Work Go Away 

As I mentioned above, we drove a lot of maintenance cost savings by making work go away. Our success with pump failures provides illustrates how much improved reliability can impact work. Pump failures and repairs decreased by more than 83 percent as a result of the program. Our ability to operate reliably also improved dramatically. 

Dramatic Improvement in Mean Time Between Failures for Pumps
Dramatic Improvement in Mean Time Between Failures for Pumps

Driving Accountability Through Site Reliability Audits 

Corporate leadership from the new owners also drove increased accountability, especially at the management level. We conducted regular Site Reliability audits. By this time, I was in a corporate reliability leadership role, and I reported to the same guy that the plant managers reported to. He was a bit impatient with progress at some of the sites in improving business performance. He pushed me to dig into each site’s SAP system and uncover the reliability and dataquality deficiencies that were holding them back. Together, we visited the sites, walked through the audits, and made those gaps impossible to ignore—setting clear expectations for improvement and the accountability needed to achieve it. These efforts delivered significant maintenance optimization and reduced avoidable work across the organization, creating the foundation for the broader reliability transformation that followed and setting the stage for the kind of sustained performance improvement TRM helps clients achieve today.  

How TRM Helps Organizations Make Work ‘Go Away’ 

TRM brings the structure, discipline, and handson expertise that help organizations turn stories like this into repeatable, scalable outcomes. We work alongside maintenance, operations, and engineering teams to build the same foundations that enabled Bruce’s results—strong reliability governance, clean and actionable data, aligned processes, and a culture that treats failures as preventable. From root cause analysis frameworks to SAP/EAM optimization, from workforce enablement to populationwide equipment strategies, TRM helps clients make work “go away” by eliminating the underlying drivers of failure. The result is the same transformation described in this journey: fewer breakdowns, lower maintenance spend, and a more reliable, predictable operation that performs at its best. 

Start reducing avoidable work and improving reliability performance—engage TRM to assess where your biggest gains are waiting. 

Connect with Bruce Hawkins on Linkedin

 

Ready to elevate your asset management?

Connect with TRM to start your journey toward exceptional performance.


Related Resources

Explore insights, guides, and tools designed to help you unlock greater asset management performance and business value.

Unlock smarter
asset management

Ready to elevate your asset management?
Connect with TRM to start your journey toward
exceptional performance.

Let’s talk