Reliability: How Useful is R(t) Really?

John Q. Todd

John Q. Todd

Sr. Business Consultant/Product Researcher Total Resource Management (TRM), Inc.

December 22, 2025

Reliability – How useful is R(t) really to our daily life? Is there a better measure? 

The concept of reliability is certainly not new. It has become one of those words we toss around, thinking that it is important or fits the context of our equipment no matter what. We are told that the reliability of our equipment is naturally quite high due to the attention the manufacturers pay to their approach to producing a quality product. All of this is well and good, but is knowing R(t) of any piece of equipment really that useful? In fact, without knowing what is needed to come up with a “good” R(t), you can set yourself up for a wild goose chase. 

Other measures, such as Availability might be more useful or practical to you and are more certain to arrive at useful results. 

Let’s start with the definition of reliability, courtesy of the American Society of Quality (ASQ):

“The probability that a product, system, or service will perform its intended function adequately for a specified period of time, in a defined environment, without failure.” 

Our goal for this article is not to convince you to abandon your quest for higher reliability with your equipment, but rather to put R(t) in proper perspective – in the light of your operations, and perhaps look to Availability as a more useful measure. 

“Probability” – we already have a problem 

Determining the probability of something happening (or not) requires lots and lots of data to calculate a meaningful result. Take the example of the outcome of flipping a coin. Given the coin is perfectly balanced, you will need to flip the coin the same way many 100’s or 1000’s of times before you can see the expected 50/50 chance or probability of heads or tails. The more times you flip the coin, the closer to 50/50 you get. If you only are allowed to flip the coin a few times, getting heads each time, you might question the 50/50 eventuality. 

If you only have a few data points, for example, failures equipment has experienced, then any probability or average calculations will deliver rather course and not very useful results. 

“Intended functions” and “failures” 

Part of arriving at reliability is the definition of what a failure is in the context of the equipment. Obviously, equipment can fail in several ways, but you must develop a clear picture of not only what functions you are expecting, but also how those functions might fail. This is an area where a good Failure Modes and Effects Analysis (FMEA) effort is necessary. Further, failures are not always binary. A failure in the context of how the equipment is operating could be along the lines of reduced output. While the equipment is still operating, the lack of output could be deemed a failure. 

This means that as your maintenance teams are out repairing or correcting equipment, they need to be very aware as to how to report a failure. The lists of failures they can select from in their work order application need to be well thought-out. If what the equipment is showing is not also a failure they can pick from the list, then you will get the next best guess. In the end your failure reports will not be very useful. 

Keep in mind too that “downtime,” for things like preventive maintenance and reconfigurations do not equate to failures. You need to have a distinction between reporting downtime due to PM, reconfiguration, and other work, and downtime because of an actual failure. 

“Defined environment” – more things to consider 

Let’s say that you have a motor that has been humming along at 1000 RPM for the last 5 years. You have some data so far that shows it is reasonably reliable. Then, unbeknownst to you, Operations has increased the speed of the motor to 2000 RPM. Your past reliability results, while useful for comparisons in the future, are no longer valid. The environment has changed, so you need to draw a distinct line between when the motor was running at 1000 RPM and its new speed of 2000 RPM. 

In general, the environment or conditions that a piece of equipment is operating under, or that have any kind of impact on its reliability, do not change very often. Seasonal changes can be accounted for in your calculations rather easily. For the most part, equipment just hums along doing its thing until it is not. 

One final thing to consider is that the equipment we are talking about is deemed repairable. If the maintenance approach for the equipment is to remove and replace when it fails, then Mean Time To Fail (MTTF) is used. Mean Time Between Failure (MTBF) is used for repairable equipment. 

Can we calculate something already? 

Ok, thanks for being patient. Let’s start out with the commonly known measure of Mean Time Between Failures (MTBF) for repairable equipment. Sorry, but we once again have a problem. <sigh> 

Mean (or average) is a calculated number. Very often, a mean value is not a value that also appears in your data set. One could argue that a mean is a “fake” number because it is not always also a value in the data set. (Who said statistics is not also a philosophical endeavor!) It takes lots and lots of data points to produce a mean that is… well… meaningful. Let’s try this: 

Specific equipment failure #1 @ 30 days 

Same specific equipment failure #2 @ 180 days 

What is your MTBF? 

MTBF = (30+180)/2 = 105 days 

Given this very limited set of data, does it seem reasonable that you would expect, on average and at a constant rate, a failure of this piece of equipment every 105 days? No. This is the trouble with trying to use R(t), based upon a MTBF that is calculated from a limited failure data set. 

Let’s now actually calculate R(t) for our increasingly silly example: 

R(t) = e– t/MTBF (stated: Reliability at a point in time, t, is equal to the exponential raised to the negative power of t/MTBF) 

So, let’s make t = 365 days 

R(365) = e– 365/105 = 0.030 

This is really low reliability. You “know” that your equipment is far more “reliable” than this. Given the input of a course MTBF, we get a course R(t) result. 

You can think of the numbers like this: If your MTBF is lower (105 days) than the time you are looking at (365 days), then your Reliability will be low because failures have occurred before the 365 days is up. Since in our silly example the equipment failed a couple of times before the one-year mark, reliability is low. If your MTBF is greater than the time you are looking at, then your reliability will be high(er). Try this example: 

MTBF = 1886 hours 

t = 1000 hours 

R(1000) = e-1000/1886 = 0.58 

So, at 1000 hours you have a 58% probability that the equipment is up and running. Just over 50% chance… not too bad really. Slightly better than a head or a tail. 

So forget chasing Reliability… let’s look at Availability. 

Let’s move on to a slightly easier metric, Availability. One does not need to work with exponentials and averages to calculate Availability. Availability is the basic difference between the amount of time the equipment could be available to you, and the time when it is not. 

If there are 8760 operating hours available in a single year (24/7 operation), and the piece of equipment has been down for 100 of those hours (remember those two failures?), then your Inherent Availability is: 

Inherent Availability = 1 – (100/8760) = 0.99 

Not too bad! 99% of the operational time available due to the Earth faithfully orbiting the Sun, the equipment is available to perform its tasks. This seems to fit better with our view of reality. Yes, the equipment failed a couple of times, but there was not very much downtime as a result. 

Unplanned downtime is never good, but when put into the perspective of the operating time over an entire year, those few hours might not be such a big deal. However, one could easily argue if that downtime resulted in significant financial losses or even a safety issue, now we are getting into the criticality of the equipment to the operation where any downtime might be of great importance. A discussion for a later time. 

If one includes the downtime due to preventive maintenance, reconfiguration, etc. then you have an Operational Availability measure. Let’s say each year the equipment is down an additional 100 hours for such things. 

Operational Availability = 1 – (200/8760) = 0.97 

So, that extra 100 hours of planned downtime does have an impact. That 2% reduction in Availability costs your company and is yet another calculation you can make. 

What to do? Stop collecting failure data? 

Sorry, but you do not get off that easy. Collecting failure details from the field on your equipment is valuable information whether you use it for reliability calculations or simply quantitative analysis. Knowing from the experienced folks in the field what the different failures they are seeing is useful for optimizing preventive maintenance activities, designing new systems, and even managing safety plans. Don’t lose sight of the value of failure reporting, even if it is not helping your hard-core calculations very much. 

Rather, being disciplined to capture the different types of downtime in your context (separated at least between planned/maintenance time and due to actual failures time) leads to perhaps a more important measure, Availability. You expect the equipment to be available to you X hours per year. Failures take that equipment away from you and cost you money in repair time and lost production. All those necessary tasks such as preventive maintenance activities also take the equipment away from you for production. Knowing how to slice and dice the notion of “downtime” is very valuable. 

A final argument… 

What if you have a piece of equipment that fails quite often, but is “repaired” and put back into service very quickly? From the MTBF numbers, its reliability is very low, but its Availability is very high due to very little downtime. The decision you must make is; Is it worth the cost to replace the equipment with something new or different, or are the low costs of time and money to just keep banging on the housing when it fails worth it? 

No matter the measures, or even the formulas you use for those measures, what matters is improvement over time. If you have a measure that has not changed over an extended period, one could argue that the measure is not helping you. Whatever you are doing is either maintaining an acceptable level, or not doing anything to improve the situation. You will never achieve 100%, but even small changes towards that goal can be quite valuable. 

Be sure you understand what the numbers are really telling you so you can trust them. Yet always verify the source(s) of data and the formulas used. Thoughtful skepticism is a powerful tool.

TRM assists clients across multiple industries in deploying technology solutions that improve operational performance and equipment reliability. Our team stays ahead of available technologies while applying hands-on experience to determine their appropriate use. The result is a practical, well-defined roadmap that is achievable and delivers measurable organizational benefits.

Contact one of our Senior Business Consultants at askTRM@trmnet.com 

Follow John Q. Todd on LinkedIn for more insights on reliability, availability, and practical asset management.

Ready to elevate your asset management?

Connect with TRM to start your journey toward exceptional performance.


Related Resources

Explore insights, guides, and tools designed to help you unlock greater asset management performance and business value.

Unlock smarter
asset management

Ready to elevate your asset management?
Connect with TRM to start your journey toward
exceptional performance.

Let’s talk