Run-to-Fail
Calculated risk or costly oversight?
Sometimes, facility managers let equipment break on purpose. And before anyone clutches their PM schedules in horror, this is not about being lazy. It is about being smart with limited resources and even more limited budgets.
Run-to-fail (RTF) is not typically viewed as “being responsible.” Skeptical? Good. Because this perspective challenges everything taught about asset management
Most facilities stumble into RTF mode like someone walking into a glass door — unexpected, painful and witnessed by way too many people. This is costly oversight. It is that awkward moment when FM teams realize they have been in reactive mode so long that their entire strategy consists of crossed fingers and emergency purchase orders.
The true price of break, fix, repeat
Here is what costly oversight actually looks like: it is not just the repair invoice that makes the CFO's eye twitch. It is the domino effect that follows — a cascading series of oh-no moments.
Picture this: An aging air handler finally gives up on the hottest day of the year. Now the organization is not just dealing with a repair bill. It must deal with lost productivity, expedited shipping costs, overtime labor, secondary damage and a reputation that is now somewhere between "unreliable" and "actively sabotaging operations."
One minute, the FM is sipping coffee and reviewing work orders; the next, they are speed-dialing vendors while calculating overtime costs and wondering whether their resume is up to date.
Strategic neglect: The new secret weapon
Now for the plot twist: sometimes letting assets fail is not negligence — it is strategy. This is calculated risk, and it is perfect for noncritical assets for which monthly preventive maintenance costs more than just replacing the asset when it fails.
Think of it like this: nobody takes a pen in for regular maintenance. They write with it until it stops working, then grab another one. This is the same principle, but with bigger equipment. That bathroom exhaust fan that nobody notices? The one that costs US$200 to replace, but US$500/year in PM labor? Let it live its best life until it gives out.
The keyword here is calculated. This is not about abandoning all maintenance. It is about being intentional with where teams spend their time, money and sanity.
Asset sorting
Time to strategically sort assets:
-
High criticality assets: These get the VIP treatment — predictive maintenance, constant monitoring, the works.
-
Low criticality assets: Perfect candidates for calculated RTF — let them run until they stop.
Seems straightforward enough, right? Yes, but it can go much further than this simplistic overview.
How does an organization rate something as a “high criticality asset”? Start by looking at the organization’s mission and/or purpose. Why is this important?
Because criticality is contextual. An HVAC system in a standard office building is a high criticality asset for comfort and productivity. However, that same HVAC system in a Level 4 bio-containment lab is a catastrophic failure point. If the organization's mission is "Don't let the plague escape," the asset is not just important — it is existential.
To rate an asset, one must ask: "If this stops working right now, does the mission stop with it?"
Making it real: The criticality method that actually works
Here is where FM teams get into the nitty-gritty of figuring out what actually matters (spoiler: not everything does, despite what that one anxious project manager tells them).
Assets can be rated on various factors and scales depending on the organization’s objectives and uses. For simplicity and common usage, consider using a 1-5 scale across multiple critical factors.
But wait, there's more: The weighting game
Because not all factors are created equal (safety trumps convenience every time), weights can be applied to prioritize decisions, ensuring that the most critical aspects are addressed first and that the overall outcome aligns with the organization's core objectives.
-
Safety gets a 1.5x multiplier (safety first)
-
Mission gets 1.2x (keeping operations running keeps everyone employed)
-
Economics gets 0.8x (money matters)
-
Lead time gets 0.5x (inconvenient but rarely catastrophic)
Here is this in action with a real example:
The tale of two chillers:
Same equipment type, vastly different criticality.
The 80/20 labor strategy (or how to do more with less without losing your mind)
When facilities have more equipment than technicians (welcome to literally every facilities team ever), here is the play:
The new PM schedule
-
Top tier (Scores 4-5): Gets 100 percent of the required PM. No negotiations, no exceptions. If the manufacturer says monthly, it is monthly. If the weighted score says a piece of equipment is a 4.8, that technician better know it by its first name.
-
Middle tier (Scores 2.0-3.9): Quarterly visits instead of monthly dates. Think of it as moving from a committed relationship to "it's complicated." Use junior techs, apprentices or outsourced vendors for these. Senior techs have better things to do than check belt tension on noncritical exhaust fans.
-
Bottom tier (Below 2.0): Zero PM hours. Zilch. Nada. Let it ride until it dies. Reclaim those hours.
The capital planning power move
When the finance department says, "pick two out of ten," do not just hand them a wish list. Give them a risk-weighted reality check that makes their decision easy.
Fund by score, not by who complains loudest or has the best PowerPoint skills.
3 hacks for limited resources
1.The critical spares strategy
Streamline RTF asset management by identifying the units most likely to fail in the near future. Rather than scheduling more preventive maintenance, bypass the PM and purchase replacement parts while the budget is available. Think of this as a back-pocket strategy — insurance that is actually cost-effective and practical. By storing these parts properly (labeled and accessible rather than buried in a service truck), it ensures that when a failure inevitably occurs, it results in a quick 20-minute fix rather than a grueling four-week procurement nightmare.
Real example: Those old circulator pumps running to failure? Buy three, stick them on a shelf. When one dies, swap it in within 30 minutes instead of waiting three weeks for delivery, while everyone asks why the heat is not working.
2. Smart monitoring (let the robots do the walking)
Instead of wasting technicians' hours on routine checks of equipment that is likely functioning fine, it is more efficient to invest in basic automated monitoring. Deploying vibration sensors on rotating equipment or temperature sensors in critical spaces offers a much higher ROI than the cost of monthly site visits or addressing comfort complaints after the fact. Similarly, installing power monitors for critical loads and WiFi-enabled leak detectors provides a proactive safety net that costs a fraction of what it would to recover from a major power failure or catastrophic water damage.
Techs get alerts when something is actually wrong, not when the calendar says it is time to look. This is like having a maintenance crystal ball, without the mystical nonsense.
3. The death row list
Create a brutally honest "death row" list for assets reaching their breaking point — like that 25-year-old rooftop unit held together by duct tape and luck. Once the FM team has tracked those assets’ age, condition and replacement costs, stop all preventive maintenance; there is no sense in polishing the Titanic's brass. Instead, pivot by rolling those operations and maintenance funds directly into a dedicated replacement fund and scheduling the swap for the slow season. By keeping vendor quotes ready, the FM team transforms an inevitable mechanical failure into a controlled, well-funded upgrade.
Making the transition without mutiny
Implementing a new approach is not about organizations throwing their hands up and declaring, "we're letting stuff break now!" They need a strategic roadmap that addresses the real issues.
The reasoning becomes obvious: skilled technicians are wasting time cleaning noncritical fan coils because the manufacturer said so. Meanwhile, critical assets are begging for attention. That is not just inefficient — it is frustrating some of the best people who see the PM program as about as flexible as a concrete block.
Organizations can fix this by talking to their technicians. They are the ones in the trenches, and they have insights that leadership needs. Companies should run pilot programs to test new practices — think of them as controlled experiments rather than full-scale gambles.
Communication matters. Management should not just drop changes on people like a surprise pop quiz. They must explain what is changing, why it matters and — this is the kicker — how it makes their teams' lives better. And they should provide actual training. Nobody likes fumbling with new procedures while trying to keep operations running.
Companies should set up metrics that focus on what actually matters: their critical assets. They should track whether this new approach is working or if they are just rearranging furniture.
With some planning and consideration for the people doing the work, organizations can optimize their workforce and improve reliability. It is not rocket science, but it does require more thought than "let's just wing it."
The psychology of letting go
For many FM professionals, letting equipment fail feels wrong. They are fixers by nature. They take pride in uptime. But here is the truth bomb: maintaining noncritical assets to perfection while critical assets suffer from neglect is not noble — it is negligent.
Bottom line
Run-to-fail is not the enemy — unplanned run-to-fail is. Make it a calculated choice, not a default setting. The only thing worse than equipment failure is equipment failure that catches teams with budgets depleted and phones ringing at 3 a.m.
This is not ignoring maintenance. It is strategically choosing where to spend time and money. That is not negligence — that is intelligence. It is the difference between being reactive and being smart; between working harder and working smarter; and between maintaining everything poorly and maintaining what matters exceptionally.
Nic Guedenet is an FM Specialist at AkitaBox with over 20 years of experience in facility management. He’s worked his way from maintenance tech to VP of Operations, managed 35+ client campuses, led FCA teams for five years, and was recognized as a green business leader for running a carbon-neutral building. Nic is also a co-host of the Facilities Unfiltered podcast.
Josh Lowe has a degree in architecture, a background in construction and a penchant for pushing the boundaries of new technologies, as he leads the way in the building industry. He is passionate about helping AEC firms, building owners and FM professionals make data-driven decisions for operational efficiency and success. He literally helped write the book on building information modeling (BIM) as a contributing author to “Mastering AutoDesk Revit Architecture: 2011 & 2012,” and has consulted on several BIM and facility management technologies. In 2016, he co-founded AkitaBox, a SaaS application for proactive building management and facility assessments.
References
Top image via Getty Images. Article charts courtesy of AkitaBox
Read more on Operations & Maintenance and Finance & Business or related topics Operational and Capital Budgeting , Asset Management and Operations and Maintenance Planning
Explore All FMJ Topics