Most mobile apps nowadays provide a service in one form or another. Disruption of this service is considered a worst-case scenario for both users and providers. Nowadays any amount of disruption to huge apps and services like Facebook, Google, or Instagram results in breaking news articles. Not only does the actual downtime result in a lot of lost revenue, but the lasting negative effects on the brand have their own considerable cost. Smaller apps and services can be completely crippled by these disruptions.
The thing is, you can do a lot to prevent these disruptions, but at some point, something will go wrong since no system is perfect. This is where MTTR comes in. The faster you can on average repair a disruption in your mobile app’s service, the more money and goodwill you save. Apps and systems with a low MTTR will have less of an impact on business outcomes. High MTTR apps will result in a negative impact on user experience and in turn business metrics.
In this article, we will be specifically focusing on the use of the MTTR metrics on mobile apps and their performance.
What is MTTR?
MTTR stands for mean time to repair. It is a metric used to measure the average time taken to resolve issues from the moment they were initially detected. MTTR is an incredibly useful metric to assess the maintainability of an app. The acronym itself is also commonly used by DevOps teams for infrastructure-related issues and equipment lifecycles.
Even though it’s a measure of time to repair, there are more activities going into the time affecting MTTR. After identifying the issues there is diagnostic time, repair time, testing, and more that go into resolving an issue completely. The faster and more efficiently you can perform these activities the better your app’s MTTR will be. The best way to optimize these areas is to measure them and the time taken to perform each one and find faster ways to diagnose and solve issues.
How to calculate MTTR
The basic calculation of MTTR is:
MTTR = (Total amount of time the service was unavailable in a period of time) / (Number of disruption incidents in that period of time)
A common way to track these numbers is using support tickets and the time from when they were created to when they get closed or resolved.
By establishing your MTTR you can begin to better understand how you can improve your numbers and address performance issues faster. This will help you improve reliability and user experience as well as reduce costs.
MTTR vs MTTF vs MTBF
MTTR isn’t the only failure metric to be aware of. Organizations track these KPIs to gauge their system’s reliability. This could also include general functionality issues or component failures, not just service outages. Other failure metrics include:
- Mean Time To Recovery (MTTR) is a measure of the time between the point at which the failure is first discovered until the point at which the equipment returns to operation. So, in addition to repair time, testing period, and return to normal operating condition, it captures failure notification time and diagnosis.
- Mean Time Between Failures (MTBF) measures the predicted time that passes between one previous failure of a mechanical/electrical system to the next failure during normal operation. In simpler terms, MTBF helps you predict how long an asset can run before the next unplanned breakdown happens. The expectation that failure will occur at some point is an essential part of MTBF. The term MTBF is used for repairable systems, but it does not take into account units that are shut down for routine scheduled maintenance
- Mean Time To Failure (MTTF) is a very basic measure of reliability used for non-repairable systems. It represents the length of time that an item is expected to last in operation until it fails. MTTF is what we commonly refer to as the lifetime of any product or a device. Its value is calculated by looking at a large number of the same kind of items over an extended period and seeing what is their mean time to failure.
How Instabug APM can help
APM tools help you monitor different elements affecting your mobile app’s performance. Service disruptions usually are a result of network issues and a lot of factors can go into your app’s network performance. Some are in the mobile developer’s control, but unpredictable issues can still occur from the users’ side. By monitoring your network performance you can take a proactive approach to detect, diagnose, and fix network issues before they affect your MTTR.
Traditionally, network performance monitoring was done purely from the server-side. While it is important to monitor server-side network performance, it doesn’t represent the whole story. Normally any APM tool will be able to highlight network failures from the server-side. But Instabug APM takes it a step further by also highlighting whether the network failure was from the client-side. This helps you determine at a glance whether the issue should be assigned to the backend team or mobile team.
By tracking the full round trip of your network requests as seen on the client-side you can identify and act on issues faster. Instabug also alerts you about common network failures which will also help you detect issues faster before they cause considerable issues and outages.
Instabug’s triple threat of Bug Reporting, Crash Reporting, and APM tools will empower you to optimize your app’s performance and deliver the high-quality experience your users expect and deserve.