Tips & Tricks

How to set up effective alerts

July 19, 2021
8 min read

You’ve decided to join 100+ operational leaders at companies like Airbnb, Stripe, Oscar Health, Square, and made the decision to use alerts to manage business operations. Great!

Getting started

Before you can get started, you need to have a few things in place.

Data

As the famous saying goes, “garbage in, garbage out”. To make your data operational, you want to ensure that is

  • Centralized: warehouses like Snowflake, Redshift increasingly house operational, not just analytical data.
  • Accessible: Get read-permissions to these central sources of information.
  • Clean and reliable: Set up data pipelines to ensure it’s clean and in the right format, and is actionable.
  • Fresh: Make sure the data is updated at a cadence that makes sense for the operations you’re tracking.

Business goals

Your data is only as good as your goals, so make sure you’re managing to the right metrics. Business users, product managers and engineering might all need to be kept in the loop. Create a staffing plan for who will monitor and respond to the alerts. You’ll also need a way to maintain the system over time and flexibly update it as your business grows.

Once you have your input data in one place and goals defined, be wary of siloing logic by writing your marketing rules in Marketo and sales rules in Salesforce. Use an application-agnostic tool to manage rules to make sure your rules can run in coordination (no customer wants to receive five emails in one day), are auditable and can be optimized in total.

Don’t let the seams in your company and tools become the seams in your customer’s experience!

Deciding what to monitor

Once you have the prep work in place to start alerting, you’re ready to start deciding what to monitor.

Alerts should be real, urgent and actionable. Here’s a quick diagnostic to know if you should modify your alerts.

How to craft a good alert

Good alerts are created when needed, have clear content and are managed well.

Creation
  • Make sure you’re well calibrated on what’s normal, and experiment on the best thresholds to set. You may want to use historical data. If you’re setting up a new alert, it’s normal to not have this information upfront, but make sure to prioritize getting it.
  • Don’t re-alert for the same issue, whether it’s from the same rule or different rules that alert for the same underlying condition. You’ll add to alert fatigue, and it can be easy to miss non-duplicates.
Content
  • Clearly state what is going on (cause) and the impact (symptom).
  • Use emojis or another visual way of making it skimmable.
  • In the alert itself, link to how to resolve it or access to debugging data.
  • Mention how deviant the current situation is from a normal benchmark or how severe a reaction needs to be.
  • Define the time window in which this needs to be fixed, if available.

Management
  • Have a way to track whether alerts were dealt with, and what the resolution was.
  • Create debugging dashboards where specialists can go in to resolve issues speedily.
  • For advanced cases, you might want to set a snooze or reminder schedule for alerts.


How to manage non-alert items

You now know how to craft an alert, but knowing what not to alert on is just as important to a good alerting strategy. The two other big misuses of alerts we see are for 1) informational reporting, and 2) timely but not urgent task creation.

Informational reporting

Maybe you want to know your GMV or the number of users that signed up every day. For informational reports, you want to create and send schedule reports to a predetermined (likely non-rotating) audience.

  • Send reports over non-interrupt driven channels like email vs Slack.
  • Highlight top trackers and north star metrics.
  • Make it visual and easily skimmable.
  • Call out any abnormalities or firing alerts with a quick human note.

Task creation

You might need to create tasks for which you need a timely resolution, but they are not urgent or do not indicate a broken process. For example, customer support requests and account onboardings may have a 1–2 day SLA. Here, you’ll want to create tickets and assign them to your operations specialists.

  • If you’re creating tickets from data triggers, make sure to thread any related requests into a single ticket.
  • Track tickets in a case management system. It might be tempting to dump it to a Slack channel, but that will create noise and you won’t be able to ensure everything was dealt with.
  • Every month, clear out a backlog to prevent buildup and irrelevance.
  • Be honest about how much time it takes a ticket to go through the queue, and manage those metrics.

How set up a self-managing system

Once we've set up a good system, we want to make sure it will continue to serve future business needs with little maintenance, so there are a few things to keep in mind.

A common failure mode is alert fatigue. Review the total number of pages per week to ensure the work is manageable. If you have multiple alerting systems, consider consolidating them so you can truly understand their impact.

Track accountability of alerts over time:

  • Refine alerts that have >x% false positive rate
  • Consolidate if there’s >y% overlap in the incidents two alerts catch

Architect the system to tell you if its not able to find data or trigger alerts (since it might be indistinguishable from the case where there are no alerts)

Treat monitoring as code:

  • Version control your changes and rules
  • Make sure only authorized people can set up alerts
  • Ensure updates to alerts go through peer or manager review
  • Test the impact of alerts on a representative dataset

Good operational alerting can transform business processes. Let us know what you do to keep your company running smoothly!


Similar posts

Get started with a free trial

Improve your business operations today
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
No credit card required
Cancel anytime