Oncall Compensation for Software Engineers

This issue is the second part and final article in a series about oncall. Part 1 – published last week – covers healthy oncall practices. In this issue, we dive into:

  1. Oncall philosophies across the industry. How do groups of companies approach oncall practices and compensation?
  2. Companies which pay and those that don’t. An overview of more than 120 firms and their approach to paying, versus others which do not.
  3. How much do companies pay? Data points from 80 employers, the largest data set of this kind published to date.
  4. Companies which don’t pay. How do they approach oncall?
  5. Poor oncall cultures. What are examples of places where oncall can be a reason for churn?

In this issue, we’ll go through more than 80 data points on how much different companies pay for oncall. A preview of some of the data we’ll discuss:

How much do companies pay for oncall? 19 of the 80+ data points in this article.
How much do companies pay for oncall? 20 of the 80+ data points in this article.

This article was originally sent to subscribers of The Pragmatic Engineer Newsletter - the #1 technology newsletter on Substack, with more than 100,000 readers. If you've not signed up yet, subscribe to get weekly articles like this one in your inbox. 👇

1. Oncall philosophies across the industry

Looking across the industry, there are several different philosophies.

1. “Being oncall is your one and only job.” Some companies hire dedicated tech people whose only job is to be oncall, handle alerts, and improve the oncall infrastructure. This role is called ‘DevOps Engineer’ at some companies, SRE (Site Reliability Engineer) at others, and may also be called ‘Operations Engineer.’

The role is constructed so expectations are clear and shifts are staggered, so the workload is reasonable.

These roles are typical at more traditional companies, many of which are transitioning from an Ops model, (see Part 1 of this series for detail on what an Ops model is,) to one with more continuous delivery. The role is also widespread in highly regulated industries. Companies which prioritize software engineers’ wellbeing might also opt for this model, or at least utilize dedicated people to be oncall to lighten the burden on the engineering organization.

2. “It’s not part of the job outside business hours.” Plenty of businesses don’t expect engineers to be oncall outside of business hours. These companies are typically:

  • Local businesses serving local customers who use the service during normal business hours. Many B2B services might fall into this category. In these cases, customers are fine with the system not working at times when they don’t usually contact company employees.
  • Businesses where downtime outside core hours is not a major problem. These are either small businesses, or those operating with little to no competition.
  • Startups without many – or any – customers.
  • Developer agencies and consultancies usually don’t cover for oncall, though there might be exceptions depending on client contracts.

3. “It’s not part of the job outside business hours, but we might still try to reach you during those times.”

A variation of the above two cases, when the company does not expect software engineers to be available. However, they do have someone oncall, and that person might try to reach out to software engineers during an outage. However, it's not expected that the engineer answers their phone.

This is a typical setup at small businesses and startups where incidents are too few and far between to warrant making oncall “official.” It’s more common at companies where the founders are hands-on, and can resolve most issues themselves, thereby bearing most of the oncall burden, rather than the engineers.

4. “It’s part of the job for all software engineers and we operate in regions which regulate how it needs to be compensated with pay and time off.”

There are plenty of countries that regulate oncall pay and time off, the list of which we covered in Part 1 already. Most companies operating in these regions structure their oncall compensation to adhere to local regulations.

I say “most” as two groups will technically violate regulations:

  1. Companies with a satellite office and leadership which is unaware of the local rules. In such situations, it might be down to local employees to communicate the regulation and lobby to put compliant oncall compensation in place.
  2. Small companies for whom following regulations unfit for tech companies would do more harm than good. In some countries, regulations for oncall have been drafted for groups like firefighters or police. While such regulation makes sense for many non-digital professions, the requirements of the regulation might be overly rigid for smaller tech companies to follow. So, they might sidestep it and compensate in line with the spirit of the regulation, while not following it to the letter.

5. “It’s part of the job, but we acknowledge the disruption with pay and additional time off.”

Empathetic companies acknowledge going oncall is another burden that needs to be compensated, regardless if there is no local regulation mandating it. These companies offer pay, and might also offer the ability to choose time off, or a mix of compensation. Companies that typically use this approach:

  • Traditional firms which pay for oncall in other parts of the business, already. For them, paying software engineers to be oncall is a given.
  • Companies paying at or below the middle of the market. Without paying for people's time, these places would see high attrition, as engineers would seek opportunities elsewhere.
  • Companies aiming to minimize engineer attrition. Unpaid oncall will always be a reason to look for a new job. Employers looking to minimize software engineer attrition will seek to recognize this work at the very least with cash, but perhaps also with time off.
  • Companies investing in healthy oncall practices. These companies look at oncall as a cost that should be quantified, and do so.

6. “It’s voluntary for most people, and we encourage it with pay and time off.”

The final group are companies which do need people oncall, but manage to structure this so it is voluntary. They get enough volunteers by generously compensating for being oncall. Often, these places have dedicated people oncall along the lines of DevOps engineers or SREs.

What if there are not enough people to cover oncall? The company might reserve a mandate for anybody to go oncall until there are enough volunteers, but pay generously for their time. In most cases, the companies I talked to which implement oncalls this way manage to recruit enough volunteers.

For example, an engineer at The Guardian – the UK-based newspaper and website – shared that their oncall is voluntary and works well:

“Our oncall is paid at £750 per week. It’s a voluntary rota of 2 engineers each week. On top of the pay, you are encouraged to take rest after your shift. Oncall is popular and we don’t have issues filling the rotation.”

Voluntary oncall can work well if the effort it takes to staff oncall, is far lower than the number of engineers at the company.

7. “It’s part of the job for all software engineers and not paid additional.”

This approach is common at many companies. A few which stand out:

  • Companies paying top of the market. Places which sit in the top tier of the trimodal nature of salaries usually pay far more without compensating for oncall, than lower-tier companies with very generous oncall compensation do.
  • Big Tech. Most of Big Tech don't pay for oncall with cash compensation. Google is the only exception.
  • Startups low on capital. Oncall is part of the job at startups that don’t have large amounts of funding. This is the case for most Pre-seed and Seed startups, while plenty of Series A ones operate like this, too.
  • Companies which can get away with it. Even if not paying the top of the market, compensating for oncall is an additional expense. Many companies will try to avoid it, if they can get away with it.

Some firms compensate with time off, instead of cash. This can range from inviting people to start late if they got paged the previous night, all the way to offering a specific number of days as paid vacation for each week worked of oncall. For companies which already offer unlimited time off, offering lieu days is no change to the existing policy, and so it won’t be seen as a supportive measure.

Although I list seven different philosophies: in reality, there are two main ways of thinking about oncall:

A: Oncall for software engineers is additional.

1. “Being oncall is your one and only job.”

2. “It’s not part of the job outside business hours.

3. “It’s not part of the job outside business hours, but we might still try to reach you during those times.”

4. “It’s part of the job for all software engineers and we operate in regions which regulate how it needs to be compensated with pay and time off.”

5. “It’s part of the job, but we recognize the disruption with pay and additional time off.”

6. “It’s voluntary for most people, and we encourage it with pay and time off.”

B: Oncall is part of the job:

7. “It’s part of the job for all software engineers and not paid additional.”

2. Companies which pay for oncall, and those that don’t

Being oncall means you need to be on standby and have your laptop with you at all times, so you can respond to pagers, and start resolving incidents within minutes. Most software engineers who are oncall tend to have this responsibility for a week at a time.

Being oncall can be quite disruptive in two major ways:

  • It disrupts your personal plans, outside of work. Going to the movies with your kid, or perhaps on a date? Bring your phone and your laptop and be ready to exit midway through if you get a page. For any event in the evenings, at weekends or during the holidays, you either need to schedule cover with the secondary oncall – if they are around – or be ready for your private time to be disrupted. Some people move their social events around so their oncall week is clear.
  • It disrupts your sleep. Alerts don’t care what time it is and they can wake you up in the middle of the night. You might also have to do an investigation at 3am. This has happened to me more than once. Disrupting your sleep can also have short and long-term health consequences, according to scientific research.

Several countries have strict regulations around oncall, with some going as far as mandating the right to a certain number of hours of uninterrupted rest per day or per week.

Incident management scaleup indcide t.io (disclaimer: I am an investor) ran an oncall survey where they collected 200 responses, and found that while 70% of respondents said each team was responsible for their oncall rota, about 40% were compensated for oncall. An interesting finding from their summary:

“Interestingly, this was more common in larger organizations (5,000+ people) than in small to mid-sized organizations that participated in the survey.

Where companies did provide compensation, most paid a fixed amount for time spent oncall (e.g., $X per hour, day or week). But the actual dollar amount paid ranged significantly, from $5 to $1,000 per week, with the average weekly rate at $540.”

For this series about oncall, I invited people to share whether their company pays for oncall, and if so, how much? Here is a collection of companies which do not pay for software engineers to be oncall – but still require it – and ones that do. This is a collection of data that people shared with me anonymously. The list is not exhaustive, and there are companies missing from this list. However, it's the largest data set of this kind published, to date.

Note that companies in the “Unpaid oncall” column need to follow local regulation for oncall compensation in countries which mandate this, as discussed in Part 1 of the series.

So which companies pay, and which ones do not?

Companies that pay, and ones that do not pay for oncall - part 1.
Companies that pay, and ones that do not pay for oncall - part 1.
Companies that pay, and ones that do not pay for oncall - part 2.
Companies that pay, and ones that do not pay for oncall - part 2.

Let’s address the elephant in the room: Big Tech generally does not pay for being oncall. Except for Google, and except for regions where paying for oncall is mandatory, they don’t compensate for this. Why is this?

A major reason is these companies pay top of the market: in the third and highest bracket in the trimodal compensation model. So even when not paying for oncall, they offer higher compensation than most other companies which pay lower, but compensate additionally for oncall.

And it’s reasonable to argue that they have a point. Between these two offers, which one sounds more appealing?

  1. $140,000 in total compensation. Oncall is paid additionally, as $800-1,200/week, meaning about an additional $8,000-12,000 per year.
  2. $310,000 in total compensation ($180,000 base salary + the rest in stock). Oncall is not paid.

Many well-funded startups which pay closer to the top of the market also make oncall as part of expectations as well.

The only curious data point is how Google, on top of paying top of the market, also pays for oncall, and invests in healthy oncall practices on top of these, as we’ve covered in Part 1 of the series.

3. How much do companies pay?

How much do companies pay which compensate for being oncall? The answer is, it varies a lot. This ranges from about $100 per week, all the way to around $1,250 per week – and even higher for some engineers at Google.

Compensation approaches are split between these three buckets, ordered by frequency:

  1. Flat rate per week or per day of being oncall. The same amount is paid, regardless if there are incidents that need attention or not. Most companies which pay for oncall follow this approach. A common reason for going with this, is that it incentivizes quiet oncalls: there’s no financial benefit to working on outages out of hours.
  2. Flat rate for standby, plus pay for hours worked outside core hours. Another approach is to have a flat standby rate. However, when an incident occurs and engineers need to spend time mitigating it, they can claim additional compensation for that time. This is usually a multiple of their regular hourly rate, and is often more for a weekend or a public holiday. Several countries mandate this approach, even if companies pay a decent standby rate already.
  3. Only pay for incidents worked on out-of-hours. Some companies don’t pay for people to stand by, but do if they need to do work during the night or at weekends. This work is usually the mitigating of an incident. Most companies that pay this way do so because of local regulation, which mandates paying for overtime at night or during weekends.

Here is a summary of how various companies compensate. I go into far more details on each company's compensation philosophy in the subscriber-only article.

Companies paying at or above 1,000 USD/EUR/GBP per week.
Companies paying at or above 1,000 USD/EUR/GBP per week.
Companies paying 600-1,000 USD/EUR/GBP per week.
Companies paying 600-1,000 USD/EUR/GBP per week.
Companies paying 400-600 USD/EUR/GBP per week.
Companies paying 400-600 USD/EUR/GBP per week.
Companies paying 300-400 USD/EUR/GBP per week.
Companies paying 300-400 USD/EUR/GBP per week.
Companies paying at or below 300 USD/EUR/GBP per week.
Companies paying at or below 300 USD/EUR/GBP per week.

For far more details on each company's compensation philosophy see this  subscriber-only article.

A few curious insights into the above rates:

  • Brazil and Spain: regulated on pay. Both countries have clear regulations for oncall pay that evey company with a local subsidiary needs to follow.
  • Germany: frequently paying despite no such regulation. German companies frequently pay for oncall, even though local German regulation only mandates rest time, and not for paying for standby oncall.
  • UK: some companies pay despite no such regulation. Even though the UK does not mandate paying for oncall, it’s more common to see companies pay for oncall duties here, versus, for example, in the US. This might be due to UK companies often hiring remotely in other European countries where oncall is mandated.
  • US: few companies pay. However, those that do, tend to pay globally. Google, Intercom, Spotify, LaunchDarkly, CircleCI and PayPal are a few places worth mentioning that do compensate for standby oncall, evenin the US.

4. Rewarding oncall at companies that don’t pay

At companies which don’t compensate for oncall, is there a different approach? As I talked with software engineers oncall at these companies, several shared that they do get various benefits, mostly related to time off. Here’s a summary of approaches:

Companies with a culture of offering some time off, instead of cash.
Companies with a culture of offering some time off, instead of cash.
Companies without a culture of offering time off for oncall.
Companies without a culture of offering time off for oncall.

5. Poor oncall cultures

During my research, I came across a few examples of poor oncall culture.

Twilio is a company about which I received an unusually high number of complaints to do with oncall. All contributors had a negative view of how the current approach to oncall works at the company.

Here is how a current software engineer detailed the oncall expectations at Twilio:

“Shift duration is also freely discussed within teams. Generally, it is 7 days, but inside teams with few members like 2 or 3, they can also do 3 days’ rotations. If you are a team of 4, you're almost in ‘oncall prison’ for 2 out of 4 weeks (secondary + primary week.)

Lately, there have been strong discussions internally within the R&D teams (mostly involved in oncall) because of this.

General rules –again, the team can slightly vary them:
1. Expected to engage within 10 minutes of an alert.
2. Expected to engage within 5/10 minutes on POC (point of contact – paged by other teams or people.)
3. No drugs or excessive alcohol while oncall.
4. Escalate early if you encounter problems.
5. You can travel, if you can ensure all the above.”

Several engineers working at the company told me oncall operational load is high, teams are understaffed, oncall is not paid, and someone even used the term “oncall prison,” as quoted above. A few additional quotes from software engineers and engineering managers at this company:

  • “Twilio does not pay anything additional for being oncall, nor do I get breaks after a rough shift. I hate oncall life and once it's time to move on, I would prefer no oncall. I'm oncall every 4 weeks.”
  • “There are critical services that are staffed at way too low levels, with the managers of those services denied headcount/backfills because ‘they aren’t moving the needle.’”
  • “People are starting to vote with their feet for the high oncall load and quitting. So now, added to the stress is a high level of IC attrition from those services’ lack of institutional knowledge.”

What is the reason for the high oncall lead? I talked with a few engineers who had two or more years of tenure. This is what they shared:

  • Lots of ‘Amazon DNA.’ Much of the previous engineering leadership came from Amazon, and focus a lot on operational excellence. This means strict oncall rules, as well.
  • Growing too fast in a short amount of time. Engineers I talked with shared how the company grew aggressively in 2020 and 2021, more than doubling the size of the engineering organization in 2021, while growing 5x over the past three years from ~1,500 employees in 2019, to more than 7,000 full time staff by the start of 2022.
  • Too many custom systems. These custom systems make it very hard to fix the root causes of noisy oncalls. Also, because so many systems are custom, it’s not practical to merge oncalls between teams.
  • Attrition for experienced people. Twilio was hit badly in 2021 with attrition, thanks to the hot market. Engineers told me several people with in-depth knowledge of custom systems have left. These people leaving was a setback in resolving the root causes of high oncall loads.
  • No backfills. From an engineer: “My org lost more than half of its engineers in two months. No backfills are coming and the oncall load has increased significantly. People will most likely keep leaving and it’s a massive operational risk to stay here.”
  • A barely acknowledged tech debt problem. Engineers tell me they feel the company has not acknowledged just how bad tech debt has been, until recently. There are now replatforming efforts which – once complete – should also address oncall issues.
  • Light at the end of the tunnel. Several people mention how new leadership joining from the likes of Google is putting more focus on healthy oncall practices, advocating for the current setup being unsustainable.

The good news at Twilio is that leadership seems to be aware of the unsustainable oncall setup, and is working on addressing this problem. Engineers I talked with shared how while burnout is hitting many of them, they do feel that leaders are trying to fix things and empower engineers. Someone I talked with shared how they feel the thing holding the company back is the mindset which many tenured engineers have; they assume things are fine just the way they are – including the high oncall load.

Amazon is another company where oncall stress is high and oncall is not compensated – unless regulated locally, like in Brazil or Spain. Within Prime Video in the UK, oncall compensation is historic: when Amazon acquired startup LoveFilm in 2011 – a startup which transformed into Prime Video – oncall compensation was part of all current and future contracts, and it has stayed that way ever since.

Oncall is one of the main negatives of working at Amazon, as I write in Inside Amazon’s Engineering Culture:

“The bad: Heavy oncall and operations load in many cases. Teams are expected to operate their systems well, and this is often done in a way that results in stressful oncalls. Operational issues need to be addressed immediately instead of letting them escalate.”

Different teams and organizations, of course, have varying oncall loads. People who work in the Alexa and Redshift organizations have shared stories of extreme oncall loads.

Engineering managers – called SDMs (Software Development Managers) – at Amazon have been known to step in and try to ease the oncall load by offering days off. As a current Amazon engineer shares:

“After rough oncalls in the Alexa Devices organization, several SDMs have been lenient with PTO to compensate (if up late, then come in late, if working the weekend, then take an extra day off, etc.) This was not an official policy and any change-ups in the manager (like a re-org which happened somewhat often) and any accumulated off-the-books PTO time would vanish.”

However, I have heard several stories of staff suffering extreme burnout due to Amazon’s relentless oncall load, and this burnout persisting with them to their next position. A software engineer I talked with shared how they burnt out so badly because of Amazon’s oncall culture, that they struggled during the first several months of their next job at a startup, as they were recovering mentally and physically from the oppressive operations load at Amazon.

Other examples. A few engineers have cautionary tales which I’m sharing as warnings:

  • “Some teams have had dramatically different pager volumes for years at my company. There are teams which people like to work on thanks to their great mission, but those same teams have a noisy oncall. There is no way I would move to those teams, where moving would come with so much unpaid overtime at unsociable hours.” From a publicly traded tech company valued at $13B.
  • “I had never done oncall before I worked at this company, and it was my only real hesitation about joining it. As it turns out, it was one of the main reasons I ended up leaving. The oncall for my team didn't seem terrible, but I have less than zero interest in having to be ready to jump online at all hours of the day. If I were paid for my time, my feelings might've changed.” From a publicly traded tech company valued at $3B.
  • “Oncall duty was expected to be part of the job, but it was not clearly mentioned in the contract. Your annual review could be impacted for not fulfilling certain expectations during oncall.” From a financial data company in the US.
  • “We have a culture of still expecting to ship updates when you have been up all night handling a SEV-1 outage… we have problems because of this.” From a warehouse automation scaleup.

Why are poor oncall practices painful? They can directly impact software engineer attrition and wellbeing. Simply put, poor oncall practices will lead to more engineers quitting, more people getting burnt out and fewer people recommending a company.

Twilio’s example can be a cautionary tale in my view. While the company has been a business success, it has seen large attrition struggles, and not getting a handle fast enough on healthy oncall means they might see things get worse in terms of attrition, before they get better.

Takeaways

Based on my in-depth, anecdotal research of the industry right now, it appears there are two main philosophies when it comes to rewarding oncall:

  1. Oncall for software engineers is part of the job. Many companies operate like this, most notably Big Tech – save for Google – and many high-growth startups. The more an employer compensates software engineers, the more likely they expect oncall to be a given.
  2. Oncall for software engineers is additional. Companies which care either about healthy oncall practices or want to minimize attrition for software engineers, make it clear oncall is additional and offer some sort of compensation. Compensation may be cash, or it could be time, or it could be lightening the load with dedicated SREs or DevOps people, or making the rotations voluntary.

I can safely say I was living in an “oncall bubble” before researching this topic that has such a big impact on many people’s work and personal lives. Having mostly worked at companies where oncall is part of the job, is unpaid, and comes with high pressure, I assumed this is how it works everywhere.

However, this article and its underlying research reveals that oncall duties being taken as a given, is actually far from the case across the industry. Many companies – small and big, startups and traditional – do pay extra compensation for it. Some places don’t pay cash, but they do offer time off in return for being on oncall duty.

We’ve explored healthy oncall practices in part one of this series. I hope the data points on which companies pay and how much, help shape your oncall policy.

And don’t forget that engineers can shape oncall practices, not just managers; as software engineer Anna Baker at LaunchDarkly shares in Part 1:

“We had a former Intercom engineer take part in developing the new oncall process here, and they advocated for some of the practices they'd seen work well. Pretty incredible to have an engineer championing their former company's oncall. Although we ended up with a slightly different model, we definitely took note and I want to make sure to acknowledge that.”

Update on 5th Aug 2022: corrected oncall data rates for Amazon (Germany), Auth0, and added Atlassian.

Subscribe to my weekly newsletter to get articles like this in your inbox. It's a pretty good read - and the #1 tech newsletter on Substack.