What Is Error Rate? — Uptime Monitoring Glossary

Definition

Error rate is the proportion of requests (or monitoring checks) that result in an error instead of a successful response, usually expressed as a percentage over a period. A 2% error rate means 2 out of every 100 requests failed.

Errors include unexpected status codes (like 500 or 503), timeouts, and connection failures. Unlike a binary up/down view, error rate captures partial degradation — a service can be "up" while quietly failing a meaningful slice of requests.

Why It Matters

Error rate often reveals problems before they become full outages. A creeping error rate signals an overloaded database, a flaky dependency, or a bad deploy affecting some users. Watching it lets you act on degradation early, when it's still a warning rather than a crisis.

How It Works

Error rate = (failed requests ÷ total requests) × 100 over a window. For monitoring, each check is a sample: the share of checks that returned an unexpected status, timed out, or failed to connect is your monitored error rate. Setting an error-rate threshold lets you alert on degradation, not just total failure.

Real-World Example

An API handles 10,000 requests in an hour; 150 return HTTP 500. The error rate is 1.5%. It normally sits near 0.1%, so the spike triggers investigation — a dependency is timing out under load, caught well before it would have caused a full outage.

Best Practices

Alert on error-rate thresholds, not only on total downtime
Define clearly which responses count as errors
Watch error-rate trends to catch slow degradation
Segment error rate by endpoint to localize problems
Correlate error-rate spikes with deploys and traffic changes

Common Mistakes

Only monitoring up/down and missing partial failures
Counting expected non-2xx responses (like 404s) as errors
Ignoring small but rising error rates until they become outages
Aggregating all endpoints so a localized problem is hidden
Setting no baseline, so you can't tell normal from abnormal

In Monitoristic

Monitoristic records each check as a success or failure against your expected status code, so the share of failing checks is effectively your monitored error rate. Watch it move over time to spot degradation before it becomes a full outage.

Start monitoring →

Frequently Asked Questions

What counts as an error?

Typically unexpected status codes (5xx, sometimes 4xx), timeouts, and connection failures. Expected non-2xx responses should be excluded from your definition.

What is a good error rate?

As close to zero as practical. Many teams treat anything sustained above a fraction of a percent on critical paths as worth investigating.

How is error rate different from uptime?

Uptime is a coarse up/down measure; error rate captures partial failure, showing when a service is up but failing a portion of requests.

Can error rate predict outages?

Often, yes. A rising error rate frequently precedes a full outage, giving you early warning to intervene.