Glossary
Uptime Monitoring Glossary
Plain-English definitions of every uptime monitoring term — explained in the context of keeping your sites and APIs online.
Alert
A notification sent the moment monitoring detects a problem (or a recovery) so you can act fast.
API Monitoring
Monitoring an API endpoint's availability, correctness, and response time — not just whether the host is up.
Availability
The degree to which a service is usable when people need it, usually expressed as an uptime percentage.
Check Interval
How often a monitoring tool tests your service — and the biggest factor in how fast you detect downtime.
Downtime
Any period when a service is unavailable or not responding correctly to users.
Error Rate
The percentage of requests or checks that fail rather than succeed.
Expected Status Code
The HTTP status code a monitor treats as healthy — anything else marks the check as down.
Heartbeat (Cron) Monitoring
Alerting when a scheduled job fails to check in on time — monitoring things that should happen, not URLs.
HTTP Methods
The type of HTTP request a monitor sends — GET, HEAD, or POST — depending on what the endpoint needs.
HTTP Monitoring
Monitoring a service by sending HTTP(S) requests and checking the response status, time, and content.
Incident
A tracked record of a disruption — from detection through resolution — with a timeline of what happened.
Keyword Monitoring
Checking that a page's response contains (or doesn't contain) specific text, not just that it returns a 200.
Maintenance Window
A scheduled period of planned downtime where alerts are suppressed and your status page shows maintenance.
Monitoring Location
Where in the world a check runs from — which affects latency readings and how outages are detected.
MTBF (Mean Time Between Failures)
The average time a service runs normally between one failure and the next.
MTTR (Mean Time to Recovery)
The average time it takes to restore a service after an outage begins.
Outage
A specific event where a service becomes unavailable or stops functioning correctly.
Ping (ICMP) Monitoring
Checking whether a host is reachable on the network using ICMP echo (ping) requests.
Port (TCP) Monitoring
Checking whether a specific network port is open and accepting TCP connections.
Response Time
How long a service takes to respond to a request — a key signal of performance and early degradation.
Retry / Confirmation
Re-checking a failing endpoint to confirm a real outage and avoid alerting on a single transient blip.
SLA (Service Level Agreement)
A commitment to a level of service — most often an uptime percentage — with consequences if it is missed.
SLI (Service Level Indicator)
The actual measured metric — like uptime or success rate — that you compare against your SLO and SLA.
SLO (Service Level Objective)
An internal reliability target you set for a metric like uptime — usually stricter than your public SLA.
SSL Certificate Monitoring
Watching your HTTPS certificates so they don't expire or break and take your site down.
Status Page
A public page that shows the current and historical status of your services to users and customers.
Synthetic Monitoring
Proactively simulating requests to a service on a schedule to detect problems before real users do.
Timeout
How long a single check waits for a response before giving up and marking the endpoint as down.
Uptime
The percentage of time a service is available and responding correctly.
Uptime Monitoring
Automatically checking whether a service is online and responding, and alerting you when it isn't.