Monitoring Guide¶

Version: 1.0.0 | Last Updated: 2026-03-22 | Status: Active

Monitoring Objectives¶

Detect outages quickly
Detect degraded behavior before customer impact
Separate liveness/readiness/probe failures from deep diagnostic latency

Core Health Endpoints¶

Endpoint	Use Case	Notes
`/api/v1/health/ping`	network/process reachability	fastest signal
`/api/v1/health/live`	liveness checks	process health only
`/api/v1/health/ready`	readiness gate	validates critical dependencies
`/api/v1/health/celery`	Celery operational summary	expected `mode: summary`
`/api/v1/health/celery/deep`	operator diagnostics	use on demand, can be slower
`/api/v1/health`	full stack diagnostics	comprehensive, not for tight probe intervals

Recommended Probe Strategy¶

Fast checks (high frequency):
- /health/ping, /health/live
Readiness checks (medium frequency):
- /health/ready
Operator diagnostics (manual/low frequency):
- /health/celery/deep, /health

What to Watch¶

Apache / mod_wsgi¶

AH10159 or AH00484 signals worker pressure
active daemon process/thread settings in wsgi.conf
response latency spikes on probe endpoints

Celery¶

online worker count
queue consumers and pending message count
repeated worker exits or SIGKILLs in logs

Dependencies¶

Postgres availability
Redis ping/connectivity
RabbitMQ connectivity and consumer health

Log Locations¶

Apache: /var/log/httpd/error_log, /var/log/httpd/access_log
Celery (systemd): journalctl -u celery

Quick Triage Commands¶

systemctl is-active httpd celery
curl -sk https://dev-backend.mightybox.site/api/v1/health/ready
curl -sk https://dev-backend.mightybox.site/api/v1/health/celery
tail -n 100 /var/log/httpd/error_log
journalctl -u celery --since "30 min ago" --no-pager

Escalation Guidance¶

If liveness fails: treat as immediate service incident.
If readiness fails: treat as dependency/system incident.
If only deep checks are slow: treat as diagnostic overhead/performance issue.

Dev Server Runbook

Monitoring Guide¶

Monitoring Objectives¶

Core Health Endpoints¶

Recommended Probe Strategy¶

What to Watch¶

Apache / mod_wsgi¶

Celery¶

Dependencies¶

Log Locations¶

Quick Triage Commands¶

Escalation Guidance¶

Related Runbook¶