DockLogDockLogBlog
7 min readDockLog

Slack and Teams alerts without the noise

Notification channels, alert rules, and thresholds that survived our own staging hosts.

DockLog alerts are useful when they fire once and mean something. They're useless when staging spam wakes you at 2am.

Two places in the UI: Admin → Notifications (where messages go) and Admin → Alerts (what triggers them). Channels first, rules second. Get delivery working before you tune thresholds.

Hook up a channel

Slack

  1. Open the channel you want (#docklog-alerts or similar).
  2. Channel name → Integrations → Incoming Webhooks → Add.
  3. Copy the webhook URL. It looks like https://hooks.slack.com/services/T.../B.../....
  4. In DockLog: Admin → Notifications → Add channel → Slack → paste URL → Save.
  5. Hit Test. You should see a message in Slack within a few seconds.

If Test fails, the URL is wrong or Slack revoked it. Regenerate the webhook and update DockLog.

Microsoft Teams

  1. Open the target channel.
  2. Channel name → Connectors → Incoming Webhook → Configure.
  3. Name it (e.g. "DockLog"), copy the URL.
  4. Same flow in DockLog: pick Teams, paste, save, Test.

Teams webhooks expire if someone deletes the connector. Symptoms look like "it worked last week."

Discord

  1. Channel settings → Integrations → Webhooks → New Webhook.
  2. Copy webhook URL.
  3. Add channel in DockLog, pick Discord, paste, Test.

Discord rate-limits aggressively. If you fire 50 alerts in a minute during a deploy, some may drop. That's Discord, not DockLog.

Custom HTTPS endpoint

Anything that accepts POST JSON works: n8n, PagerDuty Events API, your own script, a Zapier catch hook.

DockLog sends a JSON payload with rule name, severity, container/pod scope, and a short message body. Point a test channel at webhook.site first if you want to see the exact shape before wiring production.

Channel toggles

For rule-based alerts, enable Intelligent alerts on the channel. Without it, only the simpler event toggles fire (container start/stop, healthcheck failures, blocked actions).

ToggleWhen to enable
Intelligent alertsLog, metric, and K8s event rules
Container started/stoppedNoisy on staging; useful on prod if restarts matter
Healthcheck failedGood early signal before OOM
Blocked actionCatches RBAC mistakes; see RBAC post

Turn on only what you need. A channel that pings on every container start on a dev host gets muted in a day.

Rule types

Logs

Match text in stdout/stderr. "15 lines containing ERROR in 3 minutes on prod-api-*" is a common starting point.

Regex works when you need case-insensitivity or structured patterns:

text
(?i)(exception|fatal|panic)

Scope with the same patterns as RBAC: prod-api-*, staging/*, ^worker-\d+$.

Events

Restart loops, OOM kills, unhealthy healthchecks. Catches things you'd otherwise docker inspect for.

Example: a container restarts 5 times in 10 minutes. That's usually a deploy gone wrong or a missing env var, not a flaky network.

Metrics

CPU or memory above X for Y minutes. Useful when logs stay quiet but the process is thrashing.

Start high (90% CPU for 5 minutes) and tighten after a week of baseline. A rule at 50% CPU on a bursty API will lie to you.

Kubernetes events

Crash loop backoff, image pull failures, scheduling failures. Scope with namespace patterns: production/*, staging/api-*.

Pair with K8s log tailing so on-call can jump from alert to live pod logs in one UI.

Starter rules (enable one at a time)

Fresh installs ship six starter rules, all disabled:

RuleWhat it catchesSuggested first?
OOM killKernel killed the containerYes on prod
Restart loopToo many restarts in a windowYes on prod
Error spikeLog pattern thresholdAfter you tune pattern
High CPUSustained CPU over limitAfter baseline week
High memorySustained memory over limitAfter baseline week
Unhealthy containerFailed healthcheckYes if you use healthchecks

Enable one, assign a channel, trigger a test (restart a container, print ERROR lines), confirm delivery, then add the next. Turning them all on at once on a busy host is how you mute the channel forever.

Staging vs prod on one instance

Common on a single VPS: staging and prod containers share one DockLog.

ApproachProsCons
Two Slack channelsClean on-call signalMore setup
Tighter prod scope onlyOne channel to watchStaging rules still need tuning
Disable staging rules entirelyZero noiseMiss staging regressions

What we do: #docklog-staging with loose cooldowns, #docklog-prod with strict scope (prod-* only) and 10+ minute cooldowns on log rules.

Tuning so people don't mute the channel

  • Scope tight at first (prod-api-*, not *)
  • Cooldown and max-per-hour: staging can be looser, prod should be stricter
  • Separate channels for staging and prod if both run on the same DockLog instance
  • Recovery notifications are nice for "it stopped happening" but optional
  • Log rules: count hits in a time window, not single-line triggers on every stack trace line

Example prod log rule:

FieldValue
PatternERROR or (?i)exception
Threshold15 hits in 3 minutes
Scopeprod-api-*
Cooldown10 minutes
Max per hour3

That catches a real spike without paging on one stray ERROR during a deploy.

Step-by-step: first prod alert

  1. Create #docklog-prod in Slack, add Incoming Webhook.
  2. Admin → Notifications → add channel, enable Intelligent alerts, Test.
  3. Admin → Alerts → enable "Restart loop" starter rule.
  4. Set scope to prod-* (or tighter).
  5. Assign the Slack channel as destination.
  6. Save. Restart a prod container 3-4 times quickly in a test window (or use staging first).
  7. Check History tab: fired? delivered? throttled?

Repeat for OOM rule before adding log-based rules. Events are easier to reason about than log noise.

When nothing arrives

Check in order:

StepQuestion
1Global delivery enabled? (Admin → Notifications, top-level toggle)
2Intelligent alerts on the channel?
3Rule enabled with a destination channel assigned?
4Scope actually matches container names? Copy-paste from the UI; names lie.
5Cooldown or max-per-hour suppressing repeats?
6Webhook URL still valid? (regenerate in Slack/Teams)

History tab under Admin → Alerts shows what fired, delivered, or was suppressed. If History says "delivered" but Slack is quiet, the webhook is dead or the channel was archived.

Scope debugging

Container named prod-api-1 but rule scoped to production-api-*? Zero matches, zero alerts, no error. Always verify names in the container list before blaming DockLog.

For K8s, remember K8S_NAMESPACES is a hard ceiling. A rule on kube-system/* does nothing if that namespace isn't in the instance env. See RBAC guide.

Security events

Worth a separate low-traffic channel if compliance cares. Blocked stop/delete attempts show up when someone hits a button their user account doesn't allow.

Good for catching permission mistakes or someone poking at buttons they shouldn't have. Wire this before handing client logins: RBAC patterns.

If you expose DockLog on the internet, put it behind TLS first so webhook URLs and admin sessions aren't the weak link: reverse proxy post.

Email and mobile

Email isn't built yet. Slack mobile notifications are the usual workaround. Pin #docklog-prod and set notification preferences to mentions only if you add a bot username later.

For on-call without Slack, custom webhook → PagerDuty or Opsgenie is the path most teams take.

More in the alerts guide. Compose baseline with auth and DB_PATH: docker-compose setup. Why alerts matter on a self-hosted viewer: why self-hosted.

Continue reading