June 20, 20267 min readDockLog

DockLog vs Grafana and Loki

Loki keeps history. DockLog gets you to a live line before you've installed Promtail.

comparison monitoring grafana loki docklog

Loki is for when you need months of logs in object storage, LogQL across services, Grafana dashboards, and the LGTM stack beside Mimir and Tempo. Recent architecture work pushes Kafka-backed ingestion and columnar storage for scale-out installs. That's the right tool when retention, compliance, and fleet-wide search are the mandate.

DockLog is for "what is this container printing right now?" on one host or a small cluster, before anyone has time to stand up Promtail, object storage, queriers, and Grafana.

Not a replacement

We won't pretend to index petabytes or run {app="api"} |= "timeout" across hundreds of nodes. If that's the job, deploy Loki or use your cloud vendor's log store.

Many teams run both:

Phase	Tool
Incident (tail now, scoped access, phone)	DockLog
Postmortem (search 90 days, join metrics/traces)	Loki + Grafana

DockLog is the low-latency human layer Loki was never trying to be.

What Loki gives you

Retention, weeks to years in S3/GCS/MinIO
Cross-service LogQL, filter by labels, grep content
Mature alerting ecosystem through Grafana
Fleet scale, platform team to operate agents, ingesters, storage
Join logs to metrics and traces in one Grafana UI

If you have that infrastructure, use it.

Why people still install DockLog first (or alongside)

Socket to tail in under a minute

Mount docker.sock or kubeconfig, one container, WebSocket stream. No Promtail, Alloy, Kafka cluster, or Grafana install before the first line appears. On a $20 VPS the ops tax matters.

Loki's distributed path in 2026 can add Kafka as a dependency for scale-out. DockLog's dependency is Docker (or Kubernetes API access).

RBAC on the daemon, not in LogQL

allowed_containers per user, wildcards, regex, K8s namespaces. Contractor sees staging/*; on-call sees prod-api-*; admin sees all. No label cardinality design session, no waiting for a platform team to model tenants in Grafana.

Live tail without the pipeline in the middle

Explore tail in Grafana works. It still flows through agents, ingesters, and queriers. DockLog reads the container buffer on the host, closer to docker logs -f when prod is loud.

Actions when logs aren't enough

Restart the worker, open a shell, inspect labels and health, gated by ALLOW_* and can_*, audit-logged when auth is on. Loki tells you what happened; DockLog lets permitted humans do the next step in the same UI (or app).

Alerts without a logging schema

Signal	DockLog
Log line regex	Yes
Container die / OOM / health fail	Yes
K8s warning events	Yes
CPU/memory threshold	Yes
Throttling + severity	Yes
Slack, Teams, Discord, webhook	Yes

Stand this up on one host before you've designed label schemas for Loki.

Metrics history without Prometheus

~30 days of host, container, and pod CPU/memory in SQLite. "When did memory step-change?" without a second stack.

Native apps for the person getting paged

Grafana mobile targets cloud dashboards and power users. Android, Windows, Linux DockLog apps connect to your instance: tail, pause, catch up, export logs, pod view, local notifications, credentials on device, no vendor cloud in the path.

The person at dinner often isn't the one who lives in LogQL.

Comparison

	Loki + Grafana	DockLog
Setup time	Hours to days	Often under a minute
RAM on small VPS	Significant	~30MB typical
Cross-cluster search	Core strength	Out of scope
Per-user container scope	Complex	Built-in
Restart from same UI	No	Yes (with RBAC)
Native self-hosted mobile	No	Yes
Long-term retention	Yes	No (live + buffer)

Real scenarios

Scenario: greenfield VPS, prod is noisy tonight

DockLog first. Mount socket, add users, tail. Design Loki label schemas next quarter when retention becomes a real ask.

Scenario: compliance needs 90-day log retention

Loki (or cloud logging) is the answer. Keep DockLog for incident tails and scoped human access during the fire.

Scenario: contractor needs staging logs, not LogQL training

Grafana permissions and tenant modeling take time. DockLog allowed_containers: staging-* is one admin form. RBAC guide.

Scenario: postmortem needs "show me every timeout in March"

LogQL across indexed labels wins. DockLog's buffer is live and short; it will not replace search-at-scale.

Scenario: small k3s cluster, no platform team

Full LGTM stack on 4GB RAM hurts. DockLog plus optional Prometheus later is a common ladder. K8s without Loki walks the minimal path.

Decision table

Requirement	Loki + Grafana	DockLog	Both
Live tail during incident	Via pipeline	Native	Yes
90+ day retention	Yes	No	Loki stores, DockLog tails
Per-user container scope	Hard	Built-in	DockLog for humans
Fleet-wide LogQL	Yes	No	Loki
Restart/shell with audit	No	Yes	DockLog for actions
Mobile on-call tail	Weak	Android app	DockLog
Small VPS RAM budget	Tight	~30MB	DockLog now, Loki later

Config examples

DockLog today (socket, auth, alerts)

yaml

services:
  docklog:
    image: aimldev/docklog:latest
    ports:
      - "127.0.0.1:8888:8000"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - docklog-data:/data
    environment:
      DB_PATH: /data/docklog.db
      SECRET_KEY: ${SECRET_KEY}
      ALLOW_RESTART: "true"

Add alert rules in the UI for log regex ERROR on prod-api-*, container die events, or CPU thresholds. Alerts setup.

Minimal Loki path (when you are ready)

Typical small setup: Promtail or Alloy on the host, Loki single-binary or Grafana Cloud, Grafana for Explore. Label design matters early:

yaml

# promtail snippet: keep labels low-cardinality
scrape_configs:
  - job_name: docker
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
    relabel_configs:
      - source_labels: ['__meta_docker_container_name']
        target_label: container
      - source_labels: ['__meta_docker_container_label_com_docker_compose_service']
        target_label: service

DockLog does not compete with this layer; it sits beside it for humans who need tails before the pipeline exists.

Dual-layer mental model

Layer	Tool	Job
Human incident	DockLog	Tail, scope, restart, phone
Analytics / retention	Loki + Grafana	Search, dashboards, compliance

Ship logs to Loki with your logging driver or Promtail when retention matures. Keep DockLog for the humans fixing prod tonight.

Troubleshooting

"Grafana Explore tail is seconds behind DockLog"

Expected. Loki ingests in batches; DockLog reads the daemon buffer. Use DockLog during the incident, Loki for historical grep.

"Loki stack eats our 4GB VPS"

Run Loki on dedicated infra or Grafana Cloud; keep DockLog on the edge box. Or defer Loki until retention is a documented requirement. Self-hosted monitoring on a budget.

"We duplicated alerts in Grafana and DockLog"

Pick ownership: DockLog for container-level die/OOM/log-match near the host; Grafana for SLO burn rates and cross-service LogQL. Overlap causes alert fatigue.

"DockLog user can't see prod but Loki shows everything"

Different permission models. Narrow Grafana org roles or Loki tenant filters separately. DockLog allowed_containers does not govern Loki.

"WebSocket tails drop behind nginx"

Same fix as any DockLog deploy: upgrade headers, idle timeout, /ws path. Production proxy guide.

When it's obviously Loki

Retention SLAs and legal hold
Hundreds of nodes, many clusters
Joining logs to traces at fleet scale
Dedicated SRE / platform team

When it's obviously DockLog

Single VPS or small cluster, need tails today
Team boundaries on who sees prod
Incident response before the logging stack exists
Mobile on-call without Grafana expertise

When it's obviously both

Platform team operates Loki; app teams and contractors use DockLog for scoped live access
Incidents start in DockLog; postmortems query Loki for timeline reconstruction

bash

docker run -d \
  --name docklog \
  -p 8888:8000 \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v docklog-data:/data \
  -e DB_PATH=/data/docklog.db \
  aimldev/docklog:latest

Related: K8s without Loki, alerts, docker log management tools, DockLog vs Portainer if the question is control plane vs tail layer.