Blog
2026
- Full Observability on k3s: kube-prometheus-stack + Loki + Grafana OIDC04 Jul 2026Deploy a production-grade monitoring stack on bare-metal k3s: Prometheus, Loki with Garage S3 storage, Promtail on edge nodes via Ansible, SNMP monitoring for MikroTik, and Grafana SSO via Authelia OIDC — all GitOps-managed.
- HA DNS for Homelab: Unbound + AdGuard Home + Keepalived on Raspberry Pi27 Jun 2026A two-node recursive DNS stack with ad filtering, automatic config sync, and transparent failover — fully managed with Ansible.
- I Hardened Pod securityContext and Broke 9 Containers in Production25 Jun 2026capabilities.drop: [ALL] and runAsNonRoot: true passed schema validation cleanly. Within minutes of merge, nine containers — including both Postgres instances backing Paperless and Nextcloud — were down. Here's the failure analysis, why a manual kubectl fix got silently undone, and the lesson for any blanket securityContext change.
- kubectl Said Everything Was Correct. Traefik 404'd Anyway.25 Jun 2026Migrating Jellyfin off k3s onto a GPU-passthrough LXC meant pointing a Service at an external IP. The EndpointSlice looked completely correct via kubectl — Service existed, endpoint listed right — but Traefik 404'd every request. A second, unrelated gotcha surfaced in the same migration: a PVC silently shared by reference across two unrelated files.
- ArgoCD Gotchas: Cache Staleness and the SharedResourceWarning Nobody Explains22 Jun 2026kubectl apply succeeds, the field reverts within seconds, and there's no error anywhere. Two ArgoCD debugging patterns that hit the same homelab three times in one day: repo-server cache staleness reverting live edits, and two Applications silently fighting over the same resource.
- I Ran Gitleaks Against My Own Repo and Found 12 Real Secrets22 Jun 2026A full-history gitleaks scan of a homelab repo that had been running for months turned up 12 distinct plaintext secrets — including an OIDC signing key. Here's the scanning setup, the baseline strategy that doesn't block on pre-existing leaks, and the remediation plan.
- My Firewall Had 77 Rules. Terraform Knew About 22 of Them.22 Jun 2026Multiple rounds of 'reconstruct the firewall' work each added a fresh generation of rules without removing the old one. Because RouterOS evaluates rules in order and stops at the first match, the oldest, broadest generation was silently winning over the newest, narrower one — undoing a security tightening that looked complete in Terraform.
- Hardening Unattended Raspberry Pi Edge Nodes: Watchdog, fail2ban, nftables, and the Mistakes That Take Down DNS22 Jun 2026Two Raspberry Pis run DNS for an entire network with no one watching them most of the time. A hardware watchdog, fail2ban, an additive nftables host firewall that doesn't conflict with Docker, log size caps, and an alerting path that works even when the rest of the monitoring stack is down.
- k3s Backup Without the Complexity: Velero + Garage S3 on Longhorn20 Jun 2026Replace MinIO with Garage — a single 50MB binary — as the Velero backup target. Full daily cluster backups with Longhorn volume snapshots, deployed via ArgoCD.
- External Secrets Operator + HashiCorp Vault: GitOps Secret Lifecycle in Kubernetes18 Jun 2026Kubernetes Secrets are base64-encoded, not encrypted. Moving secrets out of the cluster into Vault — and syncing them back via External Secrets Operator — gives you rotation, audit logging, and compliance without changing how applications consume secrets.
- How a 1 GiB Memory Limit Took Down My Entire k3s Cluster18 Jun 2026A single misconfigured resource limit triggered a cascade: OOMKill on the control-plane, load average of 90, 1.2M DNS queries per day, and kubelet reporting the wrong allocatable memory. Here's the full post-mortem.
- Kyverno: Supply Chain Security as Admission Control on Kubernetes18 Jun 2026Most Kubernetes clusters accept any container image, any privilege level, and any resource configuration by default. Kyverno lets you enforce policies at admission time — before anything runs. Here's how to build a supply chain security baseline with Audit-first rollout.
- IPv6 NAT66 Behind a FritzBox: The RouterOS 7 Bug That Broke WiFi Clients18 Jun 2026Setting up IPv6 on MikroTik behind a FritzBox with CGN should be straightforward — ULA prefix, NAT66 masquerade, done. Instead, RouterOS 7 started advertising router advertisements on the WAN interface, turning MikroTik into an uninvited IPv6 gateway for FritzBox WiFi clients. Here's the full setup and fix.
- SLO Burn-Rate Alerting with Prometheus: Beyond Threshold Alerts18 Jun 2026Most teams alert when availability drops below a threshold. Burn-rate alerting tells you how fast you're spending your error budget — so you page on trajectory, not just current state. Here's how to implement the Google SRE Workbook approach on a bare-metal k3s cluster.
- Self-Hosted Tailscale Control Plane: Headscale on k3s with Authelia OIDC13 Jun 2026Deploy Headscale on a bare-metal k3s cluster with Longhorn persistence, Traefik ingress, and Authelia OIDC authentication — fully GitOps-managed via ArgoCD.
- Wildcard TLS Certificates on K3s with cert-manager and Cloudflare DNS22 May 2026How to automate wildcard Let's Encrypt certificates on a bare-metal K3s cluster using cert-manager's DNS-01 challenge with Cloudflare — and why HTTP-01 won't work for internal services.
- GitOps on K3s: Managing a Complete Homelab with ArgoCD20 May 2026How to manage an entire Kubernetes homelab — MetalLB, Traefik, Longhorn, Authelia, and more — as a Git repository using ArgoCD's App-of-Apps pattern.
- Bare-Metal LoadBalancer on K3s: MetalLB + Traefik with ArgoCD18 May 2026How to get a real external IP on a bare-metal Kubernetes cluster using MetalLB L2 mode, and wire it up with Traefik for automatic HTTPS — fully GitOps-managed with ArgoCD.
- NIS2 Article 21 in Azure: Implementing Network Security Controls with Terraform17 May 2026A technical deep-dive into the network security requirements of NIS2 Article 21 and how to implement them in Azure using Terraform — with concrete code, not legal theory.
- Zero-Trust RAG: Defeating the Shared Private Link Deadlock in Azure Terraform16 May 2026How to programmatically approve Azure AI Search Shared Private Links using AzAPI, and why your AI architecture will fail an audit without proper Identity Chaining.
- Enterprise Homelab: K3s, Authelia & Longhorn on Proxmox with Terraform16 May 2026How to build a production-grade Kubernetes homelab with K3s, Authelia SSO, Longhorn storage, and ArgoCD — and the five painful mistakes that will cost you hours if you don't know about them.
- Breaking the Loop: Solving Circular Dependencies in Azure Firewall Routing07 May 2026How to implement Azure Firewall Forced Tunneling in Terraform without triggering cycle errors, and why a simple 0.0.0.0/0 route will instantly break your Windows VMs.
- Architecting an Enterprise-Grade Homelab: My Ansible Master Playbook06 May 2026Take a tour of a fully automated, segmented, and highly available homelab architecture orchestrated entirely via Ansible and GitOps.
- Automating MikroTik WireGuard VPN with Role-Based Access via Terraform05 May 2026Deploy a WireGuard VPN on MikroTik using Terraform. Learn how to implement role-based network access, isolating mobile devices from full admin laptops.
- Automating MikroTik Bridge VLAN Filtering & Proxmox Trunks with Terraform04 May 2026Master MikroTik's notoriously complex Bridge VLAN Filtering. Learn how to automate dynamic VLAN matrices, Proxmox trunk ports, and edge devices using Terraform.
- Surviving Azure Policies: Zero-Trust Hub & Spoke with Terraform03 May 2026How to build an enterprise-grade Azure network architecture that blocks internet traffic by default and survives aggressive DeployIfNotExists (DINE) policies — without breaking your CI/CD pipeline.
- Implementing a Zero-Trust MikroTik Firewall with Terraform03 May 2026Learn how to enforce strict VLAN isolation, fast-track traffic, and build a default-deny firewall for MikroTik RouterOS using Infrastructure as Code.
- Deploying Gemma 4 26B on Proxmox: IaC Setup with Terraform, Ansible & AMD iGPU02 May 2026A complete guide to automating a local AI stack on Proxmox LXC using Terraform and Ansible, including Open-WebUI and AMD Radeon Vega iGPU workarounds.
- Hardening Azure Acmebot for ISO 27001 & NIS2 Compliance01 May 2026A deep dive into architecting a Zero-Trust Let's Encrypt automation using Terraform, Azure Private Link, and VNet Integration.