Skip to main content
kubernetes-metrics_1_lgknap.webp

Canary Checker Kubernetes Native Health Checks Beyond Monitoring

Vishal Anarase

Vishal Anarase


Picture this: You’ve just rolled out a critical update to your Kubernetes-based web app. Everything looks fine. Prometheus says all pods are healthy. Dashboards are green. PagerDuty? Silent.

Then your support channel lights up.

Checkout’s broken. Third-party payments are timing out. An external API is quietly serving stale data. And your status page still says everything’s “up.”

The problem? Traditional monitoring checks if containers are running—not whether your business workflows are actually working.

The solution? Canary Checkers; they don’t just ask “is the system alive?”. They ask “is it behaving the way users expect?”.

Read further to know more about Canary Checkers and how it is a game changer for your business.

Introduction

Modern distributed applications rely on dozens of services, APIs, and external integrations. Traditional monitoring tools like Prometheus or Datadog detect when something is wrong after it happens, but wouldn’t it be better to proactively verify that your systems and integrations are working as expected

This is where Canary Checker comes in. It is a Kubernetes native health check platform designed to continuously validate services, infrastructure, and integrations through proactive canaries. Think of it as automated tests running inside your cluster, validating not only pods, but also databases, APIs, and external SaaS services.

Curious to see how this would look like for your business? Book a free consultation with Our Kubernetes Experts.

What is Canary Checker

Canary Checker is a declarative health check system for Kubernetes, built to ensure your workloads and dependencies behave as expected.

  • Runs synthetic, passive, infrastructure, and integration checks inside Kubernetes.
  • Provides a unified dashboard and Prometheus metrics for observability.
  • Designed for GitOps workflows, you define canary CRDs alongside other manifests.

Unlike simple readiness/liveness probes, Canary Checker validates end-to-end functionality across your entire stack.

Health checks components
Health checks components

Image source: canarychecker.io

Types of Checks Supported

Synthetic (Active) Checks

  • Actively probe endpoints or services.
  • Examples: HTTP requests, SQL queries, MongoDB connections, LDAP logins.
  • Catch issues before real users do.

Passive Checks

  • Collect signals from existing monitoring/observability systems.
  • Sources: Prometheus alerts, Datadog, CloudWatch, and Elasticsearch queries.
  • Useful to consolidate alerts in one place.

Infrastructure Checks

  • Validate cluster or cloud resource provisioning.
  • Examples: create a pod or EC2 instance, then test readiness.
  • Ensures autoscaling or infra provisioning pipelines actually work.

Integration Checks

  • Run test suites for higher-level validation.
  • Supports Playwright (UI tests), JUnit, Postman/Newman, K6 load tests.
  • Ensures critical workflows (e.g., login → checkout → payment) still function.

Core Capabilities & Differentiators

Prometheus Metrics Exporter Built-In

Every canary automatically exports metrics. No need to wire separate exporters.

Scriptable Logic

Use CEL, JavaScript, or Go templates to customize pass/fail conditions. Example: Alert of API latency > 500ms and SSL cert expires within 7 days.

Dashboard & Visualization

Built-in dashboard shows real-time canary results. Works seamlessly with Grafana dashboards.

Canary Dashboard
Canary Dashboard

Seamless Kubernetes Integration

Canary checks are declarative CRDs, GitOps-friendly. Works with FluxCD, ArgoCD

Secret Management

Securely reference secrets via Kubernetes Secrets or ConfigMaps

Want to upgrade your business’s technology and start monitoring what really matters—but not sure where to start? Let’s make it simple. Click here to get started.

Getting Started

Install via Helm

❯ helm repo add flanksource https://flanksource.github.io/charts "flanksource" has been added to your repositories
bash
❯ helm install canary-checker flanksource/canary-checker -n canary-checker --create-namespace NAME: canary-checker LAST DEPLOYED: Fri Aug 29 17:57:06 2025 NAMESPACE: canary-checker STATUS: deployed REVISION: 1 TEST SUITE: None
bash
❯ helm list -n canary-checker NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION canary-checker canary-checker 1 2025-08-29 17:57:06.359401 +0530 IST deployed canary-checker-1.1.1 1.1.1
bash

Check Results

  • List services
❯ kubectl get svc -n canary-checker NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE canary-checker ClusterIP 10.96.61.129 <none> 8080/TCP 4d22h canary-checker-ui ClusterIP 10.96.148.84 <none> 80/TCP 4d22h
bash
  • Port-forward the UI: kubectl port-forward svc/canary-checker 8080:8080
❯ kubectl -n canary-checker port-forward svc/canary-checker-ui 8080:80 Forwarding from 127.0.0.1:8080 -> 3000 Forwarding from [::1]:8080 -> 3000 Handling connection for 8080 Handling connection for 8080 Handling connection for 8080 Handling connection for 8080 Handling connection for 8080 Handling connection for 8080
bash
  • Or check metrics at /metrics endpoint.

Use Cases

HTTP Check

Verify website returns 200 within latency limits

apiVersion: canaries.flanksource.com/v1 kind: Canary metadata: name: http-check spec: schedule: '@every 30s' http: - name: basic-check url: https://httpbin.flanksource.com/status/200 - name: failing-check url: https://httpbin.flanksource.com/status/500
yaml
❯ k apply -f canary.yaml canary.canaries.flanksource.com/http-check created
bash

Custom Metrics

Monitor external data (e.g., exchange rates) and expose them as Prometheus metrics.

apiVersion: canaries.flanksource.com/v1 kind: Canary metadata: name: exchange-rates spec: schedule: 'every 30 @minutes' http: - name: exchange-rates url: https://api.frankfurter.app/latest?from=USD\&to=GBP,EUR,ILS metrics: - name: exchange\_rate type: gauge value: json.rates.GBP labels: - name: 'from' value: 'USD' - name: to value: GBP - name: exchange\_rate type: gauge value: json.rates.EUR labels: - name: 'from' value: 'USD' - name: to value: EUR - name: exchange\_rate type: gauge value: json.rates.ILS labels: - name: 'from' value: 'USD' - name: to value: ILS - name: exchange\_rate\_api type: histogram value: elapsed.getMilliseconds()
yaml
❯ k apply -f exchange-rates-exporter.yaml canary.canaries.flanksource.com/exchange-rates created
bash

Kubernetes

Check dns pod in kubernetes cluster

apiVersion: canaries.flanksource.com/v1 kind: Canary metadata: name: checks-dns spec: replicas: 1 schedule: '@every 30s' kubernetes: - name: Monitor DNS pods kind: Pod healthy: true resource: labelSelector: k8s-app=kube-dns namespaceSelector: name: kube-system display: {}
yaml
❯ k apply -f checks-dns.yaml canary.canaries.flanksource.com/checks-dns created
bash

Infrastructure Validation

Test new pod/namespace creation after autoscaler events.

List all the canries and verify

❯ kubectl get canaries.canaries.flanksource.com NAME INTERVAL STATUS LAST CHECK UPTIME 1H LATENCY 1H LAST TRANSITIONED checks-dns exchange-rates http-check-new 30
bash

Ready to maximize your Kubernetes reliability and take Canary Checker to production? Click here to know more.

We’ll help you design advanced checks, automate best practices, and scale your health monitoring with confidence.

Comparison: Canary Checker vs Alternatives

Feature / ToolCanary CheckerPrometheus + AlertmanagerFlaggerArgo Rollouts
Kubernetes-native✅ CRDs, GitOps❌ (needs exporters)
Synthetic Checks✅ HTTP, DB, LDAP❌ Only metrics-based
Passive Checks✅ Ingest from Prom, Datadog, CloudWatch✅ Prometheus only
Integration Tests✅ JUnit, Newman, Playwright
Infrastructure Checks✅ Pods, EC2, Cloud
Metrics Exporter✅ Built-in✅ via exporters
FocusEnd-to-end healthMetrics collection & alertsProgressive deliveryProgressive delivery
Best ForFull-stack validationMetrics-based monitoringCanary deploymentsCanary deployments

When to Use Canary Checker vs Alternatives

  • Use Canary Checker if you want simple, Kubernetes-native, declarative checks.
  • Use Flagger/Argo Rollouts if your focus is progressive delivery (not just monitoring).
  • Use Kayenta/Spinnaker if you need large-scale automated analysis.
  • Use Prometheus DIY/SaaS if you want flexibility or managed monitoring.

Conclusion

Canary Checker brings a new layer of reliability to Kubernetes by proactively checking not only your workloads but also your dependencies, infrastructure, and business-critical workflows. It does not replace existing monitoring and deployment tools.

For businesses, this means fewer outages, faster incident response, and improved customer trust—teams can focus on strategic initiatives instead of firefighting unseen failures. By standardizing proactive checks and automating best practices, organizations gain actionable insights that reduce downtime and optimize resources.

Ready to maximize reliability and unlock smarter Kubernetes health monitoring? Schedule your reliability blueprint session today.

Enjoying this post?

Get our posts directly in your inbox.