Airbyte is a fast-growing ELT tool that helps acquire data from multiple sources. Particularly useful in building data lakes. Airbyte offers pre-built connectors to over 300 sources and 10s of destinations and also allows custom connectors to be built quickly using language SDKs.
Airbyte recently released Opentelemetry-based metrics, however, the documentation has been spotty and incomplete. You can check it out here. In this blog, I will document my learnings through the journey of integrating Airbyte open source, running in GKE to Grafana, and using GCP’s managed Prometheus service. The available metrics can be seen here
Airbyte to Grafana – Via Open Telemetry & Prometheus
The design looks as follows.

Implementation
Step 1 – Deploy Airbyte
Install Airbyte on Kubernetes. This is pretty straightforward. Follow the instructions on this page.
git clone https://github.com/airbytehq/airbyte.git
cd airbyte
kubectl apply -k kube/overlays/stable
You can customize the namespace to fit your needs. Add the following 2 variables to the worker in the .env (Which will translate into configmap and eventually the environment variable of the pod).
METRIC_CLIENT=otel OTEL_COLLECTOR_ENDPOINT=http://otel-collector:4317
We will build open telemetry pod in subsequent steps. Note that, if you add PUBLISH_METRICS=true, currently worker will look for datadog configurations.
Step 2 – Deploy Metric-reporter
Metric reporter queries the metrics from database in batches and pumps it to open telemetry instance. Use the following yaml as an example.
--- apiVersion: apps/v1 kind: Deployment metadata: name: airbyte-metrics namespace: airbyte-dev labels: app: airbyte-metrics spec: replicas: 1 selector: matchLabels: app: airbyte-metrics template: metadata: labels: app: airbyte-metrics spec: serviceAccountName: airbyte-admin automountServiceAccountToken: true containers: - name: metrics image: airbyte/metrics-reporter:0.39.31-alpha env: - name: METRIC_CLIENT value: "otel" - name: OTEL_COLLECTOR_ENDPOINT value: "otel-collector:4317" - name: PUBLISH_METRICS value: "true" Note: Metric reporter need access to the airbyte database. Copy all the configs of Airbyte worker (env: section) in addition to above key value pairs.
Step 3: Create Open Telemetry Collector
Open telemetry collector receives metrics from the metric exporter and writes to prometheus. Its fairly well documented and standard implementation.
--- apiVersion: v1 kind: ConfigMap metadata: name: otel-collector-conf namespace: airbyte-dev labels: app: opentelemetry component: otel-collector-conf data: otel-collector-config: | receivers: otlp: protocols: grpc: http: processors: batch: memory_limiter: limit_mib: 1500 spike_limit_mib: 512 check_interval: 5s extensions: zpages: {} memory_ballast: size_mib: 683 exporters: logging: loglevel: debug prometheusremotewrite: endpoint: "http://prometheus-test.airbyte-dev.svc:9090/api/v1/write" service: extensions: [zpages, memory_ballast] pipelines: metrics: receivers: [otlp] processors: [memory_limiter, batch] exporters: [logging, prometheusremotewrite] --- apiVersion: v1 kind: Service metadata: name: otel-collector namespace: airbyte-dev labels: app: opentelemetry component: otel-collector spec: ports: - name: otlp-grpc # Default endpoint for OpenTelemetry gRPC receiver. port: 4317 protocol: TCP targetPort: 4317 - name: otlp-http # Default endpoint for OpenTelemetry HTTP receiver. port: 4318 protocol: TCP targetPort: 4318 - name: metrics # Default endpoint for querying metrics. port: 8888 selector: component: otel-collector --- apiVersion: apps/v1 kind: Deployment metadata: name: otel-collector namespace: airbyte-dev labels: app: opentelemetry component: otel-collector spec: selector: matchLabels: app: opentelemetry component: otel-collector minReadySeconds: 5 progressDeadlineSeconds: 120 replicas: 1 #TODO - adjust this to your own requirements template: metadata: labels: app: opentelemetry component: otel-collector spec: containers: - command: - "/otelcol" - "--config=/conf/otel-collector-config.yaml" image: otel/opentelemetry-collector:0.54.0 name: otel-collector resources: limits: cpu: 1 memory: 2Gi requests: cpu: 200m memory: 400Mi ports: - containerPort: 55679 # Default endpoint for ZPages. - containerPort: 4317 # Default endpoint for OpenTelemetry receiver. - containerPort: 14250 # Default endpoint for Jaeger gRPC receiver. - containerPort: 14268 # Default endpoint for Jaeger HTTP receiver. - containerPort: 9411 # Default endpoint for Zipkin receiver. - name: metrics protocol: TCP containerPort: 8888 volumeMounts: - name: otel-collector-config-vol mountPath: /conf volumes: - configMap: name: otel-collector-conf items: - key: otel-collector-config path: otel-collector-config.yaml name: otel-collector-config-vol
Step 4: Deploy Prometheus Proxy
Though we could deploy a full fledged Prometheus, I chose to use Google provided managed prometheus (GMP) service. However, managed prometheus require a proxy to provide the end point for open telemetry instance. The documentation is here. Here is my yaml for it.
--- apiVersion: v1 kind: Service metadata: namespace: airbyte-dev name: prometheus-test labels: prometheus: test spec: type: ClusterIP selector: app: prometheus prometheus: test ports: - name: web port: 9090 targetPort: web --- apiVersion: apps/v1 kind: StatefulSet metadata: namespace: airbyte-dev name: prometheus-test labels: prometheus: test spec: replicas: 1 selector: matchLabels: app: prometheus prometheus: test serviceName: prometheus-test template: metadata: labels: app: prometheus prometheus: test spec: automountServiceAccountToken: true nodeSelector: kubernetes.io/arch: amd64 kubernetes.io/os: linux containers: - name: prometheus image: gke.gcr.io/prometheus-engine/prometheus:v2.28.1-gmp.7-gke.0 args: - --config.file=/prometheus/config_out/config.yaml - --storage.tsdb.path=/prometheus/data - --storage.tsdb.retention.time=24h - --web.enable-lifecycle - --enable-feature=remote-write-receiver - --storage.tsdb.no-lockfile - --web.route-prefix=/ ports: - name: web containerPort: 9090 readinessProbe: httpGet: path: /-/ready port: web scheme: HTTP resources: requests: memory: 400Mi volumeMounts: - name: config-out mountPath: /prometheus/config_out readOnly: true - name: prometheus-db mountPath: /prometheus/data - name: config-reloader image: gke.gcr.io/prometheus-engine/config-reloader:v0.4.1-gke.0 args: - --config-file=/prometheus/config/config.yaml - --config-file-output=/prometheus/config_out/config.yaml - --reload-url=http://localhost:9090/-/reload - --listen-address=:19091 ports: - name: reloader-web containerPort: 8080 resources: limits: cpu: 100m memory: 50Mi requests: cpu: 100m memory: 50Mi volumeMounts: - name: config mountPath: /prometheus/config - name: config-out mountPath: /prometheus/config_out terminationGracePeriodSeconds: 600 volumes: - name: prometheus-db emptyDir: {} - name: config configMap: name: prometheus-test defaultMode: 420 - name: config-out emptyDir: {} --- apiVersion: v1 kind: ConfigMap metadata: namespace: airbyte-dev name: prometheus-test labels: prometheus: test data: config.yaml: | global: scrape_interval: 30s scrape_configs: - job_name: otel-collector static_configs: - targets: ['otel-collector.airbyte-dev.svc:8888']
There are 2 key points. First, this setup is for open telemetry to write metrics to prometheus vs, Prometheus pulling the metrics, hence needed to add argument –enable-feature=remote-write-receiver . Second is, for pull, Prometheus needs to be configured for scraping, which is not implemented, though added in scrape configs.
Step-5: Install Grafana
This is fairly straightforward. Deploy the pod as below and point the data source to the GMP proxy at http://prometheus-test.airbyte-dev.svc:9090.
--- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: grafana-pvc namespace: airbyte-dev spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: apps/v1 kind: Deployment metadata: namespace: airbyte-dev labels: app: grafana name: grafana spec: selector: matchLabels: app: grafana template: metadata: labels: app: grafana spec: securityContext: fsGroup: 472 supplementalGroups: - 0 containers: - name: grafana image: grafana/grafana:8.4.4 imagePullPolicy: IfNotPresent ports: - containerPort: 3000 name: http-grafana protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /robots.txt port: 3000 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 30 successThreshold: 1 timeoutSeconds: 2 livenessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 3000 timeoutSeconds: 1 resources: requests: cpu: 250m memory: 750Mi volumeMounts: - mountPath: /var/lib/grafana name: grafana-pv volumes: - name: grafana-pv persistentVolumeClaim: claimName: grafana-pvc --- apiVersion: v1 kind: Service metadata: name: grafana namespace: airbyte-dev spec: ports: - port: 3000 protocol: TCP targetPort: http-grafana selector: app: grafana sessionAffinity: None type: LoadBalancer
The steps may not be in the necessary order, but at the end of it, each instance should discover the endpoints and should function.
The metrics can now be queried either in GMP or Grafana.