Under Construction

This documentation is a work in progress. Visit regularly to see the improvements we're making.

MDAI OpenTelemetry Collector Sample Config

This OpenTelemetry Collector (OTEL Collector) configuration sample defines an MDAI Gateway Collector. It is responsible for receiving, processing, and exporting telemetry data (logs, traces, and metrics) within the MDAI ecosystem.


Table of Contents


OpenTelemetry Collector (OTEL Collector) configuration

For additional examples and information visit OpenTelemetry


Summary

MDAI Gateway OTEL Collector:

Receives logs via Fluent Forward & OTLP.
Filters & processes logs using memory limits, batching, grouping, and attribute transformations.
Exports telemetry to MDAI Watcher Collectors and local debugging logs.
Ensures health checks via the health_check extension.

Key Components of the Configuration

ComponentRequiredDescription
API Version & KindYESUses opentelemetry.io/v1beta1, defining an OpenTelemetryCollector resource.
MetadataYESLabels the collector with mdaihub-name: mdaihub-sample and deploys it in the mdai namespace.
ImageYESDefines the container image used by the collector (otel/opentelemetry-collector-contrib:0.117.0).
Environment Variables (envFrom)❌ NOLoads configuration from a ConfigMap (mdaihub-sample-variables).

Example Configuration:

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  labels:
    mdaihub-name: mdaihub-sample
  name: gateway-3
  namespace: mdai
spec:
  image: otel/opentelemetry-collector-contrib:0.117.0
  envFrom:
    - configMapRef:
        name: mdaihub-sample-variables
  config:

Configuration Breakdown

Receivers

Receivers define how the collector ingests telemetry data.

ReceiverRequiredDescription
fluentforward❌ NOReceives logs via Fluent Forward protocol on port 8006.
otlpYESAccepts telemetry in OpenTelemetry Protocol (OTLP) over gRPC (4317) and HTTP (4318).
CORS Settings❌ NOAllows all origins (*) for web-based telemetry ingestion.

Example Configuration:

    receivers:
      fluentforward:
        endpoint: '${env:MY_POD_IP}:8006'
      otlp:
        protocols:
          grpc:
            endpoint: '${env:MY_POD_IP}:4317'
          http:
            endpoint: '${env:MY_POD_IP}:4318'
            cors:
              allowed_origins:
                - "http://*"
                - "https://*"

Extensions

Extensions add additional functionality to the collector.

ExtensionRequiredPurpose
health_checkYESEnsures the collector passes readiness & liveness probes at 13133. (Mandatory in MDAI Helm Chart)

Example Configuration:

   extensions:
      health_check:
        endpoint: "${env:MY_POD_IP}:13133"

Processors

Processors modify and filter telemetry data before exporting.

ProcessorRequiredPurpose
memory_limiterYESEnsures memory usage remains below 75%, with a spike limit of 15%.
batchYESGroups telemetry into batches of 10,000 records or after 13 seconds.
groupbyattrs❌ NOGroups logs by service.name to enable per-service aggregation.
resource/watcher_receiver_tag❌ NOAdds a watcher_direction: received label to received telemetry.
resource/watcher_exporter_tag❌ NOAdds a watcher_direction: exported label to exported telemetry.
filter/severity❌ NOFilters logs, only keeping logs where log_level == INFO.
filter/service_list❌ NOFilters logs based on environment variable ${env:SERVICE_LIST_REGEX}, dynamically configuring services to be monitored.

Example Configuration:

processors:
      memory_limiter:
        check_interval: 23s
        limit_percentage: 75
        spike_limit_percentage: 15
      batch:
        send_batch_size: 10000
        timeout: 13s
      groupbyattrs:
        keys:
          - service.name
      resource/watcher_receiver_tag:
        attributes:
          - key: watcher_direction
            value: "received"
            action: upsert
      resource/watcher_exporter_tag:
        attributes:
          - key: watcher_direction
            value: "exported"
            action: upsert
      filter/severity:
        error_mode: ignore
        logs:
          log_record:
            - 'attributes["log_level"] == "INFO"'
      filter/service_list:
        error_mode: ignore
        logs:
          log_record:
            - 'IsMatch(attributes["service.name"], "${env:SERVICE_LIST_REGEX}")'

Exporters

Exporters send processed telemetry to external destinations.

ExporterRequiredDestination
debug❌ NOOutputs telemetry to local logs for debugging.
otlp/watcherYESSends telemetry to mdaihub-sample-watcher-collector-service via OTLP (4317). TLS is disabled (insecure: true).

Example Configuration:

   exporters:
      debug: { }
      otlp/watcher:
        endpoint: mdaihub-sample-watcher-collector-service.mdai.svc.cluster.local:4317
        tls:
          insecure: true

Service & Pipelines

Defines how telemetry flows through the system.

PipelineRequiredReceiversProcessorsExporters
logs/customer_pipelineYESotlp, fluentforwardfilter/service_list, memory_limiter, batch, groupbyattrs, resource/watcher_exporter_tagdebug, otlp/watcher
logs/watch_receiversYESotlp, fluentforwardmemory_limiter, batch, groupbyattrs, resource/watcher_receiver_tagdebug, otlp/watcher
  • The customer pipeline filters logs using filter/service_list before exporting.
  • The watch receivers pipeline applies grouping and tagging before exporting.

Example Configuration:

   service:
      telemetry:
        metrics:
          address: ":8888"
      extensions:
        - health_check
      pipelines:
        logs/customer_pipeline:
          receivers: [ otlp, fluentforward ]
          processors: [ filter/service_list, memory_limiter, batch, groupbyattrs, resource/watcher_exporter_tag ]
          exporters: [ debug, otlp/watcher ]

        logs/watch_receivers:
          receivers: [ otlp, fluentforward ]
          processors: [ memory_limiter, batch, groupbyattrs, resource/watcher_receiver_tag ]
          exporters: [ debug, otlp/watcher ]

Custom Config to Copy

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  labels:
    mdaihub-name: <your-mdaihub-name>
  name: <your-collector-name>
  namespace: <your-namespace>
spec:
  image: <your-otel-collector-image>
  envFrom:
    - configMapRef:
        name: <your-configmap-name>
  config:
    receivers:
      fluentforward:
        endpoint: '${env:MY_POD_IP}:<fluentforward-port>'
      otlp:
        protocols:
          grpc:
            endpoint: '${env:MY_POD_IP}:<otlp-grpc-port>'
          http:
            endpoint: '${env:MY_POD_IP}:<otlp-http-port>'
            cors:
              allowed_origins:
                - "http://*"
                - "https://*"

    extensions:
      health_check:
        endpoint: "${env:MY_POD_IP}:<health-check-port>"

    processors:
      memory_limiter:
        check_interval: <time-interval>
        limit_percentage: <memory-limit-percent>
        spike_limit_percentage: <spike-limit-percent>

      batch:
        send_batch_size: <batch-size>
        timeout: <batch-timeout>

      groupbyattrs:
        keys:
          - <your-grouping-key>

      resource/custom_tag:
        attributes:
          - key: <your-key>
            value: <your-value>
            action: upsert

      filter/severity:
        error_mode: ignore
        logs:
          log_record:
            - 'attributes["log_level"] == "<your-log-level>"'

      filter/custom_filter:
        error_mode: ignore
        logs:
          log_record:
            - 'IsMatch(attributes["<your-attribute>"], "${env:<your-env-variable>}")'

    exporters:
      debug: { }
      otlp/watcher:
        endpoint: <your-otlp-endpoint>
        tls:
          insecure: <true|false>

    service:
      telemetry:
        metrics:
          address: ":<metrics-port>"
      extensions:
        - health_check
      pipelines:
        logs/custom_pipeline:
          receivers: [ <your-receivers> ]
          processors: [ <your-processors> ]
          exporters: [ <your-exporters> ]