OpenTelemetry Collector Configuration
Overview
This guide covers OpenTelemetry Collector configuration for capturing and exporting Agent 365 telemetry to multiple destinations including Application Insights, Azure Monitor, and third-party SIEM systems.
Architecture
flowchart LR
A[Agent 365 SDK] -->|OTLP| B[OTel Collector]
B -->|Azure Monitor Exporter| C[Application Insights]
B -->|Azure Monitor Exporter| D[Log Analytics]
B -->|OTLP Exporter| E[Splunk/Datadog]
B -->|File Exporter| F[WORM Storage]
Prerequisites
- Azure subscription with monitoring resources
- Application Insights workspace
- Log Analytics workspace
- Network connectivity from agent hosting environment
Step 1: Deploy OpenTelemetry Collector
Azure Container Apps Deployment
Deploy the collector as an Azure Container App for scalability:
// otel-collector.bicep
resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
name: 'otel-collector-agents'
location: resourceGroup().location
properties: {
managedEnvironmentId: containerAppEnv.id
configuration: {
ingress: {
external: false
targetPort: 4317
transport: 'http2'
}
secrets: [
{
name: 'appinsights-connection'
value: appInsightsConnectionString
}
]
}
template: {
containers: [
{
name: 'otel-collector'
image: 'otel/opentelemetry-collector-contrib:latest'
resources: {
cpu: json('0.5')
memory: '1Gi'
}
env: [
{
name: 'APPLICATIONINSIGHTS_CONNECTION_STRING'
secretRef: 'appinsights-connection'
}
]
volumeMounts: [
{
volumeName: 'config'
mountPath: '/etc/otelcol-contrib'
}
]
}
]
volumes: [
{
name: 'config'
storageType: 'AzureFile'
storageName: 'otel-config'
}
]
scale: {
minReplicas: 2
maxReplicas: 10
}
}
}
}
Kubernetes Deployment
For AKS environments:
# otel-collector-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector-agents
namespace: agent-governance
spec:
replicas: 2
selector:
matchLabels:
app: otel-collector
template:
metadata:
labels:
app: otel-collector
spec:
containers:
- name: otel-collector
image: otel/opentelemetry-collector-contrib:0.91.0
ports:
- containerPort: 4317 # OTLP gRPC
- containerPort: 4318 # OTLP HTTP
- containerPort: 8888 # Metrics
volumeMounts:
- name: config
mountPath: /etc/otelcol-contrib
env:
- name: APPLICATIONINSIGHTS_CONNECTION_STRING
valueFrom:
secretKeyRef:
name: otel-secrets
key: appinsights-connection
volumes:
- name: config
configMap:
name: otel-collector-config
---
apiVersion: v1
kind: Service
metadata:
name: otel-collector
namespace: agent-governance
spec:
selector:
app: otel-collector
ports:
- name: otlp-grpc
port: 4317
targetPort: 4317
- name: otlp-http
port: 4318
targetPort: 4318
Step 2: Configure Collector
Base Configuration
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
# Batch processing for efficiency
batch:
timeout: 10s
send_batch_size: 1000
send_batch_max_size: 1500
# Memory limiter to prevent OOM
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
# Attribute enrichment for FSI governance
attributes:
actions:
- key: fsi.governance.version
action: upsert
value: "1.2.51"
- key: fsi.data.classification
action: upsert
from_attribute: agent.zone
# Resource detection
resourcedetection:
detectors: [env, azure]
timeout: 5s
# Filter sensitive data
filter:
error_mode: ignore
traces:
span:
# Remove PII from span names
- 'attributes["user.email"] != nil'
exporters:
# Azure Monitor / Application Insights
azuremonitor:
connection_string: ${env:APPLICATIONINSIGHTS_CONNECTION_STRING}
maxbatchsize: 100
maxbatchinterval: 10s
# Log Analytics for Sentinel integration
azuremonitor/logs:
connection_string: ${env:APPLICATIONINSIGHTS_CONNECTION_STRING}
instrumentation_key: ${env:APPINSIGHTS_INSTRUMENTATIONKEY}
# Debug logging (development only)
logging:
verbosity: detailed
sampling_initial: 5
sampling_thereafter: 200
# File export for WORM compliance
file:
path: /var/log/otel/agent-telemetry.json
rotation:
max_megabytes: 100
max_days: 1
max_backups: 7
localtime: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, attributes, resourcedetection]
exporters: [azuremonitor, file]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch, attributes]
exporters: [azuremonitor]
logs:
receivers: [otlp]
processors: [memory_limiter, batch, attributes, filter]
exporters: [azuremonitor/logs, file]
telemetry:
logs:
level: info
metrics:
address: 0.0.0.0:8888
Step 3: Zone-Specific Configuration
Zone 2 Configuration
# Zone 2: Standard telemetry with 90-day retention focus
processors:
attributes/zone2:
actions:
- key: fsi.zone
action: upsert
value: "Zone2"
- key: fsi.retention.days
action: upsert
value: "90"
# Sample to reduce volume (90% retention)
probabilistic_sampler:
sampling_percentage: 90
exporters:
azuremonitor/zone2:
connection_string: ${env:APPINSIGHTS_ZONE2_CONNECTION}
Zone 3 Configuration
# Zone 3: Full telemetry capture, no sampling
processors:
attributes/zone3:
actions:
- key: fsi.zone
action: upsert
value: "Zone3"
- key: fsi.retention.days
action: upsert
value: "2555" # 7 years
- key: fsi.regulatory.finra
action: upsert
value: "true"
# No sampling for Zone 3 - capture everything
# Remove probabilistic_sampler from pipeline
exporters:
# Primary: Application Insights
azuremonitor/zone3:
connection_string: ${env:APPINSIGHTS_ZONE3_CONNECTION}
# Secondary: Blob storage for WORM
azureblob:
container: agent-telemetry-archive
connection_string: ${env:STORAGE_CONNECTION}
partition: minute
file_prefix: zone3-
# Tertiary: Sentinel workspace
azuremonitor/sentinel:
connection_string: ${env:SENTINEL_WORKSPACE_CONNECTION}
Step 4: Agent SDK Integration
Copilot Studio Agent Configuration
Configure the Agent 365 SDK to export telemetry:
// ILLUSTRATIVE PSEUDOCODE - Verify against current Microsoft documentation
// Last verified: January 2026 | Status: Preview
// agent-telemetry-config.js
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-metrics-otlp-grpc');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
// Configure SDK
const sdk = new NodeSDK({
resource: new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: process.env.AGENT_NAME,
[SemanticResourceAttributes.SERVICE_VERSION]: process.env.AGENT_VERSION,
'fsi.zone': process.env.GOVERNANCE_ZONE,
'fsi.blueprint.id': process.env.BLUEPRINT_ID,
'fsi.sponsor.id': process.env.SPONSOR_ID,
'fsi.environment': process.env.ENVIRONMENT_NAME
}),
traceExporter: new OTLPTraceExporter({
url: process.env.OTEL_COLLECTOR_ENDPOINT
}),
metricExporter: new OTLPMetricExporter({
url: process.env.OTEL_COLLECTOR_ENDPOINT
})
});
// Custom spans for agent operations
const tracer = sdk.trace.getTracer('agent-365-telemetry');
function instrumentAgentInteraction(interactionId, userId) {
return tracer.startActiveSpan('agent.interaction', {
attributes: {
'agent.interaction.id': interactionId,
'agent.user.id': userId,
'agent.timestamp': new Date().toISOString()
}
}, (span) => {
return {
recordToolCall: (toolName, duration, success) => {
span.addEvent('tool.invocation', {
'tool.name': toolName,
'tool.duration.ms': duration,
'tool.success': success
});
},
recordRaiFilter: (filterType, action) => {
span.addEvent('rai.filter', {
'rai.filter.type': filterType,
'rai.filter.action': action
});
},
end: (status) => {
span.setStatus({ code: status });
span.end();
}
};
});
}
module.exports = { sdk, instrumentAgentInteraction };
PowerShell Agent Instrumentation
For PowerShell-based agents:
# ILLUSTRATIVE PSEUDOCODE - Verify against current Microsoft documentation
# Last verified: January 2026 | Status: Preview
# Agent telemetry helper functions
function Initialize-AgentTelemetry {
param(
[string]$AgentName,
[string]$Zone,
[string]$CollectorEndpoint
)
$script:TelemetryConfig = @{
AgentName = $AgentName
Zone = $Zone
CollectorEndpoint = $CollectorEndpoint
SessionId = [Guid]::NewGuid().ToString()
}
}
function Send-AgentTrace {
param(
[string]$OperationName,
[hashtable]$Attributes,
[timespan]$Duration
)
$trace = @{
resourceSpans = @(
@{
resource = @{
attributes = @(
@{ key = "service.name"; value = @{ stringValue = $script:TelemetryConfig.AgentName } }
@{ key = "fsi.zone"; value = @{ stringValue = $script:TelemetryConfig.Zone } }
)
}
scopeSpans = @(
@{
scope = @{ name = "agent-powershell-telemetry" }
spans = @(
@{
traceId = [Convert]::ToBase64String([Guid]::NewGuid().ToByteArray())
spanId = [Convert]::ToBase64String([Guid]::NewGuid().ToByteArray()[0..7])
name = $OperationName
startTimeUnixNano = ([DateTimeOffset]::UtcNow.AddMilliseconds(-$Duration.TotalMilliseconds)).ToUnixTimeMilliseconds() * 1000000
endTimeUnixNano = [DateTimeOffset]::UtcNow.ToUnixTimeMilliseconds() * 1000000
attributes = $Attributes.GetEnumerator() | ForEach-Object {
@{ key = $_.Key; value = @{ stringValue = $_.Value.ToString() } }
}
}
)
}
)
}
)
}
try {
Invoke-RestMethod -Uri "$($script:TelemetryConfig.CollectorEndpoint)/v1/traces" `
-Method Post `
-ContentType "application/json" `
-Body ($trace | ConvertTo-Json -Depth 10)
} catch {
Write-Warning "Failed to send telemetry: $_"
}
}
Step 5: Verify Telemetry Flow
Health Check Queries
// Verify traces are flowing to Application Insights
traces
| where timestamp > ago(1h)
| where customDimensions.fsi_zone != ""
| summarize count() by bin(timestamp, 5m), tostring(customDimensions.fsi_zone)
| render timechart
// Check for telemetry gaps
traces
| where timestamp > ago(24h)
| summarize Count = count() by bin(timestamp, 1h)
| where Count < 10
| project GapTime = timestamp, Count
PowerShell Verification
# Test OTLP endpoint connectivity
function Test-OTelCollector {
param([string]$Endpoint)
try {
$response = Invoke-RestMethod -Uri "$Endpoint/health" -Method Get -TimeoutSec 5
Write-Host "Collector healthy: $($response.status)" -ForegroundColor Green
return $true
} catch {
Write-Host "Collector unreachable: $_" -ForegroundColor Red
return $false
}
}
# Verify telemetry in Application Insights
function Get-RecentAgentTelemetry {
param(
[string]$AppInsightsId,
[string]$AgentName,
[int]$Minutes = 60
)
$query = @"
traces
| where timestamp > ago($($Minutes)m)
| where customDimensions.service_name == '$AgentName'
| summarize TotalTraces = count(),
Zones = make_set(customDimensions.fsi_zone)
| project TotalTraces, Zones
"@
$result = Invoke-AzOperationalInsightsQuery -WorkspaceId $AppInsightsId -Query $query
return $result.Results
}
Troubleshooting
Common Issues
| Issue | Cause | Resolution |
|---|---|---|
| No telemetry in App Insights | Connection string incorrect | Verify env variable |
| High latency in traces | Batch size too large | Reduce send_batch_size |
| Missing attributes | Processor order wrong | Check processor pipeline order |
| OOM errors | No memory limiter | Add memory_limiter processor |
| Sampling artifacts | Wrong sampler | Use probabilistic_sampler for consistency |
Debug Mode
Enable verbose logging temporarily:
service:
telemetry:
logs:
level: debug
output_paths: [stdout, /var/log/otel/collector.log]
Related Resources
- Overview - Observability architecture
- Application Insights Workbooks - Dashboard templates
- Alerting Configuration - Alert rules
- Microsoft Learn: OpenTelemetry in Azure
FSI Agent Governance Framework v1.2.51 - January 2026