Channel Ingestion Setup

Conversation Analytics relies on having conversation content (audio or text) plus consistent metadata (participants, timestamps, channel, queue/team, etc.).

This chapter describes how a platform operator should set up and validate ingestion across channels.

Ingestion goals (operator)

Ingest conversations reliably and securely.
Normalize metadata so reporting, filtering, and AI Tasks behave consistently.
Detect ingestion gaps early (monitoring/alerts).

Supported channel types (conceptual)

Voice calls (speech)

Typical inputs: - audio recording(s) - call metadata (direction, start/end time, agent ID, phone numbers, queue/campaign)

Outputs to downstream pipelines: - raw audio + normalized metadata record

Text channels (omni-channel)

Typical inputs: - chat transcript / message thread - email thread (messages + headers) - ticket thread (comments, updates, resolution status) - metadata (agent/customer IDs, timestamps, channel, queue, tags)

Outputs to downstream pipelines: - normalized text thread + metadata record

Note: Even in text channels, the “conversation” is treated as a single analyzable unit with a thread of messages/events.

Metadata normalization (recommended)

Define and enforce a standard schema so everything downstream can rely on it.

At minimum, capture:

Tenant ID
Conversation ID (stable)
Channel type (call/chat/email/ticket)
Start/end timestamps
Participants
agent identifier
customer identifier (if available)
Direction (inbound/outbound), where applicable
Queue/team/campaign (optional but strongly recommended)
Language (optional; transcription/AI may infer)
Tags / dispositions (optional)

Validation checks (before enabling analytics)

For each channel: - ✅ content is present (audio or text thread) - ✅ metadata fields are populated and consistent - ✅ IDs are stable (no duplicates) - ✅ timestamps are correct (time zone handling) - ✅ sample conversations appear in the UI for the intended tenant

Operational best practices

Backfill support: Ensure you can ingest historical conversations for onboarding and reprocessing.
Deduplication: Protect against duplicate ingestion events.
PII handling: Decide whether redaction happens pre-ingestion or inside MiaRec pipelines (document the chosen approach).
Error handling: Use retries and a dead-letter strategy for poison messages.
Observability: Monitor ingestion throughput, latency, and error rates per tenant and channel.

Implementation notes

MiaRec supports various voice ingestion methods including:

SIPREC / PBX recordings
CCaaS integrations (Genesys, NICE, Five9, Amazon Connect, etc.)
Upload via API

For text channels (chat, email, tickets), support varies by deployment. Contact MiaRec for details on specific connector availability.

Document a minimum required metadata schema and a "recommended schema"
Strongly recommend stable conversation IDs and tenant IDs
Provide sample payloads (voice + text) and validation steps

EDITOR NOTE: fill in with product specifics

Purpose of this section

Operators need concrete connector/setup steps and the required metadata contract.

Missing / unclear (confirm with Engineering)

Ingestion methods
A) Real-time streaming ingestion
B) Batch ingestion (file drops)
C) REST API ingestion
D) Multiple of the above (specify)
Voice sources supported
A) SIPREC / PBX recordings
B) CCaaS integrations (e.g., Genesys, NICE, Five9, Amazon Connect)
C) Zoom/Teams calling recordings
D) Upload API only
E) Other (list)
Text channel sources supported
A) Native connectors (Zendesk / Salesforce / Intercom / etc.)
B) Webhooks ingestion
C) API upload
D) Not supported yet
Required metadata fields
A) Minimum required set is only tenant + conversation ID + content
B) Direction and agent ID are required
C) Queue/team is required
D) Other (specify)
Deduplication strategy
A) Conversation ID is unique key; duplicates overwrite
B) Duplicates are rejected
C) Duplicates are versioned