Skip to main content

Message Reference

This document provides a complete reference for all WebSocket message types used in Dasha BlackBox agent communication. All messages are JSON objects sent over the WebSocket connection.

Message Structure

All messages follow this base structure:
{
  type: string;              // Message type identifier (required)
  timestamp: string;         // ISO 8601 timestamp (required)
  channelId?: string | null; // Channel identifier (optional)
  // ... type-specific fields
}

Client → Server Messages

Messages sent from your application to the Dasha BlackBox server.

Initialize (initialize)

Start a new conversation with an agent. Message Type: initialize Schema:
{
  type: "initialize";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  request?: {
    callType: "chat" | "webCall" | "onPhone";
    additionalData?: Record<string, unknown>;
    endpoint?: string | null;
  };
  devRequest?: SingleCallEnqueueRequestDto; // For dev endpoint only
}
Example:
{
  "type": "initialize",
  "timestamp": "2025-01-20T10:00:00Z",
  "request": {
    "callType": "chat",
    "additionalData": {
      "userId": "user123",
      "sessionId": "session456"
    }
  }
}
When to Use: Send immediately after WebSocket connection opens.

Incoming Chat Message (incomingChatMessage)

Send a text message to the agent. Message Type: incomingChatMessage Schema:
{
  type: "incomingChatMessage";
  content: string; // Message text
  timestamp: string; // ISO 8601
  channelId?: string | null;
}
Example:
{
  "type": "incomingChatMessage",
  "content": "Hello, agent!",
  "timestamp": "2025-01-20T10:00:00Z"
}
When to Use: Send user text messages during chat conversations.

SDP Answer (sdpAnswer)

Respond to WebRTC SDP offer for voice calls. Message Type: sdpAnswer Schema:
{
  type: "sdpAnswer";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  data: {
    sdpAnswer: string; // SDP answer string
  };
}
Example:
{
  "type": "sdpAnswer",
  "timestamp": "2025-01-20T10:00:00Z",
  "data": {
    "sdpAnswer": "v=0\r\no=- 1234567890 1234567890 IN IP4 0.0.0.0\r\n..."
  }
}
When to Use: Respond to sdpInvite messages during voice call setup.

WebSocket Tool Response (websocketToolResponse)

Return result of tool execution requested via WebSocket. Message Type: websocketToolResponse Schema:
{
  type: "websocketToolResponse";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  content: {
    id: string; // Tool request ID
    result: unknown; // Tool execution result
  };
}
Example:
{
  "type": "websocketToolResponse",
  "timestamp": "2025-01-20T10:00:00Z",
  "content": {
    "id": "tool-request-123",
    "result": {
      "status": "success",
      "data": { "customerId": "12345" }
    }
  }
}
When to Use: Respond to websocketToolRequest messages.

Terminate (terminate)

End the conversation and close the connection. Message Type: terminate Schema:
{
  type: "terminate";
  timestamp: string; // ISO 8601
  channelId?: string | null;
}
Example:
{
  "type": "terminate",
  "timestamp": "2025-01-20T10:00:00Z"
}
When to Use: When user ends the conversation or you want to close the connection gracefully.

Server → Client Messages

Messages received from the Dasha BlackBox server.

Event (event)

System events indicating connection state and lifecycle. Message Type: event Schema:
{
  type: "event";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  name: "connection" | "opened" | "closed" | "failedOpen";
  data?: {
    endpoint?: string | null;
    recordId?: string | null;
    startEstablishConnectionTimeISO?: string | null;
    endEstablishConnectionTimeISO?: string | null;
  };
}
Event Types:
  • connection: WebSocket connection established and agent session ready
  • opened: Conversation opened successfully
  • closed: Conversation closed normally
  • failedOpen: Failed to open conversation
Example:
{
  "type": "event",
  "name": "connection",
  "timestamp": "2025-01-20T10:00:00Z",
  "data": {
    "endpoint": "sip:user@example.com",
    "recordId": "record-123"
  }
}
When Received: Throughout the connection lifecycle to indicate state changes.

Text (text)

Text messages from agent or user (echo of sent messages). Message Type: text Schema:
{
  type: "text";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  content: {
    source: "assistant" | "user";
    text: string | null;
    name?: string | null;
    // For user messages:
    type?: "potential" | "final" | "confident";
    segmentId?: string;
    startISOTimes?: string;
    endISOTimes?: string | null;
    // For assistant messages:
    synthStartISOTimes?: string | null;  // TTS synthesis start time
    synthEndISOTimes?: string | null;    // TTS synthesis end time
    playStartISOTimes?: string | null;   // Audio playback start time
    playEndISOTimes?: string | null;     // Audio playback end time
  };
}
Example - Agent Message:
{
  "type": "text",
  "timestamp": "2025-01-20T10:00:00Z",
  "content": {
    "source": "assistant",
    "text": "Hello! How can I help you today?",
    "name": null,
    "synthStartISOTimes": "2025-01-20T10:00:01Z",
    "synthEndISOTimes": "2025-01-20T10:00:03Z",
    "playStartISOTimes": "2025-01-20T10:00:03Z",
    "playEndISOTimes": "2025-01-20T10:00:05Z"
  }
}
Example - User Message (Echo):
{
  "type": "text",
  "timestamp": "2025-01-20T10:00:00Z",
  "content": {
    "source": "user",
    "text": "Hello, agent!",
    "name": null,
    "type": "final",
    "segmentId": "segment-123",
    "startISOTimes": "2025-01-20T10:00:00Z",
    "endISOTimes": "2025-01-20T10:00:01Z"
  }
}
When Received:
  • Agent responses to user messages
  • Echo of user messages you sent (for transcript)
  • Streaming/partial messages (with type: "potential")
Timing Fields Explained: For assistant messages, timing fields help correlate audio events with the transcript:
  • synthStartISOTimes: When TTS synthesis began (text-to-speech generation started)
  • synthEndISOTimes: When TTS synthesis completed
  • playStartISOTimes: When audio playback began (user started hearing the response)
  • playEndISOTimes: When audio playback finished (user finished hearing the response)
For user messages, timing fields capture speech recognition timing:
  • startISOTimes: When user speech was first detected
  • endISOTimes: When user speech ended
  • segmentId: Unique identifier for the speech segment
  • type: Recognition confidence level (potential, confident, or final)

SDP Invite (sdpInvite)

WebRTC SDP offer for voice call setup. Message Type: sdpInvite Schema:
{
  type: "sdpInvite";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  data: {
    invite: string; // SDP offer string
  };
}
Example:
{
  "type": "sdpInvite",
  "timestamp": "2025-01-20T10:00:00Z",
  "data": {
    "invite": "v=0\r\no=- 1234567890 1234567890 IN IP4 0.0.0.0\r\n..."
  }
}
When Received: When starting a voice call (callType: "webCall"). Respond with sdpAnswer.

Tool Call (toolCall)

Agent is calling a tool/function. Message Type: toolCall Schema:
{
  type: "toolCall";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  name: string; // Tool name
  args: Record<string, unknown>; // Tool arguments
  startISOTimes: string; // ISO 8601
  callId: string; // Unique call ID
}
Example:
{
  "type": "toolCall",
  "timestamp": "2025-01-20T10:00:00Z",
  "name": "get_customer_info",
  "args": {
    "customerId": "12345"
  },
  "startISOTimes": "2025-01-20T10:00:00Z",
  "callId": "call-123"
}
When Received: When agent decides to call a tool during conversation.

Tool Call Result (toolCallResult)

Result of tool execution. Message Type: toolCallResult Schema:
{
  type: "toolCallResult";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  name: string; // Tool name
  result: unknown; // Tool execution result
  startISOTimes: string; // ISO 8601
  endISOTimes: string; // ISO 8601
  callId: string; // Unique call ID
}
Example:
{
  "type": "toolCallResult",
  "timestamp": "2025-01-20T10:00:05Z",
  "name": "get_customer_info",
  "result": {
    "customerId": "12345",
    "name": "John Doe",
    "email": "john@example.com"
  },
  "startISOTimes": "2025-01-20T10:00:00Z",
  "endISOTimes": "2025-01-20T10:00:05Z",
  "callId": "call-123"
}
When Received: After tool execution completes (via webhook or WebSocket).

WebSocket Tool Request (websocketToolRequest)

Tool execution request via WebSocket (alternative to webhooks). Message Type: websocketToolRequest Schema:
{
  type: "websocketToolRequest";
  timestamp: string; // ISO 8601
  channelId: string | null; // Channel identifier
  content: {
    id: string; // Request ID (use in response)
    toolName: string; // Tool name
    args: unknown; // Tool arguments
  };
}
Example:
{
  "type": "websocketToolRequest",
  "timestamp": "2025-01-20T10:00:00Z",
  "channelId": null,
  "content": {
    "id": "tool-request-123",
    "toolName": "get_customer_info",
    "args": {
      "customerId": "12345"
    }
  }
}
When Received: When agent calls a tool configured for WebSocket execution. Respond with websocketToolResponse.

Conversation Result (conversationResult)

Final conversation summary and metadata. Message Type: conversationResult Schema:
{
  type: "conversationResult";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  result: Record<string, unknown>; // Conversation result payload
}
Example:
{
  "type": "conversationResult",
  "timestamp": "2025-01-20T10:05:00Z",
  "result": {
    "status": "completed",
    "duration": 300,
    "transcript": "...",
    "summary": "Customer inquired about product features",
    "metadata": {
      "callId": "call-123",
      "agentId": "agent-456"
    }
  }
}
When Received: When conversation ends (after terminate or natural completion).

Error (error)

Error message from server. Message Type: error Schema:
{
  type: "error";
  timestamp: string; // ISO 8601
  channelId?: string | null;
  data: {
    message: string; // Required error message
  };
}
Example:
{
  "type": "error",
  "timestamp": "2025-01-20T10:00:00Z",
  "channelId": null,
  "data": {
    "message": "Invalid token"
  }
}
When Received: When errors occur (authentication, validation, server errors, etc.). Common Error Messages:
  • "Invalid token" - Authentication failed
  • "Agent not found" - Invalid agent reference
  • "Rate limit exceeded" - Too many requests
  • "Connection timeout" - Connection took too long

Message Flow Examples

Chat Conversation Flow

Voice Call Flow

Tool Execution Flow

TypeScript Types

Define types based on the message schemas:
// Define message types based on schemas
type AssistantTextContent = {
  source: 'assistant';
  text: string | null;
  name: string | null;
  synthStartISOTimes?: string | null;  // TTS synthesis start
  synthEndISOTimes?: string | null;    // TTS synthesis end
  playStartISOTimes?: string | null;   // Audio playback start
  playEndISOTimes?: string | null;     // Audio playback end
};

type UserTextContent = {
  source: 'user';
  text: string | null;
  name: string | null;
  type: 'potential' | 'confident' | 'final';
  segmentId: string;
  startISOTimes: string;
  endISOTimes?: string | null;
};

type TextHistoryMessage = {
  type: 'text';
  timestamp: string;
  channelId?: string | null;
  content: AssistantTextContent | UserTextContent;
};

type ToolCallHistoryMessage = {
  type: 'toolCall';
  timestamp: string;
  name: string;
  args: Record<string, unknown>;
  // ... other fields
};

type AgentWebSocketMessage = 
  | TextHistoryMessage
  | ToolCallHistoryMessage
  | { type: 'event'; name: string; /* ... */ }
  | { type: 'sdpInvite'; data: { invite: string }; /* ... */ }
  | { type: 'error'; /* ... */ }
  // ... other message types
  ;

// Type guard example
function isTextMessage(msg: AgentWebSocketMessage): msg is TextHistoryMessage {
  return msg.type === 'text';
}

// Handle messages
function handleMessage(msg: AgentWebSocketMessage) {
  switch (msg.type) {
    case 'text':
      // msg is typed as TextHistoryMessage
      console.log(msg.content.text);
      break;
    case 'toolCall':
      // msg is typed as ToolCallHistoryMessage
      console.log(msg.name, msg.args);
      break;
    // ... other cases
  }
}

Message Validation

All messages should be validated before processing:
import { z } from 'zod';

// Base text content schema
const TextHistoryContent = z.object({
  source: z.string().nullish(),
  name: z.string().nullable(),
  text: z.string().nullable(),
});

// Assistant message content with timing fields
const AssistantTextContent = TextHistoryContent.extend({
  source: z.literal('assistant'),
  synthStartISOTimes: z.string().nullish(),
  synthEndISOTimes: z.string().nullish(),
  playStartISOTimes: z.string().nullish(),  // Audio playback start
  playEndISOTimes: z.string().nullish(),    // Audio playback end
});

// User message content with speech recognition fields
const UserTextContent = TextHistoryContent.extend({
  source: z.literal('user'),
  type: z.string().min(1),  // 'potential' | 'confident' | 'final'
  startISOTimes: z.string().min(1),
  endISOTimes: z.string().nullish(),
  segmentId: z.string().min(1),
});

// Complete text message schema
const TextMessageSchema = z.object({
  type: z.literal('text'),
  timestamp: z.string().datetime({ offset: true }),
  channelId: z.string().nullish(),
  content: z.union([AssistantTextContent, UserTextContent]),
});

// Validate message
function validateMessage(data: unknown): AgentWebSocketMessage | null {
  try {
    // Use appropriate schema based on message type
    return TextMessageSchema.parse(data);
  } catch (error) {
    console.error('Invalid message:', error);
    return null;
  }
}

Next Steps