Monitor Concurrent Call Capacity - Dasha BlackBox Documentation

Know how many calls you can handle simultaneously. Monitor utilization in real time, set up alerts for capacity limits, and scale before calls start queueing. What you’ll learn: Concurrency metrics, utilization thresholds, queue status, alerting setup, and scaling strategies.

Screenshots may differ from current UI version.

Understanding concurrency

Concurrency is the number of simultaneous active calls across your organization or agent. Each active call consumes one slot from your concurrency limit until it completes.

Key metrics

Metric	Definition	Calculation
Active calls	Calls currently in progress (Running status)	Count of calls where status = Running
Concurrency limit	Maximum simultaneous calls allowed	Set by subscription tier
Utilization	Percentage of limit currently in use	(Active Calls / Limit) × 100
Queue depth	Calls waiting for available slots	Count of calls where status = Queued
Available capacity	Slots available for new calls	Limit - Active Calls

Utilization thresholds

Definition: Percentage of concurrency capacity currently in use.Calculation method:

Utilization % = (Active Calls / Concurrency Limit) × 100

Performance thresholds:

Utilization	Status	Interpretation	Recommended action
0-70%	Healthy	Normal operations, ample headroom	Monitor normally
70-85%	Elevated	Approaching capacity	Monitor closely, plan scaling
85-95%	Warning	Limited headroom for spikes	Consider scaling soon
95-100%	Critical	Near or at capacity	Calls may queue, scale immediately
100%	At limit	All slots in use	New calls queue until slots free

Running at 100% utilization for extended periods degrades service quality. Calls queue longer, deadlines may expire, and inbound callers may abandon. Maintain at least 15-20% buffer capacity above typical peak usage.

View concurrency status

Dashboard
API

Navigate to Dashboard home page
View Active Calls section showing current utilization
Color indicators:
- 🟢 Green: Under 70% utilization
- 🟡 Yellow: 70-90% utilization
- 🔴 Red: Over 90% utilization

Get current concurrency status:

const { active, concurrency } = await fetch(
  'https://blackbox.dasha.ai/api/v1/misc/concurrency',
  { headers: { 'Authorization': 'Bearer YOUR_API_KEY' } }
).then(r => r.json());

const utilization = (active / concurrency * 100).toFixed(1);

console.log(`Active: ${active}/${concurrency} (${utilization}%)`);

Queue status

Monitor calls waiting in queue:

// Get queued calls count
const queueResponse = await fetch(
  'https://blackbox.dasha.ai/api/v1/calls?status=Queued&take=1',
  { headers: { 'Authorization': 'Bearer YOUR_API_KEY' } }
).then(r => r.json());

console.log(`Calls waiting in queue: ${queueResponse.totalCount}`);

Combined status check

async function getConcurrencyStatus() {
  const [concurrencyData, queued, pending] = await Promise.all([
    fetch('https://blackbox.dasha.ai/api/v1/misc/concurrency', {
      headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
    }).then(r => r.json()),

    fetch('https://blackbox.dasha.ai/api/v1/calls?status=Queued&take=1', {
      headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
    }).then(r => r.json()),

    fetch('https://blackbox.dasha.ai/api/v1/calls?status=Pending&take=1', {
      headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
    }).then(r => r.json())
  ]);

  const { active, concurrency } = concurrencyData;

  return {
    activeCalls: active,
    queuedCalls: queued.totalCount,
    pendingCalls: pending.totalCount,
    concurrencyLimit: concurrency,
    utilization: ((active / concurrency) * 100).toFixed(1) + '%',
    availableSlots: Math.max(0, concurrency - active)
  };
}

const status = await getConcurrencyStatus();
console.log('Concurrency Status:', status);

Concurrency metrics

Active calls

Definition: Number of calls currently in Running status (actively connected and processing).Calculation method:

Active Calls = Count of calls where status = 'Running'

Typical range: 0 to concurrency limit. What high values indicate:

Peak usage periods
Bulk campaign in progress
Long average call durations

Queue depth

Definition: Number of calls waiting for an available slot to start processing.Calculation method:

Queue Depth = Count of calls where status IN ('Created', 'Queued', 'Pending')

Performance thresholds:

Depth	Status	Interpretation
0	Optimal	No waiting calls
1-10	Acceptable	Brief waits expected
10-50	Elevated	Monitor for growth
50-100	High	Extended wait times likely
> 100	Critical	Significant delays, scale needed

Wait time

Definition: Time calls spend in queue before processing begins.Estimated calculation:

Estimated Wait Time = (Queue Depth / Active Calls) × Average Call Duration

What long wait times indicate:

Insufficient concurrency for volume
Long call durations consuming slots
Bulk campaigns overwhelming capacity

Concurrency limits by plan

Limits vary by subscription plan:

Plan	Concurrent Lines	Minutes/Month
Developer	1	1,000
Growth	Multiple	Unlimited

Check your specific limits in Account Settings → Billing & Usage. Contact sales for Growth plan pricing.

What happens at capacity

When all concurrency slots are in use:

New calls enter queue — Status changes to “Queued”
Queue processed by priority — Higher priority calls processed first
Calls start as slots free — First queued call gets next available slot
Deadlines enforced — Calls exceeding deadline are auto-canceled

Impact on service quality

Metric	At capacity impact	Mitigation
Wait time	Increases with queue depth	Scale concurrency
Deadline expirations	Queued calls may timeout	Set longer deadlines
Inbound abandonment	Callers may hang up waiting	Reserve capacity for inbound
Campaign velocity	Outbound campaigns slow down	Stagger campaign scheduling

Priority during capacity constraints

Calls are processed by priority value (lower values processed first), then by deadline:

Priority	Behavior at capacity
0-1 (Highest)	Processed first when slots free
2-4 (Normal)	Processed after higher priority calls
5+ (Lower)	May wait longer during high utilization

Within the same priority level, calls closer to their deadline are processed first.

Monitor utilization

Continuous monitoring script

async function monitorConcurrency(intervalMs = 60000) {
  async function check() {
    const { active, concurrency } = await fetch(
      'https://blackbox.dasha.ai/api/v1/misc/concurrency',
      { headers: { 'Authorization': 'Bearer YOUR_API_KEY' } }
    ).then(r => r.json());

    const utilization = (active / concurrency * 100);
    const timestamp = new Date().toISOString();

    console.log(`[${timestamp}] ${active}/${concurrency} (${utilization.toFixed(1)}%)`);

    // Alert on high utilization
    if (utilization >= 95) {
      console.error('CRITICAL: At capacity limit');
    } else if (utilization >= 85) {
      console.warn('WARNING: High utilization - consider scaling');
    } else if (utilization >= 70) {
      console.warn('NOTICE: Elevated utilization');
    }

    return { timestamp, active, concurrency, utilization };
  }

  // Initial check
  await check();

  // Continuous monitoring
  setInterval(check, intervalMs);
}

// Monitor every minute
monitorConcurrency(60000);

Per-agent utilization

Track utilization for specific agents:

async function getAgentUtilization(agentId) {
  const response = await fetch(
    `https://blackbox.dasha.ai/api/v1/calls?agentId=${agentId}&status=Running&take=1`,
    { headers: { 'Authorization': 'Bearer YOUR_API_KEY' } }
  ).then(r => r.json());

  return {
    agentId,
    activeCalls: response.totalCount
  };
}

// Check multiple agents
const agentIds = ['agent-1', 'agent-2', 'agent-3'];

const agentStats = await Promise.all(
  agentIds.map(id => getAgentUtilization(id))
);

console.table(agentStats);

Historical utilization analysis

Analyze patterns over time:

async function analyzeUtilizationHistory() {
  // Get recent call results to analyze duration patterns
  const results = await fetch(
    'https://blackbox.dasha.ai/api/v1/call-results?take=500',
    { headers: { 'Authorization': 'Bearer YOUR_API_KEY' } }
  ).then(r => r.json());

  // Group by hour
  const byHour = {};
  results.items.forEach(call => {
    const hour = new Date(call.createdTime).getHours();
    if (!byHour[hour]) byHour[hour] = { count: 0, totalDuration: 0 };
    byHour[hour].count++;
    byHour[hour].totalDuration += call.durationSeconds;
  });

  // Calculate peak hours
  const sortedHours = Object.entries(byHour)
    .map(([hour, data]) => ({
      hour: parseInt(hour),
      callCount: data.count,
      avgDuration: (data.totalDuration / data.count / 60).toFixed(1) + ' min'
    }))
    .sort((a, b) => b.callCount - a.callCount);

  console.log('Peak hours by call volume:');
  console.table(sortedHours.slice(0, 5));

  return sortedHours;
}

Set up alerts

Polling-based alerting

async function capacityAlert(thresholds = { warning: 85, critical: 95 }) {
  const { active, concurrency } = await fetch(
    'https://blackbox.dasha.ai/api/v1/misc/concurrency',
    { headers: { 'Authorization': 'Bearer YOUR_API_KEY' } }
  ).then(r => r.json());

  const utilization = (active / concurrency) * 100;

  if (utilization >= thresholds.critical) {
    await sendAlert({
      severity: 'critical',
      message: `Concurrency at ${utilization.toFixed(1)}%`,
      active,
      limit: concurrency,
      action: 'Calls are queueing. Consider immediate scaling.'
    });
  } else if (utilization >= thresholds.warning) {
    await sendAlert({
      severity: 'warning',
      message: `Concurrency at ${utilization.toFixed(1)}%`,
      active,
      limit: concurrency,
      action: 'Monitor closely. Plan scaling if trend continues.'
    });
  }
}

// Check every minute
setInterval(() => capacityAlert(), 60000);

Prometheus metrics exporter

Export metrics for Prometheus monitoring:

const express = require('express');
const app = express();

app.get('/metrics', async (req, res) => {
  const [concurrencyData, queued] = await Promise.all([
    fetch('https://blackbox.dasha.ai/api/v1/misc/concurrency', {
      headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
    }).then(r => r.json()),

    fetch('https://blackbox.dasha.ai/api/v1/calls?status=Queued&take=1', {
      headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
    }).then(r => r.json())
  ]);

  const { active, concurrency } = concurrencyData;

  const metrics = `
# HELP blackbox_concurrency_limit Maximum concurrent calls allowed
# TYPE blackbox_concurrency_limit gauge
blackbox_concurrency_limit ${concurrency}

# HELP blackbox_active_calls Current number of active calls
# TYPE blackbox_active_calls gauge
blackbox_active_calls ${active}

# HELP blackbox_queued_calls Current number of queued calls
# TYPE blackbox_queued_calls gauge
blackbox_queued_calls ${queued.totalCount}

# HELP blackbox_utilization_percent Current utilization percentage
# TYPE blackbox_utilization_percent gauge
blackbox_utilization_percent ${(active / concurrency * 100).toFixed(2)}
  `.trim();

  res.set('Content-Type', 'text/plain').send(metrics);
});

app.listen(9090, () => {
  console.log('Prometheus metrics available at http://localhost:9090/metrics');
});

Alert thresholds reference

Condition	Threshold	Severity	Response
Utilization spike	> 95%	Critical	Immediate investigation
Sustained high	> 85% for 15 min	Warning	Plan scaling
Queue buildup	> 50 calls	Warning	Review call durations
Unexpected drop	< 10% of normal	Info	Verify system health

Scaling strategies

Increase concurrency limits

Upgrade subscription tier:

Navigate to Account Settings → Billing
Click Upgrade Plan
Select tier with higher concurrency
New limits active immediately

Enterprise custom limits: Contact support for limits above standard tiers.

Multi-agent load distribution

Distribute load across multiple agents:

async function selectLeastLoadedAgent(agentIds) {
  const loads = await Promise.all(
    agentIds.map(async (agentId) => {
      const response = await fetch(
        `https://blackbox.dasha.ai/api/v1/calls?agentId=${agentId}&status=Running&take=1`,
        { headers: { 'Authorization': 'Bearer YOUR_API_KEY' } }
      ).then(r => r.json());

      return {
        agentId,
        activeCalls: response.totalCount
      };
    })
  );

  // Sort by active calls (ascending)
  loads.sort((a, b) => a.activeCalls - b.activeCalls);

  console.log('Agent loads:', loads);
  return loads[0].agentId; // Return least loaded agent
}

// Use when enqueuing calls
const agents = ['agent-support', 'agent-sales', 'agent-general'];
const targetAgent = await selectLeastLoadedAgent(agents);

await fetch(`https://blackbox.dasha.ai/api/v1/calls/enqueue/${targetAgent}`, {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    endpoint: '+1-555-123-4567'
  })
});

Optimize call duration

Shorter calls free slots faster:

// Agent configuration for efficient calls
{
  "config": {
    "llmConfig": {
      "temperature": 0.6       // Focused outputs
    },
    "features": {
      "maxCallDuration": 300,  // 5 minute maximum
      "silenceTimeout": 15     // End on extended silence
    }
  }
}

Stagger bulk campaigns

Avoid overwhelming capacity with large campaigns:

async function staggeredEnqueue(agentId, calls, options = {}) {
  const { batchSize = 50, delayMs = 5 * 60 * 1000 } = options;

  for (let i = 0; i < calls.length; i += batchSize) {
    const batch = calls.slice(i, i + batchSize);

    // Enqueue batch
    await fetch(`https://blackbox.dasha.ai/api/v1/calls/enqueue/${agentId}/batch`, {
      method: 'POST',
      headers: {
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(batch)
    });

    console.log(`Enqueued batch ${Math.floor(i / batchSize) + 1}: ${batch.length} calls`);

    // Wait before next batch (unless last batch)
    if (i + batchSize < calls.length) {
      console.log(`Waiting ${delayMs / 1000}s before next batch...`);
      await new Promise(resolve => setTimeout(resolve, delayMs));
    }
  }
}

// Example: Stagger 500 calls in batches of 50, 5 minutes apart
const callList = Array.from({ length: 500 }, (_, i) => ({
  endpoint: `+1-555-${String(i).padStart(7, '0')}`
}));

await staggeredEnqueue('agent-campaign', callList, {
  batchSize: 50,
  delayMs: 5 * 60 * 1000
});

Best practices

Capacity planning

Practice	Recommendation
Buffer capacity	Maintain 20% headroom above typical peak
Monitor trends	Track daily and weekly utilization patterns
Plan ahead	Scale before campaigns, not during
Test limits	Verify behavior at capacity before production

Load management

Practice	Recommendation
Priority tiers	Reserve high priority for time-sensitive calls
Stagger campaigns	Batch large campaigns with delays
Balance agents	Distribute load across multiple agents
Monitor duration	Investigate unusually long calls

Alert configuration

Alert	Threshold	Check interval
Critical capacity	> 95% utilization	Every 1 minute
High utilization	> 85% for 15 minutes	Every 5 minutes
Queue buildup	> 50 queued calls	Every 2 minutes
Duration anomaly	Calls > 2× average duration	Every 10 minutes

Troubleshooting

Calls stuck in queue

Symptoms: Calls remain in Queued status for extended periods.Causes:

All concurrency slots in use
Agent disabled or outside business hours
Long-running calls consuming all slots

Solutions:

Check current utilization — verify you’re at capacity
Review active call durations — identify unusually long calls
Verify agent is enabled and within schedule
Increase concurrency limit or add agents
Check for stuck calls in Running status that should have ended

Unexpected utilization spikes

Symptoms: Sudden increase in active calls not matching expected volume.Causes:

Bulk campaign started
Multiple systems scheduling simultaneously
Webhook retry storms
Inbound call surge

Solutions:

Review recent call scheduling patterns
Check if bulk campaigns started unintentionally
Implement rate limiting in scheduling systems
Stagger scheduled calls with delays
Add capacity checks before scheduling

Frequently hitting limits

Symptoms: Regularly reaching 100% utilization.Causes:

Insufficient capacity for volume
Long average call durations
Poor load distribution

Solutions:

Upgrade to higher tier for more concurrency
Add agents for load distribution
Optimize prompts for shorter conversations
Set maximum call duration limits
Implement application-level rate limiting

Uneven agent utilization

Symptoms: Some agents at capacity while others idle.Causes:

Fixed agent assignment
Uneven scheduling distribution
Different agent schedules

Solutions:

Implement load balancing when scheduling
Route calls to least-loaded agent
Align agent schedules with expected volume
Consider using agent pools for similar use cases

What’s next

Call History

Manage scheduled call queue

Outbound Calls

Schedule outbound campaigns

​Understanding concurrency

​Key metrics

​View concurrency status

​Concurrency metrics

​Concurrency limits by plan

​What happens at capacity

​Impact on service quality

​Priority during capacity constraints

​Monitor utilization

​Set up alerts

​Alert thresholds reference

​Scaling strategies

​Increase concurrency limits

​Best practices

​Capacity planning

​Load management

​Alert configuration

​Troubleshooting

​What’s next

Call History

Outbound Calls

Understanding concurrency

Key metrics

View concurrency status

Concurrency metrics

Concurrency limits by plan

What happens at capacity

Impact on service quality

Priority during capacity constraints

Monitor utilization

Set up alerts

Alert thresholds reference

Scaling strategies

Increase concurrency limits

Best practices

Capacity planning

Load management

Alert configuration

Troubleshooting

What’s next