Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.blackbox.dasha.ai/llms.txt

Use this file to discover all available pages before exploring further.

Your agent’s language model determines how it understands context, generates responses, and makes decisions. Dasha BlackBox supports multiple LLM providers, each with different strengths for voice conversations. What you’ll learn: Choosing a provider, configuring temperature, using custom endpoints, and optimizing for voice.
LLM latency typically accounts for 40–60% of total response time. We’ve measured this across millions of production calls — choosing the right provider and settings directly impacts conversation quality and caller satisfaction.

ProviderBest for
Reflex-1Low-latency voice conversations, native Dasha optimization
OpenAIProduction reliability, industry-leading quality
Reflex-1 is Dasha’s in-house model, built specifically for real-time voice. OpenAI remains our primary third-party provider, fully tested and optimized for Dasha BlackBox voice conversations.
Priority Tier: Enable priority tier for OpenAI to reduce latency at the cost of higher LLM pricing. This is recommended for latency-sensitive production deployments where response speed directly impacts conversation quality.

Alternative providers

The following providers are supported through OpenAI-compatible APIs. While functional, they have not been extensively tested on our platform. We recommend thorough testing before production use.
ProviderNotes
GroqHigh throughput with open-source models
Grok (xAI)Advanced reasoning capabilities
DeepSeekCost efficiency with high quality
Alternative providers may exhibit different latency characteristics, response formats, or edge-case behaviors compared to OpenAI. Test extensively with your specific use cases before deploying to production.

Configuration

  1. Go to the LLM Config tab
  2. Select your vendor from the dropdown
  3. Choose a model
  4. Configure temperature
  5. Save your agent

LLM parameters

All LLM vendors support standard configuration parameters that control response behavior.

Temperature

Controls randomness and creativity in responses.
RangeBehaviorUse when
0.0–0.5Focused, deterministic, consistentFAQs, factual information, structured workflows
0.6–0.9Balanced creativity and consistencyGeneral conversation, customer support
1.0–2.0Creative, varied, unpredictableRarely used for voice agents
For production voice agents, use temperature between 0.6–0.8. In our testing across thousands of calls, this range produces consistent yet natural-sounding responses. Lower values feel robotic; higher values risk hallucinations and off-topic tangents.

Top P (nucleus sampling)

Alternative to temperature for controlling randomness via probability mass.
ValueEffect
0.9Only considers tokens in top 90% probability — more focused
1.0Considers all tokens — standard behavior
OpenAI recommends using either temperature or topP, not both. If you set both, temperature takes precedence.

Custom compatible provider

Use any OpenAI-compatible API endpoint, including self-hosted models or alternative providers.

When to use custom providers

  • Self-hosted models for data privacy
  • Alternative providers with OpenAI-compatible APIs
  • Custom fine-tuned models
  • On-premise deployments

Configuration

  1. Select Custom Compatible as your vendor
  2. Enter the Endpoint URL (e.g., https://api.yourprovider.com/v1)
  3. Enter your API Key (minimum 10 characters)
  4. Enter the Model ID as recognized by your provider
  5. Configure standard LLM parameters
  6. Save your agent
Custom providers must implement the OpenAI Chat Completions API format. Incompatible APIs cause agent failures. Test thoroughly before production use.

Testing and optimization

A/B testing LLMs

Compare different vendors for your specific use case:
  1. Create identical agents with different LLM configs
  2. Run parallel test calls with the same scenarios
  3. Measure response quality, speed, length, and success rate
  4. Compare costs over 100–1000 calls

Parameter tuning

Temperature tuning:
  1. Start at 0.7 (balanced)
  2. Test with real conversation scenarios
  3. Adjust based on observations:
    • Too robotic/repetitive → Increase to 0.8–0.9
    • Too creative/inconsistent → Decrease to 0.5–0.6
    • Hallucinating information → Decrease to 0.3–0.5

Next steps

Voice & Speech

Configure text-to-speech providers

Tools & Functions

Enable agents to call external APIs

Test Your Agent

Validate agent responses

Production Checklist

Pre-deployment verification