Creates a new custom voice by cloning from provided audio samples using multipart form data. This process analyzes the audio characteristics and generates a synthetic voice that mimics the original speaker. The cloned voice can then be used for text-to-speech synthesis in agent configurations.
Voice Cloning Process:
Provider-Specific Features:
Audio Requirements:
Form Fields:
Processing Time:
Quality Considerations:
Common Use Cases:
Post-Processing:
Name identifier for the cloned voice
1 - 100Description of voice characteristics
1 - 1000Primary language for the voice model
Voice cloning service provider
ElevenLabs, Cartesia, Dasha, Inworld, Lmnt Whether to remove background noise from audio
Cloning mode: Stability or Similarity
Whether to enhance audio quality
Transcript for voice cloning
Custom metadata labels
Audio files for voice cloning
Returns the created cloned voice details
Response DTO for TTS voice cloning operations
Unique identifier for the voice
1TTS provider (ElevenLabs, Cartesia, Dasha, Inworld, Lmnt)
ElevenLabs, Cartesia, Dasha, Inworld, Lmnt Voice category (Public or Cloned)
Public, Cloned Display name of the voice
Voice ID used for synthesis
Description of voice characteristics
Primary language for the voice
Custom metadata labels
URL for voice preview audio
Timestamp when voice was created
Timestamp when voice was last updated