Voice and Speech

Use speech transcription and text-to-speech providers for voice input, generated narration, and channel delivery.

Voice and speech features help agents work with audio. Depending on configured providers, the app can transcribe audio, synthesize speech, and deliver voice-ready outputs through tasks or channels.

Configure Speech Providers

Open Settings
Find the speech or provider configuration area
Add the required API key or provider credentials
Save the settings
Run a small test task before using speech in a production workflow

Provider availability depends on your build and configured credentials.

Transcribe Audio

Use transcription when a task includes meetings, voice notes, interviews, or media files.

Attach an audio or video file to a task
Ask the agent to transcribe it
Specify whether you need a verbatim transcript, summary, action items, or timestamps
Review the transcript before using it as source material

For long files, ask for sections or summaries first, then request detail where needed.

Generate Speech

Use text-to-speech when the output should become narration, a voice message, or spoken media.

Useful requests include:

"Create a 45-second narration for this product demo."
"Generate a friendly voiceover script and speech file for the onboarding clip."
"Turn this summary into a short audio update for the team."

Review the script before synthesis when the message is customer-facing.

Speech in Channels

Channel delivery can include audio or media outputs when the target platform supports the file type. Use this for:

Daily spoken summaries
Incident update voice notes
Generated narration for media reviews
Accessibility-friendly recap formats

Check platform file limits before sending large audio or video files.

Audio can contain sensitive personal data. Follow these rules:

Transcribe only files you are allowed to process
Remove or redact sensitive excerpts before sharing
Do not send private recordings to public channels
Confirm provider policy before uploading regulated or confidential audio
Keep source files in the workspace only as long as needed

Troubleshooting

If transcription fails, confirm the file format and size are supported
If speech output sounds wrong, revise the script and voice instructions
If provider calls fail, check API keys and quota
If channel delivery fails, export the audio file and upload manually

Transcribe Audio

Use transcription when a task includes meetings, voice notes, interviews, or media files.

Attach an audio or video file to a task

Ask the agent to transcribe it

Specify whether you need a verbatim transcript, summary, action items, or timestamps

Review the transcript before using it as source material

For long files, ask for sections or summaries first, then request detail where needed.

Generate Speech

Use text-to-speech when the output should become narration, a voice message, or spoken media.

Useful requests include:

"Create a 45-second narration for this product demo."

"Generate a friendly voiceover script and speech file for the onboarding clip."

"Turn this summary into a short audio update for the team."

Review the script before synthesis when the message is customer-facing.

Speech in Channels

Channel delivery can include audio or media outputs when the target platform supports the file type. Use this for:

Daily spoken summaries

Incident update voice notes

Generated narration for media reviews

Accessibility-friendly recap formats

Check platform file limits before sending large audio or video files.

Privacy and Consent

Audio can contain sensitive personal data. Follow these rules:

Transcribe only files you are allowed to process

Remove or redact sensitive excerpts before sharing

Do not send private recordings to public channels

Confirm provider policy before uploading regulated or confidential audio

Keep source files in the workspace only as long as needed