同步语音合成 HTTP

授权

Authorization

string

header

必填

HTTP: Bearer Auth

Security Scheme Type: http
HTTP Authorization Scheme: Bearer API_key，用于验证账户信息，可在账户管理>接口密钥中查看。

请求头

Content-Type

enum<string>

默认值:application/json

必填

请求体的媒介类型，请设置为 application/json，确保请求数据的格式为 JSON

可用选项:

application/json

请求体

application/json

model

enum<string>

必填

请求的模型版本，可选范围：speech-2.8-hd, speech-2.8-turbo, speech-2.6-hd, speech-2.6-turbo, speech-02-hd, speech-02-turbo, speech-01-hd, speech-01-turbo.

可用选项:

speech-2.8-hd,

speech-2.8-turbo,

speech-2.6-hd,

speech-2.6-turbo,

speech-02-hd,

speech-02-turbo,

speech-01-hd,

speech-01-turbo

text

string

必填

需要合成语音的文本，长度限制小于 10000 字符，若文本长度大于 3000 字符，推荐使用流式输出

段落切换用换行符标记
停顿控制：支持自定义文本之间的语音时间间隔，以实现自定义文本语音停顿时间的效果。使用方式：在文本中增加<#x#>标记，x 为停顿时长（单位：秒），范围 [0.01, 99.99]，最多保留两位小数。文本间隔时间需设置在两个可以语音发音的文本之间，不可连续使用多个停顿标记
行内发音替换：将普通话拼音（带声调数字 1–5）或 IPA 音标或粤语拼音（带声调数字 1–6）用英文小括号包裹，可临时覆盖有问题的单词或者多音汉字的发音。
- "The word live is pronounced (lɪv) as a verb and (laɪv) as an adjective."
- "This is (he2)平, not (huo4)面."
- "去街市買啲(sung3)。"
语气词标签：仅当模型选择 speech-2.8-hd 或 speech-2.8-turbo 时，支持在文本中插入语气词标签。支持的语气词：(laughs)（笑声）、(chuckle)（轻笑）、(coughs)（咳嗽）、(clear-throat)（清嗓子）、(groans)（呻吟）、(breath)（正常换气）、(pant)（喘气）、(inhale)（吸气）、(exhale)（呼气）、(gasps)（倒吸气）、(sniffs)（吸鼻子）、(sighs)（叹气）、(snorts)（喷鼻息）、(burps)（打嗝）、(lip-smacking)（咂嘴）、(humming)（哼唱）、(hissing)（嘶嘶声）、(emm)（嗯）、(sneezes)（喷嚏）

stream

boolean

控制是否流式输出。默认 false，即不开启流式

stream_options

object

Show child attributes

voice_setting

object

Show child attributes

audio_setting

object

Show child attributes

pronunciation_dict

object

Show child attributes

timbre_weights

object[]

Show child attributes

language_boost

enum<string>

是否增强对指定的小语种和方言的识别能力。默认值为 null，可设置为 auto 让模型自主判断。

注意：speech-01 和 speech-02 系列模型暂不支持 Persian、Filipino、Tamil 这三个语种。

可用选项:

Chinese,

Chinese,Yue,

English,

Arabic,

Russian,

Spanish,

French,

Portuguese,

German,

Turkish,

Dutch,

Ukrainian,

Vietnamese,

Indonesian,

Japanese,

Italian,

Korean,

Thai,

Polish,

Romanian,

Greek,

Czech,

Finnish,

Hindi,

Bulgarian,

Danish,

Hebrew,

Malay,

Persian,

Slovak,

Swedish,

Croatian,

Filipino,

Hungarian,

Norwegian,

Slovenian,

Catalan,

Nynorsk,

Tamil,

Afrikaans,

auto

voice_modify

object

声音效果器设置，该参数支持的音频格式：

非流式：mp3, wav, flac
流式：mp3

Show child attributes

subtitle_enable

boolean

默认值:false

控制是否开启字幕服务，默认值为 false。仅对 speech-2.8-hd, speech-2.8-turbo, speech-2.6-hd, speech-2.6-turbo, speech-02-hd, speech-02-turbo, speech-01-hd, speech-01-turbo 模型有效

subtitle_type

enum<string>

默认值:sentence

字幕粒度，默认值为 sentence。可选值：

sentence：句级别时间戳
word：词级别时间戳
word_streaming：流式优化的词级别时间戳，仅在 stream=true 时有效

可用选项:

sentence,

word,

word_streaming

output_format

enum<string>

默认值:hex

控制输出结果形式的参数，可选值范围为[url, hex]，默认值为 hex 。该参数仅在非流式场景生效，流式场景仅支持返回 hex 形式。返回的 url 有效期为 24 小时

可用选项:

url,

hex

aigc_watermark

boolean

默认值:false

控制在合成音频的末尾添加音频节奏标识，默认值为 False。该参数仅对非流式合成生效

响应

data

object

返回的合成数据对象，可能为 null，需进行非空判断

Show child attributes

trace_id

string

本次会话的 id，用于在咨询/反馈时帮助定位问题

extra_info

object

音频的附加信息

Show child attributes

base_resp

object

本次请求的状态码和详情

Show child attributes

API 指引

文本

语音

视频

图片

音乐

文件

模型

授权

请求头

请求体

响应