Skip to main content
POST
/
v1
/
t2a_async_v2
curl --request POST \
--url https://api.minimax.io/v1/t2a_async_v2 \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: <content-type>' \
--data '{
"model": "speech-2.5-hd-preview",
"text": "Omg, the real danger is not that computers start thinking like people, but that people start thinking like computers. Computers can only help us with simple tasks.",
"language_boost": "auto",
"voice_setting": {
"voice_id": "English_expressive_narrator",
"speed": 1,
"vol": 1,
"pitch": 1
},
"pronunciation_dict": {
"tone": [
"Omg/Oh my god"
]
},
"audio_setting": {
"audio_sample_rate": 32000,
"bitrate": 128000,
"format": "mp3",
"channel": 2
},
"voice_modify": {
"pitch": 0,
"intensity": 0,
"timbre": 0,
"sound_effects": "spacious_echo"
}
}'
{
"task_id": 95157322514444,
"task_token": "eyJhbGciOiJSUz",
"file_id": 95157322514444,
"usage_characters": 101,
"base_resp": {
"status_code": 0,
"status_msg": "success"
}
}

Returned File Information

The return result for a single file input is shown below.
If the input is a compressed package containing multiple files, a corresponding folder will be generated for each file. The contents inside each folder are the same as those for a single file input.

Input File Type: txt File

  • Output Files:
    • Audio File: Format follows the request body settings.
    • Subtitle File: Sentence-level subtitle information.
    • Extra JSON File: Additional information related to the audio file.

Input File Type: json File

  • title Field Output Files (if this field is empty, no files will be generated)
    • Audio File: Format follows the request body settings
    • Subtitle File: Sentence-level subtitle information
    • Extra JSON File: Additional information related to the audio file
  • content Field Output Files (if this field is empty, no files will be generated)
    • Audio File: Format follows the request body settings
    • Subtitle File: Sentence-level subtitle information
    • Extra JSON File: Additional information related to the audio file
  • extra Field Output Files (if this field is empty, no files will be generated)
    • Audio File: Format follows the request body settings
    • Subtitle File: Sentence-level subtitle information
    • Extra JSON File: Additional information related to the audio file

Authorizations

Authorization
string
header
required

HTTP: Bearer Auth

Headers

Content-Type
enum<string>
default:application/json
required

The media type of the request body. Must be set to application/json to ensure the data is sent in JSON format.

Available options:
application/json

Body

application/json
model
enum<string>
required

Model version to call. Supported

Available options:
speech-2.5-hd-preview,
speech-2.5-turbo-preview,
speech-02-hd,
speech-02-turbo,
speech-01-hd,
speech-01-turbo
text
string
required

Text content to convert to audio, max length 50,000 characters. Mutually exclusive with text_file_id (one is required).

text_file_id
integer
required

ID of the text file to synthesize. Max 100,000 characters. Supported formats: txt, zip. Mutually exclusive with text (one is required).

  • txt file: Supports customizing speech pauses by adding markers in the form <#x#>, where x is the pause duration in seconds. Valid range: [0.01, 99.99], up to two decimal places. Pause markers must be placed between speakable text segments and cannot be used consecutively.

  • zip file: Must contain files of the same type (txt or json).

    • json format supports ["title", "content", "extra"] fields. Each non-empty field generates an audio file, subtitles, and metadata and would be stored in a folder.
voice_setting
object
required
audio_setting
object
pronunciation_dict
object
language_boost
enum<string>

Controls whether recognition for specific minority languages and dialects is enhanced. Default is null. If the language type is unknown, set to "auto" and the model will automatically detect it.

Available options:
Chinese,
Chinese,Yue,
English,
Arabic,
Russian,
Spanish,
French,
Portuguese,
German,
Turkish,
Dutch,
Ukrainian,
Vietnamese,
Indonesian,
Japanese,
Italian,
Korean,
Thai,
Polish,
Romanian,
Greek,
Czech,
Finnish,
Hindi,
Bulgarian,
Danish,
Hebrew,
Malay,
Persian,
Slovak,
Swedish,
Croatian,
Filipino,
Hungarian,
Norwegian,
Slovenian,
Catalan,
Nynorsk,
Tamil,
Afrikaans,
auto
voice_modify
object

Voice effect settings. Supported formats: mp3, flac.

Response

200 - application/json
task_id
string

Task ID

file_id
integer

The corresponding audio file ID is returned once the task is successfully created.

When the task is complete, you can use the file_id to call the File (Retrieve) API to download the file.

If the request fails, this field will not be returned.

Note: The download URL is valid for 9 hours (32,400 seconds) from the time it is generated. After expiration, the file will no longer be available and the generated data will be lost, so please ensure you download it within the validity period.

task_token
string

Token for completing the task

usage_characters
integer

Number of billed characters

base_resp
object

Status code and details.