whisper
Automatic Speech Recognition • OpenAIWhisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
Usage
Workers - TypeScript
curl
Parameters
Input
-
0
string -
1
object-
audio
arrayAn array of integers that represent the audio data constrained to 8-bit unsigned integer values
-
items
numberA value between 0 and 255
-
-
source_lang
stringThe language of the recorded audio
-
target_lang
stringThe language to translate the transcription into. Currently only English is supported.
-
Output
-
text
stringThe transcription
-
word_count
number -
words
array-
items
object-
word
string -
start
numberThe second this word begins in the recording
-
end
numberThe ending second when the word completes
-
-
-
vtt
string
API Schemas
The following schemas are based on JSON Schema