閱讀(1.6k) 書簽贊(0) 我要糾錯

OpenAI API Audio

2023-03-21 11:57 更新

了解如何將音頻轉換為文本。

Create transcription

POST https://api.openai.com/v1/audio/transcriptions

將音頻轉錄為輸入語言。

Request body

字段	類型	是否可選	說明
file	string	必須	要轉錄的音頻文件，采用以下格式之一：mp3、mp4、mpeg、mpga、m4a、wav 或 webm。
model	string	必須	要使用的模型的 ID。目前只有 whisper-1 可用。
prompt	string	可選	可選文本，用于指導模型的風格或繼續(xù)之前的音頻片段。提示應與音頻語言相匹配。
response_format	string	可選默認為 json	成績單輸出的格式，采用以下選項之一：json、text、srt、verbose_json 或 vtt。
temperature	number	可選默認為 0	采樣 temperature，介于 0 和 1 之間。較高的值（如 0.8）將使輸出更加隨機，而較低的值（如 0.2）將使輸出更加集中和確定。如果設置為 0，模型將使用對數(shù)概率自動升高 temperature，直到達到特定閾值。
language	string	可選	輸入音頻的語言。以 ISO-639-1 格式提供輸入語言將提高準確性和延遲。

示例請求

curl python node.js

curl https://api.openai.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \
  -F model="whisper-1"

import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
audio_file = open("audio.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)

const { Configuration, OpenAIApi } = require("openai");
const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
const resp = await openai.createTranscription(
  fs.createReadStream("audio.mp3"),
  "whisper-1"
);

參數(shù)

{
  "file": "audio.mp3",
  "model": "whisper-1"
}

響應

{
  "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}

Create translation

POST https://api.openai.com/v1/audio/translations

將音頻翻譯成英文。

Request body

字段	類型	是否可選	說明
file	string	必須	要翻譯的音頻文件，采用以下格式之一：mp3、mp4、mpeg、mpga、m4a、wav 或 webm。
model	string	必須	要使用的模型的 ID。目前只有 whisper-1 可用。
prompt	string	可選	可選文本，用于指導模型的風格或繼續(xù)之前的音頻片段。提示應為英文。
response_format	string	可選默認為 json	成績單輸出的格式，采用以下選項之一：json、text、srt、verbose_json 或 vtt。
temperature	number	可選默認為 0	采樣 temperature，介于 0 和 1 之間。較高的值（如 0.8）將使輸出更加隨機，而較低的值（如 0.2）將使輸出更加集中和確定。如果設置為 0，模型將使用對數(shù)概率自動升高 temperature，直到達到特定閾值。

示例請求

curl python node.js

curl https://api.openai.com/v1/audio/translations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/german.m4a" \
  -F model="whisper-1"

import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
audio_file = open("german.m4a", "rb")
transcript = openai.Audio.translate("whisper-1", audio_file)

const { Configuration, OpenAIApi } = require("openai");
const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
const resp = await openai.createTranslation(
  fs.createReadStream("audio.mp3"),
  "whisper-1"
);

參數(shù)

{
  "file": "german.m4a",
  "model": "whisper-1"
}

響應

{
  "text": "Hello, my name is Wolfgang and I come from Germany. Where are you heading today?"
}

以上內容是否對您有幫助：

← OpenAI API Embeddings

OpenAI API Files →

寫筆記

我要補充