閱讀(2.1k) 書簽贊(0) 我要糾錯

OpenAI API 聊天功能

2023-03-17 11:14 更新

使用 OpenAI Chat API，您可以使用 gpt-3.5-turbo 和 gpt-4 構(gòu)建自己的應(yīng)用程序來執(zhí)行以下操作：

起草電子郵件或其他書面文件
編寫 Python 代碼
回答有關(guān)一組文件的問題
創(chuàng)建會話代理
為您的軟件提供自然語言界面
一系列科目的導(dǎo)師
翻譯語言
模擬視頻游戲中的角色等等

本指南解釋了如何為基于聊天的語言模型進(jìn)行 API 調(diào)用，并分享獲得良好結(jié)果的技巧。您還可以在 OpenAI Playground 中試用新的聊天格式。

介紹

聊天模型將一系列消息作為輸入，并返回模型生成的消息作為輸出。

盡管聊天格式旨在簡化多回合對話，但它對于沒有任何對話的單回合任務(wù)同樣有用（例如之前由指令遵循模型（如 text-davinci-003）提供的任務(wù)）。

示例 API 調(diào)用如下所示：

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

主要輸入是消息參數(shù)。消息必須是一個消息對象數(shù)組，其中每個對象都有一個角色（“系統(tǒng)”、“用戶”或“助手”）和內(nèi)容（消息的內(nèi)容）。對話可以短至 1 條消息或填滿許多頁面。

通常，對話首先使用系統(tǒng)消息進(jìn)行格式化，然后是交替的用戶和助理消息。

系統(tǒng)消息有助于設(shè)置助手的行為。在上面的例子中，助手被指示“你是一個有用的助手”。

gpt-3.5-turbo-0301 并不總是高度關(guān)注系統(tǒng)消息。未來的模型將被訓(xùn)練為更加關(guān)注系統(tǒng)消息。

用戶消息有助于指導(dǎo)助手。它們可以由應(yīng)用程序的最終用戶生成，或由開發(fā)人員設(shè)置為指令。

助手消息幫助存儲先前的響應(yīng)。它們也可以由開發(fā)人員編寫，以幫助提供所需行為的示例。

當(dāng)用戶指令引用先前的消息時，包括對話歷史記錄會有所幫助。在上面的示例中，用戶的最后一個問題是“它在哪里播放？”僅在有關(guān) 2020 年世界大賽的先前消息的上下文中才有意義。由于模型對過去的請求沒有記憶，因此必須通過對話提供所有相關(guān)信息。如果對話不適合模型的令牌限制，則需要以某種方式縮短它。

響應(yīng)格式

API 響應(yīng)示例如下所示：

{
 'id': 'chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve',
 'object': 'chat.completion',
 'created': 1677649420,
 'model': 'gpt-3.5-turbo',
 'usage': {'prompt_tokens': 56, 'completion_tokens': 31, 'total_tokens': 87},
 'choices': [
   {
    'message': {
      'role': 'assistant',
      'content': 'The 2020 World Series was played in Arlington, Texas at the Globe Life Field, which was the new home stadium for the Texas Rangers.'},
    'finish_reason': 'stop',
    'index': 0
   }
  ]
}

在 Python 中，可以使用 response['choices'][0]['message']['content'] 提取助手的回復(fù)。

每個響應(yīng)都將包含一個 finish_reason。 finish_reason 的可能值為：

stop: API 返回完整的模型輸出
length: 由于 max_tokens 參數(shù)或令牌限制，模型輸出不完整
content_filter: 由于我們的內(nèi)容過濾器中的標(biāo)記而省略的內(nèi)容
null: API 響應(yīng)仍在進(jìn)行中或未完成

管理 tokens

語言模型以稱為標(biāo)記的塊形式讀取文本。在英語中，token 可以短到一個字符，也可以長到一個單詞（例如 a 或 apple），在某些語言中，token 甚至可以短于一個字符，甚至長于一個單詞。

例如，字符串“ChatGPT 很棒！”被編碼為六個標(biāo)記：["Chat", "G", "PT", "is", "great", "!"]。

API 調(diào)用中的令牌總數(shù)會影響：

您為每個令牌支付的 API 調(diào)用費用是多少
您的 API 調(diào)用需要多長時間，因為寫入更多令牌需要更多時間
您的 API 調(diào)用是否有效，因為令牌總數(shù)必須低于模型的最大限制（gpt-3.5-turbo-0301 為 4096 個令牌）

輸入和輸出令牌都計入這些數(shù)量。例如，如果您的 API 調(diào)用在消息輸入中使用了 10 個令牌，而您在消息輸出中收到了 20 個令牌，則您需要支付 30 個令牌的費用。

要查看 API 調(diào)用使用了多少令牌，請檢查 API 響應(yīng)中的使用字段（例如，response['usage']['total_tokens']）。

gpt-3.5-turbo 和 gpt-4 等聊天模型使用令牌的方式與其他模型相同，但由于它們基于消息的格式，因此更難計算對話將使用多少令牌。

計算聊天 API 調(diào)用的令牌

下面是一個示例函數(shù)，用于計算傳遞給 gpt-3.5-turbo-0301 的消息的令牌。

消息轉(zhuǎn)換為令牌的確切方式可能因模型而異。因此，當(dāng)發(fā)布未來的模型版本時，此函數(shù)返回的答案可能只是近似值。

def num_tokens_from_messages(messages, model="gpt-3.5-turbo-0301"):
  """Returns the number of tokens used by a list of messages."""
  try:
      encoding = tiktoken.encoding_for_model(model)
  except KeyError:
      encoding = tiktoken.get_encoding("cl100k_base")
  if model == "gpt-3.5-turbo-0301":  # note: future models may deviate from this
      num_tokens = 0
      for message in messages:
          num_tokens += 4  # every message follows <im_start>{role/name}\n{content}<im_end>\n
          for key, value in message.items():
              num_tokens += len(encoding.encode(value))
              if key == "name":  # if there's a name, the role is omitted
                  num_tokens += -1  # role is always required and always 1 token
      num_tokens += 2  # every reply is primed with <im_start>assistant
      return num_tokens
  else:
      raise NotImplementedError(f"""num_tokens_from_messages() is not presently implemented for model {model}.
  See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens.""")

接下來，創(chuàng)建一條消息并將其傳遞給上面定義的函數(shù)以查看令牌計數(shù)，這應(yīng)該與 API 使用參數(shù)返回的值相匹配：

messages = [
  {"role": "system", "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English."},
  {"role": "system", "name":"example_user", "content": "New synergies will help drive top-line growth."},
  {"role": "system", "name": "example_assistant", "content": "Things working well together will increase revenue."},
  {"role": "system", "name":"example_user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
  {"role": "system", "name": "example_assistant", "content": "Let's talk later when we're less busy about how to do better."},
  {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
]

model = "gpt-3.5-turbo-0301"

print(f"{num_tokens_from_messages(messages, model)} prompt tokens counted.")
# Should show ~126 total_tokens

要確認(rèn)我們上面的函數(shù)生成的數(shù)字與 API 返回的數(shù)字相同，請創(chuàng)建一個新的 Chat Completion：

# example token count from the OpenAI API
import openai


response = openai.ChatCompletion.create(
    model=model,
    messages=messages,
    temperature=0,
)

print(f'{response["usage"]["prompt_tokens"]} prompt tokens used.')

要在不調(diào)用 API 的情況下查看文本字符串中有多少個標(biāo)記，請使用 OpenAI 的 tiktoken Python 庫。

傳遞給 API 的每條消息都會消耗內(nèi)容、角色和其他字段中的令牌數(shù)量，外加一些額外的用于幕后格式化。這在未來可能會略有改變。

如果對話中的標(biāo)記太多而無法滿足模型的最大限制（例如，gpt-3.5-turbo 的標(biāo)記超過 4096 個），您將不得不截斷、省略或以其他方式縮小文本直到適合。請注意，如果從消息輸入中刪除一條消息，模型將失去所有關(guān)于它的知識。

另請注意，很長的對話更有可能收到不完整的回復(fù)。例如，長度為 4090 個令牌的 gpt-3.5-turbo 對話將在僅 6 個令牌后被切斷回復(fù)。

指導(dǎo)聊天模型

指導(dǎo)模型的最佳實踐可能因模型版本而異。以下建議適用于 gpt-3.5-turbo-0301，可能不適用于未來的模型。

許多對話以系統(tǒng)消息開始，以溫和地指示助手。例如，這是用于 ChatGPT 的系統(tǒng)消息之一：

You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible. Knowledge cutoff: {knowledge_cutoff} Current date: {current_date}

總的來說，gpt-3.5-turbo-0301 對系統(tǒng)消息的關(guān)注度不高，因此重要的說明往往放在用戶消息中比較好。

如果模型沒有生成您想要的輸出，請隨意迭代并嘗試潛在的改進(jìn)。您可以嘗試以下方法：

讓你的指示更明確
指定您想要答案的格式
在確定答案之前讓模型逐步思考或討論利弊

如需更及時的工程創(chuàng)意，請閱讀有關(guān)提高可靠性的技術(shù)的 OpenAI Cookbook 指南。

除了系統(tǒng)消息之外，temperature 和最大令牌是開發(fā)人員必須影響聊天模型輸出的眾多選項中的兩個。對于 temperature，較高的值（如 0.8）將使輸出更加隨機，而較低的值（如 0.2）將使輸出更加集中和確定。在 max tokens 的情況下，如果要將響應(yīng)限制為特定長度，可以將 max tokens 設(shè)置為任意數(shù)字。這可能會導(dǎo)致問題，例如，如果您將最大標(biāo)記值設(shè)置為 5，因為輸出將被切斷并且結(jié)果對用戶沒有意義。

Chat vs Completions

由于 gpt-3.5-turbo 的性能與 text-davinci-003 相似，但每個令牌的價格低 10%，因此我們建議在大多數(shù)用例中使用 gpt-3.5-turbo。

對于許多開發(fā)人員來說，轉(zhuǎn)換就像重寫和重新測試提示一樣簡單。

例如，如果您使用以下完成提示將英語翻譯成法語：

Translate the following English text to French: "{text}"

等效的聊天對話可能如下所示：

[
  {"role": "system", "content": "You are a helpful assistant that translates English to French."},
  {"role": "user", "content": 'Translate the following English text to French: "{text}"'}
]

或者甚至只是用戶消息：

[
  {"role": "user", "content": 'Translate the following English text to French: "{text}"'}
]

FAQ

gpt-3.5-turbo 是否可以進(jìn)行微調(diào)？

不可以。自 2023 年 3 月 1 日起，您只能微調(diào)基礎(chǔ) GPT-3 模型。

您是否存儲傳遞到 API 中的數(shù)據(jù)？

自 2023 年 3 月 1 日起，我們會將您的 API 數(shù)據(jù)保留 30 天，但不再使用您通過 API 發(fā)送的數(shù)據(jù)來改進(jìn)我們的模型。

添加審核層

如果您想向聊天 API 的輸出添加審核層，您可以按照我們的審核指南來防止顯示違反 OpenAI 使用政策的內(nèi)容。

以上內(nèi)容是否對您有幫助：

← OpenAI API 代碼補全

OpenAI API 圖像生成 →

寫筆記

我要補充