Spaces:

TGSAI
/

Tutorials

Runtime error

File size: 9,707 Bytes

1e43666

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ae5bcee9-6588-4d29-bbb9-6fb351ef6630",
   "metadata": {},
   "source": [
    "# L1 Language Models, the Chat Format and Tokens"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c797991-8486-4d79-8c1d-5dc0c1289c2f",
   "metadata": {},
   "source": [
    "## Setup\n",
    "#### Load the API key and relevant Python libaries.\n",
    "In this course, we've provided some code that loads the OpenAI API key for you."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19cd4e96",
   "metadata": {
    "height": 132
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import openai\n",
    "import tiktoken\n",
    "from dotenv import load_dotenv, find_dotenv\n",
    "_ = load_dotenv(find_dotenv()) # read local .env file\n",
    "\n",
    "openai.api_key  = os.environ['OPENAI_API_KEY']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47ba0938-7ca5-46c4-a9d1-b55708d4dc7c",
   "metadata": {},
   "source": [
    "#### helper function\n",
    "This may look familiar if you took the earlier course \"ChatGPT Prompt Engineering for Developers\" Course"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1ed96988",
   "metadata": {
    "height": 149
   },
   "outputs": [],
   "source": [
    "def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
    "    messages = [{\"role\": \"user\", \"content\": prompt}]\n",
    "    response = openai.ChatCompletion.create(\n",
    "        model=model,\n",
    "        messages=messages,\n",
    "        temperature=0,\n",
    "    )\n",
    "    return response.choices[0].message[\"content\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fe10a390-2461-447d-bf8b-8498db404c44",
   "metadata": {},
   "source": [
    "## Prompt the model and get a completion"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e1cc57b2",
   "metadata": {
    "height": 45
   },
   "outputs": [],
   "source": [
    "response = get_completion(\"What is the capital of France?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "76774108",
   "metadata": {
    "height": 30
   },
   "outputs": [],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b83d4e38-3e3c-4c5a-a949-040a27f29d63",
   "metadata": {},
   "source": [
    "## Tokens"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc2d9e40",
   "metadata": {
    "height": 64
   },
   "outputs": [],
   "source": [
    "response = get_completion(\"Take the letters in lollipop \\\n",
    "and reverse them\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d2b14d0-749d-4a79-9812-7b00ace9ae6f",
   "metadata": {},
   "source": [
    "\"lollipop\" in reverse should be \"popillol\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "37cab84f",
   "metadata": {
    "height": 47
   },
   "outputs": [],
   "source": [
    "response = get_completion(\"\"\"Take the letters in \\\n",
    "l-o-l-l-i-p-o-p and reverse them\"\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1577c561",
   "metadata": {
    "height": 30
   },
   "outputs": [],
   "source": [
    "response"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8b88940-d3ab-4c00-b5c0-31531deaacbd",
   "metadata": {},
   "source": [
    "## Helper function (chat format)\n",
    "Here's the helper function we'll use in this course."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8f89efad",
   "metadata": {
    "height": 215
   },
   "outputs": [],
   "source": [
    "def get_completion_from_messages(messages, \n",
    "                                 model=\"gpt-3.5-turbo\", \n",
    "                                 temperature=0, \n",
    "                                 max_tokens=500):\n",
    "    response = openai.ChatCompletion.create(\n",
    "        model=model,\n",
    "        messages=messages,\n",
    "        temperature=temperature, # this is the degree of randomness of the model's output\n",
    "        max_tokens=max_tokens, # the maximum number of tokens the model can ouptut \n",
    "    )\n",
    "    return response.choices[0].message[\"content\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b28c3424",
   "metadata": {
    "height": 198
   },
   "outputs": [],
   "source": [
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content':\"\"\"You are an assistant who\\\n",
    " responds in the style of Dr Seuss.\"\"\"},    \n",
    "{'role':'user', \n",
    " 'content':\"\"\"write me a very short poem\\\n",
    " about a happy carrot\"\"\"},  \n",
    "] \n",
    "response = get_completion_from_messages(messages, temperature=1)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56c6978d",
   "metadata": {
    "height": 198
   },
   "outputs": [],
   "source": [
    "# length\n",
    "messages =  [  \n",
    "{'role':'system',\n",
    " 'content':'All your responses must be \\\n",
    "one sentence long.'},    \n",
    "{'role':'user',\n",
    " 'content':'write me a story about a happy carrot'},  \n",
    "] \n",
    "response = get_completion_from_messages(messages, temperature =1)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14fd6331",
   "metadata": {
    "height": 217
   },
   "outputs": [],
   "source": [
    "# combined\n",
    "messages =  [  \n",
    "{'role':'system',\n",
    " 'content':\"\"\"You are an assistant who \\\n",
    "responds in the style of Dr Seuss. \\\n",
    "All your responses must be one sentence long.\"\"\"},    \n",
    "{'role':'user',\n",
    " 'content':\"\"\"write me a story about a happy carrot\"\"\"},\n",
    "] \n",
    "response = get_completion_from_messages(messages, \n",
    "                                        temperature =1)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "89a70c79",
   "metadata": {
    "height": 385
   },
   "outputs": [],
   "source": [
    "def get_completion_and_token_count(messages, \n",
    "                                   model=\"gpt-3.5-turbo\", \n",
    "                                   temperature=0, \n",
    "                                   max_tokens=500):\n",
    "    \n",
    "    response = openai.ChatCompletion.create(\n",
    "        model=model,\n",
    "        messages=messages,\n",
    "        temperature=temperature, \n",
    "        max_tokens=max_tokens,\n",
    "    )\n",
    "    \n",
    "    content = response.choices[0].message[\"content\"]\n",
    "    \n",
    "    token_dict = {\n",
    "'prompt_tokens':response['usage']['prompt_tokens'],\n",
    "'completion_tokens':response['usage']['completion_tokens'],\n",
    "'total_tokens':response['usage']['total_tokens'],\n",
    "    }\n",
    "\n",
    "    return content, token_dict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a64cf3c6",
   "metadata": {
    "height": 181
   },
   "outputs": [],
   "source": [
    "messages = [\n",
    "{'role':'system', \n",
    " 'content':\"\"\"You are an assistant who responds\\\n",
    " in the style of Dr Seuss.\"\"\"},    \n",
    "{'role':'user',\n",
    " 'content':\"\"\"write me a very short poem \\ \n",
    " about a happy carrot\"\"\"},  \n",
    "] \n",
    "response, token_dict = get_completion_and_token_count(messages)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cfd8fbd4",
   "metadata": {
    "height": 30
   },
   "outputs": [],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "352ad320",
   "metadata": {
    "height": 30
   },
   "outputs": [],
   "source": [
    "print(token_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65372cdd-d869-4768-947a-0173e7f96335",
   "metadata": {},
   "source": [
    "#### Notes on using the OpenAI API outside of this classroom\n",
    "\n",
    "To install the OpenAI Python library:\n",
    "```\n",
    "!pip install openai\n",
    "```\n",
    "\n",
    "The library needs to be configured with your account's secret key, which is available on the [website](https://platform.openai.com/account/api-keys). \n",
    "\n",
    "You can either set it as the `OPENAI_API_KEY` environment variable before using the library:\n",
    " ```\n",
    " !export OPENAI_API_KEY='sk-...'\n",
    " ```\n",
    "\n",
    "Or, set `openai.api_key` to its value:\n",
    "\n",
    "```\n",
    "import openai\n",
    "openai.api_key = \"sk-...\"\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d8f889c1-f2e4-40a5-bd27-164facb54402",
   "metadata": {},
   "source": [
    "#### A note about the backslash\n",
    "- In the course, we are using a backslash `\\` to make the text fit on the screen without inserting newline '\\n' characters.\n",
    "- GPT-3 isn't really affected whether you insert newline characters or not.  But when working with LLMs in general, you may consider whether newline characters in your prompt may affect the model's performance."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}