{ "cells": [ { "cell_type": "markdown", "id": "ae5bcee9-6588-4d29-bbb9-6fb351ef6630", "metadata": {}, "source": [ "# L1 Language Models, the Chat Format and Tokens" ] }, { "cell_type": "markdown", "id": "0c797991-8486-4d79-8c1d-5dc0c1289c2f", "metadata": {}, "source": [ "## Setup\n", "#### Load the API key and relevant Python libaries.\n", "In this course, we've provided some code that loads the OpenAI API key for you." ] }, { "cell_type": "code", "execution_count": null, "id": "19cd4e96", "metadata": { "height": 132 }, "outputs": [], "source": [ "import os\n", "import openai\n", "import tiktoken\n", "from dotenv import load_dotenv, find_dotenv\n", "_ = load_dotenv(find_dotenv()) # read local .env file\n", "\n", "openai.api_key = os.environ['OPENAI_API_KEY']" ] }, { "cell_type": "markdown", "id": "47ba0938-7ca5-46c4-a9d1-b55708d4dc7c", "metadata": {}, "source": [ "#### helper function\n", "This may look familiar if you took the earlier course \"ChatGPT Prompt Engineering for Developers\" Course" ] }, { "cell_type": "code", "execution_count": null, "id": "1ed96988", "metadata": { "height": 149 }, "outputs": [], "source": [ "def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n", " messages = [{\"role\": \"user\", \"content\": prompt}]\n", " response = openai.ChatCompletion.create(\n", " model=model,\n", " messages=messages,\n", " temperature=0,\n", " )\n", " return response.choices[0].message[\"content\"]" ] }, { "cell_type": "markdown", "id": "fe10a390-2461-447d-bf8b-8498db404c44", "metadata": {}, "source": [ "## Prompt the model and get a completion" ] }, { "cell_type": "code", "execution_count": null, "id": "e1cc57b2", "metadata": { "height": 45 }, "outputs": [], "source": [ "response = get_completion(\"What is the capital of France?\")" ] }, { "cell_type": "code", "execution_count": null, "id": "76774108", "metadata": { "height": 30 }, "outputs": [], "source": [ "print(response)" ] }, { "cell_type": "markdown", "id": "b83d4e38-3e3c-4c5a-a949-040a27f29d63", "metadata": {}, "source": [ "## Tokens" ] }, { "cell_type": "code", "execution_count": null, "id": "cc2d9e40", "metadata": { "height": 64 }, "outputs": [], "source": [ "response = get_completion(\"Take the letters in lollipop \\\n", "and reverse them\")\n", "print(response)" ] }, { "cell_type": "markdown", "id": "9d2b14d0-749d-4a79-9812-7b00ace9ae6f", "metadata": {}, "source": [ "\"lollipop\" in reverse should be \"popillol\"" ] }, { "cell_type": "code", "execution_count": null, "id": "37cab84f", "metadata": { "height": 47 }, "outputs": [], "source": [ "response = get_completion(\"\"\"Take the letters in \\\n", "l-o-l-l-i-p-o-p and reverse them\"\"\")" ] }, { "cell_type": "code", "execution_count": null, "id": "1577c561", "metadata": { "height": 30 }, "outputs": [], "source": [ "response" ] }, { "cell_type": "markdown", "id": "c8b88940-d3ab-4c00-b5c0-31531deaacbd", "metadata": {}, "source": [ "## Helper function (chat format)\n", "Here's the helper function we'll use in this course." ] }, { "cell_type": "code", "execution_count": null, "id": "8f89efad", "metadata": { "height": 215 }, "outputs": [], "source": [ "def get_completion_from_messages(messages, \n", " model=\"gpt-3.5-turbo\", \n", " temperature=0, \n", " max_tokens=500):\n", " response = openai.ChatCompletion.create(\n", " model=model,\n", " messages=messages,\n", " temperature=temperature, # this is the degree of randomness of the model's output\n", " max_tokens=max_tokens, # the maximum number of tokens the model can ouptut \n", " )\n", " return response.choices[0].message[\"content\"]" ] }, { "cell_type": "code", "execution_count": null, "id": "b28c3424", "metadata": { "height": 198 }, "outputs": [], "source": [ "messages = [ \n", "{'role':'system', \n", " 'content':\"\"\"You are an assistant who\\\n", " responds in the style of Dr Seuss.\"\"\"}, \n", "{'role':'user', \n", " 'content':\"\"\"write me a very short poem\\\n", " about a happy carrot\"\"\"}, \n", "] \n", "response = get_completion_from_messages(messages, temperature=1)\n", "print(response)" ] }, { "cell_type": "code", "execution_count": null, "id": "56c6978d", "metadata": { "height": 198 }, "outputs": [], "source": [ "# length\n", "messages = [ \n", "{'role':'system',\n", " 'content':'All your responses must be \\\n", "one sentence long.'}, \n", "{'role':'user',\n", " 'content':'write me a story about a happy carrot'}, \n", "] \n", "response = get_completion_from_messages(messages, temperature =1)\n", "print(response)" ] }, { "cell_type": "code", "execution_count": null, "id": "14fd6331", "metadata": { "height": 217 }, "outputs": [], "source": [ "# combined\n", "messages = [ \n", "{'role':'system',\n", " 'content':\"\"\"You are an assistant who \\\n", "responds in the style of Dr Seuss. \\\n", "All your responses must be one sentence long.\"\"\"}, \n", "{'role':'user',\n", " 'content':\"\"\"write me a story about a happy carrot\"\"\"},\n", "] \n", "response = get_completion_from_messages(messages, \n", " temperature =1)\n", "print(response)" ] }, { "cell_type": "code", "execution_count": null, "id": "89a70c79", "metadata": { "height": 385 }, "outputs": [], "source": [ "def get_completion_and_token_count(messages, \n", " model=\"gpt-3.5-turbo\", \n", " temperature=0, \n", " max_tokens=500):\n", " \n", " response = openai.ChatCompletion.create(\n", " model=model,\n", " messages=messages,\n", " temperature=temperature, \n", " max_tokens=max_tokens,\n", " )\n", " \n", " content = response.choices[0].message[\"content\"]\n", " \n", " token_dict = {\n", "'prompt_tokens':response['usage']['prompt_tokens'],\n", "'completion_tokens':response['usage']['completion_tokens'],\n", "'total_tokens':response['usage']['total_tokens'],\n", " }\n", "\n", " return content, token_dict" ] }, { "cell_type": "code", "execution_count": null, "id": "a64cf3c6", "metadata": { "height": 181 }, "outputs": [], "source": [ "messages = [\n", "{'role':'system', \n", " 'content':\"\"\"You are an assistant who responds\\\n", " in the style of Dr Seuss.\"\"\"}, \n", "{'role':'user',\n", " 'content':\"\"\"write me a very short poem \\ \n", " about a happy carrot\"\"\"}, \n", "] \n", "response, token_dict = get_completion_and_token_count(messages)" ] }, { "cell_type": "code", "execution_count": null, "id": "cfd8fbd4", "metadata": { "height": 30 }, "outputs": [], "source": [ "print(response)" ] }, { "cell_type": "code", "execution_count": null, "id": "352ad320", "metadata": { "height": 30 }, "outputs": [], "source": [ "print(token_dict)" ] }, { "cell_type": "markdown", "id": "65372cdd-d869-4768-947a-0173e7f96335", "metadata": {}, "source": [ "#### Notes on using the OpenAI API outside of this classroom\n", "\n", "To install the OpenAI Python library:\n", "```\n", "!pip install openai\n", "```\n", "\n", "The library needs to be configured with your account's secret key, which is available on the [website](https://platform.openai.com/account/api-keys). \n", "\n", "You can either set it as the `OPENAI_API_KEY` environment variable before using the library:\n", " ```\n", " !export OPENAI_API_KEY='sk-...'\n", " ```\n", "\n", "Or, set `openai.api_key` to its value:\n", "\n", "```\n", "import openai\n", "openai.api_key = \"sk-...\"\n", "```" ] }, { "cell_type": "markdown", "id": "d8f889c1-f2e4-40a5-bd27-164facb54402", "metadata": {}, "source": [ "#### A note about the backslash\n", "- In the course, we are using a backslash `\\` to make the text fit on the screen without inserting newline '\\n' characters.\n", "- GPT-3 isn't really affected whether you insert newline characters or not. But when working with LLMs in general, you may consider whether newline characters in your prompt may affect the model's performance." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }