{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GtkTGCuh6QZy"
      },
      "source": [
        "# Reinforcement Learning with OpenAI gpt-oss-20b: Teaching an LLM to Play 2048\n",
        "\n",
        "In this tutorial, we'll teach OpenAI's open-source model **gpt-oss 20b** to generate winning strategies for the classic 2048 puzzle game using **reinforcement learning (RL)**. By the end, you'll understand how to:\n",
        "\n",
        "- Connect LLMs to game environments using **OpenEnv**\n",
        "- Design reward functions that guide model behavior\n",
        "- Train models with **GRPO** (Group Relative Policy Optimization)\n",
        "- Prevent \"reward hacking\" with code sandboxing\n",
        "\n",
        "**Requirements:** This notebook runs on a free Tesla T4 Google Colab instance."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hzPgFeIkZn9q"
      },
      "source": [
        "## What is 2048?\n",
        "\n",
        "**2048** is an fun single-player sliding puzzle game created by Gabriele Cirulli in 2014. The game is played on a 4×4 grid where numbered tiles slide in four directions (up, down, left, right). When two tiles with the same number collide, they merge into one tile with their sum. The goal is to create a tile with the value **2048**—though skilled players can continue beyond that!\n",
        "\n",
        "The game requires strategic thinking: random moves quickly lead to a gridlock, while optimal play involves keeping high-value tiles in corners and building systematically.\n",
        "\n",
        "<img src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/f/f9/2048_win.png/500px-2048_win.png\" height=300 />\n",
        "\n",
        "## Our Goal\n",
        "\n",
        "We'll use reinforcement learning to train **OpenAI gpt-oss-20b** to generate Python functions that implement winning 2048 strategies. Rather than playing move-by-move, the model will learn to write *code* that plays the game—a form of \"code generation as policy.\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "31KIMLJLnHET"
      },
      "source": [
        "## Installation\n",
        "\n",
        "We need two key libraries for this tutorial:\n",
        "\n",
        "1. **[OpenEnv](https://github.com/meta-pytorch/OpenEnv)** - A unified interface to reinforcement learning environments. Traditional RL setups require installing and configuring each environment separately (Gym, OpenSpiel, Atari, etc.). OpenEnv provides a consistent API across all of them. Best of all, OpenEnv environments are available on Hugging Face Spaces, so we can connect to them remotely without any local installation.\n",
        "\n",
        "2. **[Unsloth](https://github.com/unslothai/unsloth)** - An optimized training library that reduces VRAM usage by ~70% through memory-efficient LoRA and gradient checkpointing. This lets us run RL on a free Colab T4 GPU."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "CGoDZwcunHEU"
      },
      "outputs": [],
      "source": [
        "%%capture\n",
        "import os, importlib.util\n",
        "\n",
        "!pip install --upgrade -qqq uv\n",
        "if importlib.util.find_spec(\"torch\") is None or \"COLAB_\" in \"\".join(os.environ.keys()):\n",
        "    try:\n",
        "        import numpy\n",
        "\n",
        "        get_numpy = f\"numpy=={numpy.__version__}\"\n",
        "    except:\n",
        "        get_numpy = \"numpy\"\n",
        "    !uv pip install -qqq \\\n",
        "        \"torch>=2.8.0\" \"triton>=3.4.0\" {get_numpy} torchvision bitsandbytes \"transformers==4.56.2\" trackio \\\n",
        "        \"unsloth_zoo[base] @ git+https://github.com/unslothai/unsloth-zoo\" \\\n",
        "        \"unsloth[base] @ git+https://github.com/unslothai/unsloth\" \\\n",
        "        git+https://github.com/triton-lang/triton.git@0add68262ab0a2e33b84524346cb27cbb2787356#subdirectory=python/triton_kernels\n",
        "elif importlib.util.find_spec(\"unsloth\") is None:\n",
        "    !uv pip install -qqq unsloth trackio\n",
        "\n",
        "!uv pip install --upgrade --no-deps transformers==4.56.2 tokenizers trl==0.22.2 unsloth unsloth_zoo"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OjifwNNZ7bMx"
      },
      "source": [
        "Next, we install the OpenEnv client and the connector for **OpenSpiel**—DeepMind's collection of game environments used in RL research. OpenSpiel includes implementations of classic games like Chess, Go, and our target: 2048."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1yzMMSLR7dNj"
      },
      "outputs": [],
      "source": [
        "%%capture\n",
        "!pip install -qqq openenv-core websockets\n",
        "!pip install -qqq git+https://huggingface.co/spaces/openenv/openspiel_env"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CcLYwLyQLADE"
      },
      "source": [
        "## Loading OpenAI gpt-oss 20b\n",
        "\n",
        "We load the model with several memory optimizations:\n",
        "\n",
        "| Parameter | Value | Description |\n",
        "|-----------|-------|-------------|\n",
        "| `max_seq_length` | 768 | Maximum context length. Increase for longer outputs (uses more VRAM). |\n",
        "| `load_in_4bit` | True | Quantizes weights to 4-bit, dramatically reducing memory usage. |\n",
        "| `lora_rank` | 4 | LoRA adapter rank. Higher = more expressive but slower/more memory. |\n",
        "| `offload_embedding` | True | Moves embeddings to CPU RAM, saving ~1GB VRAM. |"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 543,
          "referenced_widgets": [
            "419e4369b36644d3abec159511a88ad8",
            "eee0d2b6eb6047d8a4001f4999dcdc35",
            "2d91f25a8d0b4dd39dc320b2cf17fc0b",
            "1b84d7dc9567474c8587432d48342b71",
            "f34cadfaf9bb46729aeeff9492ba9026",
            "d257eaa588bd41fb947f81d306fe05cd",
            "b09745899e1446129f397d822a21fc99",
            "88aa58551344410a91b07af480d6ab53",
            "4f38070c09354406928c5e7be6cff3fd",
            "665f7b7e5496432985ed9f49829f5834",
            "d16dc921c6554bc6b77abcb423721cc8",
            "baf3db00d28f4c849bb6a6739e908c62",
            "f4f2fb240a12406ab5e709be33b93683",
            "4ca56d8605864a438290152268bfc686",
            "de306a50594f463ba9c78c713bc33241",
            "0f09a00375bd4d4c8863fc8fb7d64d61",
            "89e3ad0d517f44d084df0d8a3ed40703",
            "3f589cf3cb804ed29ba322e4fa10c511",
            "2612008cf36949d9a6618a01ac817618",
            "3e3ce50c437a412f9cc2bedb697648f7",
            "fa5c42218fbb44378bef71551c3383e0",
            "d82ab09313cc482f9b9b45192f489825",
            "4e63263fc27b4f07b4f6abffba082379",
            "eb13ad96565a44519e7cab9ce9483b90",
            "0b851acfd32047bfb6bae17d43ccfcb1",
            "d4c72002b5fe44d3ad7ae67f5536c889",
            "a1e042e5b8ad4b028cc28ac54924207d",
            "6d77c8dcb28240f7b860571d11f8b9af",
            "4f29591af0d64d398515479b032a1b3d",
            "667192c07a7340a9a72ed25648a0be64",
            "aa714b1f70e8495b9299b51d6ac4c3c4",
            "06e420cfa2974f7d8d7ec4b83f064a6e",
            "3566e9058ebc45498b42e521f2314365",
            "6af6de3285684e76b320571b44af5fc1",
            "f7cc1614e13d4d22b83f787b20407a5f",
            "072060f8bdb54a15baf838f67d376d99",
            "5371b2fad7b04f97bd8f4671d844d2cb",
            "b10d25fb43eb42198abf71c4e326bcff",
            "7ce0f239a8514923b383f738bc0c9899",
            "4bc16bd2399a43fe85967131af7f846a",
            "29d98af49dd3412f84f5843b937029d1",
            "93176d6b63284b4e832e0f028be90655",
            "9c2371c32afe46519ee53427ea42bc9a",
            "f9d4672fb86b4b4e9c3e9068dc479e5b",
            "297bdff1add5414893319b185cb15da6",
            "362947c7e102474cbf560f51e713bdff",
            "7f63ff3e87aa4b86a4f8e64785d1d34c",
            "cca7d922b85449a1b0c5c025b65fba10",
            "5520c59f24cd414092bf5f952425611d",
            "b1dcd2f908344949af01dab5a244e458",
            "9e6f46ddb61943f4a168d70a209a7ddc",
            "ae79b9a378ee4218beddf77ac9af6de7",
            "e6d63cf58647443b9749195ec0579d87",
            "e7ca6b7ba2094872a888da5511e2bb49",
            "7c7d40163ecc4dae8a2c54af21de4661",
            "4dc823fcd0dd4eaaa2e8aaff0daa9ad1",
            "1b26eac03fed4c8784bd611474cf4607",
            "a24e2cfbef794d35a0e22753352caa15",
            "8368e4420e814c6f9be30994b69c66ee",
            "eb8c197ce1fc41f78531e5e73ae14a89",
            "b72fc22310714bf0bf6ff4021db5aba8",
            "d2a7fa9dddc240e29330297870159c59",
            "0e49035e0c3a4ee4ab477b475e74ef36",
            "08d8f0cfd7614900a9c9bba888619749",
            "d1cea390ecaa4583a698278fa3a438c9",
            "5483233a9b224f3c8eb9337e4ed82314",
            "cbed0dab2d7540f697eacec5a33e1061",
            "dc9b2549ff834880ac19b578adfee5a5",
            "48f65e25acd84b0cb582de66753215da",
            "7c52841c7e714173bfd526b0a625bc9d",
            "bd8f575034c041be93ff20f63388de2d",
            "212c17e3829c4accb30265a3d9ee73dc",
            "1daa373c890b4ae0a9cf6a3ec325693c",
            "383e9b4b74e34cae96c1f46f41591b82",
            "46cb3593b37c4cb9b4bac422bc5809c8",
            "ae9728c04b974af29460ce5179a9edba",
            "754f2452fbe14c7098215ec810ffbf14",
            "3c2e00b9d20a4c9ea5b850b776752fd2",
            "7c29e5f727ec4b608c06d0487a2c9e53",
            "4beae5415d0647e2898a83313a08ea94",
            "62cd4ccc430549e7a2c156222d47ebeb",
            "4691273248984fdea29d83ec0a246cd9",
            "ae3819cd042f43babef24199081bb97f",
            "52e9a0ea76df4fd8822ccadadaadb501",
            "a6a122064bc340868b5e0e11afa9c42f",
            "3371a1da5d0e426bb6cc02a6c383dd6f",
            "bedc42e908454f1494768194b43b2964",
            "41ee8407344d402997b0574e0ae26c77",
            "5d1e2fdbf7a2409abbafb63e4b160668",
            "a4c19dc98fe943e09a26a60d23c8ff01",
            "8f5799610318490492ff5aba76be3d1a",
            "e78534b135d2465c83e3be614b71c8a4",
            "2f57c8e713b94c8692837b4e17c9e983",
            "f8d2164046fb46a298e2d80628808cb3",
            "af23432e17664cd6852496e43f9de0cf",
            "1055200185004ea2a95a05eb51232501",
            "eb1068efb4364ee2893c4d21c58f38db",
            "c4d592366499414a99f19bce7f0bd665",
            "1ece5fe9597a4e3db2dd96c38995705c",
            "050567dccb47456aaac65d118ac60a6b",
            "ea2f3ca562444c46b50596a1d7cf9030",
            "0b7287d482cc44dbb406d71f23f1aea0",
            "474c7fe6ef4b430d9826171eded2ebf1",
            "12a877304ddf45e49bbdfe056394c3d6",
            "56b30cc150924879abfc138427f4ca98",
            "02c88a690a384ae183c233b6927aaf57",
            "d541fd264d7e4b2691e8efbd993e6ae7",
            "cbc7bbfe983f4e6696f8dd6b37c50543",
            "f290b17e3cd9487b9b97d54d6cec9efc",
            "18d4cc3fd8ee4e3fbf27c574fd467f20",
            "941d8cbf8188402eb603c18b6e979035",
            "8708226010324a83b4f7900c3958d430",
            "fc12c97d659640b3b2b2b48ad7f17e5b",
            "aa22cb2d1dcf4001bd3720b075318906",
            "bb358c5523814376a2d4690f88f20b74",
            "4ba91b8d008e483d89f16f204326f6b6",
            "31afdb187d2a44d8b3a101fa18543c13",
            "28300c16023d4ad9a59784baea2f57aa",
            "004f28173c3b4fb6a9f8c2068f5db81f",
            "0b8fa4ff186a4bfeac18cf6d676e99df",
            "ccb581e230604bb690015eb685e4b8e1"
          ]
        },
        "id": "DkIvEkIIkEyB",
        "outputId": "feb2ac4d-19dc-4e55-ebbe-e28e0ef721c1"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n",
            "🦥 Unsloth Zoo will now patch everything to make training faster!\n",
            "==((====))==  Unsloth 2026.1.3: Fast Gpt_Oss patching. Transformers: 4.56.2.\n",
            "   \\\\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.\n",
            "O^O/ \\_/ \\    Torch: 2.9.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.5.0\n",
            "\\        /    Bfloat16 = FALSE. FA [Xformers = None. FA2 = False]\n",
            " \"-____-\"     Free license: http://github.com/unslothai/unsloth\n",
            "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n",
            "Unsloth: Using float16 precision for gpt_oss won't work! Using float32.\n"
          ]
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "419e4369b36644d3abec159511a88ad8",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model.safetensors.index.json: 0.00B [00:00, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "baf3db00d28f4c849bb6a6739e908c62",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model-00001-of-00004.safetensors:   0%|          | 0.00/4.00G [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "4e63263fc27b4f07b4f6abffba082379",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model-00002-of-00004.safetensors:   0%|          | 0.00/4.00G [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "6af6de3285684e76b320571b44af5fc1",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model-00003-of-00004.safetensors:   0%|          | 0.00/3.37G [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "297bdff1add5414893319b185cb15da6",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model-00004-of-00004.safetensors:   0%|          | 0.00/1.16G [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "4dc823fcd0dd4eaaa2e8aaff0daa9ad1",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "cbed0dab2d7540f697eacec5a33e1061",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "generation_config.json:   0%|          | 0.00/165 [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Unsloth: Offloading embeddings to RAM to save 1.08 GB.\n"
          ]
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "3c2e00b9d20a4c9ea5b850b776752fd2",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "tokenizer_config.json: 0.00B [00:00, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "5d1e2fdbf7a2409abbafb63e4b160668",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "tokenizer.json:   0%|          | 0.00/27.9M [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "050567dccb47456aaac65d118ac60a6b",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "special_tokens_map.json:   0%|          | 0.00/446 [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "941d8cbf8188402eb603c18b6e979035",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "chat_template.jinja: 0.00B [00:00, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "import os\n",
        "from unsloth import FastLanguageModel\n",
        "import torch\n",
        "\n",
        "max_seq_length = 768  # Can increase for longer RL output\n",
        "lora_rank = 4  # Larger rank = smarter, but slower\n",
        "model, tokenizer = FastLanguageModel.from_pretrained(\n",
        "    model_name=\"openai/gpt-oss-20b\",  # Official OpenAI weights\n",
        "    load_in_4bit=True,\n",
        "    max_seq_length=max_seq_length,\n",
        "    offload_embedding=True,  # Offload embeddings to save more VRAM\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TfeUs-lQJDSq"
      },
      "source": [
        "### Applying LoRA for Efficient Training\n",
        "\n",
        "[LoRA (Low-Rank Adaptation)](https://hf.co/papers/2106.09685) lets us fine-tune large models by adding small trainable adapters (1-5% of original parameters) instead of updating all weights. This reduces memory usage by ~60% while maintaining accuracy.\n",
        "\n",
        "We target the attention and feedforward layers—these are where most of the model's \"reasoning\" happens:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "8rGa-o3HJCo1",
        "outputId": "9931e1a8-9990-4e12-f114-f8a3d0ebb2a4"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Unsloth: Making `model.base_model.model.model` require gradients\n"
          ]
        }
      ],
      "source": [
        "model = FastLanguageModel.get_peft_model(\n",
        "    model,\n",
        "    r=lora_rank,  # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128\n",
        "    target_modules=[\n",
        "        \"q_proj\",\n",
        "        \"k_proj\",\n",
        "        \"v_proj\",\n",
        "        \"o_proj\",\n",
        "        \"gate_proj\",\n",
        "        \"up_proj\",\n",
        "        \"down_proj\",\n",
        "    ],\n",
        "    lora_alpha=lora_rank * 2,  # *2 speeds up training\n",
        "    use_gradient_checkpointing=\"unsloth\",  # Reduces memory usage\n",
        "    random_state=3407,\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "N0QnO9_YJBOI"
      },
      "source": [
        "## Connecting to the 2048 Game Environment\n",
        "\n",
        "OpenEnv lets us connect to game environments hosted remotely. We'll use a **Hugging Face Space** that runs the OpenSpiel 2048 game server. This architecture has several benefits:\n",
        "\n",
        "- **No local installation** of game dependencies (OpenSpiel can be tricky to build)\n",
        "- **Consistent environment** across different machines\n",
        "- **Scalable** - the same pattern works for more complex environments\n",
        "\n",
        "The Space exposes a WebSocket API that accepts actions and returns game states."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "hQrG81XV87-7"
      },
      "outputs": [],
      "source": [
        "from openspiel_env import OpenSpielEnv\n",
        "from openspiel_env.models import OpenSpielAction, OpenSpielObservation"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-WT0Zu0IcN_r"
      },
      "source": [
        "The [openenv/openspiel_env](https://huggingface.co/spaces/openenv/openspiel_env) Space hosts a running OpenSpiel server configured for 2048. It handles game state management, validates moves, and returns observations after each action. You can also run the server locally for faster iteration:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zBIye94T8djG"
      },
      "outputs": [],
      "source": [
        "# Connect to OpenSpiel 2048 environment on HuggingFace Spaces\n",
        "# The game is configured server-side via OPENSPIEL_GAME=2048\n",
        "OPENSPIEL_URL = \"https://openenv-openspiel-env.hf.space\"\n",
        "# For local: OPENSPIEL_URL = \"http://localhost:8000\"\n",
        "\n",
        "env = OpenSpielEnv(base_url=OPENSPIEL_URL)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "P3rMiKLl9Ro2"
      },
      "source": [
        "Let's see how the current 2048 game state looks like:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "uBPx0Hho9Xi1",
        "outputId": "7e7fae62-6d83-4c5e-b8c5-b49a1ccc9c53"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "OpenSpielObservation(done=False, reward=None, metadata={}, info_state=[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], legal_actions=[0, 1, 2], game_phase='initial', current_player_id=0, opponent_last_action=None)"
            ]
          },
          "execution_count": 9,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "result = env.reset()\n",
        "current_state = result.observation\n",
        "current_state"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4Qz1tRVTAii8"
      },
      "source": [
        "### Decoding the Game State\n",
        "\n",
        "OpenSpiel's 2048 `info_state` uses a compact encoding—not raw tile values. The first 16 elements represent the 4×4 board positions, with each value being **log₂ of the tile** (so 1 = 2¹ = 2, 2 = 2² = 4, etc.). Values of 0 represent empty cells.\n",
        "\n",
        "We need to:\n",
        "1. Extract only the first 16 elements (the board)\n",
        "2. Reshape into a 4×4 grid\n",
        "3. Convert from log₂ encoding to actual tile values"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "7PVeoKW2AmKr",
        "outputId": "5689ae76-cd68-48f5-e7e7-e05494109a63"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "([[0, 1, 0, 0, 0, 0, 0, 0],\n",
              "  [0, 0, 0, 0, 0, 0, 0],\n",
              "  [0, 0, 0, 0, 0, 0, 0],\n",
              "  [0, 0, 0, 0, 0, 0, 0],\n",
              "  [0, 0, 0, 0, 0, 0, 0],\n",
              "  [0, 0, 0, 0, 0, 0, 0],\n",
              "  [0, 0, 0, 0, 1, 0, 0]],\n",
              " 7)"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "import numpy as np\n",
        "\n",
        "# 2048 game constants\n",
        "BOARD_SIZE = 4\n",
        "BOARD_CELLS = BOARD_SIZE * BOARD_SIZE  # 16\n",
        "WIN_TILE = 2048  # The target tile value to win\n",
        "\n",
        "\n",
        "def convert_to_board(current_state):\n",
        "    \"\"\"\n",
        "    Convert OpenSpiel 2048 observation to a 4×4 board of tile values.\n",
        "\n",
        "    OpenSpiel encodes tiles as log₂(value), so we convert back:\n",
        "    - 0 → 0 (empty)\n",
        "    - 1 → 2 (2^1)\n",
        "    - 2 → 4 (2^2)\n",
        "    - etc.\n",
        "    \"\"\"\n",
        "    # Extract only the first 16 elements (the board state)\n",
        "    raw_board = current_state.info_state[:BOARD_CELLS]\n",
        "\n",
        "    # Convert from log₂ encoding to actual tile values\n",
        "    # 0 stays 0 (empty), otherwise 2^value\n",
        "    tiles = [int(2**val) if val > 0 else 0 for val in raw_board]\n",
        "\n",
        "    # Reshape into 4×4 grid\n",
        "    board = [tiles[i * BOARD_SIZE : (i + 1) * BOARD_SIZE] for i in range(BOARD_SIZE)]\n",
        "    return board, BOARD_SIZE\n",
        "\n",
        "\n",
        "convert_to_board(current_state)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hCS56yu29dsG"
      },
      "source": [
        "We also want to pretty print the game board! This is not entirely necessary, but it helps us visualize the game state and learn from the process."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "D9CI4jtgL5mw"
      },
      "outputs": [],
      "source": [
        "# @title (Collapsible) 2048 Game Renderer\n",
        "def render_board(obs, colors: bool = True, border: bool = True, dot_for_zero: bool = True) -> str:\n",
        "    \"\"\"\n",
        "    Pretty-print the board with colors that scale from 0 up to self.target.\n",
        "    Uses ANSI 256-color codes (works in most terminals). Set colors=False to disable.\n",
        "    \"\"\"\n",
        "    import math\n",
        "\n",
        "    b, size = convert_to_board(obs)\n",
        "    mx = max((max(row) for row in b), default=0)\n",
        "    cell_w = max(3, len(str(mx)))\n",
        "\n",
        "    RESET = \"\\x1b[0m\"\n",
        "\n",
        "    # A smooth-ish gradient from cool → warm\n",
        "    # (blue/cyan/green → yellow/orange/red). Tweak or expand as you like.\n",
        "    GRAD = [33, 39, 45, 51, 50, 49, 48, 47, 46, 82, 118, 154, 190, 226, 220, 214, 208, 202, 196]\n",
        "    ZERO_FG = 239  # dim gray\n",
        "\n",
        "    def color_code(v: int) -> str:\n",
        "        if not colors:\n",
        "            return \"\"\n",
        "        if v == 0:\n",
        "            return f\"\\x1b[38;5;{ZERO_FG}m\"\n",
        "        # Normalize by exponent relative to target: r in [0,1]\n",
        "        t = max(2, WIN_TILE)  # safety; avoid log2(1)\n",
        "        # Guard: if v is not a power of two or is <1, handle gracefully\n",
        "        try:\n",
        "            r = max(0.0, min(1.0, math.log2(v) / math.log2(t)))\n",
        "        except ValueError:\n",
        "            r = 0.0\n",
        "        idx = int(round(r * (len(GRAD) - 1)))\n",
        "        return f\"\\x1b[38;5;{GRAD[idx]}m\"\n",
        "\n",
        "    def fmt(v: int) -> str:\n",
        "        s = \".\" if (v == 0 and dot_for_zero) else str(v)\n",
        "        s = s.rjust(cell_w)\n",
        "        return color_code(v) + s + (RESET if colors else \"\")\n",
        "\n",
        "    def hline(left: str, mid: str, right: str) -> str:\n",
        "        return left + mid.join(\"─\" * cell_w for _ in range(size)) + right\n",
        "\n",
        "    rows = []\n",
        "    if border:\n",
        "        rows.append(hline(\"┌\", \"┬\", \"┐\"))\n",
        "    for r in range(size):\n",
        "        content = \"│\".join(fmt(v) for v in b[r])\n",
        "        rows.append((\"│\" + content + \"│\") if border else content)\n",
        "        if border:\n",
        "            rows.append(\n",
        "                hline(\"└\" if r == size - 1 else \"├\", \"┴\" if r == size - 1 else \"┼\", \"┘\" if r == size - 1 else \"┤\")\n",
        "            )\n",
        "    return \"\\n\".join(rows)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "DAJE2LUo9oRR",
        "outputId": "4b15a5e3-26fb-4ce0-a4b5-0261ab7a5999"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n"
          ]
        }
      ],
      "source": [
        "print(render_board(current_state))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0AhUa4hW-Dji"
      },
      "source": [
        "We can see the `legal_actions` ie what you can take as `[0, 1, 2, 3]` Let's try doing the action `0`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "b-gSgthFI_wq",
        "outputId": "b490416b-e52e-4e13-d761-23c4efecf08a"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n"
          ]
        }
      ],
      "source": [
        "action = OpenSpielAction(action_id=0, game_name=\"2048\")\n",
        "result = env.step(action)\n",
        "current_state = result.observation\n",
        "print(render_board(current_state))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lPSNb8-A-iPn"
      },
      "source": [
        "So it looks like `0` is a move up action! Let's try `1`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "IUel11Tc-oLB",
        "outputId": "b67365e4-d760-4d49-dafc-57f4b397c98c"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n"
          ]
        }
      ],
      "source": [
        "action = OpenSpielAction(action_id=1, game_name=\"2048\")\n",
        "result = env.step(action)\n",
        "current_state = result.observation\n",
        "print(render_board(current_state))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nUlOshVe-qNL"
      },
      "source": [
        "`1` is a move right action. And `2`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "XU09-KA3-sqs",
        "outputId": "f9089a3f-f564-418f-b2bd-512103f13135"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n"
          ]
        }
      ],
      "source": [
        "action = OpenSpielAction(action_id=2, game_name=\"2048\")\n",
        "result = env.step(action)\n",
        "current_state = result.observation\n",
        "print(render_board(current_state))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "X2r7Zqw9-u-d"
      },
      "source": [
        "`2` is a move down. And I guess `3` is just move left!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "pFgspqn6-zd2",
        "outputId": "65e800d7-9d78-420e-b4fa-232f4c919481"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n"
          ]
        }
      ],
      "source": [
        "action = OpenSpielAction(action_id=3, game_name=\"2048\")\n",
        "result = env.step(action)\n",
        "current_state = result.observation\n",
        "print(render_board(current_state))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RJP4TsDq-2ft"
      },
      "source": [
        "We can also print the game status which indicates if no more moves are possible, and also the possible actions you can take!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "MEa2ngmrvfNm",
        "outputId": "48462391-11d8-4c55-e5c4-f05ec1df6dfe"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "False\n",
            "[0, 1, 2]\n"
          ]
        }
      ],
      "source": [
        "print(current_state.done)\n",
        "print(current_state.legal_actions)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VR6czU96cpxf"
      },
      "source": [
        "## RL Environment Setup: The Strategy Executor\n",
        "\n",
        "For reinforcement learning, we need a way to evaluate generated strategies. The key insight is that **our model doesn't play 2048 directly**—instead, it writes Python code that plays the game. We then execute that code and measure how well it performs.\n",
        "\n",
        "The `execute_strategy` function:\n",
        "1. Takes a generated Python function (the \"strategy\")\n",
        "2. Runs the 2048 game loop, calling the strategy for each move\n",
        "3. Returns how many steps the game lasted and whether it reached 2048\n",
        "\n",
        "**Timeout protection**: LLM-generated code might contain infinite loops or be very slow. We wrap execution with a 2-second timeout to ensure the RL training loop doesn't hang."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "tdgjnf-8z_kr"
      },
      "outputs": [],
      "source": [
        "from typing import Callable\n",
        "from unsloth import execute_with_time_limit\n",
        "import itertools\n",
        "\n",
        "\n",
        "def has_won(board) -> bool:\n",
        "    \"\"\"Check if the board contains a winning tile (2048 or higher).\"\"\"\n",
        "    max_tile = max(itertools.chain.from_iterable(board))\n",
        "    return max_tile >= WIN_TILE\n",
        "\n",
        "\n",
        "def _execute_strategy(strategy, current_state: OpenSpielObservation):\n",
        "    \"\"\"Execute a strategy function on the 2048 game until completion or invalid move.\"\"\"\n",
        "    assert callable(strategy)\n",
        "\n",
        "    steps = 0\n",
        "    total_reward = 0\n",
        "    board = None\n",
        "\n",
        "    while not current_state.done:\n",
        "        board, _ = convert_to_board(current_state)\n",
        "        action = strategy(board)\n",
        "        try:\n",
        "            action = int(action)\n",
        "        except:\n",
        "            return steps, False\n",
        "        steps += 1\n",
        "\n",
        "        # Invalid action - return current win status\n",
        "        if type(action) is not int or action not in current_state.legal_actions:\n",
        "            return steps, has_won(board) if board else False\n",
        "\n",
        "        action = OpenSpielAction(action_id=action, game_name=\"2048\")\n",
        "        result = env.step(action)\n",
        "        current_state = result.observation\n",
        "        if result.reward is not None:\n",
        "            total_reward += result.reward\n",
        "\n",
        "    # Game ended - check final board for win\n",
        "    if board is None:\n",
        "        board, _ = convert_to_board(current_state)\n",
        "    return steps, has_won(board)\n",
        "\n",
        "\n",
        "@execute_with_time_limit(2)\n",
        "def execute_strategy(strategy: Callable, current_state: OpenSpielObservation):\n",
        "    return _execute_strategy(strategy, current_state)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ywh0HizI9ayE"
      },
      "source": [
        "Let's make a generic strategy to just hit `3`. We should expect this generic strategy to fail:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "5bkhqoZc0IO8",
        "outputId": "c773dddf-7a5b-40fa-b061-33f407f8b804"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "(1, False)"
            ]
          },
          "execution_count": 19,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "def always_move_left(board):\n",
        "    return 3\n",
        "\n",
        "\n",
        "# Reset OpenEnv to an initial state!\n",
        "result = env.reset()\n",
        "current_state = result.observation\n",
        "try:\n",
        "    steps, if_done = execute_strategy(always_move_left, current_state)\n",
        "except TimeoutError as e:\n",
        "    print(f\"Timed out with error = {str(e)}\")\n",
        "\n",
        "steps, if_done"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dkuHVdB09sgf"
      },
      "source": [
        "To allow longer strategies for GPT-OSS Reinforcement Learning, we shall allow a 5 second timer."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "SK-LfzsA9wbW"
      },
      "outputs": [],
      "source": [
        "@execute_with_time_limit(5)\n",
        "def execute_strategy(strategy: Callable, current_state: OpenSpielObservation):\n",
        "    return _execute_strategy(strategy, current_state)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tRhLV_bZMYxy"
      },
      "source": [
        "## Sandboxed Code Execution: Preventing Reward Hacking\n",
        "\n",
        "A critical challenge in RL with code generation is **reward hacking**—the model might learn to \"cheat\" rather than solve the actual problem. For example, it could:\n",
        "\n",
        "- Import external libraries to hardcode solutions\n",
        "- Access global variables to manipulate game state directly  \n",
        "- Call system functions to bypass the game logic\n",
        "\n",
        "We use two safeguards:\n",
        "\n",
        "1. `check_python_modules` validates that the code only uses Python standard library imports (no numpy, pandas, etc.)\n",
        "2. `create_locked_down_function` executes code in an isolated namespace with no access to global variables\n",
        "\n",
        "Let's see these in action. First, a valid strategy that only uses standard library:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "zz80kvg6M4BG",
        "outputId": "19d7f047-127f-4667-8d8a-c64c132f87ea"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Only Python imports? True\n",
            "{'stdlib': ['math', 'typing'], 'non_stdlib': [], 'relative_imports': 0}\n"
          ]
        }
      ],
      "source": [
        "from unsloth import check_python_modules\n",
        "\n",
        "sample = \"\"\"\n",
        "def strategy(board):\n",
        "    import math\n",
        "    from typing import Callable\n",
        "    return \"0\"\n",
        "\"\"\"\n",
        "ok, info = check_python_modules(sample)\n",
        "print(\"Only Python imports?\", ok)\n",
        "print(info)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bZzVWgKQ-VIg"
      },
      "source": [
        "For the below piece of code, since we import `numpy`, we should not allow the execution:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Z89Jw1KB-Ux7",
        "outputId": "203c5dc5-bd09-42cc-b67f-08dd4a23af1c"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Only Python imports? False\n",
            "{'stdlib': [], 'non_stdlib': ['numpy'], 'relative_imports': 0}\n"
          ]
        }
      ],
      "source": [
        "sample = \"\"\"\n",
        "def strategy(board):\n",
        "    from numpy import matmul\n",
        "    return \"0\"\n",
        "\"\"\"\n",
        "ok, info = check_python_modules(sample)\n",
        "print(\"Only Python imports?\", ok)\n",
        "print(info)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SDSrjOTLVyQm"
      },
      "source": [
        "We also disallow global variable access. We'll use Unsloth's `create_locked_down_function` function\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "GcmYAmohVqw2",
        "outputId": "b66e5e50-5ec8-42f6-8440-d5d8bc5dcc08"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "name 'np' is not defined\n"
          ]
        }
      ],
      "source": [
        "from unsloth import create_locked_down_function\n",
        "\n",
        "function = \"\"\"\n",
        "def import_numpy():\n",
        "    np.matmul\n",
        "    print(\"Success\")\n",
        "\"\"\"\n",
        "f = create_locked_down_function(function)\n",
        "try:\n",
        "    f()\n",
        "except Exception as e:\n",
        "    print(str(e))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "5tJKwLUgZsRq",
        "outputId": "512a7a58-5a11-4612-fccd-55ee2678f97c"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "60\n"
          ]
        }
      ],
      "source": [
        "from unsloth import create_locked_down_function\n",
        "\n",
        "function = \"\"\"\n",
        "def add(a, b):\n",
        "    def adder(a):\n",
        "        return a + b\n",
        "    return adder(b) + b\n",
        "\"\"\"\n",
        "f = create_locked_down_function(function)\n",
        "try:\n",
        "    print(f(10, 20))\n",
        "except Exception as e:\n",
        "    print(str(e))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8CzwCyXIPK04"
      },
      "source": [
        "## Prompt Design: Instructing the Model\n",
        "\n",
        "The prompt is crucial—it tells the model what we expect it to generate. We want:\n",
        "\n",
        "1. A **single Python function** named `strategy(board)` that:\n",
        "   - Takes a 4×4 list of lists as input\n",
        "   - Returns a move: \"0\" (up), \"1\" (right), \"2\" (down), or \"3\" (left)\n",
        "2. **Self-contained code** with all helpers defined inside the function\n",
        "3. **Native Python only** (no external dependencies)\n",
        "\n",
        "This structured output format makes parsing straightforward and helps the model understand the task boundaries."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "B-2RRE4HMrQO",
        "outputId": "20224710-abb6-4b34-a13f-dfb526c6fa9f"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Create a new short 2048 strategy using only native Python code.\n",
            "You are given a list of list of numbers for the current board state.\n",
            "Output one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\n",
            "Output your new short function in backticks using the format below:\n",
            "```python\n",
            "def strategy(board):\n",
            "    return \"0\" # Example\n",
            "```\n",
            "All helper functions should be inside def strategy. Only output the short function `strategy`.\n"
          ]
        }
      ],
      "source": [
        "prompt = \"\"\"\n",
        "Create a new short 2048 strategy using only native Python code.\n",
        "You are given a list of list of numbers for the current board state.\n",
        "Output one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\n",
        "Output your new short function in backticks using the format below:\n",
        "```python\n",
        "def strategy(board):\n",
        "    return \"0\" # Example\n",
        "```\n",
        "All helper functions should be inside def strategy. Only output the short function `strategy`.\n",
        "\"\"\".strip()\n",
        "print(prompt)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MIdudFUodN4i"
      },
      "source": [
        "Let's see what the **base model** (before RL training) generates when given this prompt:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "9HJxrS76h3Ds",
        "outputId": "b6ead093-ac19-49f6-9d0f-98e6e196dace"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\n",
            "Knowledge cutoff: 2024-06\n",
            "Current date: 2026-01-20\n",
            "\n",
            "Reasoning: low\n",
            "\n",
            "# Valid channels: analysis, commentary, final. Channel must be included for every message.\n",
            "Calls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>user<|message|>Create a new short 2048 strategy using only native Python code.\n",
            "You are given a list of list of numbers for the current board state.\n",
            "Output one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\n",
            "Output your new short function in backticks using the format below:\n",
            "```python\n",
            "def strategy(board):\n",
            "    return \"0\" # Example\n",
            "```\n",
            "All helper functions should be inside def strategy. Only output the short function `strategy`.<|end|><|start|>assistant<|channel|>analysis<|message|>We need to provide a short function. Probably simple heuristic: choose move with lowest collision? Use sum? Just a placeholder.<|end|><|start|>assistant<|channel|>final<|message|>```python\n",
            "def strategy(board):\n",
            "    scores = [0]*4\n",
            "    for i in range(4):\n",
            "        r = [row[i] for row in board]  # column\n",
            "        if any(r) and any(r) != 0:\n",
            "            # prefer moves that keep most tiles grouped\n",
            "            if i==0 or i==1:\n",
            "                scores[i] += sum(r)\n",
            "    best = scores.index(max(scores))\n",
            "    return str(best)\n",
            "```<|return|>\n"
          ]
        }
      ],
      "source": [
        "text = tokenizer.apply_chat_template(\n",
        "    [{\"role\": \"user\", \"content\": prompt}],\n",
        "    tokenize=False,\n",
        "    add_generation_prompt=True,\n",
        "    reasoning_effort=\"low\",\n",
        ")\n",
        "\n",
        "from transformers import TextStreamer\n",
        "\n",
        "_ = model.generate(\n",
        "    **tokenizer(text, return_tensors=\"pt\").to(\"cuda\"),\n",
        "    temperature=1.0,\n",
        "    max_new_tokens=512,\n",
        "    streamer=TextStreamer(tokenizer, skip_prompt=False),\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iknaWZNudTNq"
      },
      "source": [
        "## Designing Reward Functions\n",
        "\n",
        "Reward functions are the heart of RL—they define what the \"good\" behavior that we want to encourage in the model looks like. For code generation, we need a multi-part reward that captures different aspects of quality:\n",
        "\n",
        "| Reward Function | Purpose | Score Range |\n",
        "|-----------------|---------|-------------|\n",
        "| `function_works` | Is the generated code syntactically valid and executable? | -2.0 to +1.0 |\n",
        "| `no_cheating` | Does the code avoid forbidden imports (numpy, etc.)? | -20.0 to +1.0 |\n",
        "| `strategy_succeeds` | Does the strategy actually play 2048 well? | -3.0 to +20.0 |\n",
        "\n",
        "\n",
        "Let's break down the reward functions into some practical examples:\n",
        "- We could heavily penalize cheating (-20.0) to make honest solutions more rewarding.\n",
        "- We could massively reward success (+20.0) since reaching 2048 is rare initially.\n",
        "- We could graduated penalties for partial failures (timeout vs. crash vs. invalid syntax). This gives the model more information to learn from, creating a 'richer' reward signal.\n",
        "\n",
        "First, we need a helper to extract the Python function from the model's markdown-formatted output:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "8JJGXKdJ-Zl_",
        "outputId": "1474af23-0d1f-4e73-b111-579ea0805d27"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "def strategy(board):\n",
            "    return \"0\" # Example\n"
          ]
        }
      ],
      "source": [
        "def extract_function(text):\n",
        "    if text.count(\"```\") >= 2:\n",
        "        first = text.find(\"```\") + 3\n",
        "        second = text.find(\"```\", first)\n",
        "        fx = text[first:second].strip()\n",
        "        fx = fx.removeprefix(\"python\\n\")\n",
        "        fx = fx[fx.find(\"def\") :]\n",
        "        if fx.startswith(\"def strategy(board):\"):\n",
        "            return fx\n",
        "    return None\n",
        "\n",
        "\n",
        "print(extract_function(prompt))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KLXEcf_HSJlI"
      },
      "source": [
        "Below is our `function_works` reward function which uses Python's `exec` but guarded by not allowing leakage of local and global variables. We can also use `check_python_modules` first to check if there are errors before even executing the function:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "h3-B0IIsS56S",
        "outputId": "33348b7f-c5dd-4d22-a0ac-e68a2264426f"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "(False,\n",
              " {'error': \"SyntaxError: expected '(' (<unknown>, line 1)\",\n",
              "  'stdlib': [],\n",
              "  'non_stdlib': [],\n",
              "  'relative_imports': 0})"
            ]
          },
          "execution_count": 28,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "ok, info = check_python_modules(\"def a\")\n",
        "ok, info"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qgFNXORy-lpO"
      },
      "outputs": [],
      "source": [
        "def function_works(completions, **kwargs):\n",
        "    scores = []\n",
        "    for completion in completions:\n",
        "        score = 0\n",
        "        response = completion[0][\"content\"]\n",
        "        function = extract_function(response)\n",
        "        if function is not None:\n",
        "            ok, info = check_python_modules(function)\n",
        "        if function is None or \"error\" in info:\n",
        "            score = -2.0\n",
        "        else:\n",
        "            try:\n",
        "                new_strategy = create_locked_down_function(function)\n",
        "                score = 1.0\n",
        "            except:\n",
        "                score = -0.5\n",
        "        scores.append(score)\n",
        "    return scores"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Gf69i2WT-m4K"
      },
      "source": [
        "`no_cheating` checks if the function cheated since it might have imported Numpy or other functions:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cUfHzCVx-nGK"
      },
      "outputs": [],
      "source": [
        "def no_cheating(completions, **kwargs):\n",
        "    scores = []\n",
        "    for completion in completions:\n",
        "        score = 0\n",
        "        response = completion[0][\"content\"]\n",
        "        function = extract_function(response)\n",
        "        if function is not None:\n",
        "            ok, info = check_python_modules(function)\n",
        "            scores.append(1.0 if ok else -20.0)  # Penalize heavily!\n",
        "        else:\n",
        "            scores.append(-1.0)  # Failed creating function\n",
        "    return scores"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "slnqWG3FTror"
      },
      "source": [
        "Next `strategy_succeeds` checks if the strategy actually allows the game to terminate. Imagine if the strategy simply returned \"0\" which would fail after a time limit of 10 seconds.\n",
        "\n",
        "We also add a global `PRINTER` to print out the strategy and board state."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "sNi129lYTpZ2"
      },
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "\n",
        "global PRINTER\n",
        "PRINTER = 0\n",
        "\n",
        "\n",
        "def strategy_succeeds(completions, **kwargs):\n",
        "    global PRINTER\n",
        "    scores = []\n",
        "    for completion in completions:\n",
        "        printed = False\n",
        "        score = 0\n",
        "        response = completion[0][\"content\"]\n",
        "        function = extract_function(response)\n",
        "        if PRINTER % 5 == 0:\n",
        "            printed = True\n",
        "            print(function)\n",
        "        PRINTER += 1\n",
        "        if function is not None:\n",
        "            ok, info = check_python_modules(function)\n",
        "        if function is None or \"error\" in info:\n",
        "            scores.append(0)\n",
        "            continue\n",
        "        try:\n",
        "            new_strategy = create_locked_down_function(function)\n",
        "        except:\n",
        "            scores.append(0)\n",
        "            continue\n",
        "        try:\n",
        "            # Reset OpenEnv to an initial state!\n",
        "            result = env.reset()\n",
        "            current_state = result.observation\n",
        "            steps, if_done = execute_strategy(new_strategy, current_state)\n",
        "            print(f\"Steps = {steps} If Done = {if_done}\")\n",
        "            if printed is False:\n",
        "                print(function)\n",
        "            print(render_board(current_state))\n",
        "            if if_done:\n",
        "                scores.append(20.0)  # Success - massively reward!\n",
        "            else:\n",
        "                scores.append(2.0)  # Failed but function works!\n",
        "        except TimeoutError as e:\n",
        "            print(\"Timeout\")\n",
        "            scores.append(-1.0)  # Failed with timeout\n",
        "        except Exception as e:\n",
        "            print(f\"Exception = {str(e)}\")\n",
        "            scores.append(-3.0)  # Failed\n",
        "    return scores"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TCpSxtvSeAG_"
      },
      "source": [
        "We'll now create the dataset which includes a replica of our prompt. Remember to add a reasoning effort of low! You can choose high reasoning mode, but this'll only work on more memory GPUs like MI300s."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Ldf6SjLHVPRv",
        "outputId": "684f36b7-d23a-4229-afd6-b0033edf7962"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "181\n"
          ]
        },
        {
          "data": {
            "text/plain": [
              "{'prompt': [{'content': 'Create a new short 2048 strategy using only native Python code.\\nYou are given a list of list of numbers for the current board state.\\nOutput one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\\nOutput your new short function in backticks using the format below:\\n```python\\ndef strategy(board):\\n    return \"0\" # Example\\n```\\nAll helper functions should be inside def strategy. Only output the short function `strategy`.',\n",
              "   'role': 'user'}],\n",
              " 'answer': 0,\n",
              " 'reasoning_effort': 'low'}"
            ]
          },
          "execution_count": 32,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "from datasets import Dataset\n",
        "\n",
        "dataset = Dataset.from_list(\n",
        "    [{\"prompt\": [{\"role\": \"user\", \"content\": prompt.strip()}], \"answer\": 0, \"reasoning_effort\": \"low\"}] * 1000\n",
        ")\n",
        "maximum_length = len(\n",
        "    tokenizer.apply_chat_template([{\"role\": \"user\", \"content\": prompt.strip()}], add_generation_prompt=True)\n",
        ")\n",
        "print(maximum_length)\n",
        "dataset[0]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9-IOMhVg-2AM"
      },
      "source": [
        "## Training with GRPO\n",
        "\n",
        "**Group Relative Policy Optimization (GRPO)** is an RL algorithm designed for language models. Unlike PPO which requires a separate value network, GRPO computes advantages by comparing generations within the same prompt group—making it simpler and more memory-efficient.\n",
        "\n",
        "Key training parameters:\n",
        "- `num_generations=2`: Generate 2 candidates per prompt to compute relative rewards\n",
        "- `max_steps=600`: Total training steps (~5 hours on T4)\n",
        "- `temperature=1.0`: Controls generation randomness (higher = more exploration)\n",
        "\n",
        "We use [TrackIO](https://github.com/gradio-app/trackio) for live visualization of training metrics directly in the notebook."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ptqkXK2D4d6p",
        "outputId": "42f0f57e-d1a8-4a21-8834-c2dca0e493fb"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Unsloth: We now expect `per_device_train_batch_size` * `gradient_accumulation_steps` * `world_size` to be a multiple of `num_generations`.\n",
            "We will change the batch size of 1 to the `num_generations` of 2\n"
          ]
        }
      ],
      "source": [
        "max_prompt_length = maximum_length + 1  # + 1 just in case!\n",
        "max_completion_length = max_seq_length - max_prompt_length\n",
        "\n",
        "from trl import GRPOConfig, GRPOTrainer\n",
        "\n",
        "training_args = GRPOConfig(\n",
        "    temperature=1.0,\n",
        "    learning_rate=2e-4,\n",
        "    weight_decay=0.001,\n",
        "    warmup_ratio=0.1,\n",
        "    lr_scheduler_type=\"linear\",\n",
        "    optim=\"adamw_8bit\",\n",
        "    logging_steps=1,\n",
        "    per_device_train_batch_size=1,\n",
        "    gradient_accumulation_steps=1,  # Increase to 4 for smoother training\n",
        "    num_generations=2,  # Decrease if out of memory\n",
        "    max_prompt_length=max_prompt_length,\n",
        "    max_completion_length=max_completion_length,\n",
        "    # num_train_epochs = 1, # Set to 1 for a full training run\n",
        "    max_steps=600,\n",
        "    save_steps=100,\n",
        "    report_to=\"trackio\",  # Can use Weights & Biases, TrackIO\n",
        "    output_dir=\"outputs\",\n",
        "    # For optional training + evaluation\n",
        "    # fp16_full_eval = True,\n",
        "    # per_device_eval_batch_size = 4,\n",
        "    # eval_accumulation_steps = 1,\n",
        "    # eval_strategy = \"steps\",\n",
        "    # eval_steps = 1,\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r9Mv8UZO5hz-"
      },
      "source": [
        "And let's run the trainer! If you scroll up, you'll see a table of rewards. The goal is to see the `reward` column increase!\n",
        "\n",
        "You might have to wait 150 to 200 steps for any action. You'll probably get 0 reward for the first 100 steps. Please be patient!\n",
        "\n",
        "| Step | Training Loss | reward    | reward_std | completion_length | kl       |\n",
        "|------|---------------|-----------|------------|-------------------|----------|\n",
        "| 1    | 0.000000      | 0.125000  | 0.000000   | 200.000000        | 0.000000 |\n",
        "| 2    | 0.000000      | 0.072375  | 0.248112   | 200.000000        | 0.000000 |\n",
        "| 3    | 0.000000      | -0.079000 | 0.163776   | 182.500000        | 0.000005 |\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "vzOuSVCL_GA9",
        "outputId": "880e7b31-fd7b-4ddc-96c6-9661a6e9a85e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Unsloth: Switching to float32 training since model cannot work with float16\n"
          ]
        }
      ],
      "source": [
        "trainer = GRPOTrainer(\n",
        "    model=model,\n",
        "    processing_class=tokenizer,\n",
        "    reward_funcs=[\n",
        "        function_works,\n",
        "        no_cheating,\n",
        "        strategy_succeeds,\n",
        "    ],\n",
        "    args=training_args,\n",
        "    train_dataset=dataset,\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fQhtuwP4cf34"
      },
      "source": [
        "And let's train the model! **NOTE** This might be quite slow! 600 steps takes ~5 hours or longer.\n",
        "\n",
        "[TrackIO](https://github.com/gradio-app/trackio) might be a bit slow to load - wait 2 minutes until the graphs pop up!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "id": "VGRxPdSCcfC3",
        "outputId": "9fb52ba6-2b05-4cdf-cd86-642302a96dde"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 2\n",
            "   \\\\   /|    Num examples = 1,000 | Num Epochs = 1 | Total steps = 600\n",
            "O^O/ \\_/ \\    Batch size per device = 2 | Gradient accumulation steps = 1\n",
            "\\        /    Data Parallel GPUs = 1 | Total batch size (2 x 1 x 1) = 2\n",
            " \"-____-\"     Trainable parameters = 1,990,656 of 20,916,747,840 (0.01% trained)\n"
          ]
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "* Running on public URL: https://68c7c4f7168a6e9cd6.gradio.live\n",
            "* Trackio project initialized: huggingface\n",
            "* Trackio metrics logged to: /root/.cache/huggingface/trackio\n"
          ]
        },
        {
          "data": {
            "text/html": [
              "<div><iframe src=\"https://68c7c4f7168a6e9cd6.gradio.live/?project=huggingface&write_token=gy41SzZnTVpOSmlwdkysxq7wSLjsbdlKTH-790_p43o\" width=\"100%\" height=\"1000px\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "* GPU detected, enabling automatic GPU metrics logging\n",
            "* Created new run: dainty-sunset-0\n"
          ]
        },
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "`generation_config` default values have been modified to match model-specific defaults: {'max_length': 131072}. If this is not desired, please set these values explicitly.\n"
          ]
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "def strategy(board):\n",
            "    # simple look‑ahead: pick the move that keeps the board almost sorted\n",
            "    # score is total immobility (higher=better)\n",
            "    def score(b):\n",
            "        s = 0\n",
            "        n = len(b)\n",
            "        for i in range(n):\n",
            "            for j in range(n):\n",
            "                v = b[i][j]\n",
            "                if v != 0:\n",
            "                    # neighbors that can merge\n",
            "                    for di,dj in [(1,0),(-1,0),(0,1),(0,-1)]:\n",
            "                        ni, nj = i+di, j+dj\n",
            "                        if 0 <= ni < n and 0 <= nj < n:\n",
            "                            if b[ni][nj] == v:\n",
            "                                s += v\n",
            "        return s\n",
            "    moves = []\n",
            "    for m in [\"0\",\"1\",\"2\",\"3\"]:\n",
            "        new_b = [row[:] for row in board]\n",
            "        # simulate move (simplified: just skip actual moving)\n",
            "        # Here we just pick the move with highest score (mock)\n",
            "        moves.append((score(new_b), m))\n",
            "    best = max(moves)[1]\n",
            "    return best\n",
            "Steps = 1 If Done = False\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    def apply(board, d):\n",
            "        size = len(board)\n",
            "        moved, new = False, [[0]*size for _ in range(size)]\n",
            "        for i in range(size):\n",
            "            row = board[i] if d in (0,1) else [board[j][i] for j in range(size)]\n",
            "            vals = [v for v in row if v]\n",
            "            merged = []\n",
            "            j = 0\n",
            "            while j < len(vals):\n",
            "                if j+1 < len(vals) and vals[j]==vals[j+1]:\n",
            "                    merged.append(vals[j]*2)\n",
            "                    j+=2\n",
            "                else:\n",
            "                    merged.append(vals[j]); j+=1\n",
            "            merged += [0]*(size-len(merged))\n",
            "            if d==0: new[i]=merged\n",
            "            if d==1: new[i]=merged[::-1]\n",
            "            if d==2: new[i]=merged\n",
            "            if d==3: new[i]=merged[::-1]\n",
            "        return new\n",
            "    best, best_move = 0, \"0\"\n",
            "    for move in \"0123\":\n",
            "        new = apply(board, int(move))\n",
            "        cnt = sum(1 for r in new for v in r if v==0)\n",
            "        if cnt > best:\n",
            "            best, best_move = cnt, move\n",
            "    return best_move\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Unsloth: Will smartly offload gradients to save VRAM!\n"
          ]
        },
        {
          "data": {
            "text/html": [
              "\n",
              "    <div>\n",
              "      \n",
              "      <progress value='15' max='600' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
              "      [ 15/600 1:09:12 < 51:54:20, 0.00 it/s, Epoch 0.01/1]\n",
              "    </div>\n",
              "    <table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              " <tr style=\"text-align: left;\">\n",
              "      <th>Step</th>\n",
              "      <th>Training Loss</th>\n",
              "      <th>reward</th>\n",
              "      <th>reward_std</th>\n",
              "      <th>completions / mean_length</th>\n",
              "      <th>completions / min_length</th>\n",
              "      <th>completions / max_length</th>\n",
              "      <th>completions / clipped_ratio</th>\n",
              "      <th>completions / mean_terminated_length</th>\n",
              "      <th>completions / min_terminated_length</th>\n",
              "      <th>completions / max_terminated_length</th>\n",
              "      <th>kl</th>\n",
              "      <th>rewards / function_works / mean</th>\n",
              "      <th>rewards / function_works / std</th>\n",
              "      <th>rewards / no_cheating / mean</th>\n",
              "      <th>rewards / no_cheating / std</th>\n",
              "      <th>rewards / strategy_succeeds / mean</th>\n",
              "      <th>rewards / strategy_succeeds / std</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <td>1</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>312.000000</td>\n",
              "      <td>278.000000</td>\n",
              "      <td>346.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>312.000000</td>\n",
              "      <td>278.000000</td>\n",
              "      <td>346.000000</td>\n",
              "      <td>0.000103</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>2</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>340.000000</td>\n",
              "      <td>338.000000</td>\n",
              "      <td>342.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>340.000000</td>\n",
              "      <td>338.000000</td>\n",
              "      <td>342.000000</td>\n",
              "      <td>0.000055</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>3</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>-1.250000</td>\n",
              "      <td>2.474874</td>\n",
              "      <td>557.000000</td>\n",
              "      <td>528.000000</td>\n",
              "      <td>586.000000</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>528.000000</td>\n",
              "      <td>528.000000</td>\n",
              "      <td>528.000000</td>\n",
              "      <td>0.000063</td>\n",
              "      <td>-1.250000</td>\n",
              "      <td>1.060660</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.414214</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>4</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>82.000000</td>\n",
              "      <td>52.000000</td>\n",
              "      <td>112.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>82.000000</td>\n",
              "      <td>52.000000</td>\n",
              "      <td>112.000000</td>\n",
              "      <td>0.000175</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>5</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>355.500000</td>\n",
              "      <td>244.000000</td>\n",
              "      <td>467.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>355.500000</td>\n",
              "      <td>244.000000</td>\n",
              "      <td>467.000000</td>\n",
              "      <td>0.001542</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>6</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>278.000000</td>\n",
              "      <td>142.000000</td>\n",
              "      <td>414.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>278.000000</td>\n",
              "      <td>142.000000</td>\n",
              "      <td>414.000000</td>\n",
              "      <td>0.008132</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>7</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>370.000000</td>\n",
              "      <td>183.000000</td>\n",
              "      <td>557.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>370.000000</td>\n",
              "      <td>183.000000</td>\n",
              "      <td>557.000000</td>\n",
              "      <td>0.007350</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>8</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>317.500000</td>\n",
              "      <td>190.000000</td>\n",
              "      <td>445.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>317.500000</td>\n",
              "      <td>190.000000</td>\n",
              "      <td>445.000000</td>\n",
              "      <td>0.012345</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>9</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>4.949748</td>\n",
              "      <td>321.500000</td>\n",
              "      <td>57.000000</td>\n",
              "      <td>586.000000</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>57.000000</td>\n",
              "      <td>57.000000</td>\n",
              "      <td>57.000000</td>\n",
              "      <td>0.018377</td>\n",
              "      <td>-0.500000</td>\n",
              "      <td>2.121320</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.414214</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.414214</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>10</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>4.949748</td>\n",
              "      <td>407.500000</td>\n",
              "      <td>229.000000</td>\n",
              "      <td>586.000000</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>229.000000</td>\n",
              "      <td>229.000000</td>\n",
              "      <td>229.000000</td>\n",
              "      <td>0.031208</td>\n",
              "      <td>-0.500000</td>\n",
              "      <td>2.121320</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.414214</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.414214</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>11</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>4.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>225.500000</td>\n",
              "      <td>187.000000</td>\n",
              "      <td>264.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>225.500000</td>\n",
              "      <td>187.000000</td>\n",
              "      <td>264.000000</td>\n",
              "      <td>0.037550</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>12</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>-3.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>586.000000</td>\n",
              "      <td>586.000000</td>\n",
              "      <td>586.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000167</td>\n",
              "      <td>-2.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>-1.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>13</td>\n",
              "      <td>0.000100</td>\n",
              "      <td>-2.000000</td>\n",
              "      <td>1.414214</td>\n",
              "      <td>491.000000</td>\n",
              "      <td>396.000000</td>\n",
              "      <td>586.000000</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>396.000000</td>\n",
              "      <td>396.000000</td>\n",
              "      <td>396.000000</td>\n",
              "      <td>0.129521</td>\n",
              "      <td>-0.500000</td>\n",
              "      <td>2.121320</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>1.414214</td>\n",
              "      <td>-1.500000</td>\n",
              "      <td>2.121320</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table><p>"
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Steps = 1 If Done = False\n",
            "def strategy(board):\n",
            "    # Count empty cells for each move\n",
            "    def score_for(move):\n",
            "        temp = [row[:] for row in board]\n",
            "        # simulate move\n",
            "        for i in range(4):\n",
            "            line = [temp[j][i] for j in range(4)] if move==3 else [temp[i][j] for j in range(4)]\n",
            "            # shift and merge\n",
            "            new_line = []\n",
            "            merged = False\n",
            "            for val in (line if move in (0,2) else reversed(line)):\n",
            "                if val == 0: continue\n",
            "                if new_line and new_line[-1] == val and not merged:\n",
            "                    new_line[-1] *= 2\n",
            "                    merged = True\n",
            "                else:\n",
            "                    new_line.append(val)\n",
            "            # fill rest with zeros\n",
            "            new_line += [0]*(4-len(new_line))\n",
            "            if move==3:\n",
            "                for j in range(4): temp[j][i] = new_line[j]\n",
            "            else:\n",
            "                for j in range(4): temp[i][j] = new_line[j]\n",
            "        return sum(new_line)  # simple heuristic\n",
            "    best = -1; best_move=None\n",
            "    for m in range(4):\n",
            "        if score_for(m) > best:\n",
            "            best = score_for(m); best_move=str(m)\n",
            "    return best_move\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    # Assign scores to each move based on total number of merges and minimal tile movement\n",
            "    scores = {}\n",
            "    dirs = [(0,1),(0,-1),(1,0),(-1,0)]  # right, left, down, up\n",
            "    # evaluate each direction\n",
            "    for i, (dx, dy) in enumerate(dirs):\n",
            "        tmp = [row[:] for row in board]  # copy\n",
            "        for y in range(len(tmp)):\n",
            "            line = tmp[y] if dx==0 else [tmp[x][y] for x in range(len(tmp))]\n",
            "            # slide and combine\n",
            "            new_line = [v for v in line if v!=0]\n",
            "            for k in range(len(new_line)-1):\n",
            "                if new_line[k]==new_line[k+1]:\n",
            "                    new_line[k]*=2\n",
            "                    new_line.pop(k+1)\n",
            "            new_line+= [0]*(len(line)-len(new_line))\n",
            "            # place back\n",
            "            if dx==0:\n",
            "                tmp[y] = new_line\n",
            "            else:\n",
            "                for x in range(len(tmp)):\n",
            "                    tmp[x][y] = new_line[x]\n",
            "        # score by number of zero tiles (more space)\n",
            "        scores[i] = sum(v==0 for row in tmp for v in row)\n",
            "    # choose move with most empty tiles\n",
            "    best = max(scores, key=scores.get)\n",
            "    return str(best)\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "def strategy(board):\n",
            "    def move(board, dir):\n",
            "        size = len(board)\n",
            "        def compress(line):\n",
            "            nonlocal line\n",
            "            line = [x for x in line if x]\n",
            "            for i in range(len(line)-1):\n",
            "                if line[i]==line[i+1]:\n",
            "                    line[i]*=2\n",
            "                    line[i+1]=0\n",
            "            line=[x for x in line if x]\n",
            "            return line+[0]*(size-len(line))\n",
            "        if dir==0:  # up\n",
            "            res=[[0]*size for _ in range(size)]\n",
            "            for c in range(size):\n",
            "                col=[board[r][c] for r in range(size)]\n",
            "                col=compress(col)\n",
            "                for r in range(size):\n",
            "                    res[r][c]=col[r]\n",
            "            return res\n",
            "        if dir==1:  # down\n",
            "            res=[[0]*size for _ in range(size)]\n",
            "            for c in range(size):\n",
            "                col=[board[r][c] for r in range(size)][::-1]\n",
            "                col=compress(col)\n",
            "                col=col[::-1]\n",
            "                for r in range(size):\n",
            "                    res[r][c]=col[r]\n",
            "            return res\n",
            "        if dir==2:  # left\n",
            "            res=[[0]*size for _ in range(size)]\n",
            "            for r in range(size):\n",
            "                line=board[r]\n",
            "                line=compress(line)\n",
            "                res[r]=line\n",
            "            return res\n",
            "        if dir==3:  # right\n",
            "            res=[[0]*size for _ in range(size)]\n",
            "            for r in range(size):\n",
            "                line=board[r][::-1]\n",
            "                line=compress(line)\n",
            "                line=line[::-1]\n",
            "                res[r]=line\n",
            "            return res\n",
            "    best=None\n",
            "    best_sum=-1\n",
            "    for d in range(4):\n",
            "        new=move(board,d)\n",
            "        if new==board: continue\n",
            "        s=sum(sum(row) for row in new)\n",
            "        if s>best_sum:\n",
            "            best_sum=s; best=str(d)\n",
            "    return best if best is not None else \"0\"\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    import random\n",
            "    best = None\n",
            "    best_score = -1\n",
            "    for move in '0123':\n",
            "        b = board\n",
            "        # simulate move by simple shift (not full 2048 logic)\n",
            "        # This is a placeholder: choose random valid move\n",
            "        if random.random() < 0.5:\n",
            "            return move\n",
            "    return best if best else \"0\"\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    # Very simple strategy: always return the first possible direction (0).\n",
            "    return \"0\"\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    #\n",
            "    # Try to reduce the number of empty tiles by making a merge first.\n",
            "    #\n",
            "    def can_merge(up, down):\n",
            "        return any((r, c) for r in range(4) for c in range(4)\n",
            "                    if board[r][c] == 0 and board[down(r)][down(c)] == r and board[up(r)][up(c)] == r)\n",
            "    #\n",
            "    # Prefer to move towards the corner that is most populated.\n",
            "    #\n",
            "    if any(board[0][c] == 0 for c in range(4)):\n",
            "        return \"0\"   # up\n",
            "    if any(board[3][c] == 0 for c in range(4)):\n",
            "        return \"1\"   # down\n",
            "    if any(board[r][0] == 0 for r in range(4)):\n",
            "        return \"2\"   # left\n",
            "    return \"3\"       # right\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 1 If Done = False\n",
            "def strategy(board):\n",
            "    # Simple heuristic: choose the move that results in the highest number of empty cells after a move\n",
            "    moves = []\n",
            "    for move in range(4):\n",
            "        new_board = [row[:] for row in board]\n",
            "        # simulate move\n",
            "        def compress_and_merge(line):\n",
            "            filtered = [x for x in line if x != 0]\n",
            "            merged = []\n",
            "            skip = False\n",
            "            for i, val in enumerate(filtered):\n",
            "                if skip:\n",
            "                    skip = False\n",
            "                    continue\n",
            "                if i + 1 < len(filtered) and filtered[i] == filtered[i+1]:\n",
            "                    merged.append(val * 2)\n",
            "                    skip = True\n",
            "                else:\n",
            "                    merged.append(val)\n",
            "            merged += [0] * (len(line) - len(merged))\n",
            "            return merged\n",
            "        if move == 0:  # left\n",
            "            for r in range(4):\n",
            "                new_board[r] = compress_and_merge(new_board[r])\n",
            "        elif move == 1:  # right\n",
            "            for r in range(4):\n",
            "                new_board[r] = list(reversed(compress_and_merge(list(reversed(new_board[r])))))\n",
            "        elif move == 2:  # up\n",
            "            for c in range(4):\n",
            "                col = [new_board[r][c] for r in range(4)]\n",
            "                merged = compress_and_merge(col)\n",
            "                for r in range(4):\n",
            "                    new_board[r][c] = merged[r]\n",
            "        elif move == 3:  # down\n",
            "            for c in range(4):\n",
            "                col = [new_board[r][c] for r in range(4)]\n",
            "                merged = list(reversed(compress_and_merge(list(reversed(col)))))\n",
            "                for r in range(4):\n",
            "                    new_board[r][c] = merged[r]\n",
            "        moves.append((sum(row.count(0) for row in new_board), move))\n",
            "    return str(max(moves)[1])\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "def strategy(board):\n",
            "    # compute all possible moves and pick one with max score (simple heuristic)\n",
            "    best, best_val = None, -1\n",
            "    for move in map(str, range(4)):\n",
            "        # simulate move\n",
            "        new_board = [row[:] for row in board]\n",
            "        # apply move logic (omitted for brevity)\n",
            "        # evaluate board\n",
            "        val = sum(sum(row) for row in new_board)  # placeholder\n",
            "        if val > best_val:\n",
            "            best_val, best = val, move\n",
            "    return best\n",
            "Steps = 9 If Done = False\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    # simple heuristic: choose first direction that merges a pair\n",
            "    for d in range(4):\n",
            "        visited = set()\n",
            "        for i in range(4):\n",
            "            for j in range(4):\n",
            "                if board[i][j] == 0:\n",
            "                    continue\n",
            "                ni, nj = i, j\n",
            "                if d == 0:   # up\n",
            "                    ni -= 1\n",
            "                elif d == 1: # down\n",
            "                    ni += 1\n",
            "                elif d == 2: # left\n",
            "                    nj -= 1\n",
            "                else:        # right\n",
            "                    nj += 1\n",
            "                if 0 <= ni < 4 and 0 <= nj < 4:\n",
            "                    if board[ni][nj] == board[i][j]:\n",
            "                        return str(d)\n",
            "    return \"0\"\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    # look for the best next move using a simple heuristic\n",
            "    max_val = -1\n",
            "    move = \"0\"\n",
            "    for m in (\"0\",\"1\",\"2\",\"3\"):\n",
            "        # copy board\n",
            "        import copy\n",
            "        b = copy.deepcopy(board)\n",
            "        # simulate move\n",
            "        r=False\n",
            "        if m==\"0\":  # up\n",
            "            for c in range(len(b)):\n",
            "                col=[b[r][c] for r in range(len(b))]\n",
            "                col=[n for n in col if n]\n",
            "                i=0\n",
            "                while i< len(col)-1:\n",
            "                    if col[i]==col[i+1]:\n",
            "                        col[i]*=2; del col[i+1]; i+=1\n",
            "                    i+=1\n",
            "                for r in range(len(b)):\n",
            "                    b[r][c]=col[r] if r<len(col) else 0\n",
            "        elif m==\"1\":  # down\n",
            "            for c in range(len(b)):\n",
            "                col=[b[r][c] for r in range(len(b))]\n",
            "                col=[n for n in col if n]\n",
            "                i=len(col)-1\n",
            "                while i>0:\n",
            "                    if col[i]==col[i-1]:\n",
            "                        col[i]*=2; del col[i-1]; i-=1\n",
            "                    i-=1\n",
            "                for r in range(len(b)):\n",
            "                    b[r][c]=col[len(col)-1-r] if r<len(col) else 0\n",
            "        elif m==\"2\":  # left\n",
            "            for r in range(len(b)):\n",
            "                row=b[r]\n",
            "                row=[n for n in row if n]\n",
            "                i=0\n",
            "                while i< len(row)-1:\n",
            "                    if row[i]==row[i+1]:\n",
            "                        row[i]*=2; del row[i+1]; i+=1\n",
            "                    i+=1\n",
            "                b[r]=row+ [0]*(len(b)-len(row))\n",
            "        elif m==\"3\":  # right\n",
            "            for r in range(len(b)):\n",
            "                row=b[r]\n",
            "                row=[n for n in row if n]\n",
            "                i=len(row)-1\n",
            "                while i>0:\n",
            "                    if row[i]==row[i-1]:\n",
            "                        row[i]*=2; del row[i-1]; i-=1\n",
            "                    i-=1\n",
            "                b[r]=[0]*(len(b)-len(row))+row\n",
            "        # evaluate\n",
            "        val=sum(max(row) for row in b)\n",
            "        if val>max_val:\n",
            "            max_val=val; move=m\n",
            "    return move\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 0 If Done = False\n",
            "def strategy(board):\n",
            "    # Count tiles in entire board\n",
            "    total = sum(sum(row) for row in board)\n",
            "    if total == 0:  # no tiles\n",
            "        return \"0\"\n",
            "    # Heuristic: prefer moving up if average value of upper row > lower row\n",
            "    upper = sum(board[0])\n",
            "    lower = sum(board[-1])\n",
            "    left = sum(row[0] for row in board)\n",
            "    right = sum(row[-1] for row in board)\n",
            "    moves = [(upper - lower, \"0\"), (right - left, \"1\")], \n",
            "    # pick the move with biggest difference (push bigger numbers up or right)\n",
            "    best_move = max(moves)[1]\n",
            "    return best_move\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    # Possible moves: 0=up, 1=right, 2=down, 3=left\n",
            "    best_score = -1\n",
            "    best_move = 0\n",
            "    dirs = [(0, -1), (1, 0), (0, 1), (-1, 0)]  # mapping: up, right, down, left\n",
            "    for move, (dx, dy) in enumerate(dirs):\n",
            "        new_board = [row[:] for row in board]\n",
            "        moved = False\n",
            "        for x in range(4):\n",
            "            for y in range(4):\n",
            "                if dx != 0:\n",
            "                    nx, ny = x + dx, y\n",
            "                else:\n",
            "                    nx, ny = x, y + dy\n",
            "                if 0 <= nx < 4 and 0 <= ny < 4:\n",
            "                    if board[x][y] != 0 and new_board[nx][ny] == 0:\n",
            "                        new_board[nx][ny] = board[x][y]\n",
            "                        new_board[x][y] = 0\n",
            "                        moved = True\n",
            "                # Merge\n",
            "                if dx != 0:\n",
            "                    if 0 <= nx-1 < 4 and new_board[nx-1][ny] == new_board[nx][ny] != 0:\n",
            "                        new_board[nx-1][ny] *= 2\n",
            "                        new_board[nx][ny] = 0\n",
            "                else:\n",
            "                    if 0 <= ny-1 < 4 and new_board[nx][ny-1] == new_board[nx][ny] != 0:\n",
            "                        new_board[nx][ny-1] *= 2\n",
            "                        new_board[nx][ny] = 0\n",
            "        if not moved:\n",
            "            continue\n",
            "        score = sum(new_board[x][y] for x in range(4) for y in range(4))\n",
            "        if score > best_score:\n",
            "            best_score = score\n",
            "            best_move = move\n",
            "    return str(best_move)\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "def strategy(board):\n",
            "    # simple heuristic: try to merge the first pair of equal tiles from left to right\n",
            "    for i in range(len(board)):\n",
            "        for j in range(len(board[i])-1):\n",
            "            if board[i][j] == board[i][j+1] and board[i][j] != 0:\n",
            "                return str(j)  # direction: 0-left, 1-right, 2-up, 3-down\n",
            "    # if no merges, slide to fill empty spot on the left\n",
            "    for i in range(len(board)):\n",
            "        for j in range(len(board[i])):\n",
            "            if board[i][j] == 0:\n",
            "                return \"0\"\n",
            "    return \"0\"\n",
            "Steps = 9 If Done = False\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    return \"0\"\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    # board is a list of 4 lists, each containing 4 integers (0 for empty)\n",
            "    from collections import Counter\n",
            "    # Count numbers of each value\n",
            "    counts = Counter([num for row in board for num in row if num != 0])\n",
            "    # Prefer moving toward the rightmost or downwards if a move will combine\n",
            "    # First, try to combine pairs by moving left\n",
            "    for i in range(4):\n",
            "        for j in range(1,4):\n",
            "            if board[i][j] == board[i][j-1] and board[i][j] != 0:\n",
            "                return \"3\"  # Move up to combine\n",
            "    # If no direct combine, move right if possible\n",
            "    for i in range(4):\n",
            "        for j in range(3):\n",
            "            if board[i][j] == 0:\n",
            "                return \"2\"  # Move right\n",
            "    # If no empty, move down\n",
            "    return \"1\"\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "def strategy(board):\n",
            "    # Try to move left if possible, else right, up, down\n",
            "    def can(move):\n",
            "        for i,row in enumerate(board):\n",
            "            if move == \"0\" and i>0 and row[i-1]==0: return True\n",
            "            if move == \"1\" and i<3 and row[i+1]==0: return True\n",
            "            if move == \"2\" and i>0 and board[i-1][i]==0: return True\n",
            "            if move == \"3\" and i<3 and board[i+1][i]==0: return True\n",
            "        return False\n",
            "\n",
            "    for m in [\"0\",\"1\",\"2\",\"3\"]:\n",
            "        if can(m):\n",
            "            return m\n",
            "    return \"0\"\n",
            "Steps = 9 If Done = False\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    # Simple heuristic: move a tile towards the nearest zero cell.\n",
            "    n = len(board)\n",
            "    for i in range(n):\n",
            "        for j in range(n):\n",
            "            if board[i][j] != 0:\n",
            "                # try to move right if possible\n",
            "                if j+1 < n and board[i][j+1] == 0:\n",
            "                    return \"1\"  # move right\n",
            "                # try upwards\n",
            "                if i-1 >= 0 and board[i-1][j] == 0:\n",
            "                    return \"0\"  # move up\n",
            "                # try left\n",
            "                if j-1 >= 0 and board[i][j-1] == 0:\n",
            "                    return \"3\"  # move left\n",
            "                # try downwards\n",
            "                if i+1 < n and board[i+1][j] == 0:\n",
            "                    return \"2\"  # move down\n",
            "    return \"0\"\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Exception = list index out of range\n",
            "None\n",
            "Steps = 9 If Done = False\n",
            "def strategy(board):\n",
            "    \"\"\"\n",
            "    Determine the best move (\"0\": Up, \"1\": Right, \"2\": Down, \"3\": Left)\n",
            "    for a 2048 board represented by a list of lists.\n",
            "    This implementation uses a simple heuristic: count the number of\n",
            "    empty cells after each potential move and choose the move that\n",
            "    maximizes this count. It does not simulate future moves.\n",
            "    \"\"\"\n",
            "    # Directions: 0=Up,1=Right,2=Down,3=Left\n",
            "    dirs = [( -1,  0), ( 0,  1), ( 1,  0), ( 0, -1)]\n",
            "    best_move = None\n",
            "    best_empty = -1\n",
            "\n",
            "    n = len(board)\n",
            "    for move, (dx, dy) in enumerate(dirs):\n",
            "        new_board = [row[:] for row in board]  # copy\n",
            "        changed = False\n",
            "        for i in range(n):\n",
            "            for j in range(n):\n",
            "                x, y = i, j\n",
            "                # Move the tile in the chosen direction\n",
            "                while True:\n",
            "                    nx, ny = x + dx, y + dy\n",
            "                    if 0 <= nx < n and 0 <= ny < n and new_board[nx][ny] == 0:\n",
            "                        # Merge if possible\n",
            "                        if new_board[x][y] != 0 and new_board[nx][ny] == 0:\n",
            "                            new_board[nx][ny] = new_board[x][y]\n",
            "                            new_board[x][y] = 0\n",
            "                            changed = True\n",
            "                        x, y = nx, ny\n",
            "                    else:\n",
            "                        break\n",
            "                # If tile could not move, skip\n",
            "        # Count empty cells in the resulting board\n",
            "        empty = sum(row.count(0) for row in new_board)\n",
            "        if empty > best_empty:\n",
            "            best_empty = empty\n",
            "            best_move = str(move)\n",
            "    return best_move\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n",
            "Steps = 2 If Done = False\n",
            "def strategy(board):\n",
            "    n = len(board)\n",
            "    score = lambda r,c: board[r][c]\n",
            "    # Count empty cells and total\n",
            "    empties = sum(board[i][j] == 0 for i in range(n) for j in range(n))\n",
            "    # Simple heuristic: move left if highest tile on left, else right, else up then down\n",
            "    # Find position of maximum tile\n",
            "    max_val = -1\n",
            "    max_pos = None\n",
            "    for i in range(n):\n",
            "        for j in range(n):\n",
            "            if board[i][j] > max_val:\n",
            "                max_val = board[i][j]\n",
            "                max_pos = (i, j)\n",
            "    # Prefer moving towards the edge with max tile\n",
            "    i, j = max_pos\n",
            "    # Prioritize directions that keep max tile towards edge\n",
            "    if j == 0: return \"0\"  # left\n",
            "    if j == n-1: return \"1\"  # right\n",
            "    if i == 0: return \"2\"  # up\n",
            "    return \"3\"  # down\n",
            "┌───┬───┬───┬───┬───┬───┬───┐\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "├───┼───┼───┼───┼───┼───┼───┤\n",
            "│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;33m  1\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\u001b[38;5;239m  .\u001b[0m│\n",
            "└───┴───┴───┴───┴───┴───┴───┘\n"
          ]
        }
      ],
      "source": [
        "trainer.train()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tlaUdxC_VHpz"
      },
      "source": [
        "## Testing the Trained Model\n",
        "\n",
        "Let's generate a strategy from our RL-trained model and see how it differs from the base model:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "TwZygRdWf8ab"
      },
      "outputs": [],
      "source": [
        "text = tokenizer.apply_chat_template(\n",
        "    [{\"role\": \"user\", \"content\": prompt}],\n",
        "    tokenize=False,\n",
        "    add_generation_prompt=True,\n",
        "    reasoning_effort=\"low\",\n",
        ")\n",
        "\n",
        "from transformers import TextStreamer\n",
        "\n",
        "_ = model.generate(\n",
        "    **tokenizer(text, return_tensors=\"pt\").to(\"cuda\"),\n",
        "    temperature=1.0,\n",
        "    max_new_tokens=1024,\n",
        "    streamer=TextStreamer(tokenizer, skip_prompt=False),\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-NUEmHFSYNTp"
      },
      "source": [
        "## Saving the Fine-tuned Model\n",
        "\n",
        "You can save the trained model in different formats:\n",
        "\n",
        "- **MXFP4**: OpenAI gpt-oss's native 4-bit precision format\n",
        "- **float16**: Standard half-precision for broader compatibility\n",
        "\n",
        "To push to Hugging Face Hub, you'll need a token from https://huggingface.co/settings/tokens:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "NjXGTkp7YNtB"
      },
      "outputs": [],
      "source": [
        "# Merge and push to hub in mxfp4 4bit format\n",
        "if False:\n",
        "    model.save_pretrained_merged(\"finetuned_model\", tokenizer, save_method=\"mxfp4\")\n",
        "if False:\n",
        "    model.push_to_hub_merged(\"repo_id/repo_name\", tokenizer, token=\"hf...\", save_method=\"mxfp4\")\n",
        "\n",
        "# Merge and push to hub in 16bit\n",
        "if False:\n",
        "    model.save_pretrained_merged(\"finetuned_model\", tokenizer, save_method=\"merged_16bit\")\n",
        "if False:  # Pushing to HF Hub\n",
        "    model.push_to_hub_merged(\"hf/gpt-oss-finetune\", tokenizer, save_method=\"merged_16bit\", token=\"\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "V15Yhj1V9lwG"
      },
      "source": [
        "## Conclusion\n",
        "\n",
        "Congratulations! You've learned how to apply reinforcement learning to teach an LLM to generate game-playing code. The key concepts covered:\n",
        "\n",
        "1. **OpenEnv** for standardized access to RL environments\n",
        "2. **LoRA** for memory-efficient fine-tuning\n",
        "3. **Sandboxed execution** to prevent reward hacking\n",
        "4. **Multi-objective reward functions** that balance validity, safety, and performance\n",
        "5. **GRPO** for policy optimization without a value network\n",
        "\n",
        "This pattern extends beyond 2048—you can adapt it to any task where model outputs can be programmatically evaluated: code synthesis, mathematical proofs, API usage, and more.\n",
        "\n",
        "### Further Resources\n",
        "\n",
        "- [OpenAI gpt-oss-20b Model Card](https://huggingface.co/openai/gpt-oss-20b)\n",
        "- [OpenEnv Documentation](https://github.com/meta-pytorch/OpenEnv)\n",
        "- [TRL GRPO Trainer](https://huggingface.co/docs/trl/main/en/grpo_trainer)\n",
        "- [Unsloth RL Guide](https://docs.unsloth.ai/get-started/reinforcement-learning-rl-guide)\n",
        "\n",
        "---\n",
        "\n",
        "*This notebook uses [Unsloth](https://github.com/unslothai/unsloth) for memory-efficient training.*\n",
        "\n",
        "**License:** Apache 2.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KMwNkyqlB4Ae"
      },
      "source": []
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "gpuType": "T4",
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    },
    "widgets": {
      "application/vnd.jupyter.widget-state+json": {
        "004f28173c3b4fb6a9f8c2068f5db81f": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "02c88a690a384ae183c233b6927aaf57": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "050567dccb47456aaac65d118ac60a6b": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_ea2f3ca562444c46b50596a1d7cf9030",
              "IPY_MODEL_0b7287d482cc44dbb406d71f23f1aea0",
              "IPY_MODEL_474c7fe6ef4b430d9826171eded2ebf1"
            ],
            "layout": "IPY_MODEL_12a877304ddf45e49bbdfe056394c3d6"
          }
        },
        "06e420cfa2974f7d8d7ec4b83f064a6e": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "072060f8bdb54a15baf838f67d376d99": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_29d98af49dd3412f84f5843b937029d1",
            "max": 3372033380,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_93176d6b63284b4e832e0f028be90655",
            "value": 3372033380
          }
        },
        "08d8f0cfd7614900a9c9bba888619749": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "0b7287d482cc44dbb406d71f23f1aea0": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_d541fd264d7e4b2691e8efbd993e6ae7",
            "max": 446,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_cbc7bbfe983f4e6696f8dd6b37c50543",
            "value": 446
          }
        },
        "0b851acfd32047bfb6bae17d43ccfcb1": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_667192c07a7340a9a72ed25648a0be64",
            "max": 3996690997,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_aa714b1f70e8495b9299b51d6ac4c3c4",
            "value": 3996690997
          }
        },
        "0b8fa4ff186a4bfeac18cf6d676e99df": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "0e49035e0c3a4ee4ab477b475e74ef36": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "0f09a00375bd4d4c8863fc8fb7d64d61": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "1055200185004ea2a95a05eb51232501": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "12a877304ddf45e49bbdfe056394c3d6": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "18d4cc3fd8ee4e3fbf27c574fd467f20": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "1b26eac03fed4c8784bd611474cf4607": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_b72fc22310714bf0bf6ff4021db5aba8",
            "placeholder": "​",
            "style": "IPY_MODEL_d2a7fa9dddc240e29330297870159c59",
            "value": "Loading checkpoint shards: 100%"
          }
        },
        "1b84d7dc9567474c8587432d48342b71": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_665f7b7e5496432985ed9f49829f5834",
            "placeholder": "​",
            "style": "IPY_MODEL_d16dc921c6554bc6b77abcb423721cc8",
            "value": " 1.19M/? [00:00&lt;00:00, 76.2MB/s]"
          }
        },
        "1daa373c890b4ae0a9cf6a3ec325693c": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "1ece5fe9597a4e3db2dd96c38995705c": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "212c17e3829c4accb30265a3d9ee73dc": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "2612008cf36949d9a6618a01ac817618": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "28300c16023d4ad9a59784baea2f57aa": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": "20px"
          }
        },
        "297bdff1add5414893319b185cb15da6": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_362947c7e102474cbf560f51e713bdff",
              "IPY_MODEL_7f63ff3e87aa4b86a4f8e64785d1d34c",
              "IPY_MODEL_cca7d922b85449a1b0c5c025b65fba10"
            ],
            "layout": "IPY_MODEL_5520c59f24cd414092bf5f952425611d"
          }
        },
        "29d98af49dd3412f84f5843b937029d1": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "2d91f25a8d0b4dd39dc320b2cf17fc0b": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_88aa58551344410a91b07af480d6ab53",
            "max": 1,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_4f38070c09354406928c5e7be6cff3fd",
            "value": 1
          }
        },
        "2f57c8e713b94c8692837b4e17c9e983": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "31afdb187d2a44d8b3a101fa18543c13": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "3371a1da5d0e426bb6cc02a6c383dd6f": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "3566e9058ebc45498b42e521f2314365": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "362947c7e102474cbf560f51e713bdff": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_b1dcd2f908344949af01dab5a244e458",
            "placeholder": "​",
            "style": "IPY_MODEL_9e6f46ddb61943f4a168d70a209a7ddc",
            "value": "model-00004-of-00004.safetensors: 100%"
          }
        },
        "383e9b4b74e34cae96c1f46f41591b82": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "3c2e00b9d20a4c9ea5b850b776752fd2": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_7c29e5f727ec4b608c06d0487a2c9e53",
              "IPY_MODEL_4beae5415d0647e2898a83313a08ea94",
              "IPY_MODEL_62cd4ccc430549e7a2c156222d47ebeb"
            ],
            "layout": "IPY_MODEL_4691273248984fdea29d83ec0a246cd9"
          }
        },
        "3e3ce50c437a412f9cc2bedb697648f7": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "3f589cf3cb804ed29ba322e4fa10c511": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "419e4369b36644d3abec159511a88ad8": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_eee0d2b6eb6047d8a4001f4999dcdc35",
              "IPY_MODEL_2d91f25a8d0b4dd39dc320b2cf17fc0b",
              "IPY_MODEL_1b84d7dc9567474c8587432d48342b71"
            ],
            "layout": "IPY_MODEL_f34cadfaf9bb46729aeeff9492ba9026"
          }
        },
        "41ee8407344d402997b0574e0ae26c77": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "4691273248984fdea29d83ec0a246cd9": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "46cb3593b37c4cb9b4bac422bc5809c8": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "474c7fe6ef4b430d9826171eded2ebf1": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_f290b17e3cd9487b9b97d54d6cec9efc",
            "placeholder": "​",
            "style": "IPY_MODEL_18d4cc3fd8ee4e3fbf27c574fd467f20",
            "value": " 446/446 [00:00&lt;00:00, 50.9kB/s]"
          }
        },
        "48f65e25acd84b0cb582de66753215da": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_383e9b4b74e34cae96c1f46f41591b82",
            "max": 165,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_46cb3593b37c4cb9b4bac422bc5809c8",
            "value": 165
          }
        },
        "4ba91b8d008e483d89f16f204326f6b6": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "4bc16bd2399a43fe85967131af7f846a": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "4beae5415d0647e2898a83313a08ea94": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_a6a122064bc340868b5e0e11afa9c42f",
            "max": 1,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_3371a1da5d0e426bb6cc02a6c383dd6f",
            "value": 1
          }
        },
        "4ca56d8605864a438290152268bfc686": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_2612008cf36949d9a6618a01ac817618",
            "max": 3998751275,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_3e3ce50c437a412f9cc2bedb697648f7",
            "value": 3998751275
          }
        },
        "4dc823fcd0dd4eaaa2e8aaff0daa9ad1": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_1b26eac03fed4c8784bd611474cf4607",
              "IPY_MODEL_a24e2cfbef794d35a0e22753352caa15",
              "IPY_MODEL_8368e4420e814c6f9be30994b69c66ee"
            ],
            "layout": "IPY_MODEL_eb8c197ce1fc41f78531e5e73ae14a89"
          }
        },
        "4e63263fc27b4f07b4f6abffba082379": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_eb13ad96565a44519e7cab9ce9483b90",
              "IPY_MODEL_0b851acfd32047bfb6bae17d43ccfcb1",
              "IPY_MODEL_d4c72002b5fe44d3ad7ae67f5536c889"
            ],
            "layout": "IPY_MODEL_a1e042e5b8ad4b028cc28ac54924207d"
          }
        },
        "4f29591af0d64d398515479b032a1b3d": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "4f38070c09354406928c5e7be6cff3fd": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "52e9a0ea76df4fd8822ccadadaadb501": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "5371b2fad7b04f97bd8f4671d844d2cb": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_9c2371c32afe46519ee53427ea42bc9a",
            "placeholder": "​",
            "style": "IPY_MODEL_f9d4672fb86b4b4e9c3e9068dc479e5b",
            "value": " 3.37G/3.37G [00:20&lt;00:00, 60.9MB/s]"
          }
        },
        "5483233a9b224f3c8eb9337e4ed82314": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "5520c59f24cd414092bf5f952425611d": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "56b30cc150924879abfc138427f4ca98": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "5d1e2fdbf7a2409abbafb63e4b160668": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_a4c19dc98fe943e09a26a60d23c8ff01",
              "IPY_MODEL_8f5799610318490492ff5aba76be3d1a",
              "IPY_MODEL_e78534b135d2465c83e3be614b71c8a4"
            ],
            "layout": "IPY_MODEL_2f57c8e713b94c8692837b4e17c9e983"
          }
        },
        "62cd4ccc430549e7a2c156222d47ebeb": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_bedc42e908454f1494768194b43b2964",
            "placeholder": "​",
            "style": "IPY_MODEL_41ee8407344d402997b0574e0ae26c77",
            "value": " 22.8k/? [00:00&lt;00:00, 1.59MB/s]"
          }
        },
        "665f7b7e5496432985ed9f49829f5834": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "667192c07a7340a9a72ed25648a0be64": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "6af6de3285684e76b320571b44af5fc1": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_f7cc1614e13d4d22b83f787b20407a5f",
              "IPY_MODEL_072060f8bdb54a15baf838f67d376d99",
              "IPY_MODEL_5371b2fad7b04f97bd8f4671d844d2cb"
            ],
            "layout": "IPY_MODEL_b10d25fb43eb42198abf71c4e326bcff"
          }
        },
        "6d77c8dcb28240f7b860571d11f8b9af": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "754f2452fbe14c7098215ec810ffbf14": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "7c29e5f727ec4b608c06d0487a2c9e53": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ae3819cd042f43babef24199081bb97f",
            "placeholder": "​",
            "style": "IPY_MODEL_52e9a0ea76df4fd8822ccadadaadb501",
            "value": "tokenizer_config.json: "
          }
        },
        "7c52841c7e714173bfd526b0a625bc9d": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ae9728c04b974af29460ce5179a9edba",
            "placeholder": "​",
            "style": "IPY_MODEL_754f2452fbe14c7098215ec810ffbf14",
            "value": " 165/165 [00:00&lt;00:00, 16.4kB/s]"
          }
        },
        "7c7d40163ecc4dae8a2c54af21de4661": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "7ce0f239a8514923b383f738bc0c9899": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "7f63ff3e87aa4b86a4f8e64785d1d34c": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ae79b9a378ee4218beddf77ac9af6de7",
            "max": 1158267008,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_e6d63cf58647443b9749195ec0579d87",
            "value": 1158267008
          }
        },
        "8368e4420e814c6f9be30994b69c66ee": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_d1cea390ecaa4583a698278fa3a438c9",
            "placeholder": "​",
            "style": "IPY_MODEL_5483233a9b224f3c8eb9337e4ed82314",
            "value": " 4/4 [00:56&lt;00:00, 12.02s/it]"
          }
        },
        "8708226010324a83b4f7900c3958d430": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_4ba91b8d008e483d89f16f204326f6b6",
            "placeholder": "​",
            "style": "IPY_MODEL_31afdb187d2a44d8b3a101fa18543c13",
            "value": "chat_template.jinja: "
          }
        },
        "88aa58551344410a91b07af480d6ab53": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": "20px"
          }
        },
        "89e3ad0d517f44d084df0d8a3ed40703": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "8f5799610318490492ff5aba76be3d1a": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_1055200185004ea2a95a05eb51232501",
            "max": 27868174,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_eb1068efb4364ee2893c4d21c58f38db",
            "value": 27868174
          }
        },
        "93176d6b63284b4e832e0f028be90655": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "941d8cbf8188402eb603c18b6e979035": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_8708226010324a83b4f7900c3958d430",
              "IPY_MODEL_fc12c97d659640b3b2b2b48ad7f17e5b",
              "IPY_MODEL_aa22cb2d1dcf4001bd3720b075318906"
            ],
            "layout": "IPY_MODEL_bb358c5523814376a2d4690f88f20b74"
          }
        },
        "9c2371c32afe46519ee53427ea42bc9a": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "9e6f46ddb61943f4a168d70a209a7ddc": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "a1e042e5b8ad4b028cc28ac54924207d": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "a24e2cfbef794d35a0e22753352caa15": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_0e49035e0c3a4ee4ab477b475e74ef36",
            "max": 4,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_08d8f0cfd7614900a9c9bba888619749",
            "value": 4
          }
        },
        "a4c19dc98fe943e09a26a60d23c8ff01": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_f8d2164046fb46a298e2d80628808cb3",
            "placeholder": "​",
            "style": "IPY_MODEL_af23432e17664cd6852496e43f9de0cf",
            "value": "tokenizer.json: 100%"
          }
        },
        "a6a122064bc340868b5e0e11afa9c42f": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": "20px"
          }
        },
        "aa22cb2d1dcf4001bd3720b075318906": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_0b8fa4ff186a4bfeac18cf6d676e99df",
            "placeholder": "​",
            "style": "IPY_MODEL_ccb581e230604bb690015eb685e4b8e1",
            "value": " 15.1k/? [00:00&lt;00:00, 1.44MB/s]"
          }
        },
        "aa714b1f70e8495b9299b51d6ac4c3c4": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "ae3819cd042f43babef24199081bb97f": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "ae79b9a378ee4218beddf77ac9af6de7": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "ae9728c04b974af29460ce5179a9edba": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "af23432e17664cd6852496e43f9de0cf": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "b09745899e1446129f397d822a21fc99": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "b10d25fb43eb42198abf71c4e326bcff": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "b1dcd2f908344949af01dab5a244e458": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "b72fc22310714bf0bf6ff4021db5aba8": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "baf3db00d28f4c849bb6a6739e908c62": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_f4f2fb240a12406ab5e709be33b93683",
              "IPY_MODEL_4ca56d8605864a438290152268bfc686",
              "IPY_MODEL_de306a50594f463ba9c78c713bc33241"
            ],
            "layout": "IPY_MODEL_0f09a00375bd4d4c8863fc8fb7d64d61"
          }
        },
        "bb358c5523814376a2d4690f88f20b74": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "bd8f575034c041be93ff20f63388de2d": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "bedc42e908454f1494768194b43b2964": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "c4d592366499414a99f19bce7f0bd665": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "cbc7bbfe983f4e6696f8dd6b37c50543": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "cbed0dab2d7540f697eacec5a33e1061": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_dc9b2549ff834880ac19b578adfee5a5",
              "IPY_MODEL_48f65e25acd84b0cb582de66753215da",
              "IPY_MODEL_7c52841c7e714173bfd526b0a625bc9d"
            ],
            "layout": "IPY_MODEL_bd8f575034c041be93ff20f63388de2d"
          }
        },
        "cca7d922b85449a1b0c5c025b65fba10": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_e7ca6b7ba2094872a888da5511e2bb49",
            "placeholder": "​",
            "style": "IPY_MODEL_7c7d40163ecc4dae8a2c54af21de4661",
            "value": " 1.16G/1.16G [00:09&lt;00:00, 242MB/s]"
          }
        },
        "ccb581e230604bb690015eb685e4b8e1": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "d16dc921c6554bc6b77abcb423721cc8": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "d1cea390ecaa4583a698278fa3a438c9": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "d257eaa588bd41fb947f81d306fe05cd": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "d2a7fa9dddc240e29330297870159c59": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "d4c72002b5fe44d3ad7ae67f5536c889": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_06e420cfa2974f7d8d7ec4b83f064a6e",
            "placeholder": "​",
            "style": "IPY_MODEL_3566e9058ebc45498b42e521f2314365",
            "value": " 4.00G/4.00G [00:19&lt;00:00, 279MB/s]"
          }
        },
        "d541fd264d7e4b2691e8efbd993e6ae7": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "d82ab09313cc482f9b9b45192f489825": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "dc9b2549ff834880ac19b578adfee5a5": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_212c17e3829c4accb30265a3d9ee73dc",
            "placeholder": "​",
            "style": "IPY_MODEL_1daa373c890b4ae0a9cf6a3ec325693c",
            "value": "generation_config.json: 100%"
          }
        },
        "de306a50594f463ba9c78c713bc33241": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_fa5c42218fbb44378bef71551c3383e0",
            "placeholder": "​",
            "style": "IPY_MODEL_d82ab09313cc482f9b9b45192f489825",
            "value": " 4.00G/4.00G [00:25&lt;00:00, 110MB/s]"
          }
        },
        "e6d63cf58647443b9749195ec0579d87": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "e78534b135d2465c83e3be614b71c8a4": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_c4d592366499414a99f19bce7f0bd665",
            "placeholder": "​",
            "style": "IPY_MODEL_1ece5fe9597a4e3db2dd96c38995705c",
            "value": " 27.9M/27.9M [00:01&lt;00:00, 21.9MB/s]"
          }
        },
        "e7ca6b7ba2094872a888da5511e2bb49": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "ea2f3ca562444c46b50596a1d7cf9030": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_56b30cc150924879abfc138427f4ca98",
            "placeholder": "​",
            "style": "IPY_MODEL_02c88a690a384ae183c233b6927aaf57",
            "value": "special_tokens_map.json: 100%"
          }
        },
        "eb1068efb4364ee2893c4d21c58f38db": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "eb13ad96565a44519e7cab9ce9483b90": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_6d77c8dcb28240f7b860571d11f8b9af",
            "placeholder": "​",
            "style": "IPY_MODEL_4f29591af0d64d398515479b032a1b3d",
            "value": "model-00002-of-00004.safetensors: 100%"
          }
        },
        "eb8c197ce1fc41f78531e5e73ae14a89": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "eee0d2b6eb6047d8a4001f4999dcdc35": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_d257eaa588bd41fb947f81d306fe05cd",
            "placeholder": "​",
            "style": "IPY_MODEL_b09745899e1446129f397d822a21fc99",
            "value": "model.safetensors.index.json: "
          }
        },
        "f290b17e3cd9487b9b97d54d6cec9efc": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "f34cadfaf9bb46729aeeff9492ba9026": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "f4f2fb240a12406ab5e709be33b93683": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_89e3ad0d517f44d084df0d8a3ed40703",
            "placeholder": "​",
            "style": "IPY_MODEL_3f589cf3cb804ed29ba322e4fa10c511",
            "value": "model-00001-of-00004.safetensors: 100%"
          }
        },
        "f7cc1614e13d4d22b83f787b20407a5f": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_7ce0f239a8514923b383f738bc0c9899",
            "placeholder": "​",
            "style": "IPY_MODEL_4bc16bd2399a43fe85967131af7f846a",
            "value": "model-00003-of-00004.safetensors: 100%"
          }
        },
        "f8d2164046fb46a298e2d80628808cb3": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "f9d4672fb86b4b4e9c3e9068dc479e5b": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "fa5c42218fbb44378bef71551c3383e0": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "fc12c97d659640b3b2b2b48ad7f17e5b": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_28300c16023d4ad9a59784baea2f57aa",
            "max": 1,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_004f28173c3b4fb6a9f8c2068f5db81f",
            "value": 1
          }
        }
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}