{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "GtkTGCuh6QZy" }, "source": [ "# Reinforcement Learning with OpenAI gpt-oss-20b: Teaching an LLM to Play 2048\n", "\n", "In this tutorial, we'll teach OpenAI's open-source model **gpt-oss 20b** to generate winning strategies for the classic 2048 puzzle game using **reinforcement learning (RL)**. By the end, you'll understand how to:\n", "\n", "- Connect LLMs to game environments using **OpenEnv**\n", "- Design reward functions that guide model behavior\n", "- Train models with **GRPO** (Group Relative Policy Optimization)\n", "- Prevent \"reward hacking\" with code sandboxing\n", "\n", "**Requirements:** This notebook runs on a free Tesla T4 Google Colab instance." ] }, { "cell_type": "markdown", "metadata": { "id": "hzPgFeIkZn9q" }, "source": [ "## What is 2048?\n", "\n", "**2048** is an fun single-player sliding puzzle game created by Gabriele Cirulli in 2014. The game is played on a 4Γ—4 grid where numbered tiles slide in four directions (up, down, left, right). When two tiles with the same number collide, they merge into one tile with their sum. The goal is to create a tile with the value **2048**β€”though skilled players can continue beyond that!\n", "\n", "The game requires strategic thinking: random moves quickly lead to a gridlock, while optimal play involves keeping high-value tiles in corners and building systematically.\n", "\n", "\n", "\n", "## Our Goal\n", "\n", "We'll use reinforcement learning to train **OpenAI gpt-oss-20b** to generate Python functions that implement winning 2048 strategies. Rather than playing move-by-move, the model will learn to write *code* that plays the gameβ€”a form of \"code generation as policy.\"" ] }, { "cell_type": "markdown", "metadata": { "id": "31KIMLJLnHET" }, "source": [ "## Installation\n", "\n", "We need two key libraries for this tutorial:\n", "\n", "1. **[OpenEnv](https://github.com/meta-pytorch/OpenEnv)** - A unified interface to reinforcement learning environments. Traditional RL setups require installing and configuring each environment separately (Gym, OpenSpiel, Atari, etc.). OpenEnv provides a consistent API across all of them. Best of all, OpenEnv environments are available on Hugging Face Spaces, so we can connect to them remotely without any local installation.\n", "\n", "2. **[Unsloth](https://github.com/unslothai/unsloth)** - An optimized training library that reduces VRAM usage by ~70% through memory-efficient LoRA and gradient checkpointing. This lets us run RL on a free Colab T4 GPU." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "CGoDZwcunHEU" }, "outputs": [], "source": [ "%%capture\n", "import os, importlib.util\n", "\n", "!pip install --upgrade -qqq uv\n", "if importlib.util.find_spec(\"torch\") is None or \"COLAB_\" in \"\".join(os.environ.keys()):\n", " try:\n", " import numpy\n", "\n", " get_numpy = f\"numpy=={numpy.__version__}\"\n", " except:\n", " get_numpy = \"numpy\"\n", " !uv pip install -qqq \\\n", " \"torch>=2.8.0\" \"triton>=3.4.0\" {get_numpy} torchvision bitsandbytes \"transformers==4.56.2\" trackio \\\n", " \"unsloth_zoo[base] @ git+https://github.com/unslothai/unsloth-zoo\" \\\n", " \"unsloth[base] @ git+https://github.com/unslothai/unsloth\" \\\n", " git+https://github.com/triton-lang/triton.git@0add68262ab0a2e33b84524346cb27cbb2787356#subdirectory=python/triton_kernels\n", "elif importlib.util.find_spec(\"unsloth\") is None:\n", " !uv pip install -qqq unsloth trackio\n", "\n", "!uv pip install --upgrade --no-deps transformers==4.56.2 tokenizers trl==0.22.2 unsloth unsloth_zoo" ] }, { "cell_type": "markdown", "metadata": { "id": "OjifwNNZ7bMx" }, "source": [ "Next, we install the OpenEnv client and the connector for **OpenSpiel**β€”DeepMind's collection of game environments used in RL research. OpenSpiel includes implementations of classic games like Chess, Go, and our target: 2048." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1yzMMSLR7dNj" }, "outputs": [], "source": [ "%%capture\n", "!pip install -qqq openenv-core websockets\n", "!pip install -qqq git+https://huggingface.co/spaces/openenv/openspiel_env" ] }, { "cell_type": "markdown", "metadata": { "id": "CcLYwLyQLADE" }, "source": [ "## Loading OpenAI gpt-oss 20b\n", "\n", "We load the model with several memory optimizations:\n", "\n", "| Parameter | Value | Description |\n", "|-----------|-------|-------------|\n", "| `max_seq_length` | 768 | Maximum context length. Increase for longer outputs (uses more VRAM). |\n", "| `load_in_4bit` | True | Quantizes weights to 4-bit, dramatically reducing memory usage. |\n", "| `lora_rank` | 4 | LoRA adapter rank. Higher = more expressive but slower/more memory. |\n", "| `offload_embedding` | True | Moves embeddings to CPU RAM, saving ~1GB VRAM. |" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 543, "referenced_widgets": [ "419e4369b36644d3abec159511a88ad8", "eee0d2b6eb6047d8a4001f4999dcdc35", "2d91f25a8d0b4dd39dc320b2cf17fc0b", "1b84d7dc9567474c8587432d48342b71", "f34cadfaf9bb46729aeeff9492ba9026", "d257eaa588bd41fb947f81d306fe05cd", "b09745899e1446129f397d822a21fc99", "88aa58551344410a91b07af480d6ab53", "4f38070c09354406928c5e7be6cff3fd", "665f7b7e5496432985ed9f49829f5834", "d16dc921c6554bc6b77abcb423721cc8", "baf3db00d28f4c849bb6a6739e908c62", "f4f2fb240a12406ab5e709be33b93683", "4ca56d8605864a438290152268bfc686", "de306a50594f463ba9c78c713bc33241", "0f09a00375bd4d4c8863fc8fb7d64d61", "89e3ad0d517f44d084df0d8a3ed40703", "3f589cf3cb804ed29ba322e4fa10c511", "2612008cf36949d9a6618a01ac817618", "3e3ce50c437a412f9cc2bedb697648f7", "fa5c42218fbb44378bef71551c3383e0", "d82ab09313cc482f9b9b45192f489825", "4e63263fc27b4f07b4f6abffba082379", "eb13ad96565a44519e7cab9ce9483b90", "0b851acfd32047bfb6bae17d43ccfcb1", "d4c72002b5fe44d3ad7ae67f5536c889", "a1e042e5b8ad4b028cc28ac54924207d", "6d77c8dcb28240f7b860571d11f8b9af", "4f29591af0d64d398515479b032a1b3d", "667192c07a7340a9a72ed25648a0be64", "aa714b1f70e8495b9299b51d6ac4c3c4", "06e420cfa2974f7d8d7ec4b83f064a6e", "3566e9058ebc45498b42e521f2314365", "6af6de3285684e76b320571b44af5fc1", "f7cc1614e13d4d22b83f787b20407a5f", "072060f8bdb54a15baf838f67d376d99", "5371b2fad7b04f97bd8f4671d844d2cb", "b10d25fb43eb42198abf71c4e326bcff", "7ce0f239a8514923b383f738bc0c9899", "4bc16bd2399a43fe85967131af7f846a", "29d98af49dd3412f84f5843b937029d1", "93176d6b63284b4e832e0f028be90655", "9c2371c32afe46519ee53427ea42bc9a", "f9d4672fb86b4b4e9c3e9068dc479e5b", "297bdff1add5414893319b185cb15da6", "362947c7e102474cbf560f51e713bdff", "7f63ff3e87aa4b86a4f8e64785d1d34c", "cca7d922b85449a1b0c5c025b65fba10", "5520c59f24cd414092bf5f952425611d", "b1dcd2f908344949af01dab5a244e458", "9e6f46ddb61943f4a168d70a209a7ddc", "ae79b9a378ee4218beddf77ac9af6de7", "e6d63cf58647443b9749195ec0579d87", "e7ca6b7ba2094872a888da5511e2bb49", "7c7d40163ecc4dae8a2c54af21de4661", "4dc823fcd0dd4eaaa2e8aaff0daa9ad1", "1b26eac03fed4c8784bd611474cf4607", "a24e2cfbef794d35a0e22753352caa15", "8368e4420e814c6f9be30994b69c66ee", "eb8c197ce1fc41f78531e5e73ae14a89", "b72fc22310714bf0bf6ff4021db5aba8", "d2a7fa9dddc240e29330297870159c59", "0e49035e0c3a4ee4ab477b475e74ef36", "08d8f0cfd7614900a9c9bba888619749", "d1cea390ecaa4583a698278fa3a438c9", "5483233a9b224f3c8eb9337e4ed82314", "cbed0dab2d7540f697eacec5a33e1061", "dc9b2549ff834880ac19b578adfee5a5", "48f65e25acd84b0cb582de66753215da", "7c52841c7e714173bfd526b0a625bc9d", "bd8f575034c041be93ff20f63388de2d", "212c17e3829c4accb30265a3d9ee73dc", "1daa373c890b4ae0a9cf6a3ec325693c", "383e9b4b74e34cae96c1f46f41591b82", "46cb3593b37c4cb9b4bac422bc5809c8", "ae9728c04b974af29460ce5179a9edba", "754f2452fbe14c7098215ec810ffbf14", "3c2e00b9d20a4c9ea5b850b776752fd2", "7c29e5f727ec4b608c06d0487a2c9e53", "4beae5415d0647e2898a83313a08ea94", "62cd4ccc430549e7a2c156222d47ebeb", "4691273248984fdea29d83ec0a246cd9", "ae3819cd042f43babef24199081bb97f", "52e9a0ea76df4fd8822ccadadaadb501", "a6a122064bc340868b5e0e11afa9c42f", "3371a1da5d0e426bb6cc02a6c383dd6f", "bedc42e908454f1494768194b43b2964", "41ee8407344d402997b0574e0ae26c77", "5d1e2fdbf7a2409abbafb63e4b160668", "a4c19dc98fe943e09a26a60d23c8ff01", "8f5799610318490492ff5aba76be3d1a", "e78534b135d2465c83e3be614b71c8a4", "2f57c8e713b94c8692837b4e17c9e983", "f8d2164046fb46a298e2d80628808cb3", "af23432e17664cd6852496e43f9de0cf", "1055200185004ea2a95a05eb51232501", "eb1068efb4364ee2893c4d21c58f38db", "c4d592366499414a99f19bce7f0bd665", "1ece5fe9597a4e3db2dd96c38995705c", "050567dccb47456aaac65d118ac60a6b", "ea2f3ca562444c46b50596a1d7cf9030", "0b7287d482cc44dbb406d71f23f1aea0", "474c7fe6ef4b430d9826171eded2ebf1", "12a877304ddf45e49bbdfe056394c3d6", "56b30cc150924879abfc138427f4ca98", "02c88a690a384ae183c233b6927aaf57", "d541fd264d7e4b2691e8efbd993e6ae7", "cbc7bbfe983f4e6696f8dd6b37c50543", "f290b17e3cd9487b9b97d54d6cec9efc", "18d4cc3fd8ee4e3fbf27c574fd467f20", "941d8cbf8188402eb603c18b6e979035", "8708226010324a83b4f7900c3958d430", "fc12c97d659640b3b2b2b48ad7f17e5b", "aa22cb2d1dcf4001bd3720b075318906", "bb358c5523814376a2d4690f88f20b74", "4ba91b8d008e483d89f16f204326f6b6", "31afdb187d2a44d8b3a101fa18543c13", "28300c16023d4ad9a59784baea2f57aa", "004f28173c3b4fb6a9f8c2068f5db81f", "0b8fa4ff186a4bfeac18cf6d676e99df", "ccb581e230604bb690015eb685e4b8e1" ] }, "id": "DkIvEkIIkEyB", "outputId": "feb2ac4d-19dc-4e55-ebbe-e28e0ef721c1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "πŸ¦₯ Unsloth: Will patch your computer to enable 2x faster free finetuning.\n", "πŸ¦₯ Unsloth Zoo will now patch everything to make training faster!\n", "==((====))== Unsloth 2026.1.3: Fast Gpt_Oss patching. Transformers: 4.56.2.\n", " \\\\ /| Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.\n", "O^O/ \\_/ \\ Torch: 2.9.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.5.0\n", "\\ / Bfloat16 = FALSE. FA [Xformers = None. FA2 = False]\n", " \"-____-\" Free license: http://github.com/unslothai/unsloth\n", "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n", "Unsloth: Using float16 precision for gpt_oss won't work! Using float32.\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "419e4369b36644d3abec159511a88ad8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "model.safetensors.index.json: 0.00B [00:00, ?B/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "baf3db00d28f4c849bb6a6739e908c62", "version_major": 2, "version_minor": 0 }, "text/plain": [ "model-00001-of-00004.safetensors: 0%| | 0.00/4.00G [00:00 0 ! Suggested 8, 16, 32, 64, 128\n", " target_modules=[\n", " \"q_proj\",\n", " \"k_proj\",\n", " \"v_proj\",\n", " \"o_proj\",\n", " \"gate_proj\",\n", " \"up_proj\",\n", " \"down_proj\",\n", " ],\n", " lora_alpha=lora_rank * 2, # *2 speeds up training\n", " use_gradient_checkpointing=\"unsloth\", # Reduces memory usage\n", " random_state=3407,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "N0QnO9_YJBOI" }, "source": [ "## Connecting to the 2048 Game Environment\n", "\n", "OpenEnv lets us connect to game environments hosted remotely. We'll use a **Hugging Face Space** that runs the OpenSpiel 2048 game server. This architecture has several benefits:\n", "\n", "- **No local installation** of game dependencies (OpenSpiel can be tricky to build)\n", "- **Consistent environment** across different machines\n", "- **Scalable** - the same pattern works for more complex environments\n", "\n", "The Space exposes a WebSocket API that accepts actions and returns game states." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "hQrG81XV87-7" }, "outputs": [], "source": [ "from openspiel_env import OpenSpielEnv\n", "from openspiel_env.models import OpenSpielAction, OpenSpielObservation" ] }, { "cell_type": "markdown", "metadata": { "id": "-WT0Zu0IcN_r" }, "source": [ "The [openenv/openspiel_env](https://huggingface.co/spaces/openenv/openspiel_env) Space hosts a running OpenSpiel server configured for 2048. It handles game state management, validates moves, and returns observations after each action. You can also run the server locally for faster iteration:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zBIye94T8djG" }, "outputs": [], "source": [ "# Connect to OpenSpiel 2048 environment on HuggingFace Spaces\n", "# The game is configured server-side via OPENSPIEL_GAME=2048\n", "OPENSPIEL_URL = \"https://openenv-openspiel-env.hf.space\"\n", "# For local: OPENSPIEL_URL = \"http://localhost:8000\"\n", "\n", "env = OpenSpielEnv(base_url=OPENSPIEL_URL)" ] }, { "cell_type": "markdown", "metadata": { "id": "P3rMiKLl9Ro2" }, "source": [ "Let's see how the current 2048 game state looks like:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "uBPx0Hho9Xi1", "outputId": "7e7fae62-6d83-4c5e-b8c5-b49a1ccc9c53" }, "outputs": [ { "data": { "text/plain": [ "OpenSpielObservation(done=False, reward=None, metadata={}, info_state=[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], legal_actions=[0, 1, 2], game_phase='initial', current_player_id=0, opponent_last_action=None)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result = env.reset()\n", "current_state = result.observation\n", "current_state" ] }, { "cell_type": "markdown", "metadata": { "id": "4Qz1tRVTAii8" }, "source": [ "### Decoding the Game State\n", "\n", "OpenSpiel's 2048 `info_state` uses a compact encodingβ€”not raw tile values. The first 16 elements represent the 4Γ—4 board positions, with each value being **logβ‚‚ of the tile** (so 1 = 2ΒΉ = 2, 2 = 2Β² = 4, etc.). Values of 0 represent empty cells.\n", "\n", "We need to:\n", "1. Extract only the first 16 elements (the board)\n", "2. Reshape into a 4Γ—4 grid\n", "3. Convert from logβ‚‚ encoding to actual tile values" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "7PVeoKW2AmKr", "outputId": "5689ae76-cd68-48f5-e7e7-e05494109a63" }, "outputs": [ { "data": { "text/plain": [ "([[0, 1, 0, 0, 0, 0, 0, 0],\n", " [0, 0, 0, 0, 0, 0, 0],\n", " [0, 0, 0, 0, 0, 0, 0],\n", " [0, 0, 0, 0, 0, 0, 0],\n", " [0, 0, 0, 0, 0, 0, 0],\n", " [0, 0, 0, 0, 0, 0, 0],\n", " [0, 0, 0, 0, 1, 0, 0]],\n", " 7)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "\n", "# 2048 game constants\n", "BOARD_SIZE = 4\n", "BOARD_CELLS = BOARD_SIZE * BOARD_SIZE # 16\n", "WIN_TILE = 2048 # The target tile value to win\n", "\n", "\n", "def convert_to_board(current_state):\n", " \"\"\"\n", " Convert OpenSpiel 2048 observation to a 4Γ—4 board of tile values.\n", "\n", " OpenSpiel encodes tiles as logβ‚‚(value), so we convert back:\n", " - 0 β†’ 0 (empty)\n", " - 1 β†’ 2 (2^1)\n", " - 2 β†’ 4 (2^2)\n", " - etc.\n", " \"\"\"\n", " # Extract only the first 16 elements (the board state)\n", " raw_board = current_state.info_state[:BOARD_CELLS]\n", "\n", " # Convert from logβ‚‚ encoding to actual tile values\n", " # 0 stays 0 (empty), otherwise 2^value\n", " tiles = [int(2**val) if val > 0 else 0 for val in raw_board]\n", "\n", " # Reshape into 4Γ—4 grid\n", " board = [tiles[i * BOARD_SIZE : (i + 1) * BOARD_SIZE] for i in range(BOARD_SIZE)]\n", " return board, BOARD_SIZE\n", "\n", "\n", "convert_to_board(current_state)" ] }, { "cell_type": "markdown", "metadata": { "id": "hCS56yu29dsG" }, "source": [ "We also want to pretty print the game board! This is not entirely necessary, but it helps us visualize the game state and learn from the process." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "D9CI4jtgL5mw" }, "outputs": [], "source": [ "# @title (Collapsible) 2048 Game Renderer\n", "def render_board(obs, colors: bool = True, border: bool = True, dot_for_zero: bool = True) -> str:\n", " \"\"\"\n", " Pretty-print the board with colors that scale from 0 up to self.target.\n", " Uses ANSI 256-color codes (works in most terminals). Set colors=False to disable.\n", " \"\"\"\n", " import math\n", "\n", " b, size = convert_to_board(obs)\n", " mx = max((max(row) for row in b), default=0)\n", " cell_w = max(3, len(str(mx)))\n", "\n", " RESET = \"\\x1b[0m\"\n", "\n", " # A smooth-ish gradient from cool β†’ warm\n", " # (blue/cyan/green β†’ yellow/orange/red). Tweak or expand as you like.\n", " GRAD = [33, 39, 45, 51, 50, 49, 48, 47, 46, 82, 118, 154, 190, 226, 220, 214, 208, 202, 196]\n", " ZERO_FG = 239 # dim gray\n", "\n", " def color_code(v: int) -> str:\n", " if not colors:\n", " return \"\"\n", " if v == 0:\n", " return f\"\\x1b[38;5;{ZERO_FG}m\"\n", " # Normalize by exponent relative to target: r in [0,1]\n", " t = max(2, WIN_TILE) # safety; avoid log2(1)\n", " # Guard: if v is not a power of two or is <1, handle gracefully\n", " try:\n", " r = max(0.0, min(1.0, math.log2(v) / math.log2(t)))\n", " except ValueError:\n", " r = 0.0\n", " idx = int(round(r * (len(GRAD) - 1)))\n", " return f\"\\x1b[38;5;{GRAD[idx]}m\"\n", "\n", " def fmt(v: int) -> str:\n", " s = \".\" if (v == 0 and dot_for_zero) else str(v)\n", " s = s.rjust(cell_w)\n", " return color_code(v) + s + (RESET if colors else \"\")\n", "\n", " def hline(left: str, mid: str, right: str) -> str:\n", " return left + mid.join(\"─\" * cell_w for _ in range(size)) + right\n", "\n", " rows = []\n", " if border:\n", " rows.append(hline(\"β”Œ\", \"┬\", \"┐\"))\n", " for r in range(size):\n", " content = \"β”‚\".join(fmt(v) for v in b[r])\n", " rows.append((\"β”‚\" + content + \"β”‚\") if border else content)\n", " if border:\n", " rows.append(\n", " hline(\"β””\" if r == size - 1 else \"β”œ\", \"β”΄\" if r == size - 1 else \"β”Ό\", \"β”˜\" if r == size - 1 else \"─\")\n", " )\n", " return \"\\n\".join(rows)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "DAJE2LUo9oRR", "outputId": "4b15a5e3-26fb-4ce0-a4b5-0261ab7a5999" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n" ] } ], "source": [ "print(render_board(current_state))" ] }, { "cell_type": "markdown", "metadata": { "id": "0AhUa4hW-Dji" }, "source": [ "We can see the `legal_actions` ie what you can take as `[0, 1, 2, 3]` Let's try doing the action `0`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "b-gSgthFI_wq", "outputId": "b490416b-e52e-4e13-d761-23c4efecf08a" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n" ] } ], "source": [ "action = OpenSpielAction(action_id=0, game_name=\"2048\")\n", "result = env.step(action)\n", "current_state = result.observation\n", "print(render_board(current_state))" ] }, { "cell_type": "markdown", "metadata": { "id": "lPSNb8-A-iPn" }, "source": [ "So it looks like `0` is a move up action! Let's try `1`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "IUel11Tc-oLB", "outputId": "b67365e4-d760-4d49-dafc-57f4b397c98c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n" ] } ], "source": [ "action = OpenSpielAction(action_id=1, game_name=\"2048\")\n", "result = env.step(action)\n", "current_state = result.observation\n", "print(render_board(current_state))" ] }, { "cell_type": "markdown", "metadata": { "id": "nUlOshVe-qNL" }, "source": [ "`1` is a move right action. And `2`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "XU09-KA3-sqs", "outputId": "f9089a3f-f564-418f-b2bd-512103f13135" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n" ] } ], "source": [ "action = OpenSpielAction(action_id=2, game_name=\"2048\")\n", "result = env.step(action)\n", "current_state = result.observation\n", "print(render_board(current_state))" ] }, { "cell_type": "markdown", "metadata": { "id": "X2r7Zqw9-u-d" }, "source": [ "`2` is a move down. And I guess `3` is just move left!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "pFgspqn6-zd2", "outputId": "65e800d7-9d78-420e-b4fa-232f4c919481" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n" ] } ], "source": [ "action = OpenSpielAction(action_id=3, game_name=\"2048\")\n", "result = env.step(action)\n", "current_state = result.observation\n", "print(render_board(current_state))" ] }, { "cell_type": "markdown", "metadata": { "id": "RJP4TsDq-2ft" }, "source": [ "We can also print the game status which indicates if no more moves are possible, and also the possible actions you can take!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "MEa2ngmrvfNm", "outputId": "48462391-11d8-4c55-e5c4-f05ec1df6dfe" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "False\n", "[0, 1, 2]\n" ] } ], "source": [ "print(current_state.done)\n", "print(current_state.legal_actions)" ] }, { "cell_type": "markdown", "metadata": { "id": "VR6czU96cpxf" }, "source": [ "## RL Environment Setup: The Strategy Executor\n", "\n", "For reinforcement learning, we need a way to evaluate generated strategies. The key insight is that **our model doesn't play 2048 directly**β€”instead, it writes Python code that plays the game. We then execute that code and measure how well it performs.\n", "\n", "The `execute_strategy` function:\n", "1. Takes a generated Python function (the \"strategy\")\n", "2. Runs the 2048 game loop, calling the strategy for each move\n", "3. Returns how many steps the game lasted and whether it reached 2048\n", "\n", "**Timeout protection**: LLM-generated code might contain infinite loops or be very slow. We wrap execution with a 2-second timeout to ensure the RL training loop doesn't hang." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tdgjnf-8z_kr" }, "outputs": [], "source": [ "from typing import Callable\n", "from unsloth import execute_with_time_limit\n", "import itertools\n", "\n", "\n", "def has_won(board) -> bool:\n", " \"\"\"Check if the board contains a winning tile (2048 or higher).\"\"\"\n", " max_tile = max(itertools.chain.from_iterable(board))\n", " return max_tile >= WIN_TILE\n", "\n", "\n", "def _execute_strategy(strategy, current_state: OpenSpielObservation):\n", " \"\"\"Execute a strategy function on the 2048 game until completion or invalid move.\"\"\"\n", " assert callable(strategy)\n", "\n", " steps = 0\n", " total_reward = 0\n", " board = None\n", "\n", " while not current_state.done:\n", " board, _ = convert_to_board(current_state)\n", " action = strategy(board)\n", " try:\n", " action = int(action)\n", " except:\n", " return steps, False\n", " steps += 1\n", "\n", " # Invalid action - return current win status\n", " if type(action) is not int or action not in current_state.legal_actions:\n", " return steps, has_won(board) if board else False\n", "\n", " action = OpenSpielAction(action_id=action, game_name=\"2048\")\n", " result = env.step(action)\n", " current_state = result.observation\n", " if result.reward is not None:\n", " total_reward += result.reward\n", "\n", " # Game ended - check final board for win\n", " if board is None:\n", " board, _ = convert_to_board(current_state)\n", " return steps, has_won(board)\n", "\n", "\n", "@execute_with_time_limit(2)\n", "def execute_strategy(strategy: Callable, current_state: OpenSpielObservation):\n", " return _execute_strategy(strategy, current_state)" ] }, { "cell_type": "markdown", "metadata": { "id": "ywh0HizI9ayE" }, "source": [ "Let's make a generic strategy to just hit `3`. We should expect this generic strategy to fail:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "5bkhqoZc0IO8", "outputId": "c773dddf-7a5b-40fa-b061-33f407f8b804" }, "outputs": [ { "data": { "text/plain": [ "(1, False)" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def always_move_left(board):\n", " return 3\n", "\n", "\n", "# Reset OpenEnv to an initial state!\n", "result = env.reset()\n", "current_state = result.observation\n", "try:\n", " steps, if_done = execute_strategy(always_move_left, current_state)\n", "except TimeoutError as e:\n", " print(f\"Timed out with error = {str(e)}\")\n", "\n", "steps, if_done" ] }, { "cell_type": "markdown", "metadata": { "id": "dkuHVdB09sgf" }, "source": [ "To allow longer strategies for GPT-OSS Reinforcement Learning, we shall allow a 5 second timer." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "SK-LfzsA9wbW" }, "outputs": [], "source": [ "@execute_with_time_limit(5)\n", "def execute_strategy(strategy: Callable, current_state: OpenSpielObservation):\n", " return _execute_strategy(strategy, current_state)" ] }, { "cell_type": "markdown", "metadata": { "id": "tRhLV_bZMYxy" }, "source": [ "## Sandboxed Code Execution: Preventing Reward Hacking\n", "\n", "A critical challenge in RL with code generation is **reward hacking**β€”the model might learn to \"cheat\" rather than solve the actual problem. For example, it could:\n", "\n", "- Import external libraries to hardcode solutions\n", "- Access global variables to manipulate game state directly \n", "- Call system functions to bypass the game logic\n", "\n", "We use two safeguards:\n", "\n", "1. `check_python_modules` validates that the code only uses Python standard library imports (no numpy, pandas, etc.)\n", "2. `create_locked_down_function` executes code in an isolated namespace with no access to global variables\n", "\n", "Let's see these in action. First, a valid strategy that only uses standard library:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "zz80kvg6M4BG", "outputId": "19d7f047-127f-4667-8d8a-c64c132f87ea" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Only Python imports? True\n", "{'stdlib': ['math', 'typing'], 'non_stdlib': [], 'relative_imports': 0}\n" ] } ], "source": [ "from unsloth import check_python_modules\n", "\n", "sample = \"\"\"\n", "def strategy(board):\n", " import math\n", " from typing import Callable\n", " return \"0\"\n", "\"\"\"\n", "ok, info = check_python_modules(sample)\n", "print(\"Only Python imports?\", ok)\n", "print(info)" ] }, { "cell_type": "markdown", "metadata": { "id": "bZzVWgKQ-VIg" }, "source": [ "For the below piece of code, since we import `numpy`, we should not allow the execution:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Z89Jw1KB-Ux7", "outputId": "203c5dc5-bd09-42cc-b67f-08dd4a23af1c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Only Python imports? False\n", "{'stdlib': [], 'non_stdlib': ['numpy'], 'relative_imports': 0}\n" ] } ], "source": [ "sample = \"\"\"\n", "def strategy(board):\n", " from numpy import matmul\n", " return \"0\"\n", "\"\"\"\n", "ok, info = check_python_modules(sample)\n", "print(\"Only Python imports?\", ok)\n", "print(info)" ] }, { "cell_type": "markdown", "metadata": { "id": "SDSrjOTLVyQm" }, "source": [ "We also disallow global variable access. We'll use Unsloth's `create_locked_down_function` function\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "GcmYAmohVqw2", "outputId": "b66e5e50-5ec8-42f6-8440-d5d8bc5dcc08" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "name 'np' is not defined\n" ] } ], "source": [ "from unsloth import create_locked_down_function\n", "\n", "function = \"\"\"\n", "def import_numpy():\n", " np.matmul\n", " print(\"Success\")\n", "\"\"\"\n", "f = create_locked_down_function(function)\n", "try:\n", " f()\n", "except Exception as e:\n", " print(str(e))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "5tJKwLUgZsRq", "outputId": "512a7a58-5a11-4612-fccd-55ee2678f97c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "60\n" ] } ], "source": [ "from unsloth import create_locked_down_function\n", "\n", "function = \"\"\"\n", "def add(a, b):\n", " def adder(a):\n", " return a + b\n", " return adder(b) + b\n", "\"\"\"\n", "f = create_locked_down_function(function)\n", "try:\n", " print(f(10, 20))\n", "except Exception as e:\n", " print(str(e))" ] }, { "cell_type": "markdown", "metadata": { "id": "8CzwCyXIPK04" }, "source": [ "## Prompt Design: Instructing the Model\n", "\n", "The prompt is crucialβ€”it tells the model what we expect it to generate. We want:\n", "\n", "1. A **single Python function** named `strategy(board)` that:\n", " - Takes a 4Γ—4 list of lists as input\n", " - Returns a move: \"0\" (up), \"1\" (right), \"2\" (down), or \"3\" (left)\n", "2. **Self-contained code** with all helpers defined inside the function\n", "3. **Native Python only** (no external dependencies)\n", "\n", "This structured output format makes parsing straightforward and helps the model understand the task boundaries." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "B-2RRE4HMrQO", "outputId": "20224710-abb6-4b34-a13f-dfb526c6fa9f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Create a new short 2048 strategy using only native Python code.\n", "You are given a list of list of numbers for the current board state.\n", "Output one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\n", "Output your new short function in backticks using the format below:\n", "```python\n", "def strategy(board):\n", " return \"0\" # Example\n", "```\n", "All helper functions should be inside def strategy. Only output the short function `strategy`.\n" ] } ], "source": [ "prompt = \"\"\"\n", "Create a new short 2048 strategy using only native Python code.\n", "You are given a list of list of numbers for the current board state.\n", "Output one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\n", "Output your new short function in backticks using the format below:\n", "```python\n", "def strategy(board):\n", " return \"0\" # Example\n", "```\n", "All helper functions should be inside def strategy. Only output the short function `strategy`.\n", "\"\"\".strip()\n", "print(prompt)" ] }, { "cell_type": "markdown", "metadata": { "id": "MIdudFUodN4i" }, "source": [ "Let's see what the **base model** (before RL training) generates when given this prompt:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "9HJxrS76h3Ds", "outputId": "b6ead093-ac19-49f6-9d0f-98e6e196dace" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\n", "Knowledge cutoff: 2024-06\n", "Current date: 2026-01-20\n", "\n", "Reasoning: low\n", "\n", "# Valid channels: analysis, commentary, final. Channel must be included for every message.\n", "Calls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>user<|message|>Create a new short 2048 strategy using only native Python code.\n", "You are given a list of list of numbers for the current board state.\n", "Output one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\n", "Output your new short function in backticks using the format below:\n", "```python\n", "def strategy(board):\n", " return \"0\" # Example\n", "```\n", "All helper functions should be inside def strategy. Only output the short function `strategy`.<|end|><|start|>assistant<|channel|>analysis<|message|>We need to provide a short function. Probably simple heuristic: choose move with lowest collision? Use sum? Just a placeholder.<|end|><|start|>assistant<|channel|>final<|message|>```python\n", "def strategy(board):\n", " scores = [0]*4\n", " for i in range(4):\n", " r = [row[i] for row in board] # column\n", " if any(r) and any(r) != 0:\n", " # prefer moves that keep most tiles grouped\n", " if i==0 or i==1:\n", " scores[i] += sum(r)\n", " best = scores.index(max(scores))\n", " return str(best)\n", "```<|return|>\n" ] } ], "source": [ "text = tokenizer.apply_chat_template(\n", " [{\"role\": \"user\", \"content\": prompt}],\n", " tokenize=False,\n", " add_generation_prompt=True,\n", " reasoning_effort=\"low\",\n", ")\n", "\n", "from transformers import TextStreamer\n", "\n", "_ = model.generate(\n", " **tokenizer(text, return_tensors=\"pt\").to(\"cuda\"),\n", " temperature=1.0,\n", " max_new_tokens=512,\n", " streamer=TextStreamer(tokenizer, skip_prompt=False),\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "iknaWZNudTNq" }, "source": [ "## Designing Reward Functions\n", "\n", "Reward functions are the heart of RLβ€”they define what the \"good\" behavior that we want to encourage in the model looks like. For code generation, we need a multi-part reward that captures different aspects of quality:\n", "\n", "| Reward Function | Purpose | Score Range |\n", "|-----------------|---------|-------------|\n", "| `function_works` | Is the generated code syntactically valid and executable? | -2.0 to +1.0 |\n", "| `no_cheating` | Does the code avoid forbidden imports (numpy, etc.)? | -20.0 to +1.0 |\n", "| `strategy_succeeds` | Does the strategy actually play 2048 well? | -3.0 to +20.0 |\n", "\n", "\n", "Let's break down the reward functions into some practical examples:\n", "- We could heavily penalize cheating (-20.0) to make honest solutions more rewarding.\n", "- We could massively reward success (+20.0) since reaching 2048 is rare initially.\n", "- We could graduated penalties for partial failures (timeout vs. crash vs. invalid syntax). This gives the model more information to learn from, creating a 'richer' reward signal.\n", "\n", "First, we need a helper to extract the Python function from the model's markdown-formatted output:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "8JJGXKdJ-Zl_", "outputId": "1474af23-0d1f-4e73-b111-579ea0805d27" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "def strategy(board):\n", " return \"0\" # Example\n" ] } ], "source": [ "def extract_function(text):\n", " if text.count(\"```\") >= 2:\n", " first = text.find(\"```\") + 3\n", " second = text.find(\"```\", first)\n", " fx = text[first:second].strip()\n", " fx = fx.removeprefix(\"python\\n\")\n", " fx = fx[fx.find(\"def\") :]\n", " if fx.startswith(\"def strategy(board):\"):\n", " return fx\n", " return None\n", "\n", "\n", "print(extract_function(prompt))" ] }, { "cell_type": "markdown", "metadata": { "id": "KLXEcf_HSJlI" }, "source": [ "Below is our `function_works` reward function which uses Python's `exec` but guarded by not allowing leakage of local and global variables. We can also use `check_python_modules` first to check if there are errors before even executing the function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "h3-B0IIsS56S", "outputId": "33348b7f-c5dd-4d22-a0ac-e68a2264426f" }, "outputs": [ { "data": { "text/plain": [ "(False,\n", " {'error': \"SyntaxError: expected '(' (, line 1)\",\n", " 'stdlib': [],\n", " 'non_stdlib': [],\n", " 'relative_imports': 0})" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ok, info = check_python_modules(\"def a\")\n", "ok, info" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qgFNXORy-lpO" }, "outputs": [], "source": [ "def function_works(completions, **kwargs):\n", " scores = []\n", " for completion in completions:\n", " score = 0\n", " response = completion[0][\"content\"]\n", " function = extract_function(response)\n", " if function is not None:\n", " ok, info = check_python_modules(function)\n", " if function is None or \"error\" in info:\n", " score = -2.0\n", " else:\n", " try:\n", " new_strategy = create_locked_down_function(function)\n", " score = 1.0\n", " except:\n", " score = -0.5\n", " scores.append(score)\n", " return scores" ] }, { "cell_type": "markdown", "metadata": { "id": "Gf69i2WT-m4K" }, "source": [ "`no_cheating` checks if the function cheated since it might have imported Numpy or other functions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cUfHzCVx-nGK" }, "outputs": [], "source": [ "def no_cheating(completions, **kwargs):\n", " scores = []\n", " for completion in completions:\n", " score = 0\n", " response = completion[0][\"content\"]\n", " function = extract_function(response)\n", " if function is not None:\n", " ok, info = check_python_modules(function)\n", " scores.append(1.0 if ok else -20.0) # Penalize heavily!\n", " else:\n", " scores.append(-1.0) # Failed creating function\n", " return scores" ] }, { "cell_type": "markdown", "metadata": { "id": "slnqWG3FTror" }, "source": [ "Next `strategy_succeeds` checks if the strategy actually allows the game to terminate. Imagine if the strategy simply returned \"0\" which would fail after a time limit of 10 seconds.\n", "\n", "We also add a global `PRINTER` to print out the strategy and board state." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sNi129lYTpZ2" }, "outputs": [], "source": [ "import numpy as np\n", "\n", "global PRINTER\n", "PRINTER = 0\n", "\n", "\n", "def strategy_succeeds(completions, **kwargs):\n", " global PRINTER\n", " scores = []\n", " for completion in completions:\n", " printed = False\n", " score = 0\n", " response = completion[0][\"content\"]\n", " function = extract_function(response)\n", " if PRINTER % 5 == 0:\n", " printed = True\n", " print(function)\n", " PRINTER += 1\n", " if function is not None:\n", " ok, info = check_python_modules(function)\n", " if function is None or \"error\" in info:\n", " scores.append(0)\n", " continue\n", " try:\n", " new_strategy = create_locked_down_function(function)\n", " except:\n", " scores.append(0)\n", " continue\n", " try:\n", " # Reset OpenEnv to an initial state!\n", " result = env.reset()\n", " current_state = result.observation\n", " steps, if_done = execute_strategy(new_strategy, current_state)\n", " print(f\"Steps = {steps} If Done = {if_done}\")\n", " if printed is False:\n", " print(function)\n", " print(render_board(current_state))\n", " if if_done:\n", " scores.append(20.0) # Success - massively reward!\n", " else:\n", " scores.append(2.0) # Failed but function works!\n", " except TimeoutError as e:\n", " print(\"Timeout\")\n", " scores.append(-1.0) # Failed with timeout\n", " except Exception as e:\n", " print(f\"Exception = {str(e)}\")\n", " scores.append(-3.0) # Failed\n", " return scores" ] }, { "cell_type": "markdown", "metadata": { "id": "TCpSxtvSeAG_" }, "source": [ "We'll now create the dataset which includes a replica of our prompt. Remember to add a reasoning effort of low! You can choose high reasoning mode, but this'll only work on more memory GPUs like MI300s." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Ldf6SjLHVPRv", "outputId": "684f36b7-d23a-4229-afd6-b0033edf7962" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "181\n" ] }, { "data": { "text/plain": [ "{'prompt': [{'content': 'Create a new short 2048 strategy using only native Python code.\\nYou are given a list of list of numbers for the current board state.\\nOutput one action for \"0\", \"1\", \"2\", \"3\" on what is the optimal next step.\\nOutput your new short function in backticks using the format below:\\n```python\\ndef strategy(board):\\n return \"0\" # Example\\n```\\nAll helper functions should be inside def strategy. Only output the short function `strategy`.',\n", " 'role': 'user'}],\n", " 'answer': 0,\n", " 'reasoning_effort': 'low'}" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from datasets import Dataset\n", "\n", "dataset = Dataset.from_list(\n", " [{\"prompt\": [{\"role\": \"user\", \"content\": prompt.strip()}], \"answer\": 0, \"reasoning_effort\": \"low\"}] * 1000\n", ")\n", "maximum_length = len(\n", " tokenizer.apply_chat_template([{\"role\": \"user\", \"content\": prompt.strip()}], add_generation_prompt=True)\n", ")\n", "print(maximum_length)\n", "dataset[0]" ] }, { "cell_type": "markdown", "metadata": { "id": "9-IOMhVg-2AM" }, "source": [ "## Training with GRPO\n", "\n", "**Group Relative Policy Optimization (GRPO)** is an RL algorithm designed for language models. Unlike PPO which requires a separate value network, GRPO computes advantages by comparing generations within the same prompt groupβ€”making it simpler and more memory-efficient.\n", "\n", "Key training parameters:\n", "- `num_generations=2`: Generate 2 candidates per prompt to compute relative rewards\n", "- `max_steps=600`: Total training steps (~5 hours on T4)\n", "- `temperature=1.0`: Controls generation randomness (higher = more exploration)\n", "\n", "We use [TrackIO](https://github.com/gradio-app/trackio) for live visualization of training metrics directly in the notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ptqkXK2D4d6p", "outputId": "42f0f57e-d1a8-4a21-8834-c2dca0e493fb" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Unsloth: We now expect `per_device_train_batch_size` * `gradient_accumulation_steps` * `world_size` to be a multiple of `num_generations`.\n", "We will change the batch size of 1 to the `num_generations` of 2\n" ] } ], "source": [ "max_prompt_length = maximum_length + 1 # + 1 just in case!\n", "max_completion_length = max_seq_length - max_prompt_length\n", "\n", "from trl import GRPOConfig, GRPOTrainer\n", "\n", "training_args = GRPOConfig(\n", " temperature=1.0,\n", " learning_rate=2e-4,\n", " weight_decay=0.001,\n", " warmup_ratio=0.1,\n", " lr_scheduler_type=\"linear\",\n", " optim=\"adamw_8bit\",\n", " logging_steps=1,\n", " per_device_train_batch_size=1,\n", " gradient_accumulation_steps=1, # Increase to 4 for smoother training\n", " num_generations=2, # Decrease if out of memory\n", " max_prompt_length=max_prompt_length,\n", " max_completion_length=max_completion_length,\n", " # num_train_epochs = 1, # Set to 1 for a full training run\n", " max_steps=600,\n", " save_steps=100,\n", " report_to=\"trackio\", # Can use Weights & Biases, TrackIO\n", " output_dir=\"outputs\",\n", " # For optional training + evaluation\n", " # fp16_full_eval = True,\n", " # per_device_eval_batch_size = 4,\n", " # eval_accumulation_steps = 1,\n", " # eval_strategy = \"steps\",\n", " # eval_steps = 1,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "r9Mv8UZO5hz-" }, "source": [ "And let's run the trainer! If you scroll up, you'll see a table of rewards. The goal is to see the `reward` column increase!\n", "\n", "You might have to wait 150 to 200 steps for any action. You'll probably get 0 reward for the first 100 steps. Please be patient!\n", "\n", "| Step | Training Loss | reward | reward_std | completion_length | kl |\n", "|------|---------------|-----------|------------|-------------------|----------|\n", "| 1 | 0.000000 | 0.125000 | 0.000000 | 200.000000 | 0.000000 |\n", "| 2 | 0.000000 | 0.072375 | 0.248112 | 200.000000 | 0.000000 |\n", "| 3 | 0.000000 | -0.079000 | 0.163776 | 182.500000 | 0.000005 |\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "vzOuSVCL_GA9", "outputId": "880e7b31-fd7b-4ddc-96c6-9661a6e9a85e" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Unsloth: Switching to float32 training since model cannot work with float16\n" ] } ], "source": [ "trainer = GRPOTrainer(\n", " model=model,\n", " processing_class=tokenizer,\n", " reward_funcs=[\n", " function_works,\n", " no_cheating,\n", " strategy_succeeds,\n", " ],\n", " args=training_args,\n", " train_dataset=dataset,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "fQhtuwP4cf34" }, "source": [ "And let's train the model! **NOTE** This might be quite slow! 600 steps takes ~5 hours or longer.\n", "\n", "[TrackIO](https://github.com/gradio-app/trackio) might be a bit slow to load - wait 2 minutes until the graphs pop up!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "VGRxPdSCcfC3", "outputId": "9fb52ba6-2b05-4cdf-cd86-642302a96dde" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 2\n", " \\\\ /| Num examples = 1,000 | Num Epochs = 1 | Total steps = 600\n", "O^O/ \\_/ \\ Batch size per device = 2 | Gradient accumulation steps = 1\n", "\\ / Data Parallel GPUs = 1 | Total batch size (2 x 1 x 1) = 2\n", " \"-____-\" Trainable parameters = 1,990,656 of 20,916,747,840 (0.01% trained)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "* Running on public URL: https://68c7c4f7168a6e9cd6.gradio.live\n", "* Trackio project initialized: huggingface\n", "* Trackio metrics logged to: /root/.cache/huggingface/trackio\n" ] }, { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "* GPU detected, enabling automatic GPU metrics logging\n", "* Created new run: dainty-sunset-0\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "`generation_config` default values have been modified to match model-specific defaults: {'max_length': 131072}. If this is not desired, please set these values explicitly.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "def strategy(board):\n", " # simple look‑ahead: pick the move that keeps the board almost sorted\n", " # score is total immobility (higher=better)\n", " def score(b):\n", " s = 0\n", " n = len(b)\n", " for i in range(n):\n", " for j in range(n):\n", " v = b[i][j]\n", " if v != 0:\n", " # neighbors that can merge\n", " for di,dj in [(1,0),(-1,0),(0,1),(0,-1)]:\n", " ni, nj = i+di, j+dj\n", " if 0 <= ni < n and 0 <= nj < n:\n", " if b[ni][nj] == v:\n", " s += v\n", " return s\n", " moves = []\n", " for m in [\"0\",\"1\",\"2\",\"3\"]:\n", " new_b = [row[:] for row in board]\n", " # simulate move (simplified: just skip actual moving)\n", " # Here we just pick the move with highest score (mock)\n", " moves.append((score(new_b), m))\n", " best = max(moves)[1]\n", " return best\n", "Steps = 1 If Done = False\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " def apply(board, d):\n", " size = len(board)\n", " moved, new = False, [[0]*size for _ in range(size)]\n", " for i in range(size):\n", " row = board[i] if d in (0,1) else [board[j][i] for j in range(size)]\n", " vals = [v for v in row if v]\n", " merged = []\n", " j = 0\n", " while j < len(vals):\n", " if j+1 < len(vals) and vals[j]==vals[j+1]:\n", " merged.append(vals[j]*2)\n", " j+=2\n", " else:\n", " merged.append(vals[j]); j+=1\n", " merged += [0]*(size-len(merged))\n", " if d==0: new[i]=merged\n", " if d==1: new[i]=merged[::-1]\n", " if d==2: new[i]=merged\n", " if d==3: new[i]=merged[::-1]\n", " return new\n", " best, best_move = 0, \"0\"\n", " for move in \"0123\":\n", " new = apply(board, int(move))\n", " cnt = sum(1 for r in new for v in r if v==0)\n", " if cnt > best:\n", " best, best_move = cnt, move\n", " return best_move\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Unsloth: Will smartly offload gradients to save VRAM!\n" ] }, { "data": { "text/html": [ "\n", "
\n", " \n", " \n", " [ 15/600 1:09:12 < 51:54:20, 0.00 it/s, Epoch 0.01/1]\n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StepTraining Lossrewardreward_stdcompletions / mean_lengthcompletions / min_lengthcompletions / max_lengthcompletions / clipped_ratiocompletions / mean_terminated_lengthcompletions / min_terminated_lengthcompletions / max_terminated_lengthklrewards / function_works / meanrewards / function_works / stdrewards / no_cheating / meanrewards / no_cheating / stdrewards / strategy_succeeds / meanrewards / strategy_succeeds / std
10.0000004.0000000.000000312.000000278.000000346.0000000.000000312.000000278.000000346.0000000.0001031.0000000.0000001.0000000.0000002.0000000.000000
20.0000004.0000000.000000340.000000338.000000342.0000000.000000340.000000338.000000342.0000000.0000551.0000000.0000001.0000000.0000002.0000000.000000
30.000000-1.2500002.474874557.000000528.000000586.0000000.500000528.000000528.000000528.0000000.000063-1.2500001.0606600.0000001.4142140.0000000.000000
40.0000004.0000000.00000082.00000052.000000112.0000000.00000082.00000052.000000112.0000000.0001751.0000000.0000001.0000000.0000002.0000000.000000
50.0000004.0000000.000000355.500000244.000000467.0000000.000000355.500000244.000000467.0000000.0015421.0000000.0000001.0000000.0000002.0000000.000000
60.0000004.0000000.000000278.000000142.000000414.0000000.000000278.000000142.000000414.0000000.0081321.0000000.0000001.0000000.0000002.0000000.000000
70.0000004.0000000.000000370.000000183.000000557.0000000.000000370.000000183.000000557.0000000.0073501.0000000.0000001.0000000.0000002.0000000.000000
80.0000004.0000000.000000317.500000190.000000445.0000000.000000317.500000190.000000445.0000000.0123451.0000000.0000001.0000000.0000002.0000000.000000
90.0000000.5000004.949748321.50000057.000000586.0000000.50000057.00000057.00000057.0000000.018377-0.5000002.1213200.0000001.4142141.0000001.414214
100.0000000.5000004.949748407.500000229.000000586.0000000.500000229.000000229.000000229.0000000.031208-0.5000002.1213200.0000001.4142141.0000001.414214
110.0000004.0000000.000000225.500000187.000000264.0000000.000000225.500000187.000000264.0000000.0375501.0000000.0000001.0000000.0000002.0000000.000000
120.000000-3.0000000.000000586.000000586.000000586.0000001.0000000.0000000.0000000.0000000.000167-2.0000000.000000-1.0000000.0000000.0000000.000000
130.000100-2.0000001.414214491.000000396.000000586.0000000.500000396.000000396.000000396.0000000.129521-0.5000002.1213200.0000001.414214-1.5000002.121320

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Steps = 1 If Done = False\n", "def strategy(board):\n", " # Count empty cells for each move\n", " def score_for(move):\n", " temp = [row[:] for row in board]\n", " # simulate move\n", " for i in range(4):\n", " line = [temp[j][i] for j in range(4)] if move==3 else [temp[i][j] for j in range(4)]\n", " # shift and merge\n", " new_line = []\n", " merged = False\n", " for val in (line if move in (0,2) else reversed(line)):\n", " if val == 0: continue\n", " if new_line and new_line[-1] == val and not merged:\n", " new_line[-1] *= 2\n", " merged = True\n", " else:\n", " new_line.append(val)\n", " # fill rest with zeros\n", " new_line += [0]*(4-len(new_line))\n", " if move==3:\n", " for j in range(4): temp[j][i] = new_line[j]\n", " else:\n", " for j in range(4): temp[i][j] = new_line[j]\n", " return sum(new_line) # simple heuristic\n", " best = -1; best_move=None\n", " for m in range(4):\n", " if score_for(m) > best:\n", " best = score_for(m); best_move=str(m)\n", " return best_move\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " # Assign scores to each move based on total number of merges and minimal tile movement\n", " scores = {}\n", " dirs = [(0,1),(0,-1),(1,0),(-1,0)] # right, left, down, up\n", " # evaluate each direction\n", " for i, (dx, dy) in enumerate(dirs):\n", " tmp = [row[:] for row in board] # copy\n", " for y in range(len(tmp)):\n", " line = tmp[y] if dx==0 else [tmp[x][y] for x in range(len(tmp))]\n", " # slide and combine\n", " new_line = [v for v in line if v!=0]\n", " for k in range(len(new_line)-1):\n", " if new_line[k]==new_line[k+1]:\n", " new_line[k]*=2\n", " new_line.pop(k+1)\n", " new_line+= [0]*(len(line)-len(new_line))\n", " # place back\n", " if dx==0:\n", " tmp[y] = new_line\n", " else:\n", " for x in range(len(tmp)):\n", " tmp[x][y] = new_line[x]\n", " # score by number of zero tiles (more space)\n", " scores[i] = sum(v==0 for row in tmp for v in row)\n", " # choose move with most empty tiles\n", " best = max(scores, key=scores.get)\n", " return str(best)\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "def strategy(board):\n", " def move(board, dir):\n", " size = len(board)\n", " def compress(line):\n", " nonlocal line\n", " line = [x for x in line if x]\n", " for i in range(len(line)-1):\n", " if line[i]==line[i+1]:\n", " line[i]*=2\n", " line[i+1]=0\n", " line=[x for x in line if x]\n", " return line+[0]*(size-len(line))\n", " if dir==0: # up\n", " res=[[0]*size for _ in range(size)]\n", " for c in range(size):\n", " col=[board[r][c] for r in range(size)]\n", " col=compress(col)\n", " for r in range(size):\n", " res[r][c]=col[r]\n", " return res\n", " if dir==1: # down\n", " res=[[0]*size for _ in range(size)]\n", " for c in range(size):\n", " col=[board[r][c] for r in range(size)][::-1]\n", " col=compress(col)\n", " col=col[::-1]\n", " for r in range(size):\n", " res[r][c]=col[r]\n", " return res\n", " if dir==2: # left\n", " res=[[0]*size for _ in range(size)]\n", " for r in range(size):\n", " line=board[r]\n", " line=compress(line)\n", " res[r]=line\n", " return res\n", " if dir==3: # right\n", " res=[[0]*size for _ in range(size)]\n", " for r in range(size):\n", " line=board[r][::-1]\n", " line=compress(line)\n", " line=line[::-1]\n", " res[r]=line\n", " return res\n", " best=None\n", " best_sum=-1\n", " for d in range(4):\n", " new=move(board,d)\n", " if new==board: continue\n", " s=sum(sum(row) for row in new)\n", " if s>best_sum:\n", " best_sum=s; best=str(d)\n", " return best if best is not None else \"0\"\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " import random\n", " best = None\n", " best_score = -1\n", " for move in '0123':\n", " b = board\n", " # simulate move by simple shift (not full 2048 logic)\n", " # This is a placeholder: choose random valid move\n", " if random.random() < 0.5:\n", " return move\n", " return best if best else \"0\"\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " # Very simple strategy: always return the first possible direction (0).\n", " return \"0\"\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " #\n", " # Try to reduce the number of empty tiles by making a merge first.\n", " #\n", " def can_merge(up, down):\n", " return any((r, c) for r in range(4) for c in range(4)\n", " if board[r][c] == 0 and board[down(r)][down(c)] == r and board[up(r)][up(c)] == r)\n", " #\n", " # Prefer to move towards the corner that is most populated.\n", " #\n", " if any(board[0][c] == 0 for c in range(4)):\n", " return \"0\" # up\n", " if any(board[3][c] == 0 for c in range(4)):\n", " return \"1\" # down\n", " if any(board[r][0] == 0 for r in range(4)):\n", " return \"2\" # left\n", " return \"3\" # right\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 1 If Done = False\n", "def strategy(board):\n", " # Simple heuristic: choose the move that results in the highest number of empty cells after a move\n", " moves = []\n", " for move in range(4):\n", " new_board = [row[:] for row in board]\n", " # simulate move\n", " def compress_and_merge(line):\n", " filtered = [x for x in line if x != 0]\n", " merged = []\n", " skip = False\n", " for i, val in enumerate(filtered):\n", " if skip:\n", " skip = False\n", " continue\n", " if i + 1 < len(filtered) and filtered[i] == filtered[i+1]:\n", " merged.append(val * 2)\n", " skip = True\n", " else:\n", " merged.append(val)\n", " merged += [0] * (len(line) - len(merged))\n", " return merged\n", " if move == 0: # left\n", " for r in range(4):\n", " new_board[r] = compress_and_merge(new_board[r])\n", " elif move == 1: # right\n", " for r in range(4):\n", " new_board[r] = list(reversed(compress_and_merge(list(reversed(new_board[r])))))\n", " elif move == 2: # up\n", " for c in range(4):\n", " col = [new_board[r][c] for r in range(4)]\n", " merged = compress_and_merge(col)\n", " for r in range(4):\n", " new_board[r][c] = merged[r]\n", " elif move == 3: # down\n", " for c in range(4):\n", " col = [new_board[r][c] for r in range(4)]\n", " merged = list(reversed(compress_and_merge(list(reversed(col)))))\n", " for r in range(4):\n", " new_board[r][c] = merged[r]\n", " moves.append((sum(row.count(0) for row in new_board), move))\n", " return str(max(moves)[1])\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "def strategy(board):\n", " # compute all possible moves and pick one with max score (simple heuristic)\n", " best, best_val = None, -1\n", " for move in map(str, range(4)):\n", " # simulate move\n", " new_board = [row[:] for row in board]\n", " # apply move logic (omitted for brevity)\n", " # evaluate board\n", " val = sum(sum(row) for row in new_board) # placeholder\n", " if val > best_val:\n", " best_val, best = val, move\n", " return best\n", "Steps = 9 If Done = False\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " # simple heuristic: choose first direction that merges a pair\n", " for d in range(4):\n", " visited = set()\n", " for i in range(4):\n", " for j in range(4):\n", " if board[i][j] == 0:\n", " continue\n", " ni, nj = i, j\n", " if d == 0: # up\n", " ni -= 1\n", " elif d == 1: # down\n", " ni += 1\n", " elif d == 2: # left\n", " nj -= 1\n", " else: # right\n", " nj += 1\n", " if 0 <= ni < 4 and 0 <= nj < 4:\n", " if board[ni][nj] == board[i][j]:\n", " return str(d)\n", " return \"0\"\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " # look for the best next move using a simple heuristic\n", " max_val = -1\n", " move = \"0\"\n", " for m in (\"0\",\"1\",\"2\",\"3\"):\n", " # copy board\n", " import copy\n", " b = copy.deepcopy(board)\n", " # simulate move\n", " r=False\n", " if m==\"0\": # up\n", " for c in range(len(b)):\n", " col=[b[r][c] for r in range(len(b))]\n", " col=[n for n in col if n]\n", " i=0\n", " while i< len(col)-1:\n", " if col[i]==col[i+1]:\n", " col[i]*=2; del col[i+1]; i+=1\n", " i+=1\n", " for r in range(len(b)):\n", " b[r][c]=col[r] if r0:\n", " if col[i]==col[i-1]:\n", " col[i]*=2; del col[i-1]; i-=1\n", " i-=1\n", " for r in range(len(b)):\n", " b[r][c]=col[len(col)-1-r] if r0:\n", " if row[i]==row[i-1]:\n", " row[i]*=2; del row[i-1]; i-=1\n", " i-=1\n", " b[r]=[0]*(len(b)-len(row))+row\n", " # evaluate\n", " val=sum(max(row) for row in b)\n", " if val>max_val:\n", " max_val=val; move=m\n", " return move\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 0 If Done = False\n", "def strategy(board):\n", " # Count tiles in entire board\n", " total = sum(sum(row) for row in board)\n", " if total == 0: # no tiles\n", " return \"0\"\n", " # Heuristic: prefer moving up if average value of upper row > lower row\n", " upper = sum(board[0])\n", " lower = sum(board[-1])\n", " left = sum(row[0] for row in board)\n", " right = sum(row[-1] for row in board)\n", " moves = [(upper - lower, \"0\"), (right - left, \"1\")], \n", " # pick the move with biggest difference (push bigger numbers up or right)\n", " best_move = max(moves)[1]\n", " return best_move\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " # Possible moves: 0=up, 1=right, 2=down, 3=left\n", " best_score = -1\n", " best_move = 0\n", " dirs = [(0, -1), (1, 0), (0, 1), (-1, 0)] # mapping: up, right, down, left\n", " for move, (dx, dy) in enumerate(dirs):\n", " new_board = [row[:] for row in board]\n", " moved = False\n", " for x in range(4):\n", " for y in range(4):\n", " if dx != 0:\n", " nx, ny = x + dx, y\n", " else:\n", " nx, ny = x, y + dy\n", " if 0 <= nx < 4 and 0 <= ny < 4:\n", " if board[x][y] != 0 and new_board[nx][ny] == 0:\n", " new_board[nx][ny] = board[x][y]\n", " new_board[x][y] = 0\n", " moved = True\n", " # Merge\n", " if dx != 0:\n", " if 0 <= nx-1 < 4 and new_board[nx-1][ny] == new_board[nx][ny] != 0:\n", " new_board[nx-1][ny] *= 2\n", " new_board[nx][ny] = 0\n", " else:\n", " if 0 <= ny-1 < 4 and new_board[nx][ny-1] == new_board[nx][ny] != 0:\n", " new_board[nx][ny-1] *= 2\n", " new_board[nx][ny] = 0\n", " if not moved:\n", " continue\n", " score = sum(new_board[x][y] for x in range(4) for y in range(4))\n", " if score > best_score:\n", " best_score = score\n", " best_move = move\n", " return str(best_move)\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "def strategy(board):\n", " # simple heuristic: try to merge the first pair of equal tiles from left to right\n", " for i in range(len(board)):\n", " for j in range(len(board[i])-1):\n", " if board[i][j] == board[i][j+1] and board[i][j] != 0:\n", " return str(j) # direction: 0-left, 1-right, 2-up, 3-down\n", " # if no merges, slide to fill empty spot on the left\n", " for i in range(len(board)):\n", " for j in range(len(board[i])):\n", " if board[i][j] == 0:\n", " return \"0\"\n", " return \"0\"\n", "Steps = 9 If Done = False\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " return \"0\"\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " # board is a list of 4 lists, each containing 4 integers (0 for empty)\n", " from collections import Counter\n", " # Count numbers of each value\n", " counts = Counter([num for row in board for num in row if num != 0])\n", " # Prefer moving toward the rightmost or downwards if a move will combine\n", " # First, try to combine pairs by moving left\n", " for i in range(4):\n", " for j in range(1,4):\n", " if board[i][j] == board[i][j-1] and board[i][j] != 0:\n", " return \"3\" # Move up to combine\n", " # If no direct combine, move right if possible\n", " for i in range(4):\n", " for j in range(3):\n", " if board[i][j] == 0:\n", " return \"2\" # Move right\n", " # If no empty, move down\n", " return \"1\"\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "def strategy(board):\n", " # Try to move left if possible, else right, up, down\n", " def can(move):\n", " for i,row in enumerate(board):\n", " if move == \"0\" and i>0 and row[i-1]==0: return True\n", " if move == \"1\" and i<3 and row[i+1]==0: return True\n", " if move == \"2\" and i>0 and board[i-1][i]==0: return True\n", " if move == \"3\" and i<3 and board[i+1][i]==0: return True\n", " return False\n", "\n", " for m in [\"0\",\"1\",\"2\",\"3\"]:\n", " if can(m):\n", " return m\n", " return \"0\"\n", "Steps = 9 If Done = False\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " # Simple heuristic: move a tile towards the nearest zero cell.\n", " n = len(board)\n", " for i in range(n):\n", " for j in range(n):\n", " if board[i][j] != 0:\n", " # try to move right if possible\n", " if j+1 < n and board[i][j+1] == 0:\n", " return \"1\" # move right\n", " # try upwards\n", " if i-1 >= 0 and board[i-1][j] == 0:\n", " return \"0\" # move up\n", " # try left\n", " if j-1 >= 0 and board[i][j-1] == 0:\n", " return \"3\" # move left\n", " # try downwards\n", " if i+1 < n and board[i+1][j] == 0:\n", " return \"2\" # move down\n", " return \"0\"\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Exception = list index out of range\n", "None\n", "Steps = 9 If Done = False\n", "def strategy(board):\n", " \"\"\"\n", " Determine the best move (\"0\": Up, \"1\": Right, \"2\": Down, \"3\": Left)\n", " for a 2048 board represented by a list of lists.\n", " This implementation uses a simple heuristic: count the number of\n", " empty cells after each potential move and choose the move that\n", " maximizes this count. It does not simulate future moves.\n", " \"\"\"\n", " # Directions: 0=Up,1=Right,2=Down,3=Left\n", " dirs = [( -1, 0), ( 0, 1), ( 1, 0), ( 0, -1)]\n", " best_move = None\n", " best_empty = -1\n", "\n", " n = len(board)\n", " for move, (dx, dy) in enumerate(dirs):\n", " new_board = [row[:] for row in board] # copy\n", " changed = False\n", " for i in range(n):\n", " for j in range(n):\n", " x, y = i, j\n", " # Move the tile in the chosen direction\n", " while True:\n", " nx, ny = x + dx, y + dy\n", " if 0 <= nx < n and 0 <= ny < n and new_board[nx][ny] == 0:\n", " # Merge if possible\n", " if new_board[x][y] != 0 and new_board[nx][ny] == 0:\n", " new_board[nx][ny] = new_board[x][y]\n", " new_board[x][y] = 0\n", " changed = True\n", " x, y = nx, ny\n", " else:\n", " break\n", " # If tile could not move, skip\n", " # Count empty cells in the resulting board\n", " empty = sum(row.count(0) for row in new_board)\n", " if empty > best_empty:\n", " best_empty = empty\n", " best_move = str(move)\n", " return best_move\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n", "Steps = 2 If Done = False\n", "def strategy(board):\n", " n = len(board)\n", " score = lambda r,c: board[r][c]\n", " # Count empty cells and total\n", " empties = sum(board[i][j] == 0 for i in range(n) for j in range(n))\n", " # Simple heuristic: move left if highest tile on left, else right, else up then down\n", " # Find position of maximum tile\n", " max_val = -1\n", " max_pos = None\n", " for i in range(n):\n", " for j in range(n):\n", " if board[i][j] > max_val:\n", " max_val = board[i][j]\n", " max_pos = (i, j)\n", " # Prefer moving towards the edge with max tile\n", " i, j = max_pos\n", " # Prioritize directions that keep max tile towards edge\n", " if j == 0: return \"0\" # left\n", " if j == n-1: return \"1\" # right\n", " if i == 0: return \"2\" # up\n", " return \"3\" # down\n", "β”Œβ”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€β”€β”€β”€\n", "β”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;33m 1\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\u001b[38;5;239m .\u001b[0mβ”‚\n", "β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜\n" ] } ], "source": [ "trainer.train()" ] }, { "cell_type": "markdown", "metadata": { "id": "tlaUdxC_VHpz" }, "source": [ "## Testing the Trained Model\n", "\n", "Let's generate a strategy from our RL-trained model and see how it differs from the base model:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "TwZygRdWf8ab" }, "outputs": [], "source": [ "text = tokenizer.apply_chat_template(\n", " [{\"role\": \"user\", \"content\": prompt}],\n", " tokenize=False,\n", " add_generation_prompt=True,\n", " reasoning_effort=\"low\",\n", ")\n", "\n", "from transformers import TextStreamer\n", "\n", "_ = model.generate(\n", " **tokenizer(text, return_tensors=\"pt\").to(\"cuda\"),\n", " temperature=1.0,\n", " max_new_tokens=1024,\n", " streamer=TextStreamer(tokenizer, skip_prompt=False),\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "-NUEmHFSYNTp" }, "source": [ "## Saving the Fine-tuned Model\n", "\n", "You can save the trained model in different formats:\n", "\n", "- **MXFP4**: OpenAI gpt-oss's native 4-bit precision format\n", "- **float16**: Standard half-precision for broader compatibility\n", "\n", "To push to Hugging Face Hub, you'll need a token from https://huggingface.co/settings/tokens:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NjXGTkp7YNtB" }, "outputs": [], "source": [ "# Merge and push to hub in mxfp4 4bit format\n", "if False:\n", " model.save_pretrained_merged(\"finetuned_model\", tokenizer, save_method=\"mxfp4\")\n", "if False:\n", " model.push_to_hub_merged(\"repo_id/repo_name\", tokenizer, token=\"hf...\", save_method=\"mxfp4\")\n", "\n", "# Merge and push to hub in 16bit\n", "if False:\n", " model.save_pretrained_merged(\"finetuned_model\", tokenizer, save_method=\"merged_16bit\")\n", "if False: # Pushing to HF Hub\n", " model.push_to_hub_merged(\"hf/gpt-oss-finetune\", tokenizer, save_method=\"merged_16bit\", token=\"\")" ] }, { "cell_type": "markdown", "metadata": { "id": "V15Yhj1V9lwG" }, "source": [ "## Conclusion\n", "\n", "Congratulations! You've learned how to apply reinforcement learning to teach an LLM to generate game-playing code. The key concepts covered:\n", "\n", "1. **OpenEnv** for standardized access to RL environments\n", "2. **LoRA** for memory-efficient fine-tuning\n", "3. **Sandboxed execution** to prevent reward hacking\n", "4. **Multi-objective reward functions** that balance validity, safety, and performance\n", "5. **GRPO** for policy optimization without a value network\n", "\n", "This pattern extends beyond 2048β€”you can adapt it to any task where model outputs can be programmatically evaluated: code synthesis, mathematical proofs, API usage, and more.\n", "\n", "### Further Resources\n", "\n", "- [OpenAI gpt-oss-20b Model Card](https://huggingface.co/openai/gpt-oss-20b)\n", "- [OpenEnv Documentation](https://github.com/meta-pytorch/OpenEnv)\n", "- [TRL GRPO Trainer](https://huggingface.co/docs/trl/main/en/grpo_trainer)\n", "- [Unsloth RL Guide](https://docs.unsloth.ai/get-started/reinforcement-learning-rl-guide)\n", "\n", "---\n", "\n", "*This notebook uses [Unsloth](https://github.com/unslothai/unsloth) for memory-efficient training.*\n", "\n", "**License:** Apache 2.0" ] }, { "cell_type": "markdown", "metadata": { "id": "KMwNkyqlB4Ae" }, "source": [] } ], "metadata": { "accelerator": "GPU", "colab": { "gpuType": "T4", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "004f28173c3b4fb6a9f8c2068f5db81f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "02c88a690a384ae183c233b6927aaf57": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "050567dccb47456aaac65d118ac60a6b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_ea2f3ca562444c46b50596a1d7cf9030", "IPY_MODEL_0b7287d482cc44dbb406d71f23f1aea0", "IPY_MODEL_474c7fe6ef4b430d9826171eded2ebf1" ], "layout": "IPY_MODEL_12a877304ddf45e49bbdfe056394c3d6" } }, "06e420cfa2974f7d8d7ec4b83f064a6e": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "072060f8bdb54a15baf838f67d376d99": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_29d98af49dd3412f84f5843b937029d1", "max": 3372033380, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_93176d6b63284b4e832e0f028be90655", "value": 3372033380 } }, "08d8f0cfd7614900a9c9bba888619749": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "0b7287d482cc44dbb406d71f23f1aea0": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d541fd264d7e4b2691e8efbd993e6ae7", "max": 446, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_cbc7bbfe983f4e6696f8dd6b37c50543", "value": 446 } }, "0b851acfd32047bfb6bae17d43ccfcb1": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_667192c07a7340a9a72ed25648a0be64", "max": 3996690997, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_aa714b1f70e8495b9299b51d6ac4c3c4", "value": 3996690997 } }, "0b8fa4ff186a4bfeac18cf6d676e99df": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "0e49035e0c3a4ee4ab477b475e74ef36": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "0f09a00375bd4d4c8863fc8fb7d64d61": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "1055200185004ea2a95a05eb51232501": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "12a877304ddf45e49bbdfe056394c3d6": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "18d4cc3fd8ee4e3fbf27c574fd467f20": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "1b26eac03fed4c8784bd611474cf4607": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_b72fc22310714bf0bf6ff4021db5aba8", "placeholder": "​", "style": "IPY_MODEL_d2a7fa9dddc240e29330297870159c59", "value": "Loading checkpoint shards: 100%" } }, "1b84d7dc9567474c8587432d48342b71": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_665f7b7e5496432985ed9f49829f5834", "placeholder": "​", "style": "IPY_MODEL_d16dc921c6554bc6b77abcb423721cc8", "value": " 1.19M/? [00:00<00:00, 76.2MB/s]" } }, "1daa373c890b4ae0a9cf6a3ec325693c": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "1ece5fe9597a4e3db2dd96c38995705c": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "212c17e3829c4accb30265a3d9ee73dc": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2612008cf36949d9a6618a01ac817618": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "28300c16023d4ad9a59784baea2f57aa": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": "20px" } }, "297bdff1add5414893319b185cb15da6": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_362947c7e102474cbf560f51e713bdff", "IPY_MODEL_7f63ff3e87aa4b86a4f8e64785d1d34c", "IPY_MODEL_cca7d922b85449a1b0c5c025b65fba10" ], "layout": "IPY_MODEL_5520c59f24cd414092bf5f952425611d" } }, "29d98af49dd3412f84f5843b937029d1": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2d91f25a8d0b4dd39dc320b2cf17fc0b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_88aa58551344410a91b07af480d6ab53", "max": 1, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_4f38070c09354406928c5e7be6cff3fd", "value": 1 } }, "2f57c8e713b94c8692837b4e17c9e983": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "31afdb187d2a44d8b3a101fa18543c13": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "3371a1da5d0e426bb6cc02a6c383dd6f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "3566e9058ebc45498b42e521f2314365": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "362947c7e102474cbf560f51e713bdff": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_b1dcd2f908344949af01dab5a244e458", "placeholder": "​", "style": "IPY_MODEL_9e6f46ddb61943f4a168d70a209a7ddc", "value": "model-00004-of-00004.safetensors: 100%" } }, "383e9b4b74e34cae96c1f46f41591b82": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "3c2e00b9d20a4c9ea5b850b776752fd2": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_7c29e5f727ec4b608c06d0487a2c9e53", "IPY_MODEL_4beae5415d0647e2898a83313a08ea94", "IPY_MODEL_62cd4ccc430549e7a2c156222d47ebeb" ], "layout": "IPY_MODEL_4691273248984fdea29d83ec0a246cd9" } }, "3e3ce50c437a412f9cc2bedb697648f7": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "3f589cf3cb804ed29ba322e4fa10c511": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "419e4369b36644d3abec159511a88ad8": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_eee0d2b6eb6047d8a4001f4999dcdc35", "IPY_MODEL_2d91f25a8d0b4dd39dc320b2cf17fc0b", "IPY_MODEL_1b84d7dc9567474c8587432d48342b71" ], "layout": "IPY_MODEL_f34cadfaf9bb46729aeeff9492ba9026" } }, "41ee8407344d402997b0574e0ae26c77": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "4691273248984fdea29d83ec0a246cd9": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "46cb3593b37c4cb9b4bac422bc5809c8": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "474c7fe6ef4b430d9826171eded2ebf1": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_f290b17e3cd9487b9b97d54d6cec9efc", "placeholder": "​", "style": "IPY_MODEL_18d4cc3fd8ee4e3fbf27c574fd467f20", "value": " 446/446 [00:00<00:00, 50.9kB/s]" } }, "48f65e25acd84b0cb582de66753215da": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_383e9b4b74e34cae96c1f46f41591b82", "max": 165, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_46cb3593b37c4cb9b4bac422bc5809c8", "value": 165 } }, "4ba91b8d008e483d89f16f204326f6b6": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "4bc16bd2399a43fe85967131af7f846a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "4beae5415d0647e2898a83313a08ea94": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_a6a122064bc340868b5e0e11afa9c42f", "max": 1, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_3371a1da5d0e426bb6cc02a6c383dd6f", "value": 1 } }, "4ca56d8605864a438290152268bfc686": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_2612008cf36949d9a6618a01ac817618", "max": 3998751275, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_3e3ce50c437a412f9cc2bedb697648f7", "value": 3998751275 } }, "4dc823fcd0dd4eaaa2e8aaff0daa9ad1": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_1b26eac03fed4c8784bd611474cf4607", "IPY_MODEL_a24e2cfbef794d35a0e22753352caa15", "IPY_MODEL_8368e4420e814c6f9be30994b69c66ee" ], "layout": "IPY_MODEL_eb8c197ce1fc41f78531e5e73ae14a89" } }, "4e63263fc27b4f07b4f6abffba082379": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_eb13ad96565a44519e7cab9ce9483b90", "IPY_MODEL_0b851acfd32047bfb6bae17d43ccfcb1", "IPY_MODEL_d4c72002b5fe44d3ad7ae67f5536c889" ], "layout": "IPY_MODEL_a1e042e5b8ad4b028cc28ac54924207d" } }, "4f29591af0d64d398515479b032a1b3d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "4f38070c09354406928c5e7be6cff3fd": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "52e9a0ea76df4fd8822ccadadaadb501": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "5371b2fad7b04f97bd8f4671d844d2cb": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9c2371c32afe46519ee53427ea42bc9a", "placeholder": "​", "style": "IPY_MODEL_f9d4672fb86b4b4e9c3e9068dc479e5b", "value": " 3.37G/3.37G [00:20<00:00, 60.9MB/s]" } }, "5483233a9b224f3c8eb9337e4ed82314": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "5520c59f24cd414092bf5f952425611d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "56b30cc150924879abfc138427f4ca98": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "5d1e2fdbf7a2409abbafb63e4b160668": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_a4c19dc98fe943e09a26a60d23c8ff01", "IPY_MODEL_8f5799610318490492ff5aba76be3d1a", "IPY_MODEL_e78534b135d2465c83e3be614b71c8a4" ], "layout": "IPY_MODEL_2f57c8e713b94c8692837b4e17c9e983" } }, "62cd4ccc430549e7a2c156222d47ebeb": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_bedc42e908454f1494768194b43b2964", "placeholder": "​", "style": "IPY_MODEL_41ee8407344d402997b0574e0ae26c77", "value": " 22.8k/? [00:00<00:00, 1.59MB/s]" } }, "665f7b7e5496432985ed9f49829f5834": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "667192c07a7340a9a72ed25648a0be64": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "6af6de3285684e76b320571b44af5fc1": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_f7cc1614e13d4d22b83f787b20407a5f", "IPY_MODEL_072060f8bdb54a15baf838f67d376d99", "IPY_MODEL_5371b2fad7b04f97bd8f4671d844d2cb" ], "layout": "IPY_MODEL_b10d25fb43eb42198abf71c4e326bcff" } }, "6d77c8dcb28240f7b860571d11f8b9af": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "754f2452fbe14c7098215ec810ffbf14": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "7c29e5f727ec4b608c06d0487a2c9e53": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_ae3819cd042f43babef24199081bb97f", "placeholder": "​", "style": "IPY_MODEL_52e9a0ea76df4fd8822ccadadaadb501", "value": "tokenizer_config.json: " } }, "7c52841c7e714173bfd526b0a625bc9d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_ae9728c04b974af29460ce5179a9edba", "placeholder": "​", "style": "IPY_MODEL_754f2452fbe14c7098215ec810ffbf14", "value": " 165/165 [00:00<00:00, 16.4kB/s]" } }, "7c7d40163ecc4dae8a2c54af21de4661": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "7ce0f239a8514923b383f738bc0c9899": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "7f63ff3e87aa4b86a4f8e64785d1d34c": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_ae79b9a378ee4218beddf77ac9af6de7", "max": 1158267008, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_e6d63cf58647443b9749195ec0579d87", "value": 1158267008 } }, "8368e4420e814c6f9be30994b69c66ee": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d1cea390ecaa4583a698278fa3a438c9", "placeholder": "​", "style": "IPY_MODEL_5483233a9b224f3c8eb9337e4ed82314", "value": " 4/4 [00:56<00:00, 12.02s/it]" } }, "8708226010324a83b4f7900c3958d430": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_4ba91b8d008e483d89f16f204326f6b6", "placeholder": "​", "style": "IPY_MODEL_31afdb187d2a44d8b3a101fa18543c13", "value": "chat_template.jinja: " } }, "88aa58551344410a91b07af480d6ab53": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": "20px" } }, "89e3ad0d517f44d084df0d8a3ed40703": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "8f5799610318490492ff5aba76be3d1a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_1055200185004ea2a95a05eb51232501", "max": 27868174, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_eb1068efb4364ee2893c4d21c58f38db", "value": 27868174 } }, "93176d6b63284b4e832e0f028be90655": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "941d8cbf8188402eb603c18b6e979035": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_8708226010324a83b4f7900c3958d430", "IPY_MODEL_fc12c97d659640b3b2b2b48ad7f17e5b", "IPY_MODEL_aa22cb2d1dcf4001bd3720b075318906" ], "layout": "IPY_MODEL_bb358c5523814376a2d4690f88f20b74" } }, "9c2371c32afe46519ee53427ea42bc9a": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9e6f46ddb61943f4a168d70a209a7ddc": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "a1e042e5b8ad4b028cc28ac54924207d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "a24e2cfbef794d35a0e22753352caa15": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_0e49035e0c3a4ee4ab477b475e74ef36", "max": 4, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_08d8f0cfd7614900a9c9bba888619749", "value": 4 } }, "a4c19dc98fe943e09a26a60d23c8ff01": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_f8d2164046fb46a298e2d80628808cb3", "placeholder": "​", "style": "IPY_MODEL_af23432e17664cd6852496e43f9de0cf", "value": "tokenizer.json: 100%" } }, "a6a122064bc340868b5e0e11afa9c42f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": "20px" } }, "aa22cb2d1dcf4001bd3720b075318906": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_0b8fa4ff186a4bfeac18cf6d676e99df", "placeholder": "​", "style": "IPY_MODEL_ccb581e230604bb690015eb685e4b8e1", "value": " 15.1k/? [00:00<00:00, 1.44MB/s]" } }, "aa714b1f70e8495b9299b51d6ac4c3c4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "ae3819cd042f43babef24199081bb97f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "ae79b9a378ee4218beddf77ac9af6de7": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "ae9728c04b974af29460ce5179a9edba": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "af23432e17664cd6852496e43f9de0cf": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "b09745899e1446129f397d822a21fc99": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "b10d25fb43eb42198abf71c4e326bcff": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "b1dcd2f908344949af01dab5a244e458": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "b72fc22310714bf0bf6ff4021db5aba8": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "baf3db00d28f4c849bb6a6739e908c62": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_f4f2fb240a12406ab5e709be33b93683", "IPY_MODEL_4ca56d8605864a438290152268bfc686", "IPY_MODEL_de306a50594f463ba9c78c713bc33241" ], "layout": "IPY_MODEL_0f09a00375bd4d4c8863fc8fb7d64d61" } }, "bb358c5523814376a2d4690f88f20b74": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "bd8f575034c041be93ff20f63388de2d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "bedc42e908454f1494768194b43b2964": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "c4d592366499414a99f19bce7f0bd665": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "cbc7bbfe983f4e6696f8dd6b37c50543": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "cbed0dab2d7540f697eacec5a33e1061": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_dc9b2549ff834880ac19b578adfee5a5", "IPY_MODEL_48f65e25acd84b0cb582de66753215da", "IPY_MODEL_7c52841c7e714173bfd526b0a625bc9d" ], "layout": "IPY_MODEL_bd8f575034c041be93ff20f63388de2d" } }, "cca7d922b85449a1b0c5c025b65fba10": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_e7ca6b7ba2094872a888da5511e2bb49", "placeholder": "​", "style": "IPY_MODEL_7c7d40163ecc4dae8a2c54af21de4661", "value": " 1.16G/1.16G [00:09<00:00, 242MB/s]" } }, "ccb581e230604bb690015eb685e4b8e1": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "d16dc921c6554bc6b77abcb423721cc8": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "d1cea390ecaa4583a698278fa3a438c9": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d257eaa588bd41fb947f81d306fe05cd": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d2a7fa9dddc240e29330297870159c59": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "d4c72002b5fe44d3ad7ae67f5536c889": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_06e420cfa2974f7d8d7ec4b83f064a6e", "placeholder": "​", "style": "IPY_MODEL_3566e9058ebc45498b42e521f2314365", "value": " 4.00G/4.00G [00:19<00:00, 279MB/s]" } }, "d541fd264d7e4b2691e8efbd993e6ae7": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d82ab09313cc482f9b9b45192f489825": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "dc9b2549ff834880ac19b578adfee5a5": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_212c17e3829c4accb30265a3d9ee73dc", "placeholder": "​", "style": "IPY_MODEL_1daa373c890b4ae0a9cf6a3ec325693c", "value": "generation_config.json: 100%" } }, "de306a50594f463ba9c78c713bc33241": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_fa5c42218fbb44378bef71551c3383e0", "placeholder": "​", "style": "IPY_MODEL_d82ab09313cc482f9b9b45192f489825", "value": " 4.00G/4.00G [00:25<00:00, 110MB/s]" } }, "e6d63cf58647443b9749195ec0579d87": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "e78534b135d2465c83e3be614b71c8a4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_c4d592366499414a99f19bce7f0bd665", "placeholder": "​", "style": "IPY_MODEL_1ece5fe9597a4e3db2dd96c38995705c", "value": " 27.9M/27.9M [00:01<00:00, 21.9MB/s]" } }, "e7ca6b7ba2094872a888da5511e2bb49": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "ea2f3ca562444c46b50596a1d7cf9030": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_56b30cc150924879abfc138427f4ca98", "placeholder": "​", "style": "IPY_MODEL_02c88a690a384ae183c233b6927aaf57", "value": "special_tokens_map.json: 100%" } }, "eb1068efb4364ee2893c4d21c58f38db": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "eb13ad96565a44519e7cab9ce9483b90": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_6d77c8dcb28240f7b860571d11f8b9af", "placeholder": "​", "style": "IPY_MODEL_4f29591af0d64d398515479b032a1b3d", "value": "model-00002-of-00004.safetensors: 100%" } }, "eb8c197ce1fc41f78531e5e73ae14a89": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "eee0d2b6eb6047d8a4001f4999dcdc35": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d257eaa588bd41fb947f81d306fe05cd", "placeholder": "​", "style": "IPY_MODEL_b09745899e1446129f397d822a21fc99", "value": "model.safetensors.index.json: " } }, "f290b17e3cd9487b9b97d54d6cec9efc": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f34cadfaf9bb46729aeeff9492ba9026": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f4f2fb240a12406ab5e709be33b93683": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_89e3ad0d517f44d084df0d8a3ed40703", "placeholder": "​", "style": "IPY_MODEL_3f589cf3cb804ed29ba322e4fa10c511", "value": "model-00001-of-00004.safetensors: 100%" } }, "f7cc1614e13d4d22b83f787b20407a5f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_7ce0f239a8514923b383f738bc0c9899", "placeholder": "​", "style": "IPY_MODEL_4bc16bd2399a43fe85967131af7f846a", "value": "model-00003-of-00004.safetensors: 100%" } }, "f8d2164046fb46a298e2d80628808cb3": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f9d4672fb86b4b4e9c3e9068dc479e5b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "fa5c42218fbb44378bef71551c3383e0": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "fc12c97d659640b3b2b2b48ad7f17e5b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_28300c16023d4ad9a59784baea2f57aa", "max": 1, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_004f28173c3b4fb6a9f8c2068f5db81f", "value": 1 } } } } }, "nbformat": 4, "nbformat_minor": 0 }