Spaces:

thomasm6m6
/

openenv_hack

Runtime error

App Files Files Community

openenv_hack / freeciv_env.egg-info /PKG-INFO

thomasm6m6

Initial Freeciv OpenEnv Space

8dc7642 verified about 2 months ago

raw

history blame contribute delete

3.71 kB

	Metadata-Version: 2.4
	Name: freeciv-env
	Version: 0.1.0
	Summary: OpenEnv environment for Freeciv via freeciv-bot
	Requires-Python: >=3.11
	Description-Content-Type: text/markdown
	Requires-Dist: openenv-core[core]==0.2.1
	Requires-Dist: freecivbot @ git+https://github.com/chris1869/freeciv-bot.git
	Requires-Dist: uvicorn>=0.35.0
	Provides-Extra: dev
	Requires-Dist: pytest>=8.4.1; extra == "dev"
	Requires-Dist: requests>=2.32.5; extra == "dev"
	Provides-Extra: train
	Requires-Dist: accelerate>=1.10.0; extra == "train"
	Requires-Dist: bitsandbytes>=0.47.0; extra == "train"
	Requires-Dist: datasets>=4.0.0; extra == "train"
	Requires-Dist: trl>=0.24.0; extra == "train"
	Requires-Dist: unsloth>=2026.3.4; extra == "train"

	---
	title: Freeciv Environment Server
	emoji: 🎮
	colorFrom: blue
	colorTo: indigo
	sdk: docker
	pinned: false
	app_port: 8000
	base_path: /web
	tags:
	- openenv
	---

	# freeciv-env

	OpenEnv environment for Freeciv, built on top of `freeciv-bot`.

	## Current scope

	This environment exposes a small, trainable action surface:

	- `end_turn`
	- `move_unit(unit_id, direction)`
	- `build_city(unit_id)`
	- `set_city_production(city_id, target)`
	- `set_research(tech_name)`

	Observations are text-first and include compact structured summaries of:

	- current turn
	- score
	- known and visible map tiles
	- units
	- cities
	- legal actions

	## Local development

	Install dependencies:

	```bash
	uv sync --extra dev
	```

	Run tests:

	```bash
	uv run pytest
	```

	Run the server:

	```bash
	uv run uvicorn freeciv_env.server.app:app --host 0.0.0.0 --port 8000
	```

	Run the fast GRPO loop:

	```bash
	uv sync --extra dev --extra train
	uv run python scripts/train_grpo_fast.py --env-url http://127.0.0.1 --max-steps 50
	```

	## Hackathon / Unsloth notes

	For the hackathon Colab submission path on H100s, Unsloth recommended the BF16 OpenEnv gpt-oss 20B notebook:

	- <https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/OpenEnv_gpt_oss_(20B)_Reinforcement_Learning_2048_Game_BF16.ipynb>

	If you adapt that notebook for this environment, reduce `max_steps` to `300` for a faster run.

	Useful notebook indexes:

	- RL notebooks: <https://unsloth.ai/docs/get-started/unsloth-notebooks#grpo-reasoning-rl>
	- all notebooks: <https://unsloth.ai/docs/get-started/unsloth-notebooks>
	- notebook repo: <https://github.com/unslothai/notebooks/tree/main/nb>

	If GRPO is too slow, start from a smaller notebook with `fast_inference = True` and add the Freeciv/OpenEnv calls:

	- <https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_(4B)-GRPO.ipynb>
	- <https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Advanced_Llama3_2_(3B)_GRPO_LoRA.ipynb>

	If vLLM GRPO fails, Unsloth suggested a clean virtualenv install:

	```bash
	python -m venv unsloth_env
	source unsloth_env/bin/activate
	pip install --upgrade pip && pip install uv
	uv pip install unsloth vllm --torch-backend=auto
	```

	If Unsloth is already installed, update it for the latest GRPO fixes:

	```bash
	pip install --upgrade --no-cache-dir --no-deps unsloth unsloth_zoo
	```

	## Live runtime requirements

	The default server app uses `freeciv-bot` against a local Freeciv Web runtime.

	Environment variables:

	- `FREECIV_SERVER_URL` (default: `http://127.0.0.1`)
	- `FREECIV_USERNAME` (default: `openenvbot`)
	- `FREECIV_CLIENT_PORT` (default: `6000`)
	- `FREECIV_TURN_TIMEOUT_S` (default: `60`)

	The included automated tests use a fake session backend, so they do not require a live Freeciv server.

	The GRPO training script uses:

	- `Qwen/Qwen3.5-0.8B`
	- Unsloth bf16 LoRA loading
	- TRL `GRPOTrainer`
	- integer-only action selection to minimize generated tokens
	- offline GRPO over env-sampled states for maximum throughput