Add files using upload-large-folder tool
Browse filesThis view is limited to 50 files because it contains too many changes. See raw diff
- src_code_for_reproducibility/docs/source/environments.rst +35 -0
- src_code_for_reproducibility/docs/source/index.rst +22 -0
- src_code_for_reproducibility/docs/source/installation.rst +10 -0
- src_code_for_reproducibility/docs/source/src.environments.dond.dond_agent.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.dond.dond_game.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.dond.dond_log_funcs.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.dond.dond_player.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.dond.dond_return_funcs.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.dond.dond_training_data_funcs.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.dond.rst +19 -0
- src_code_for_reproducibility/docs/source/src.environments.env_imports.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.environment_imports.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_agent.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_game.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_log_funcs.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_training_data_funcs.rst +7 -0
- src_code_for_reproducibility/docs/source/src.environments.ipd.rst +19 -0
- src_code_for_reproducibility/docs/source/src.environments.rst +25 -0
- src_code_for_reproducibility/docs/source/src.experiments.arithmetic_test.rst +7 -0
- src_code_for_reproducibility/docs/source/src.experiments.dond_run_train.rst +7 -0
- src_code_for_reproducibility/docs/source/src.experiments.generate_and_train.rst +7 -0
- src_code_for_reproducibility/docs/source/src.experiments.last_completion.rst +7 -0
- src_code_for_reproducibility/docs/source/src.generation.rst +15 -0
- src_code_for_reproducibility/docs/source/src.generation.run_games.rst +7 -0
- src_code_for_reproducibility/docs/source/src.models.dummy_hf_agent.rst +7 -0
- src_code_for_reproducibility/docs/source/src.models.dummy_local_llm.rst +7 -0
- src_code_for_reproducibility/docs/source/src.models.new_local_llm.rst +7 -0
- src_code_for_reproducibility/docs/source/src.models.rst +20 -0
- src_code_for_reproducibility/docs/source/src.models.updatable_worker.rst +7 -0
- src_code_for_reproducibility/docs/source/src.models.vllm_worker_wrap.rst +7 -0
- src_code_for_reproducibility/docs/source/src.training.ppo_train_value_head.rst +7 -0
- src_code_for_reproducibility/docs/source/src.training.reinforce_training.rst +7 -0
- src_code_for_reproducibility/docs/source/src.training.rl_convs_processing.rst +7 -0
- src_code_for_reproducibility/docs/source/src.training.rst +19 -0
- src_code_for_reproducibility/docs/source/src.training.train_main.rst +7 -0
- src_code_for_reproducibility/docs/source/src.utils.common_imports.rst +7 -0
- src_code_for_reproducibility/docs/source/src.utils.export_ppo_training_set.rst +7 -0
- src_code_for_reproducibility/docs/source/src.utils.log_statistics.rst +7 -0
- src_code_for_reproducibility/docs/source/src.utils.parallel_shuffle.rst +7 -0
- src_code_for_reproducibility/docs/source/src.utils.rst +24 -0
- src_code_for_reproducibility/docs/source/src.utils.update_start_epoch.rst +7 -0
- src_code_for_reproducibility/markov_games/__pycache__/__init__.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/agent.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/alternative_actions_runner.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/gather_and_export_utils.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/group_timesteps.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/linear_runner.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/markov_game.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/mg_utils.cpython-312.pyc +0 -0
- src_code_for_reproducibility/markov_games/__pycache__/rollout_tree.cpython-312.pyc +0 -0
src_code_for_reproducibility/docs/source/environments.rst
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
=================
|
| 2 |
+
MARL Environments
|
| 3 |
+
=================
|
| 4 |
+
|
| 5 |
+
This section provides detailed documentation for the multi-agent negotiation environments included in the library.
|
| 6 |
+
|
| 7 |
+
Each environment follows the standard interface described in :doc:`../environments` but has its own unique game rules,
|
| 8 |
+
dynamics, and implementation details.
|
| 9 |
+
|
| 10 |
+
.. toctree::
|
| 11 |
+
:maxdepth: 2
|
| 12 |
+
:caption: Available Environments:
|
| 13 |
+
|
| 14 |
+
environments/ipd
|
| 15 |
+
environments/diplomacy
|
| 16 |
+
environments/dond
|
| 17 |
+
|
| 18 |
+
Overview
|
| 19 |
+
--------
|
| 20 |
+
|
| 21 |
+
The library currently includes the following environments:
|
| 22 |
+
|
| 23 |
+
1. **Iterated Prisoner's Dilemma (IPD)**: A classic game theory problem where two agents repeatedly decide whether to cooperate or defect, with different payoffs based on their joint actions.
|
| 24 |
+
|
| 25 |
+
2. **Diplomacy**: An adaptation of the board game Diplomacy, where seven European powers compete for control of supply centers through strategic moves and alliances.
|
| 26 |
+
|
| 27 |
+
3. **Deal or No Deal (DOND)**: A negotiation environment based on `the paper Deal or No Deal? End-to-End Learning for Negotiation Dialogues <https://arxiv.org/pdf/1706.05125>`_ in which agents negotiate over the distribution of a set of prizes.
|
| 28 |
+
|
| 29 |
+
Each environment documentation includes:
|
| 30 |
+
|
| 31 |
+
- Game rules and background
|
| 32 |
+
- Implementation details
|
| 33 |
+
- API reference
|
| 34 |
+
- Example usage
|
| 35 |
+
- Advanced features and customization options
|
src_code_for_reproducibility/docs/source/index.rst
ADDED
|
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Welcome to LLM Negotiation's documentation!
|
| 2 |
+
===========================================
|
| 3 |
+
This library is a collection of tools for training and evaluating LLM-based agents in multi-agent environments. It is designed to be easy to use and extend.
|
| 4 |
+
|
| 5 |
+
.. toctree::
|
| 6 |
+
:maxdepth: 3
|
| 7 |
+
:caption: Contents:
|
| 8 |
+
|
| 9 |
+
installation
|
| 10 |
+
marl_standard
|
| 11 |
+
environments
|
| 12 |
+
launch
|
| 13 |
+
usage
|
| 14 |
+
modules
|
| 15 |
+
contributing
|
| 16 |
+
|
| 17 |
+
Indices and tables
|
| 18 |
+
==================
|
| 19 |
+
|
| 20 |
+
* :ref:`genindex`
|
| 21 |
+
* :ref:`modindex`
|
| 22 |
+
* :ref:`search`
|
src_code_for_reproducibility/docs/source/installation.rst
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Installation
|
| 2 |
+
===========
|
| 3 |
+
|
| 4 |
+
To install the package, run:
|
| 5 |
+
|
| 6 |
+
.. code-block:: bash
|
| 7 |
+
|
| 8 |
+
git clone https://github.com/yourusername/llm_negotiation.git
|
| 9 |
+
cd llm_negotiation
|
| 10 |
+
pip install -e .
|
src_code_for_reproducibility/docs/source/src.environments.dond.dond_agent.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.dond.dond\_agent module
|
| 2 |
+
========================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.dond.dond_agent
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.dond.dond_game.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.dond.dond\_game module
|
| 2 |
+
=======================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.dond.dond_game
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.dond.dond_log_funcs.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.dond.dond\_log\_funcs module
|
| 2 |
+
=============================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.dond.dond_log_funcs
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.dond.dond_player.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.dond.dond\_agent module
|
| 2 |
+
=========================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.dond.dond_agent
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.dond.dond_return_funcs.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.dond.dond\_return\_funcs module
|
| 2 |
+
================================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.dond.dond_return_funcs
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.dond.dond_training_data_funcs.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.dond.dond\_training\_data\_funcs module
|
| 2 |
+
========================================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.dond.dond_training_data_funcs
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.dond.rst
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.dond package
|
| 2 |
+
=============================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.dond
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
| 8 |
+
|
| 9 |
+
Submodules
|
| 10 |
+
----------
|
| 11 |
+
|
| 12 |
+
.. toctree::
|
| 13 |
+
:maxdepth: 4
|
| 14 |
+
|
| 15 |
+
src.environments.dond.dond_agent
|
| 16 |
+
src.environments.dond.dond_game
|
| 17 |
+
src.environments.dond.dond_log_funcs
|
| 18 |
+
src.environments.dond.dond_statistics_funcs
|
| 19 |
+
src.environments.dond.dond_training_data_funcs
|
src_code_for_reproducibility/docs/source/src.environments.env_imports.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.env\_imports module
|
| 2 |
+
====================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.env_imports
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.environment_imports.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.environment\_imports module
|
| 2 |
+
============================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.environment_imports
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_agent.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.ipd.ipd\_agent module
|
| 2 |
+
======================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.ipd.ipd_agent
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_game.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.ipd.ipd\_game module
|
| 2 |
+
=====================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.ipd.ipd_game
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_log_funcs.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.ipd.ipd\_log\_funcs module
|
| 2 |
+
===========================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.ipd.ipd_log_funcs
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.ipd.ipd_training_data_funcs.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.ipd.ipd\_training\_data\_funcs module
|
| 2 |
+
======================================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.ipd.ipd_training_data_funcs
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.environments.ipd.rst
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments.ipd package
|
| 2 |
+
============================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments.ipd
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
| 8 |
+
|
| 9 |
+
Submodules
|
| 10 |
+
----------
|
| 11 |
+
|
| 12 |
+
.. toctree::
|
| 13 |
+
:maxdepth: 4
|
| 14 |
+
|
| 15 |
+
src.environments.ipd.ipd_agent
|
| 16 |
+
src.environments.ipd.ipd_game
|
| 17 |
+
src.environments.ipd.ipd_log_funcs
|
| 18 |
+
src.environments.ipd.ipd_statistics_funcs
|
| 19 |
+
src.environments.ipd.ipd_training_data_funcs
|
src_code_for_reproducibility/docs/source/src.environments.rst
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.environments package
|
| 2 |
+
========================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.environments
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
| 8 |
+
|
| 9 |
+
Subpackages
|
| 10 |
+
-----------
|
| 11 |
+
|
| 12 |
+
.. toctree::
|
| 13 |
+
:maxdepth: 4
|
| 14 |
+
|
| 15 |
+
src.environments.dond
|
| 16 |
+
src.environments.ipd
|
| 17 |
+
|
| 18 |
+
Submodules
|
| 19 |
+
----------
|
| 20 |
+
|
| 21 |
+
.. toctree::
|
| 22 |
+
:maxdepth: 4
|
| 23 |
+
|
| 24 |
+
src.environments.env_imports
|
| 25 |
+
src.environments.environment_imports
|
src_code_for_reproducibility/docs/source/src.experiments.arithmetic_test.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.experiments.arithmetic\_test module
|
| 2 |
+
=======================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.experiments.arithmetic_test
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.experiments.dond_run_train.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.experiments.dond\_run\_train module
|
| 2 |
+
=======================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.experiments.dond_run_train
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.experiments.generate_and_train.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.experiments.generate\_and\_train module
|
| 2 |
+
===========================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.experiments.generate_and_train
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.experiments.last_completion.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.experiments.last\_completion module
|
| 2 |
+
=======================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.experiments.last_completion
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.generation.rst
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.generation package
|
| 2 |
+
======================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.generation
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
| 8 |
+
|
| 9 |
+
Submodules
|
| 10 |
+
----------
|
| 11 |
+
|
| 12 |
+
.. toctree::
|
| 13 |
+
:maxdepth: 4
|
| 14 |
+
|
| 15 |
+
src.generation.run_games
|
src_code_for_reproducibility/docs/source/src.generation.run_games.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.generation.run\_games module
|
| 2 |
+
================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.generation.run_games
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.models.dummy_hf_agent.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.models.dummy\_hf\_agent module
|
| 2 |
+
==================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.models.dummy_llm_agent
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.models.dummy_local_llm.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.models.dummy\_local\_llm module
|
| 2 |
+
===================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.models.dummy_local_llm
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.models.new_local_llm.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.models.new\_local\_llm module
|
| 2 |
+
=================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.models.new_local_llm
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.models.rst
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.models package
|
| 2 |
+
==================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.models
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
| 8 |
+
|
| 9 |
+
Submodules
|
| 10 |
+
----------
|
| 11 |
+
|
| 12 |
+
.. toctree::
|
| 13 |
+
:maxdepth: 4
|
| 14 |
+
|
| 15 |
+
src.models.dummy_local_llm
|
| 16 |
+
src.models.local_llm
|
| 17 |
+
src.models.new_local_llm
|
| 18 |
+
src.models.server_llm
|
| 19 |
+
src.models.updatable_worker
|
| 20 |
+
src.models.vllm_worker_wrap
|
src_code_for_reproducibility/docs/source/src.models.updatable_worker.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.models.updatable\_worker module
|
| 2 |
+
===================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.models.updatable_worker
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.models.vllm_worker_wrap.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.models.vllm\_worker\_wrap module
|
| 2 |
+
====================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.models.vllm_worker_wrap
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.training.ppo_train_value_head.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.training.ppo\_train\_value\_head module
|
| 2 |
+
===========================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.training.ppo_train_value_head
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.training.reinforce_training.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.training.reinforce\_training module
|
| 2 |
+
=======================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.training.reinforce_training
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.training.rl_convs_processing.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.training.rl\_convs\_processing module
|
| 2 |
+
=========================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.training.rl_convs_processing
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.training.rst
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.training package
|
| 2 |
+
====================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.training
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
| 8 |
+
|
| 9 |
+
Submodules
|
| 10 |
+
----------
|
| 11 |
+
|
| 12 |
+
.. toctree::
|
| 13 |
+
:maxdepth: 4
|
| 14 |
+
|
| 15 |
+
src.training.ppo_train
|
| 16 |
+
src.training.ppo_train_value_head
|
| 17 |
+
src.training.reinforce_training
|
| 18 |
+
src.training.rl_convs_processing
|
| 19 |
+
src.training.train_main
|
src_code_for_reproducibility/docs/source/src.training.train_main.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.training.train\_main module
|
| 2 |
+
===============================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.training.train_main
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.utils.common_imports.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.utils.common\_imports module
|
| 2 |
+
================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.utils.common_imports
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.utils.export_ppo_training_set.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.utils.export\_ppo\_training\_set module
|
| 2 |
+
===========================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.utils.export_ppo_training_set
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.utils.log_statistics.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.utils.log\_statistics module
|
| 2 |
+
================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.utils.log_statistics
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.utils.parallel_shuffle.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.utils.parallel\_shuffle module
|
| 2 |
+
==================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.utils.parallel_shuffle
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/docs/source/src.utils.rst
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.utils package
|
| 2 |
+
=================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.utils
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
| 8 |
+
|
| 9 |
+
Submodules
|
| 10 |
+
----------
|
| 11 |
+
|
| 12 |
+
.. toctree::
|
| 13 |
+
:maxdepth: 4
|
| 14 |
+
|
| 15 |
+
src.utils.common_imports
|
| 16 |
+
src.utils.export_ppo_training_set
|
| 17 |
+
src.utils.extra_stats
|
| 18 |
+
src.utils.inherit_args
|
| 19 |
+
src.utils.log_gpu_usage
|
| 20 |
+
src.utils.log_statistics
|
| 21 |
+
src.utils.model_to_cpu
|
| 22 |
+
src.utils.parallel_shuffle
|
| 23 |
+
src.utils.quick_stats
|
| 24 |
+
src.utils.update_start_epoch
|
src_code_for_reproducibility/docs/source/src.utils.update_start_epoch.rst
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
src.utils.update\_start\_epoch module
|
| 2 |
+
=====================================
|
| 3 |
+
|
| 4 |
+
.. automodule:: src.utils.update_start_epoch
|
| 5 |
+
:members:
|
| 6 |
+
:undoc-members:
|
| 7 |
+
:show-inheritance:
|
src_code_for_reproducibility/markov_games/__pycache__/__init__.cpython-312.pyc
ADDED
|
Binary file (159 Bytes). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/agent.cpython-312.pyc
ADDED
|
Binary file (3.2 kB). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/alternative_actions_runner.cpython-312.pyc
ADDED
|
Binary file (4.95 kB). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/gather_and_export_utils.cpython-312.pyc
ADDED
|
Binary file (46.5 kB). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/group_timesteps.cpython-312.pyc
ADDED
|
Binary file (6.17 kB). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/linear_runner.cpython-312.pyc
ADDED
|
Binary file (1.25 kB). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/markov_game.cpython-312.pyc
ADDED
|
Binary file (9.72 kB). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/mg_utils.cpython-312.pyc
ADDED
|
Binary file (3.98 kB). View file
|
|
|
src_code_for_reproducibility/markov_games/__pycache__/rollout_tree.cpython-312.pyc
ADDED
|
Binary file (3.67 kB). View file
|
|
|