anna4142
/

output

@@ -1,72 +1,46 @@
 ---
 tags:
-- decision_transformer
-- reinforcement_learning
-- gym_environment
-- fine-tuned
 model-index:
-- name: Hierarchical Decision Transformer
   results: []
 ---
-# Hierarchical Decision Transformer
-This model is a fine-tuned version of the Decision Transformer, trained on expert trajectories sampled from the Gym HalfCheetah environment.
-## Model Description
-The **Hierarchical Decision Transformer** extends the standard Decision Transformer by incorporating hierarchical reasoning. It introduces clustering and subgoal reasoning capabilities, enabling enhanced performance on tasks requiring multi-level decision-making.
-- **Architecture**:
-  - A hierarchical head added to process state embeddings for clustering.
-  - Cluster centroids initialized as learnable parameters.
-- **Loss functions**:
-  - Action prediction loss (MSE between predicted and target actions).
-  - Entropy loss for cluster assignment diversity.
-## Intended Uses & Limitations
-### Intended Uses
-- Offline reinforcement learning tasks using trajectory data.
-- Tasks requiring subgoal reasoning or clustering-based decision-making.
-- Benchmarking on Gym environments like HalfCheetah and Hopper.
-### Limitations
-- Performance depends heavily on clustering configurations and hierarchical design.
-- Additional computational cost due to hierarchical components.
-## Training and Evaluation Data
-The model was trained on expert trajectories from the **Gym HalfCheetah environment**. These trajectories were sampled from a pre-trained policy to provide high-quality data for offline reinforcement learning.
-## Training Procedure
-### Training Hyperparameters
-- **Learning Rate**: 0.0001
-- **Train Batch Size**: 64
-- **Eval Batch Size**: 8
-- **Seed**: 42
-- **Optimizer**: `adamw_torch` with:
-  - `betas`: (0.9, 0.999)
-  - `epsilon`: 1e-08
-- **LR Scheduler Type**: `linear`
-- **Warmup Ratio**: 0.1
-- **Number of Epochs**: 200
-### Framework Versions
-- **Transformers**: 4.46.2
-- **PyTorch**: 2.5.1+cu121
-- **Datasets**: 3.1.0
-- **Tokenizers**: 0.20.3
-## References
-- [Decision Transformer Paper](https://arxiv.org/abs/2106.01345)
-- [Hugging Face Transformers Documentation](https://huggingface.co/docs/transformers/)
-- [Gym Environments](https://www.gymlibrary.dev/)
@@ -75,4 +49,4 @@ The model was trained on expert trajectories from the **Gym HalfCheetah environm
 - Transformers 4.46.2
 - Pytorch 2.5.1+cu121
 - Datasets 3.1.0
-- Tokenizers 0.20.3

 ---
+library_name: transformers
 tags:
+- generated_from_trainer
 model-index:
+- name: output
   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# output
+This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 64
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 200
+### Training results
 - Transformers 4.46.2
 - Pytorch 2.5.1+cu121
 - Datasets 3.1.0
+- Tokenizers 0.20.3