LLM agent β€” tool use (ReAct) 🚧 not trained yet

An LLM that reasons and calls tools in a loop (ReAct) to solve tasks.

Status β€” documented recipe (placeholder). A production-grade pipeline from Ropedia Academy for an advanced, GPU-heavy task. Everything below β€” base model, objective, dataset, config, the exact evaluation β€” is specified; the weights / metrics / figures land here automatically when you run the notebook on a GPU (one click below). Try the trained models live in the Ropedia demos Space.

At a glance

Base model Qwen/Qwen2.5-1.5B-Instruct (tool-calling)
Task tool-using LLM agent
Training objective ReAct loop β€” the LLM reasons, calls tools, observes, and iterates.
Track AG Β· Agents & RL
Built on huggingface/transformers (function calling)
Notebook Open In Colab
Compute / storage / time GPU required β€” see the Compute Β· storage Β· time table in the notebook

Dataset

  • Source: Task suite (the self-contained AG_agent_harness).

Training config

GPU-scale β€” the notebook ships a demo profile (free Colab T4) and a full profile, with an exact Compute Β· storage Β· time table. Hyperparameters (optimizer, steps, batch, LoRA rank, …) are in the training cell.

Evaluation results

⏳ Pending β€” run the notebook on a GPU to fill this in. This lab reports task success rate on a held-out split (see its Evaluate cell).

Inference example

No weights are published yet. After a GPU run, load the checkpoint/adapter the notebook saves (it also has a ready inference cell). Base model: Qwen/Qwen2.5-1.5B-Instruct (tool-calling).

How to fill this repo

  1. Open the notebook in Colab β†’ Runtime β†’ GPU β†’ Run all (runs the real pipeline).
  2. Run its Publish to the Hugging Face Hub step (or HfApi().upload_folder(...)) β€” the checkpoint + metrics.json + figures replace this placeholder.
  • Train / run on a GPU Β· [ ] upload weights Β· [ ] add metrics.json Β· [ ] add figures Β· [ ] swap in the real results card

Limitations

Not yet trained β€” no numbers to report. The pipeline is GPU-heavy (see the compute table); on free Colab use the demo-scale settings. This is an educational, reproducible recipe, not a tuned production release.

License

Code: MIT (this repository). The base model (huggingface/transformers (function calling)) and dataset are each under their own licenses β€” check the upstream source before redistribution.

Citation

@misc{ropedia_academy,
  title  = {Ropedia Academy: an interactive course on embodied & spatial AI},
  author = {Ropedia Academy},
  year   = {2026},
  howpublished = {\url{https://chaoyue0307.github.io/ropedia-academy/}}
}

Method / original work: Yao et al., ReAct, ICLR 2023; Schick et al., Toolformer, 2023.

Related assets


Documented placeholder in the Ropedia Academy collection β€” train it on a GPU to publish the real model. Contributions welcome on GitHub.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for cy0307/ag-llm-agent-tooluse

Finetuned
(1674)
this model

Collection including cy0307/ag-llm-agent-tooluse