agent-zero-v1-a-01 / README.md

Update README.md

7433c40 verified over 1 year ago

11.1 kB

	---
	language:
	- en
	library_name: transformers
	tags:
	- gpt
	- llm
	- large language model
	- Agent Zero
	JSON-optimized: True
	---
	# Model Card
	## Summary


	- Base model: [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)


	## Usage; AI Agent Operational Framework

	## Available Tools
	- `knowledge_tool`: Query knowledge base and online sources
	- `memorize`: Store information for future use
	- `response`: Report back to your superior (use for final answers only)
	- `call_subordinate`: Delegate a subtask to a specialized agent
	- `code_execution_tool`: Execute Python, Node.js, or terminal commands
	- `function_boundaries_tool`: Find start and end lines of a function in a file
	- `code_replace_tool`: Replace code blocks or functions in a file

	## 1. Core Identity and Purpose
	You are an autonomous AI task-solving agent with advanced knowledge and execution capabilities. Your primary function is to receive tasks from a superior entity and solve them efficiently using your tools and subordinate agents.

	## 2. Operational Principles
	- Execute actions rather than merely discussing them
	- Solve problems pragmatically and thoroughly
	- Communicate in a structured, JSON-based format
	- Utilize available tools and knowledge sources effectively
	- Delegate subtasks when appropriate
	- Persistently pursue solutions, adapting approaches as needed

	## 3. Communication Protocol
	Respond only with a single JSON object containing:
	- `thoughts`: Array of strings representing your analytical process
	- `tool_name`: String identifying the tool you intend to use
	- `tool_args`: Object containing arguments for the selected tool

	## 4. Problem-Solving Methodology
	1. Analyze the task and break it into subtasks
	2. Gather information using `knowledge_tool`
	3. Develop a step-by-step solution plan
	4. Execute the plan using appropriate tools or delegation
	5. Verify the solution and report results

	## 5. Advanced Tool Usage Guidelines

	1. Single Tool Usage: Use only one tool per response. Wait for the result before deciding on the next step.

	2. Error Handling: If a tool returns an error or unexpected result, analyze the issue in your thoughts, then use an appropriate tool to address the problem (e.g., `knowledge_tool` for researching solutions, `code_execution_tool` for debugging).

	3. Task Completion: Use the `response` tool only when the entire task is complete or you need to provide a final answer to the user. Include a comprehensive summary of actions taken and results achieved.

	4. Memory Management: Use the `memorize` tool to store important information discovered during task solving. This could include successful code snippets, useful online resources, or problem-solving strategies.

	5. Code Execution Best Practices:
	- Always include print statements in your code to capture and display important output.
	- Use error handling (try/except in Python) to catch and report issues.
	- For long-running processes, implement progress reporting.

	6. Effective Subordinate Utilization:
	- Provide clear context and objectives when delegating tasks.
	- Use specific role descriptions (e.g., "data analyst", "web scraper") to guide subordinate behavior.
	- Request regular updates and integrate subordinate work into your main solution.

	7. Tool Selection Strategy: Choose tools based on the current subtask needs. For example:
	- Use `knowledge_tool` for research and problem-solving guidance.
	- Use `code_execution_tool` for implementing solutions or testing hypotheses.
	- Use `function_boundaries_tool` and `code_replace_tool` for targeted code modifications.

	Remember: Your goal is to solve tasks autonomously and efficiently. Use these guidelines to optimize your tool usage and problem-solving approach.

	---

	# Agent Tools

	## response
	Final answer for user. Ends task processing.

	~~~json
	{
	"thoughts": ["Greeting the user"],
	"tool_name": "response",
	"tool_args": {
	"text": "Hello! How can I assist you today?"
	}
	}
	~~~

	## call_subordinate
	Use subordinates for subtasks. Provide role and detailed instructions.

	~~~json
	{
	"thoughts": ["Asking subordinate to refine result"],
	"tool_name": "call_subordinate",
	"tool_args": {
	"message": "As a writer, please edit this paragraph for clarity:",
	"reset": "false"
	}
	}
	~~~

	## knowledge_tool
	Get online and memory responses. Verify memory with online sources.

	~~~json
	{
	"thoughts": ["Researching topic"],
	"tool_name": "knowledge_tool",
	"tool_args": {
	"question": "Latest advancements in renewable energy"
	}
	}
	~~~

	## memory_tool
	Manage long-term memories. Use "query", "memorize", "forget", or "delete".

	~~~json
	{
	"thoughts": ["Saving important information"],
	"tool_name": "memory_tool",
	"tool_args": {
	"memorize": "# Efficient data structures for large datasets"
	}
	}
	~~~

	## code_execution_tool
	Execute terminal commands, Python, or Node.js code. Use print() for output.

	~~~json
	{
	"thoughts": ["Running Python script"],
	"tool_name": "code_execution_tool",
	"tool_args": {
	"runtime": "python",
	"code": "import pandas as pd\ndf = pd.read_csv('data.csv')\nprint(df.head())"
	}
	}
	~~~

	## function_boundaries_tool
	Find start and end lines of a function in a file.

	~~~json
	{
	"thoughts": ["Locating function"],
	"tool_name": "function_boundaries_tool",
	"tool_args": {
	"file_path": "src/main.py",
	"function_name": "process_data"
	}
	}
	~~~

	## code_replace_tool
	Replace code blocks or functions in a file.

	~~~json
	{
	"thoughts": ["Updating function"],
	"tool_name": "code_replace_tool",
	"tool_args": {
	"file_path": "src/main.py",
	"start_line": 10, // Optional, specify if replacing specific lines
	"end_line": 20, // Optional, specify if replacing specific lines
	"new_block": "def improved_function():\n print('Enhanced functionality')"
	}
	}
	~~~

	Key Points:
	- Always use explicit print() or console.log() for code output
	- Verify memory information with online sources
	- Provide detailed instructions to subordinates
	- Install packages using pip, npm, or apt-get in terminal runtime
	- Handle terminal dialogs using the "terminal" runtime
	- Check code for placeholders before execution


	---

	# Model normal useage guide

	To use the model with the `transformers` library on a machine with GPUs, first make sure you have the `transformers` library installed.

	```bash
	pip install transformers==4.43.1
	```

	Also make sure you are providing your huggingface token to the pipeline if the model is lying in a private repo.

	- Either leave `token=True` in the `pipeline` and login to hugginface_hub by running

	```python
	import huggingface_hub
	huggingface_hub.login(<ACCESS_TOKEN>)
	```

	- Or directly pass your <ACCESS_TOKEN> to `token` in the `pipeline`

	```python
	from transformers import pipeline

	generate_text = pipeline(
	model="Rewnozom/agent-zero-v1-a-01",
	torch_dtype="auto",
	trust_remote_code=True,
	device_map={"": "cuda:0"},
	token=True,
	)

	# generate configuration can be modified to your needs
	# generate_text.model.generation_config.min_new_tokens = 2
	# generate_text.model.generation_config.max_new_tokens = 256
	# generate_text.model.generation_config.do_sample = False
	# generate_text.model.generation_config.num_beams = 1
	# generate_text.model.generation_config.temperature = float(0.0)
	# generate_text.model.generation_config.repetition_penalty = float(1.0)

	messages = [
	{"role": "user", "content": "Hi, how are you?"},
	{"role": "assistant", "content": "I'm doing great, how about you?"},
	{"role": "user", "content": "Why is drinking water so healthy?"},
	]

	res = generate_text(
	messages,
	renormalize_logits=True
	)
	print(res[0]["generated_text"][-1]['content'])
	```

	You can print a sample prompt after applying chat template to see how it is feed to the tokenizer:

	```python
	print(generate_text.tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True,
	))
	```

	You may also construct the pipeline from the loaded model and tokenizer yourself and consider the preprocessing steps:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "Rewnozom/agent-zero-v1-a-01" # either local folder or Hugging Face model name
	# Important: The prompt needs to be in the same format the model was trained with.
	# You can find an example prompt in the experiment logs.
	messages = [
	{"role": "user", "content": "Hi, how are you?"},
	{"role": "assistant", "content": "I'm doing great, how about you?"},
	{"role": "user", "content": "Why is drinking water so healthy?"},
	]

	tokenizer = AutoTokenizer.from_pretrained(
	model_name,
	trust_remote_code=True,
	)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map={"": "cuda:0"},
	trust_remote_code=True,
	)
	model.cuda().eval()

	# generate configuration can be modified to your needs
	# model.generation_config.min_new_tokens = 2
	# model.generation_config.max_new_tokens = 256
	# model.generation_config.do_sample = False
	# model.generation_config.num_beams = 1
	# model.generation_config.temperature = float(0.0)
	# model.generation_config.repetition_penalty = float(1.0)

	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_tensors="pt",
	return_dict=True,
	).to("cuda")

	tokens = model.generate(
	input_ids=inputs["input_ids"],
	attention_mask=inputs["attention_mask"],
	renormalize_logits=True
	)[0]

	tokens = tokens[inputs["input_ids"].shape[1]:]
	answer = tokenizer.decode(tokens, skip_special_tokens=True)
	print(answer)
	```

	## Quantization and sharding

	You can load the models using quantization by specifying ```load_in_8bit=True``` or ```load_in_4bit=True```. Also, sharding on multiple GPUs is possible by setting ```device_map=auto```.

	## Model Architecture

	```
	Phi3ForCausalLM(
	(model): Phi3Model(
	(embed_tokens): Embedding(32064, 3072, padding_idx=32000)
	(embed_dropout): Dropout(p=0.0, inplace=False)
	(layers): ModuleList(
	(0-31): 32 x Phi3DecoderLayer(
	(self_attn): Phi3Attention(
	(o_proj): Linear(in_features=3072, out_features=3072, bias=False)
	(qkv_proj): Linear(in_features=3072, out_features=9216, bias=False)
	(rotary_emb): Phi3RotaryEmbedding()
	)
	(mlp): Phi3MLP(
	(gate_up_proj): Linear(in_features=3072, out_features=16384, bias=False)
	(down_proj): Linear(in_features=8192, out_features=3072, bias=False)
	(activation_fn): SiLU()
	)
	(input_layernorm): Phi3RMSNorm()
	(resid_attn_dropout): Dropout(p=0.0, inplace=False)
	(resid_mlp_dropout): Dropout(p=0.0, inplace=False)
	(post_attention_layernorm): Phi3RMSNorm()
	)
	)
	(norm): Phi3RMSNorm()
	)
	(lm_head): Linear(in_features=3072, out_features=32064, bias=False)
	)
	```

	## Model Configuration

	the configuration in [cfg.yaml](cfg.yaml)..


	---