InventorsHub commited on
Commit
4b04410
·
verified ·
1 Parent(s): 3500e27

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -111
README.md CHANGED
@@ -1,111 +1,8 @@
1
- # SwarmChat: Unified Audio, Text, and Simulation Environment for Human-Swarm Interaction
2
-
3
- SwarmChat is an innovative project that enables intuitive communication with swarm robotics through natural language. This system integrates advanced audio transcription, text processing, and safety mechanisms with a live simulation environment that visualizes a swarm of agents executing behavior trees.
4
-
5
- ## Features
6
-
7
- - **Audio Input Processing**:
8
-
9
- - Record commands via a microphone.
10
- - Translate speech into English using the `facebook/seamless-m4t-v2-large` model.
11
- - Perform a safety check on the translated text before execution.
12
-
13
- - **Text Input Processing**:
14
-
15
- - Enter text commands for swarm control.
16
- - Translate text using EuroLLM (EuroLLM-9B-Instruct-Q4_K_M.gguf).
17
- - Detect unsafe or inappropriate content with an integrated safety module.
18
-
19
- - **Safety Module**:
20
-
21
- - Utilizes a fine-tuned LLaMA-based model (llama-guard-3-8b-q4_k_m.gguf) for safety classification.
22
- - Identifies unsafe content across predefined categories (e.g., violent crimes, privacy violations, hate speech).
23
- - Ensures commands comply with safety standards.
24
-
25
- - **Swarm Simulation**:
26
-
27
- - Visualize a swarm of agents in a live simulation powered by Violet simulator and Pygame.
28
- - Agents are controlled by behavior trees defined in an XML file (`tree.xml`), using the `py_trees` framework.
29
- - Real-time simulation updates streamed via a Gradio web interface.
30
-
31
- - **Behavior Tree Generator**:
32
-
33
- - DeepSeek leverages a Llama-based model to dynamically generate behavior trees in XML format.
34
- - Automatically extracts available behaviors from the SwarmAgent class and constructs a detailed prompt using a predefined XML template.
35
- - Generates and saves new behavior tree configurations (updating tree.xml) based on user-specified tasks.
36
-
37
- - **Integrated Interface**:
38
- - A unified Gradio web interface for both audio and text inputs.
39
- - Live streaming of the simulation environment.
40
- - Seamless switching between different input modalities.
41
-
42
- ## Technology Stack
43
-
44
- - **Backend**:
45
-
46
- - Python
47
- - [Transformers](https://huggingface.co/transformers/) (Hugging Face)
48
- - PyTorch
49
- - Pygame
50
- - Threading and Queue modules for simulation management
51
-
52
- - **Frontend**:
53
-
54
- - [Gradio](https://gradio.app/) for an interactive web-based interface.
55
-
56
- - **AI Models**:
57
-
58
- - **Speech Processing**: `facebook/seamless-m4t-v2-large` for audio transcription and translation.
59
- - **Text Processing**: EuroLLM (EuroLLM-9B-Instruct-Q4_K_M.gguf) for text translation.
60
- - **Safety Classification**: LLaMA Guard (llama-guard-3-8b-q4_k_m.gguf) for content safety assessment.
61
- - **Behavior Tree Generation**: DeepSeek (using a Llama-based model DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf) for creating and updating behavior tree configurations.
62
-
63
- - **Behavior Trees**:
64
- - Agents utilize behavior trees—parsed from XML and built with `py_trees`—to dictate their actions within the simulation.
65
-
66
- ## Installation
67
-
68
- 1. **Clone the repository**:
69
-
70
- ```bash
71
- git clone https://github.com/Inventors-Hub/SwarmChat.git
72
- cd SwarmChat
73
- ```
74
-
75
- 2. **Install dependencies**:
76
- ```bash
77
- pip install -r requirements.txt
78
- ```
79
- 3. **Setup AI Models**:
80
-
81
- - Place the EuroLLM model file (`EuroLLM-9B-Instruct-Q4_K_M.gguf`) at the specified path in `text_processing.py`.
82
- - Place the LLaMA Guard model file (`llama-guard-3-8b-q4_k_m.gguf`) at the specified path in `safety_module.py`.
83
- - Place the DeepSeek model file (`DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf`) at the specified path in `bt_generator.py`.
84
-
85
- 4. **Run the Application**:
86
- ```bash
87
- python app.py
88
- ```
89
- 5. **Access the Interface**:
90
-
91
- Open your browser and navigate to http://127.0.0.1:7860 to start using SwarmChat.
92
-
93
- ## Overview of Modules
94
-
95
- - **app.py**
96
- The main application integrates audio/text processing, behavior tree generation, and the live simulation. It sets up the Gradio interface, handles simulation streaming, and routes user inputs to the appropriate processing modules.
97
-
98
- - **speech_processing.py**
99
- Implements audio transcription and translation using the `facebook/seamless-m4t-v2-large` model.
100
-
101
- - **text_processing.py**
102
- Translates text commands using EuroLLM (EuroLLM-9B-Instruct-Q4_K_M.gguf).
103
-
104
- - **safety_module.py**
105
- Utilizes LLaMA Guard to assess the safety of incoming commands, ensuring compliance with safety policies.
106
-
107
- - **bt_generator.py**
108
- Dynamically generates behavior trees in XML format by extracting behaviors from the SwarmAgent class, constructing a prompt, and querying a Llama-based model. The generated XML is saved to `tree.xml` for simulation use.
109
-
110
- - **simulator_env.py**
111
- Powers the simulation environment, manages agent behaviors using XML-defined behavior trees, and handles real-time simulation updates.
 
1
+ title: SwarmChat v2
2
+ emoji: 🤖
3
+ colorFrom: blue
4
+ colorTo: indigo
5
+ sdk: docker
6
+ app_file: Dockerfile
7
+ app_port: 7860
8
+ pinned: false