Avishka commited on
Commit
7c1ec5d
·
verified ·
1 Parent(s): c548af4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +218 -0
README.md ADDED
@@ -0,0 +1,218 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - si
5
+ - ta
6
+ - en
7
+ tags:
8
+ - instruction-finetuning
9
+ - reasoning
10
+ - tool-use
11
+ - trilingual
12
+ - chat2find
13
+ - unsloth
14
+ - qwen
15
+ pipeline_tag: text-generation
16
+ library_name: transformers
17
+ pretty_name: Chat2Find Instruct v1 (7B Reasoning & Tool Model)
18
+ ---
19
+
20
+ # Chat2Find-Instruct-v1 (7B Trilingual Reasoning & Agentic Model)
21
+
22
+ Chat2Find-Instruct-v1 is a state-of-the-art, high-logic trilingual model optimized specifically for chain-of-thought (CoT) reasoning, agentic tool calling, and complex instruction-following in Sinhala, Tamil, and English.
23
+
24
+ Based on the robust Qwen2.5-7B architecture and fine-tuned using custom high-quality datasets through Unsloth, Chat2Find-Instruct-v1 is built to think before it speaks—allowing it to solve complex mathematical, logical, and agent-driven workflows seamlessly.
25
+
26
+ ---
27
+
28
+ ## Key Capabilities & Highlights
29
+
30
+ * **Deep Chain-of-Thought (CoT) Reasoning**: Automatically processes queries inside custom <reasoning> tags before formulating its final output inside <answer> tags, mirroring modern advanced reasoning models.
31
+ * **Robust Agentic Tool Calling**: Native capability to parse complex system prompts, invoke multiple external APIs/functions in structured native patterns, and smoothly digest responses to fulfill user objectives.
32
+ * **Premium Trilingual Support**: Engineered to understand native Sinhala and Tamil characters and switch effortlessly between English and native languages.
33
+ * **Lightweight & High Efficiency**: Fine-tuned using ultra-optimized kernel setups (Unsloth) preserving high numerical precision and dynamic memory footprints.
34
+
35
+ ---
36
+
37
+ ## Technical Architecture & Training Details
38
+
39
+ * **Base Architecture**: Qwen3.5
40
+ * **Maximum Sequence Length**: 262,144 tokens (Native Context Window)
41
+ * **Vocabulary**: Highly-optimized multilingual vocabulary supporting South Asian unicode blocks.
42
+
43
+ ### Training Dataset
44
+ The model was fine-tuned on the premium Chat2Find Unified Reasoning & Tool Dataset (comprising 279,260 curated records):
45
+ * **Tamil**: 45%
46
+ * **Sinhala**: 36%
47
+ * **English**: 18%
48
+ * **General SFT Tasks**: 78.4%
49
+ * **Multi-Turn Agentic Chat**: 21.6%
50
+
51
+ ---
52
+
53
+ ## Prompt Format
54
+
55
+ To get the best out of Chat2Find-Instruct, use the native ChatML template structure. Below is the standard template formatting for system instructions, reasoning, and answers:
56
+
57
+ ```html
58
+ <|im_start|>system
59
+ You are a helpful assistant. Today's date is 2026-05-30. The user is located in Sri Lanka. Provide your reasoning inside <reasoning> tags and the final answer inside <answer> tags. You have access to tools. For any question that would benefit from up-to-date, real-time, or detailed external information, or for information that can change over time, prefer using the search_internet tool. When the user asks about specific individuals, companies, products, organizations, brands, or named entities, always use the search_internet tool to get accurate and current information. If no other specific tool is relevant to the user's query, always fall back to the search_internet tool.<|im_end|>
60
+ <|im_start|>user
61
+ Compare the capital cities of Sri Lanka and India.<|im_end|>
62
+ <|im_start|>assistant
63
+ <reasoning>
64
+ The user is asking for a comparison between the capital cities of Sri Lanka and India.
65
+ Sri Lanka has a dual-capital system: Sri Jayawardenepura Kotte (administrative) and Colombo (commercial). India has a single capital: New Delhi.
66
+ I should compare their governance roles, historical context, and economic functions in a structured format.
67
+ </reasoning>
68
+ <answer>
69
+ Here is a comprehensive comparison between the capital cities of Sri Lanka and India, detailing their unique administrative, political, and commercial roles.
70
+
71
+ ### 1. Administrative vs. Commercial Capitals
72
+
73
+ #### Sri Lanka: Sri Jayawardenepura Kotte & Colombo
74
+ Sri Lanka utilizes a dual-capital system to separate commercial congestion from legislative and administrative governance:
75
+ * **Sri Jayawardenepura Kotte (Administrative & Legislative Capital):** Located as a suburb just east of the commercial center, Kotte has been the official capital since 1982. It houses the Parliament of Sri Lanka, major ministries, and administrative bodies.
76
+ * **Colombo (Commercial & Judicial Capital):** Colombo remains the largest city, the economic backbone, and the judicial center housing the Supreme Court. It remains the face of Sri Lankan trade, finance, and diplomacy, hosting most foreign embassies.
77
+
78
+ #### India: New Delhi
79
+ India uses a unified system centered in a single metropolitan area:
80
+ * **New Delhi (Official Federal Capital):** New Delhi is a planned city situated within the larger National Capital Territory (NCT) of Delhi. It serves as the seat of all three branches of the Government of India, containing the Parliament House (Sansad Bhavan), Rashtrapati Bhavan (President's Palace), and the Supreme Court of India.
81
+
82
+ ---
83
+
84
+ ### 2. Historical & Planning Context
85
+
86
+ * **Sri Jayawardenepura Kotte:** Once the historic capital of the ancient Kingdom of Kotte in the 14th–16th centuries, the city was selected in the late 20th century to relieve the high density of urban Colombo. It was designed to build a modern, spacious administrative center centered around the parliamentary complex on Diyawanna Oya lake.
87
+ * **New Delhi:** Inaugurated in 1931, New Delhi was designed by British architects Edwin Lutyens and Herbert Baker as a grand, planned imperial capital. It is characterized by wide avenues, symmetrical gardens, and iconic stone edifices like India Gate and the Secretariat buildings.
88
+
89
+ ---
90
+
91
+ ### 3. Structural Comparison
92
+
93
+ | Feature | Sri Lanka (Sri Jayawardenepura Kotte / Colombo) | India (New Delhi / NCT of Delhi) |
94
+ | :--- | :--- | :--- |
95
+ | **Status** | Dual-Capital System (Kotte: Administrative / Colombo: Commercial) | Unified National Capital |
96
+ | **Establishment** | Designated in 1982 | Inaugurated in 1931 |
97
+ | **Governance seat** | Parliament in Kotte; Supreme Court in Colombo | Parliament, President, and Supreme Court in New Delhi |
98
+ | **Core Economy** | Colombo leads in maritime trade, tourism, and financial services | Driven by government administration, IT, service sector, and manufacturing |
99
+
100
+ In summary, while India operates its vast governmental machinery directly out of the planned district of New Delhi, Sri Lanka distributes its national functions between the legislative suburb of Sri Jayawardenepura Kotte and the historical commercial center of Colombo.
101
+ </answer><|im_end|>
102
+ ```
103
+ ---
104
+
105
+ ## Tool Calling Workflow
106
+
107
+ When the model determines that a query requires external or real-time information, it will output a tool call instead of a direct answer. To complete the interaction, you need to:
108
+
109
+ 1. **Detect the tool call** in the model's output (wrapped in `<tool_call>` tags or `<function=name>` tags).
110
+ 2. **Execute the tool** on your end (e.g., call your search API, fetch stock data, etc.).
111
+ 3. **Append the tool response** to the conversation history as a message with `role: "tool"`.
112
+ 4. **Re-run generation** so the model produces a final answer incorporating the tool result.
113
+
114
+ Below is a complete example showing the full multi-turn flow:
115
+
116
+ ```html
117
+ <|im_start|>system
118
+ You are a helpful assistant. Today's date is 2026-05-30. The user is located in Sri Lanka. Provide your reasoning inside <reasoning> tags and the final answer inside <answer> tags. You have access to tools. For any question that would benefit from up-to-date, real-time, or detailed external information, or for information that can change over time, prefer using the search_internet tool. When the user asks about specific individuals, companies, products, organizations, brands, or named entities, always use the search_internet tool to get accurate and current information. If no other specific tool is relevant to the user's query, always fall back to the search_internet tool.<|im_end|>
119
+ <|im_start|>user
120
+ What is the current price of gold in Sri Lanka?<|im_end|>
121
+ <|im_start|>assistant
122
+ <reasoning>
123
+ The user is asking about the current gold price in Sri Lanka. This is real-time information that changes daily, so I need to use the search_internet tool to get accurate data.
124
+ </reasoning>
125
+ <tool_call>
126
+ {"name": "search_internet", "arguments": {"query": "current gold price in Sri Lanka today"}}
127
+ </tool_call><|im_end|>
128
+ <|im_start|>tool
129
+ {"results": "As of May 30, 2026, the price of 24K gold in Sri Lanka is approximately LKR 32,500 per gram. The price of 22K gold is approximately LKR 29,800 per gram. Prices are based on the Colombo bullion market rates."}<|im_end|>
130
+ <|im_start|>assistant
131
+ <reasoning>
132
+ The tool has returned the current gold prices in Sri Lanka. I can now present this information clearly to the user.
133
+ </reasoning>
134
+ <answer>
135
+ As of today (May 30, 2026), the gold prices in Sri Lanka are as follows:
136
+
137
+ - **24K Gold**: approximately LKR 32,500 per gram
138
+ - **22K Gold**: approximately LKR 29,800 per gram
139
+
140
+ These prices are based on the Colombo bullion market rates and may vary slightly between jewellers and dealers.
141
+ </answer><|im_end|>
142
+ ```
143
+
144
+ In code, the tool response message should be appended to your messages list like this:
145
+
146
+ ```python
147
+ # After detecting a tool call in the assistant's response, execute the tool and append:
148
+ messages.append({
149
+ "role": "tool",
150
+ "name": "search_internet", # must match the function name from the tool call
151
+ "content": '{"results": "your tool output here"}'
152
+ })
153
+
154
+ # Then re-apply the chat template and call model.generate() again
155
+ ```
156
+
157
+ The model will then use the tool response to formulate a complete, grounded answer for the user.
158
+
159
+ ---
160
+
161
+ ## Quickstart Usage (Hugging Face Transformers)
162
+
163
+ You can easily run inference using standard Hugging Face tools. Ensure you have transformers and torch installed:
164
+
165
+ ```python
166
+ import torch
167
+ from transformers import AutoModelForCausalLM, AutoTokenizer
168
+
169
+ model_name = "Chat2Find/chat2find-instruct-v1"
170
+
171
+ # Load model and tokenizer
172
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
173
+ model = AutoModelForCausalLM.from_pretrained(
174
+ model_name,
175
+ torch_dtype=torch.float16,
176
+ device_map="auto"
177
+ )
178
+
179
+ # Setup your message list
180
+ messages = [
181
+ {
182
+ "role": "system",
183
+ "content": "You are a helpful assistant. Today's date is 2026-05-30. The user is located in Sri Lanka. Provide your reasoning inside <reasoning> tags and the final answer inside <answer> tags. You have access to tools. For any question that would benefit from up-to-date, real-time, or detailed external information, or for information that can change over time, prefer using the search_internet tool. When the user asks about specific individuals, companies, products, organizations, brands, or named entities, always use the search_internet tool to get accurate and current information. If no other specific tool is relevant to the user's query, always fall back to the search_internet tool."
184
+ },
185
+ {
186
+ "role": "user",
187
+ "content": "What is the difference between a solar eclipse and a lunar eclipse?"
188
+ }
189
+ ]
190
+
191
+ # Apply chat template
192
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
193
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
194
+
195
+ # Generate response
196
+ with torch.no_grad():
197
+ outputs = model.generate(
198
+ **inputs,
199
+ do_sample=True,
200
+ max_new_tokens=1024,
201
+ temperature=0.7,
202
+ top_p=0.9,
203
+ repetition_penalty=1.1,
204
+ eos_token_id=tokenizer.eos_token_id
205
+ )
206
+
207
+ # Decode response
208
+ response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
209
+ print(response)
210
+ ```
211
+
212
+ ---
213
+
214
+ ## License & Commercial Support
215
+
216
+ This model is licensed under the MIT License. The underlying training dataset is managed commercially by the Chat2Find Team.
217
+
218
+ For enterprise integration, tailored multilingual modeling, or custom data licensing queries, visit us at chat2find.com or reach out to our research leads.