neihtmahp
/

flock_dataset

Model card Files Files and versions

flock_dataset / README.md

neihtmahp's picture

Upload folder using huggingface_hub

05b5ef4 verified 6 months ago

|

history blame contribute delete

3.12 kB

	---
	dataset_info:
	features:
	- name: system
	dtype: string
	- name: conversations
	sequence:
	- name: role
	dtype: string
	- name: content
	dtype: string
	splits:
	- name: train
	num_bytes: 0
	num_examples: 50 # update if you add more rows later
	download_size: 0
	dataset_size: 0
	language:
	- en
	tags:
	- bittensor
	- flock
	- consulting
	- m&a
	license: mit
	---

	# Flock Dataset for Subnet 96 (M&A Consulting)

	This dataset is designed for use on Bittensor Subnet 96 (Flock) to train and evaluate models that generate high-quality consulting-style responses in the Mergers & Acquisitions (M&A) domain.
	It follows the JSONL structure required by Subnet 96 validators.

	---

	## 📂 Dataset Structure

	Each entry in the dataset is a JSON object stored in a `.jsonl` file.
	Format:

	```json
	{
	"system": "You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.",
	"conversations": [
	{"role": "user", "content": "User input text"},
	{"role": "assistant", "content": "Assistant response text"}
	]
	}
	system → Defines the expert persona (M&A consultant).

	user → The client’s question or request.

	assistant → The consultant’s structured response.

	🏗️ Example Entry
	json
	Kopiëren
	Bewerken
	{
	"system": "You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.",
	"conversations": [
	{
	"role": "user",
	"content": "What are key sales synergy levers for an integration/merger in the consumer goods industry?"
	},
	{
	"role": "assistant",
	"content": "- Improving margins by taking the most generous contractual terms\n- Consolidate sale volumes to bring more scale to the (yearly) negotiations\n- Apply cross selling by offering additional products to existing/new clients"
	}
	]
	}
	✅ Requirements Compliance
	Follows Subnet 96 JSONL format (system + conversations array).

	Answers are structured in bullet points for validator readability.

	Focused on M&A consulting (pre-deal and post-deal).

	📊 Current Dataset Size
	Entries: ~50 Q&A pairs (v1.0)

	Format: JSONL (dataset_sn96.jsonl)

	🚀 Usage
	Loading with datasets library
	python
	Kopiëren
	Bewerken
	from datasets import load_dataset

	dataset = load_dataset("neihtmahp/flock_dataset")
	print(dataset["train"][0])
	Example Output
	python
	Kopiëren
	Bewerken
	{
	'system': 'You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.',
	'conversations': [
	{'role': 'user', 'content': 'What are integration risks that are often underestimated?'},
	{'role': 'assistant', 'content': '- Missing cross-functional alignment\n- Not sufficient time to apply user acceptance testing\n- Late sign-off from stakeholders'}
	]
	}
	📌 Version History
	v1.0 → Initial release with 50 curated Q&A entries.

	Future versions will expand coverage of:

	Commercial due diligence

	IT due diligence

	Post-merger integration

	✨ Acknowledgements
	This dataset was created for experimentation with Flock Subnet 96 mining and validation.
	Contributions welcome!

	---
	license: mit
	---