---
dataset_info:
  features:
  - name: system
    dtype: string
  - name: conversations
    sequence:
      - name: role
        dtype: string
      - name: content
        dtype: string
  splits:
  - name: train
    num_bytes: 0
    num_examples: 50   # update if you add more rows later
  download_size: 0
  dataset_size: 0
language:
- en
tags:
- bittensor
- flock
- consulting
- m&a
license: mit
---

# Flock Dataset for Subnet 96 (M&A Consulting)

This dataset is designed for use on **Bittensor Subnet 96 (Flock)** to train and evaluate models that generate high-quality consulting-style responses in the **Mergers & Acquisitions (M&A)** domain.  
It follows the JSONL structure required by Subnet 96 validators.

---

## 📂 Dataset Structure

Each entry in the dataset is a JSON object stored in a `.jsonl` file.  
Format:

```json
{
  "system": "You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.",
  "conversations": [
    {"role": "user", "content": "User input text"},
    {"role": "assistant", "content": "Assistant response text"}
  ]
}
system → Defines the expert persona (M&A consultant).

user → The client’s question or request.

assistant → The consultant’s structured response.

🏗️ Example Entry
json
Kopiëren
Bewerken
{
  "system": "You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.",
  "conversations": [
    {
      "role": "user",
      "content": "What are key sales synergy levers for an integration/merger in the consumer goods industry?"
    },
    {
      "role": "assistant",
      "content": "- Improving margins by taking the most generous contractual terms\n- Consolidate sale volumes to bring more scale to the (yearly) negotiations\n- Apply cross selling by offering additional products to existing/new clients"
    }
  ]
}
✅ Requirements Compliance
Follows Subnet 96 JSONL format (system + conversations array).

Answers are structured in bullet points for validator readability.

Focused on M&A consulting (pre-deal and post-deal).

📊 Current Dataset Size
Entries: ~50 Q&A pairs (v1.0)

Format: JSONL (dataset_sn96.jsonl)

🚀 Usage
Loading with datasets library
python
Kopiëren
Bewerken
from datasets import load_dataset

dataset = load_dataset("neihtmahp/flock_dataset")
print(dataset["train"][0])
Example Output
python
Kopiëren
Bewerken
{
  'system': 'You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.',
  'conversations': [
    {'role': 'user', 'content': 'What are integration risks that are often underestimated?'},
    {'role': 'assistant', 'content': '- Missing cross-functional alignment\n- Not sufficient time to apply user acceptance testing\n- Late sign-off from stakeholders'}
  ]
}
📌 Version History
v1.0 → Initial release with 50 curated Q&A entries.

Future versions will expand coverage of:

Commercial due diligence

IT due diligence

Post-merger integration

✨ Acknowledgements
This dataset was created for experimentation with Flock Subnet 96 mining and validation.
Contributions welcome!

---
license: mit
---