flock_dataset / README.md
neihtmahp's picture
Upload folder using huggingface_hub
05b5ef4 verified
metadata
dataset_info:
  features:
    - name: system
      dtype: string
    - name: conversations
      sequence:
        - name: role
          dtype: string
        - name: content
          dtype: string
  splits:
    - name: train
      num_bytes: 0
      num_examples: 50
  download_size: 0
  dataset_size: 0
language:
  - en
tags:
  - bittensor
  - flock
  - consulting
  - m&a
license: mit

Flock Dataset for Subnet 96 (M&A Consulting)

This dataset is designed for use on Bittensor Subnet 96 (Flock) to train and evaluate models that generate high-quality consulting-style responses in the Mergers & Acquisitions (M&A) domain.
It follows the JSONL structure required by Subnet 96 validators.


📂 Dataset Structure

Each entry in the dataset is a JSON object stored in a .jsonl file.
Format:

{
  "system": "You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.",
  "conversations": [
    {"role": "user", "content": "User input text"},
    {"role": "assistant", "content": "Assistant response text"}
  ]
}
system → Defines the expert persona (M&A consultant).

user → The client’s question or request.

assistant → The consultant’s structured response.

🏗️ Example Entry
json
Kopiëren
Bewerken
{
  "system": "You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.",
  "conversations": [
    {
      "role": "user",
      "content": "What are key sales synergy levers for an integration/merger in the consumer goods industry?"
    },
    {
      "role": "assistant",
      "content": "- Improving margins by taking the most generous contractual terms\n- Consolidate sale volumes to bring more scale to the (yearly) negotiations\n- Apply cross selling by offering additional products to existing/new clients"
    }
  ]
}
✅ Requirements Compliance
Follows Subnet 96 JSONL format (system + conversations array).

Answers are structured in bullet points for validator readability.

Focused on M&A consulting (pre-deal and post-deal).

📊 Current Dataset Size
Entries: ~50 Q&A pairs (v1.0)

Format: JSONL (dataset_sn96.jsonl)

🚀 Usage
Loading with datasets library
python
Kopiëren
Bewerken
from datasets import load_dataset

dataset = load_dataset("neihtmahp/flock_dataset")
print(dataset["train"][0])
Example Output
python
Kopiëren
Bewerken
{
  'system': 'You are an expert M&A strategy consultant. Provide concise, bullet-point style answers.',
  'conversations': [
    {'role': 'user', 'content': 'What are integration risks that are often underestimated?'},
    {'role': 'assistant', 'content': '- Missing cross-functional alignment\n- Not sufficient time to apply user acceptance testing\n- Late sign-off from stakeholders'}
  ]
}
📌 Version History
v1.0 → Initial release with 50 curated Q&A entries.

Future versions will expand coverage of:

Commercial due diligence

IT due diligence

Post-merger integration

✨ Acknowledgements
This dataset was created for experimentation with Flock Subnet 96 mining and validation.
Contributions welcome!

---
license: mit
---