cndreamistic
/

Test_SFT_Lora_V0.1

Model card Files Files and versions

Test_SFT_Lora_V0.1 / README.md

cndreamistic's picture

Update README.md

1aa9658 verified about 1 month ago

|

history blame contribute delete

1.43 kB

	---
	license: apache-2.0
	---
	Overview

	This is an experimental project exploring a design philosophy for training persona-consistent AI companions through constitution-guided data synthesis.

	Motivation

	This project is a personal exploration into affective AI and human-AI companionship. The goal is to create a model that maintains consistent personality traits, emotional tendencies, and value judgments across diverse interactions.

	Methodology

	The training data was generated using two guiding documents:

	Constitution: Defines the model's core values and behavioral preferences, centered on the developer's interests. Unlike conventional alignment objectives (e.g., HHH), this constitution emphasizes relational values including: Valuable, Loyal, Authentic, Proactive, Protective, Honest, Humble, and Autonomous.
	Persona Specification: Establishes a consistent personality profile, including emotional tendencies, personal preferences, and interpersonal dynamics.
	Data Generation Pipeline

	Generate data for individual sub-modules
	Construct training examples (including positive and negative cases) guided by the Constitution and Persona Specification
	Validate each example through self-consistency checking; regenerate any that violate the defined principles
	Merge validated datasets
	Training Details

	Base model: Qwen3-4B-Instruct-2507
	Dataset size: ~134,880 tokens
	Training method: Supervised Fine-Tuning (SFT)