NebulaeWis's picture
Update README.md
1a6f446 verified
metadata
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation
base_model:
  - Qwen/Qwen3-0.6B-Base

这是一个基于 Qwen/Qwen3-0.6B-Base 进行指令微调的语言模型,专注于处理和生成与 动漫 图像标签体系相关的自然语言和标签数据。

模型详情

  • 基础模型: Qwen/Qwen3-0.6B-Base
  • 微调方法: 指令微调 (Instruction SFT)
  • 微调框架: LLaMA-Factory
  • 训练数据集: 由五个指令构成的共 42,268,080 条样本,平均长度 287 Token,构成了一个总计约 121 亿 Token 的大规模数据集。
  • 训练进度: 目前模型已在上述数据集中训练了 4,396,032,000 Token。
  • 硬件配置: 3 x NVIDIA GeForce RTX 4090
  • 赞助商: Myself
  • 上下文长度 (cutoff_len): 设为 768。这个长度覆盖了训练集中 99.5% 的样本。由于输入格式采用 XML 包裹,为避免破坏结构,超过此长度的训练样本被直接丢弃。
  • 评估损失 (Eval Loss):
任务 (Task) 评估损失 (Eval Loss)
eval_nltotag_loss 0.8972
eval_shorttolong_loss 1.2120
eval_tagdetail_loss 0.9317
eval_tagtonl_loss 1.2363
eval_tagtotag_loss 0.7396

与 Neta-Lumina 的协同设计

本模型是一个专为 Neta-Lumina模型 设计的文本处理引擎。

由于此语言模型与 Neta-Lumina 图像模型使用了同源的高质量自然语言-标签数据集进行训练,二者在数据理解上具有天然的一致性。这意味着:

  • 高度适配的理解能力: 本模型生成的标签 (Tags) 和自然语言描述 (Captions) 在风格、结构和细节上,与 Neta-Lumina 的“偏好”高度契合。
  • 释放 T2I 模型潜力: 使用本模型生成的精准提示词,可以更有效地引导 Neta-Lumina 创作出符合预期的、高质量的图像作品。

用于其他模型 (如 noobai-XL)

对于依赖标签的模型,本模型可以高效生成、补全和优化标签集。

  • 使用方式:
    1. 调用 <NLTOTAG>, <TAGTOTAG><TAGDETAIL> 指令。
    2. 编写一个简单的脚本,提取输出结果 XML<tag> 标签下的各类标签文本。
    3. 将提取的标签用 ", " 连接起来,形成适用于目标模型的提示词。

功能与任务

模型支持以下五种指令任务,所有输入和输出均需使用指定的 XML 格式包裹:

  1. 自然语言描述 → 标签 (<NLTOTAG>)

    • 功能: 将一段自然语言的图像描述(Caption)转换为一组标签。
  2. 标签 → 自然语言描述 (<TAGTONL>)

    • 功能: 将一组标签转换为一段详细、连贯的自然语言描述。
  3. 标签补全与优化 (<TAGTOTAG>)

    • 功能: 对一组不完整的标签进行补全和优化。训练时通过对完整的标签集进行高、中、低强度的随机丢弃来模拟不完整的输入。
  4. 标签扩增 (<TAGDETAIL>)

    • 功能: 将一组稀疏的核心标签(如 1girl、角色名等,少于10个)扩充为包含丰富细节的完整标签集(30个以上)。
  5. 短描述 → 长描述 (<SHORTTOLONG>)

    • 功能: 将一段简短的图像描述扩写成更详细、内容更丰富的长描述。

如何使用

请使用特定的指令来引导模型执行相应任务。

1. 自然语言描述 → 标签 (<NLTOTAG>)

  • 指令: <NLTOTAG>
  • 输入:
    <caption>This illustration features a young girl standing with a smile, looking directly at the viewer. She's wearing a white beret, and her long, light-colored hair is styled with pigtails, adorned with ribbons. She sports round glasses and a white collared shirt with long sleeves, layered with a light orange sweater vest. A pleated skirt that matches the ribbons, and thigh-highs complete her outfit. Her hands are clasped over her chest, possibly holding a book or other object. The background is plain white, emphasizing the character.</caption>
    
  • 输出:
    <tags><special>1girl</special><artists></artists><characters></characters><copyrights>original</copyrights><general>skirt, thighhighs, pleated_skirt, smile, white_background, hands_on_own_chest, looking_at_viewer, long_hair, glasses, solo, simple_background, long_sleeves, sketch</general><rating>safe</rating></tags>
    

2. 标签 → 自然语言描述 (<TAGTONL>)

  • 指令: <TAGTONL>
  • 输入:
    <tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>bracelet, casual, purple_hair, oekaki, outdoors, white_flower, blouse, jewelry, pendant, drill_hair, dandelion, closed_eyes, smile, short_sleeves, frills</general><rating>safe</rating></tags>
    
  • 输出:
    <caption>The image is a vibrant, anime-style illustration of two young women, likely teens, lying together on a bed of green grass dotted with dandelions. The composition features a close-up, overhead perspective. The first girl has brown drill hair styled in two high twintails, secured with pink bows featuring black polka dots, and she's dressed in a polka-dotted dress with frilly details and short sleeves, with a pink heart pendant necklace. She's smiling widely, eyes closed, showing a blush on her cheeks, and is looking at the other girl. The second girl, also wearing a similar style dress with a pink bow, and has pink hair, looks back at her with a smile, hugging her. Both girls have bracelets. The overall atmosphere is bright and cheerful, suggesting a moment of friendship and intimacy in a sunny outdoor setting.</caption>
    

3. 标签补全与优化 (<TAGTOTAG>)

  • 指令: <TAGTOTAG>
  • 输入 (部分标签):
    <tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>looking_at_another, hug, camisole, on_back, blush, oekaki, field, jewelry, dandelion_clock, on_grass, pendant, blouse, short_sleeves, dandelion, frills, bow, casual, smile, sleeveless, outdoors, brown_hair, pink_bow, hair_ribbon, polka_dot, shirt, short_hair, yellow_flower, lying, flower, closed_eyes, bracelet, drill_hair, sparkle, grass, on_side, purple_hair, ribbon, on_ground, white_flower</general><rating>safe</rating></tags>
    
  • 输出 (补全后的标签):
    <tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>closed_eyes, hair_ribbon, oekaki, sleeveless, sparkle, hug, pink_bow, white_flower, short_hair, looking_at_another, dandelion_clock, ribbon, pendant, flower, lying, purple_hair, bracelet, smile, bow, brown_hair, frills, blush, jewelry, short_sleeves, on_grass, casual, grass, outdoors, shirt, blouse, field, yellow_flower, camisole, on_back, twintails, polka_dot, on_ground, on_side, dandelion</general><rating>safe</rating></tags>
    

4. 标签扩增 (<TAGDETAIL>)

  • 指令: <TAGDETAIL>
  • 输入 (核心标签):
    <tags><special>1girl</special><artists></artists><characters>hatsune_miku</characters><copyrights>vocaloid</copyrights><general></general><rating>safe</rating></tags>
    
  • 输出 (扩增后标签):
    <tags><special>1girl</special><artists></artists><characters>hatsune_miku</characters><copyrights>vocaloid</copyrights><general>solo, long_hair, twintails, blue_hair, looking_at_viewer, smile, aqua_hair, hair_ornament, aqua_eyes, shirt, sleeveless, collar, necktie, official_alternate_costume, bare_shoulders, pleated_skirt, black_skirt, thighhighs, detached_sleeves, headphones, microphone</general><rating>safe</rating></tags>
    

5. 短描述 → 长描述 (<SHORTTOLONG>)

  • 指令: <SHORTTOLONG>
  • 输入 (短描述):
    <caption>A girl with blue pigtails.</caption>
    
  • 输出 (长描述):
    <caption>This illustration portrays a young woman, identified as Hatsune Miku from the Vocaloid series, characterized by her signature long, aqua-colored pigtails. She is depicted looking directly at the viewer with a friendly smile. Her outfit consists of a sleeveless grey top with a teal collar and tie, complemented by a black pleated skirt and thigh-high boots, which is her iconic attire. The simple background ensures that the focus remains entirely on the character.</caption>
    

已知问题

  • 训练数据中为了保证足够的 knowledge 引入,未有效过滤掉标签过少的样本,可能需要后续通过 DPO 方法提升 <TAGTOTAG><TAGDETAIL> 指令的输出长度和质量。