tiny-scribe / extraction_benchmark_results.json
Luigi's picture
Add extraction benchmark and results for small models
126dfa5
[
{
"model": "Falcon-H1 100M",
"repo_id": "tiiuae/Falcon-H1-100M-Base-GGUF",
"success": false,
"items_extracted": 0,
"response": "",
"error": "tiiuae/Falcon-H1-100M-Base-GGUF (repository not found)",
"time_seconds": 1.1495630741119385
},
{
"model": "Gemma-3 270M",
"repo_id": "google/gemma-3-270m-it-GGUF",
"success": false,
"items_extracted": 0,
"response": "",
"error": "google/gemma-3-270m-it-GGUF (repository not found)",
"time_seconds": 0.8675286769866943
},
{
"model": "Granite-4.0 350M",
"repo_id": "unsloth/granite-4.0-h-350m-GGUF",
"success": true,
"items_extracted": 8,
"response": "{\n \"action_items\": [\n \"對三星進行供應調整\",\n \"確認供應數量\"\n ],\n \"decisions\": [\n \"優先供應大客戶\",\n \"暫停接受新訂單\"\n ],\n \"key_points\": [\n \"AI和Service上的供應需求\",\n \"PCMOthers模組廠供應數量\"\n ],\n \"open_questions\": [\n \"三星是否有其他供應策略\",\n \"是否有其他供應方的應對策略\"\n ]\n}",
"error": "",
"time_seconds": 14.357566356658936,
"parsed_data": {
"decisions": [
"優先供應大客戶",
"暫停接受新訂單"
],
"action_items": [
"對三星進行供應調整",
"確認供應數量"
],
"key_points": [
"AI和Service上的供應需求",
"PCMOthers模組廠供應數量"
],
"open_questions": [
"三星是否有其他供應策略",
"是否有其他供應方的應對策略"
]
}
},
{
"model": "BitCPM4 0.5B",
"repo_id": "openbmb/BitCPM4-0.5B-GGUF",
"success": true,
"items_extracted": 9,
"response": "{\n \"action_items\": [\"今年的AI和Service供应\", \"今年的Mobile20供应\", \"今年的PCM供应\"],\n \"decisions\": [\"是否继续供应AI和Service\", \"是否停止供应AI和Service\", \"是否继续供应Mobile20和PCM\"],\n \"key_points\": [\"明年是否继续供应AI和Service\", \"明年是否继续供应Mobile20和PCM\"],\n \"open_questions\": [\"是否继续供应AI和Service\"]\n}",
"error": "",
"time_seconds": 4.5774476528167725,
"parsed_data": {
"decisions": [
"是否继续供应AI和Service",
"是否停止供应AI和Service",
"是否继续供应Mobile20和PCM"
],
"action_items": [
"今年的AI和Service供应",
"今年的Mobile20供应",
"今年的PCM供应"
],
"key_points": [
"明年是否继续供应AI和Service",
"明年是否继续供应Mobile20和PCM"
],
"open_questions": [
"是否继续供应AI和Service"
]
}
},
{
"model": "Qwen3 0.6B",
"repo_id": "unsloth/Qwen3-0.6B-GGUF",
"success": true,
"items_extracted": 9,
"response": "<think>\n好的,我需要处理用户提供的会议 transcript 并按照要求提取关键資訊。首先,我要仔细阅读对话内容,理解讨论的主题和涉及的公司或产品。\n\n首先,用户提到三星在去年Q3时计划将产出50%用于AI和服务,而25年在Mobile20的PCM模块厂是PCMOthers。26年的供应比25年在PCMOthers少了15%。现在的情况是,三星可能缺到了8年,加上昨天看到的SanDisk的情况,说明可能需要考虑供应链问题。接下来,SPEAKER_03提到凯力士的情况类似,而SPEAKER_02回应说vivo不会停产,但PCMOthers会停,但vivo不会停。这里可能涉及到供应链调整和产品线调整。\n\n接下来,我需要按照用户的要求提取关键資訊。首先,action_items是具体的行动项目,比如讨论PCMOthers的供应情况和调整策略。decisions包括优先考虑供应大客户和暂停新订单。key_points包括DDR4的缺货和AI需求占比,以及两家公司的策略调整。open_questions可能包括价格预测和测试场良率确认。\n\n需要确保每个类别有1-5个项目,并且是具体的完整...",
"error": "",
"time_seconds": 13.032857418060303,
"parsed_data": {
"decisions": [
"優先供應大客戶浪潮",
"暫停接受新訂單"
],
"action_items": [
"與三星討論PCMOthers的供應調整",
"確認LPDDR4供應數量"
],
"key_points": [
"DDR4缺貨持續到2028年",
"AI需求占全球產能45%",
"美光可能跟進SanDisk付款條件"
],
"open_questions": [
"Q2價格漲幅預估",
"深圳測試場良率確認"
]
}
},
{
"model": "Granite 3.1 1B",
"repo_id": "bartowski/granite-3.1-1b-a400m-instruct-GGUF",
"success": true,
"items_extracted": 12,
"response": "{\n \"action_items\": [\"三星減產PCMOthers塊\", \"vivo不會停產\", \"SanDisk可能停產\"],\n \"decisions\": [\"三星減產PCMOthers塊\", \"vivo不會停產\", \"SanDisk可能停產\"],\n \"key_points\": [\"PCMOthers塊供應量將減少8年\", \"vivo不會停產\", \"SanDisk可能停產\"],\n \"open_questions\": [\"是否會停產PCMOthers塊\", \"是否會停產vivo\", \"SanDisk是否會停產\"]\n}",
"error": "",
"time_seconds": 11.540041446685791,
"parsed_data": {
"decisions": [
"三星減產PCMOthers塊",
"vivo不會停產",
"SanDisk可能停產"
],
"action_items": [
"三星減產PCMOthers塊",
"vivo不會停產",
"SanDisk可能停產"
],
"key_points": [
"PCMOthers塊供應量將減少8年",
"vivo不會停產",
"SanDisk可能停產"
],
"open_questions": [
"是否會停產PCMOthers塊",
"是否會停產vivo",
"SanDisk是否會停產"
]
}
},
{
"model": "Falcon-H1 1.5B",
"repo_id": "unsloth/Falcon-H1-1.5B-Deep-Instruct-GGUF",
"success": true,
"items_extracted": 4,
"response": "```json\n{\n \"action_items\": [],\n \"decisions\": [],\n \"key_points\": [\"三星產出50%將投入AI和服務領域\", \"PCMOthers分區25%將投入Mobile,20%投入PCM模組廠\"],\n \"open_questions\": [\"對於三星減產或停產的影響\", \"DDRName產品與SanDisk的競爭情況\"]\n}\n```",
"error": "",
"time_seconds": 24.45098614692688,
"parsed_data": {
"decisions": [],
"action_items": [],
"key_points": [
"三星產出50%將投入AI和服務領域",
"PCMOthers分區25%將投入Mobile,20%投入PCM模組廠"
],
"open_questions": [
"對於三星減產或停產的影響",
"DDRName產品與SanDisk的競爭情況"
]
}
},
{
"model": "Qwen3 1.7B",
"repo_id": "unsloth/Qwen3-1.7B-GGUF",
"success": true,
"items_extracted": 12,
"response": "<think>\n好的,我需要分析用户提供的会议逐字稿,并按照要求提取关键信息。首先,我要理解对话内容。会议涉及三星的产能调整,特别是PCMOthers板块的减产,以及这对供应和需求的影响。\n\n接下来,识别决策点。会议中提到三星在2026年减少PCMOthers的产能,这可能影响供应。另外,决定暂停接受新订单,以应对供应短缺。还有关于价格调整的讨论,但可能没有明确决策。\n\n行动项目方面,需要跟进三星的产能调整,确认LPDDR4的供应数量,以及与vivo的沟通。此外,可能需要确认价格调整的细节。\n\n关键点包括市场趋势(AI和Mobile需求增长),供应状况(短缺持续到2028年),策略调整(暂停新订单)。\n\n未解决的问题可能涉及价格涨幅预估和深圳测试场的良率确认。\n</think>\n\n{\n \"action_items\": [\"與三星確認2026年PCMOthers產能調整細節\", \"確認LPDDR4供應數量及到貨時間\", \"與vivo協調成品供應穩定策略\"],\n \"decisions\": [\"暫停接受新訂單以應對供應缺口\", \"三星2026年PCMOthers產能減產15%\"],\n ...",
"error": "",
"time_seconds": 19.010337352752686,
"parsed_data": {
"decisions": [
"暫停接受新訂單以應對供應缺口",
"三星2026年PCMOthers產能減產15%"
],
"action_items": [
"與三星確認2026年PCMOthers產能調整細節",
"確認LPDDR4供應數量及到貨時間",
"與vivo協調成品供應穩定策略"
],
"key_points": [
"DDR4缺貨持續至2028年",
"AI需求占全球產能45%",
"美光可能跟進SanDisk付款條件",
"價格調整需與vivo協商"
],
"open_questions": [
"Q2價格漲幅預估",
"深圳測試場良率確認",
"vivo供應鏈風險評估"
]
}
}
]