Upload CB_Image-Caption_Batch.json
Browse filesHey comfyui world! I’ve crafted a ComfyUI workflow that batch-processes images to generate captions using BLIP Analyze Image and Florence2Run, then saves each image with its own .txt caption file using ImageBatchSaver—perfect for AI training datasets. Check it out on my site: makuta.io/comfyui-batch-captioning.
Workflow Breakdown:
LoadImagesFromFolderKJ (from comfyui-kjnodes): Loads up to 6 images from a folder (e.g., C:\Users\clement\ComfyUI\output\Character02).
BLIP Analyze Image (from was-ns): Generates basic captions (48-96 chars, CPU-friendly).
Florence2Run (from comfyui-florence2): Adds detailed captions (task: "more_detailed_caption", fp16/sdpa).
PreviewImage (comfy-core): Visualizes images for quick checks.
ImageBatchSaver (from the ComfyUI-Batch-Process pack [by Zar4X]—it saves images with companion .txt files per image (e.g., Alix_0001.png + Alix_0001.txt).
Critical Path-Saving Tip:
Connecting LoadImagesFromFolderKJ’s image_path output to ImageBatchSaver’s output_path input auto-derives filenames from input images and saves to ComfyUI’s default output folder (e.g., C:\Users\[user]\ComfyUI\output\Character02). Great for quick tests!
For a custom folder (e.g., a dedicated training dir), skip the connection and manually enter the full filepath in ImageBatchSaver’s output_path widget (e.g., D:\MyDatasets\Character02). This ensures precise control.
Screenshots: [Embed 3-6 images: workflow canvas, LoadImagesFromFolderKJ-to-ImageBatchSaver connection, sample outputs]
Download the JSON and full setup guide at makuta.io/comfyui-batch-captioning. Install nodes via ComfyUI Manager: search for comfyui-kjnodes, was-ns, comfyui-florence2, batch-process, comfyui-easy-use (for easy showAnything previews).
What do you think? Ideas for LoRA or video batching? Join the discussion on my site or share your remixes! makuta.io/comfyui-batch-captioning
#ComfyUI #StableDiffusion #AICaptions


Workflow BLIP

Workflow FLORENCE2RUN

|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"id":"82917733-ef11-49f0-81b7-f4a063610f39","revision":0,"last_node_id":66,"last_link_id":219,"nodes":[{"id":2,"type":"BLIP Model Loader","pos":[-146.51307678222656,-272.45977783203125],"size":[270,106],"flags":{},"order":0,"mode":0,"inputs":[{"localized_name":"blip_model","name":"blip_model","type":"STRING","widget":{"name":"blip_model"},"link":null},{"localized_name":"vqa_model_id","name":"vqa_model_id","type":"STRING","widget":{"name":"vqa_model_id"},"link":null},{"localized_name":"device","name":"device","type":"COMBO","widget":{"name":"device"},"link":null}],"outputs":[{"localized_name":"BLIP_MODEL","name":"BLIP_MODEL","type":"BLIP_MODEL","links":[3]}],"properties":{"cnr_id":"was-ns","ver":"3.0.0","Node name for S&R":"BLIP Model Loader","ue_properties":{"widget_ue_connectable":{"blip_model":true,"vqa_model_id":true,"device":true},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["Salesforce/blip-image-captioning-base","Salesforce/blip-vqa-base","cpu"]},{"id":3,"type":"BLIP Analyze Image","pos":[178.65325927734375,-271.7433166503906],"size":[400,252],"flags":{},"order":5,"mode":0,"inputs":[{"localized_name":"images","name":"images","type":"IMAGE","link":1},{"localized_name":"blip_model","name":"blip_model","type":"BLIP_MODEL","link":3},{"localized_name":"mode","name":"mode","type":"COMBO","widget":{"name":"mode"},"link":null},{"localized_name":"question","name":"question","type":"STRING","widget":{"name":"question"},"link":null},{"localized_name":"min_length","name":"min_length","shape":7,"type":"INT","widget":{"name":"min_length"},"link":null},{"localized_name":"max_length","name":"max_length","shape":7,"type":"INT","widget":{"name":"max_length"},"link":null},{"localized_name":"num_beams","name":"num_beams","shape":7,"type":"INT","widget":{"name":"num_beams"},"link":null},{"localized_name":"no_repeat_ngram_size","name":"no_repeat_ngram_size","shape":7,"type":"INT","widget":{"name":"no_repeat_ngram_size"},"link":null},{"localized_name":"early_stopping","name":"early_stopping","shape":7,"type":"BOOLEAN","widget":{"name":"early_stopping"},"link":null}],"outputs":[{"localized_name":"FULL_CAPTIONS","name":"FULL_CAPTIONS","type":"STRING","links":[]},{"localized_name":"CAPTIONS","name":"CAPTIONS","shape":6,"type":"STRING","links":[213,214]}],"properties":{"cnr_id":"was-ns","ver":"3.0.0","Node name for S&R":"BLIP Analyze Image","ue_properties":{"widget_ue_connectable":{"mode":true,"question":true,"min_length":true,"max_length":true,"num_beams":true,"no_repeat_ngram_size":true,"early_stopping":true},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["caption","",48,96,3,3,false]},{"id":63,"type":"easy showAnything","pos":[626.519775390625,-165.16578674316406],"size":[412.5000305175781,594.498779296875],"flags":{},"order":11,"mode":0,"inputs":[{"localized_name":"anything","name":"anything","shape":7,"type":"*","link":213}],"outputs":[{"localized_name":"output","name":"output","type":"*","links":[]}],"properties":{"cnr_id":"comfyui-easy-use","ver":"1.3.2","Node name for S&R":"easy showAnything","ue_properties":{"widget_ue_connectable":{},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["a painting of a man in a yellow vest and blue shirt, with his hand on his head, with a blue sky in the background, and orange and yellow sky above the man is a black and white text that says ` ` ` person '","a comic strip with a drawing of a man in a red shirt and a sign that says, ' i ' m is for you ' s not a man ' s a man, i ' s, ' s ' ' ' m '","a drawing of a man in a red shirt with his head tilted to the side, looking at the camera, with a green background behind him and and a drawing on a man of a drawing to a man with a man ' person ' s","a cartoon of a man in a red shirt and a green background with the words, ` ` ' ', ' ' ' and ' ' person ' person, ', and ', `, ' and `, and ` `, `","a painting of a woman smoking a cigarette in front of a stone wall, with the words, ` ` ` ' ', ` ', ' ' ' ` `, ' and ' ' person, ' `, `, and '","a drawing of a man with a beard and a beard on his head, and a picture of a person ' s face in the background, and the text that says, ` ` ` ' ' ' person ' person, ' ', '"]},{"id":58,"type":"easy showAnything","pos":[850.0166015625,-589.4692993164062],"size":[234.62838745117188,88],"flags":{},"order":8,"mode":0,"inputs":[{"localized_name":"anything","name":"anything","shape":7,"type":"*","link":180}],"outputs":[{"localized_name":"output","name":"output","type":"*","links":[195]}],"properties":{"cnr_id":"comfyui-easy-use","ver":"1.3.2","Node name for S&R":"easy showAnything","ue_properties":{"widget_ue_connectable":{},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["6"]},{"id":47,"type":"easy showAnything","pos":[845.7152099609375,-1051.462646484375],"size":[351.42889404296875,358],"flags":{},"order":9,"mode":0,"inputs":[{"localized_name":"anything","name":"anything","shape":7,"type":"*","link":160}],"outputs":[{"localized_name":"output","name":"output","type":"*","links":[]}],"properties":{"cnr_id":"comfyui-easy-use","ver":"1.3.2","Node name for S&R":"easy showAnything","ue_properties":{"widget_ue_connectable":{},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix(01).jpg","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix(02).jpg","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix(03).jpg","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix(04).jpg","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix(05).jpg","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix(06).jpg"]},{"id":1,"type":"LoadImagesFromFolderKJ","pos":[-147.21127319335938,-630.7935180664062],"size":[353.43194580078125,263.65557861328125],"flags":{},"order":2,"mode":0,"inputs":[{"localized_name":"folder","name":"folder","type":"STRING","widget":{"name":"folder"},"link":null},{"localized_name":"width","name":"width","type":"INT","widget":{"name":"width"},"link":null},{"localized_name":"height","name":"height","type":"INT","widget":{"name":"height"},"link":null},{"localized_name":"keep_aspect_ratio","name":"keep_aspect_ratio","type":"COMBO","widget":{"name":"keep_aspect_ratio"},"link":null},{"localized_name":"image_load_cap","name":"image_load_cap","shape":7,"type":"INT","widget":{"name":"image_load_cap"},"link":null},{"localized_name":"start_index","name":"start_index","shape":7,"type":"INT","widget":{"name":"start_index"},"link":null},{"localized_name":"include_subfolders","name":"include_subfolders","shape":7,"type":"BOOLEAN","widget":{"name":"include_subfolders"},"link":null}],"outputs":[{"localized_name":"image","name":"image","type":"IMAGE","links":[1,4,183,200]},{"localized_name":"mask","name":"mask","type":"MASK","links":null},{"localized_name":"count","name":"count","type":"INT","links":[180]},{"localized_name":"image_path","name":"image_path","type":"STRING","links":[160]}],"properties":{"cnr_id":"comfyui-kjnodes","ver":"468fcc86f0b29e79a8510e8239eb15714d6747a6","Node name for S&R":"LoadImagesFromFolderKJ","ue_properties":{"widget_ue_connectable":{"folder":true,"width":true,"height":true,"keep_aspect_ratio":true,"image_load_cap":true,"start_index":true,"include_subfolders":true},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["C:\\Users\\clement\\ComfyUI\\output\\Character02",900,900,"stretch",6,0,false]},{"id":60,"type":"ImageBatchSaver","pos":[1113.995361328125,-258.309326171875],"size":[306.1015625,246],"flags":{},"order":12,"mode":0,"inputs":[{"localized_name":"images","name":"images","shape":7,"type":"IMAGE","link":200},{"localized_name":"contents","name":"contents","shape":7,"type":"STRING","link":214},{"localized_name":"output_path","name":"output_path","shape":7,"type":"STRING","widget":{"name":"output_path"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","shape":7,"type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"filename_delimiter","name":"filename_delimiter","shape":7,"type":"STRING","widget":{"name":"filename_delimiter"},"link":null},{"localized_name":"filename_suffix","name":"filename_suffix","shape":7,"type":"STRING","widget":{"name":"filename_suffix"},"link":null},{"localized_name":"extension","name":"extension","shape":7,"type":"COMBO","widget":{"name":"extension"},"link":null},{"localized_name":"filename_number_padding","name":"filename_number_padding","shape":7,"type":"INT","widget":{"name":"filename_number_padding"},"link":195},{"localized_name":"filename_number","name":"filename_number","shape":7,"type":"COMBO","widget":{"name":"filename_number"},"link":null},{"localized_name":"embeded_workflow","name":"embeded_workflow","shape":7,"type":"BOOLEAN","widget":{"name":"embeded_workflow"},"link":null}],"outputs":[{"localized_name":"images","name":"images","type":"IMAGE","links":[]},{"localized_name":"file_paths","name":"file_paths","type":"STRING","links":[207]}],"properties":{"cnr_id":"batch-process","ver":"1.0.5","Node name for S&R":"ImageBatchSaver","ue_properties":{"widget_ue_connectable":{"output_path":true,"filename_prefix":true,"filename_delimiter":true,"filename_suffix":true,"extension":true,"filename_number_padding":true,"filename_number":true,"embeded_workflow":true},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["C:\\Users\\clement\\ComfyUI\\output\\Character02","Alix","_","","png",4,"end",true]},{"id":61,"type":"easy showAnything","pos":[1512.129150390625,-240.0951385498047],"size":[506.25299072265625,682],"flags":{},"order":13,"mode":0,"inputs":[{"localized_name":"anything","name":"anything","shape":7,"type":"*","link":207}],"outputs":[{"localized_name":"output","name":"output","type":"*","links":[]}],"properties":{"cnr_id":"comfyui-easy-use","ver":"1.3.2","Node name for S&R":"easy showAnything","ue_properties":{"widget_ue_connectable":{},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0001.png","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0001.txt","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0002.png","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0002.txt","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0003.png","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0003.txt","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0004.png","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0004.txt","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0005.png","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0005.txt","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0006.png","C:\\Users\\clement\\ComfyUI\\output\\Character02\\Alix_0006.txt"]},{"id":5,"type":"PreviewImage","pos":[223.262451171875,-1112.52294921875],"size":[586.657958984375,434.2114562988281],"flags":{},"order":6,"mode":0,"inputs":[{"localized_name":"images","name":"images","type":"IMAGE","link":4}],"outputs":[],"properties":{"cnr_id":"comfy-core","ver":"0.3.59","Node name for S&R":"PreviewImage","ue_properties":{"widget_ue_connectable":{},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":[]},{"id":66,"type":"PreviewImage","pos":[1582.8232421875,-1148.1600341796875],"size":[439.16131591796875,786.3973388671875],"flags":{},"order":10,"mode":0,"inputs":[{"localized_name":"images","name":"images","type":"IMAGE","link":219}],"outputs":[],"properties":{"cnr_id":"comfy-core","ver":"0.3.59","Node name for S&R":"PreviewImage","ue_properties":{"widget_ue_connectable":{},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":[]},{"id":64,"type":"LoadImagesFromFolderKJ","pos":[1238.237060546875,-1150.4595947265625],"size":[313.09613037109375,266.51519775390625],"flags":{},"order":4,"mode":0,"inputs":[{"localized_name":"folder","name":"folder","type":"STRING","widget":{"name":"folder"},"link":null},{"localized_name":"width","name":"width","type":"INT","widget":{"name":"width"},"link":null},{"localized_name":"height","name":"height","type":"INT","widget":{"name":"height"},"link":null},{"localized_name":"keep_aspect_ratio","name":"keep_aspect_ratio","type":"COMBO","widget":{"name":"keep_aspect_ratio"},"link":null},{"localized_name":"image_load_cap","name":"image_load_cap","shape":7,"type":"INT","widget":{"name":"image_load_cap"},"link":null},{"localized_name":"start_index","name":"start_index","shape":7,"type":"INT","widget":{"name":"start_index"},"link":null},{"localized_name":"include_subfolders","name":"include_subfolders","shape":7,"type":"BOOLEAN","widget":{"name":"include_subfolders"},"link":null}],"outputs":[{"localized_name":"image","name":"image","type":"IMAGE","links":[219]},{"localized_name":"mask","name":"mask","type":"MASK","links":null},{"localized_name":"count","name":"count","type":"INT","links":[]},{"localized_name":"image_path","name":"image_path","type":"STRING","links":[]}],"properties":{"cnr_id":"comfyui-kjnodes","ver":"468fcc86f0b29e79a8510e8239eb15714d6747a6","Node name for S&R":"LoadImagesFromFolderKJ","ue_properties":{"widget_ue_connectable":{"folder":true,"width":true,"height":true,"keep_aspect_ratio":true,"image_load_cap":true,"start_index":true,"include_subfolders":true},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["C:\\Users\\clement\\ComfyUI\\output\\Screens",400,804,"crop",6,0,false]},{"id":23,"type":"Fast Groups Bypasser (rgthree)","pos":[1283.589599609375,-671.01025390625],"size":[236.74765014648438,130],"flags":{},"order":3,"mode":0,"inputs":[],"outputs":[{"name":"OPT_CONNECTION","type":"*","links":[]}],"properties":{"matchColors":"","matchTitle":"","showNav":true,"showAllGraphs":true,"sort":"position","customSortAlphabet":"","toggleRestriction":"default","ue_properties":{"widget_ue_connectable":{},"version":"7.1","input_ue_unconnectable":{}}}},{"id":56,"type":"Florence2ModelLoader","pos":[-144.4396514892578,34.29420852661133],"size":[288.740234375,130],"flags":{},"order":1,"mode":4,"inputs":[{"localized_name":"lora","name":"lora","shape":7,"type":"PEFTLORA","link":null},{"localized_name":"model","name":"model","type":"COMBO","widget":{"name":"model"},"link":null},{"localized_name":"precision","name":"precision","type":"COMBO","widget":{"name":"precision"},"link":null},{"localized_name":"attention","name":"attention","type":"COMBO","widget":{"name":"attention"},"link":null},{"localized_name":"convert_to_safetensors","name":"convert_to_safetensors","shape":7,"type":"BOOLEAN","widget":{"name":"convert_to_safetensors"},"link":null}],"outputs":[{"localized_name":"florence2_model","name":"florence2_model","type":"FL2MODEL","links":[176]}],"properties":{"cnr_id":"comfyui-florence2","ver":"1.0.6","Node name for S&R":"Florence2ModelLoader","ue_properties":{"widget_ue_connectable":{"model":true,"precision":true,"attention":true,"convert_to_safetensors":true},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["Florence-2-base","fp16","sdpa",false]},{"id":54,"type":"Florence2Run","pos":[179.13333129882812,32.334320068359375],"size":[403.0101318359375,398.6165466308594],"flags":{},"order":7,"mode":4,"inputs":[{"localized_name":"image","name":"image","type":"IMAGE","link":183},{"localized_name":"florence2_model","name":"florence2_model","type":"FL2MODEL","link":176},{"localized_name":"text_input","name":"text_input","type":"STRING","widget":{"name":"text_input"},"link":null},{"localized_name":"task","name":"task","type":"COMBO","widget":{"name":"task"},"link":null},{"localized_name":"fill_mask","name":"fill_mask","type":"BOOLEAN","widget":{"name":"fill_mask"},"link":null},{"localized_name":"keep_model_loaded","name":"keep_model_loaded","shape":7,"type":"BOOLEAN","widget":{"name":"keep_model_loaded"},"link":null},{"localized_name":"max_new_tokens","name":"max_new_tokens","shape":7,"type":"INT","widget":{"name":"max_new_tokens"},"link":null},{"localized_name":"num_beams","name":"num_beams","shape":7,"type":"INT","widget":{"name":"num_beams"},"link":null},{"localized_name":"do_sample","name":"do_sample","shape":7,"type":"BOOLEAN","widget":{"name":"do_sample"},"link":null},{"localized_name":"output_mask_select","name":"output_mask_select","shape":7,"type":"STRING","widget":{"name":"output_mask_select"},"link":null},{"localized_name":"seed","name":"seed","shape":7,"type":"INT","widget":{"name":"seed"},"link":null}],"outputs":[{"localized_name":"image","name":"image","type":"IMAGE","links":[]},{"localized_name":"mask","name":"mask","type":"MASK","links":null},{"localized_name":"caption","name":"caption","type":"STRING","links":[]},{"localized_name":"data","name":"data","type":"JSON","links":[]}],"properties":{"cnr_id":"comfyui-florence2","ver":"1.0.6","Node name for S&R":"Florence2Run","ue_properties":{"widget_ue_connectable":{"text_input":true,"task":true,"fill_mask":true,"keep_model_loaded":true,"max_new_tokens":true,"num_beams":true,"do_sample":true,"output_mask_select":true,"seed":true},"version":"7.1","input_ue_unconnectable":{}}},"widgets_values":["","more_detailed_caption",true,false,1024,2,true,"",460703812380524,"fixed"]}],"links":[[1,1,0,3,0,"IMAGE"],[3,2,0,3,1,"BLIP_MODEL"],[4,1,0,5,0,"IMAGE"],[160,1,3,47,0,"*"],[176,56,0,54,1,"FL2MODEL"],[180,1,2,58,0,"*"],[183,1,0,54,0,"IMAGE"],[195,58,0,60,7,"INT"],[200,1,0,60,0,"IMAGE"],[207,60,1,61,0,"*"],[213,3,1,63,0,"*"],[214,3,1,60,1,"STRING"],[219,64,0,66,0,"IMAGE"]],"groups":[{"id":1,"title":"02-Processes","bounding":[-155.63722229003906,-343.1752624511719,1209.1903076171875,785.4286499023438],"color":"#3f789e","font_size":24,"flags":{}},{"id":2,"title":"01-Input","bounding":[-153.83277893066406,-1183.77490234375,1379.53271484375,819.029296875],"color":"#3f789e","font_size":24,"flags":{}},{"id":3,"title":"03 Output","bounding":[1071.0430908203125,-342.421630859375,957.2517700195312,785.1242065429688],"color":"#3f789e","font_size":24,"flags":{}}],"config":{},"extra":{"ue_links":[],"ds":{"scale":0.7308641015660836,"offset":[369.6338178275056,1226.702304828325]},"links_added_by_ue":[]},"version":0.4}
|