Is there a way to merge this encoder as one safetensor file and then use in the Comfy UI ?

#1
by soymh - opened

Hi !
Thank you for this breakthrough in local Music generation, in an ACE step!

I guess the most important process of music generation via this model is done in the encoder models,
The mixture of 0.6B or 1.7B clip Models work almost fine for demonstration in English(Other languages not tested yet...)
The main improvement is to find a way to use the 4b clip model(Or even better: Its Quantized Version!) in ComfyUI...

My question is:
"Is there a way to merge these two files for the 4B clip model into one safetensors file, and hence use it in ComfyUI ? "

If so, We would appreciate any guide or walk through !

wow! thanks
haven't checked since 20 hours ago

soymh changed discussion status to closed

Isn't there a gguf way around yet?

soymh changed discussion status to open

I think it should be possible as the model is based on Qwen3 LLM architecture. The 4B model was easy to quantize to GGUF for testing and even loaded in the DualClipLoader (GGUF) node, but then it fails as soon as it starts encoding. I'm not sure if the problem is with ComfyUI, the GGUF node or something else. The error message looked like this:

  File "/home/hum/aienv/ComfyUI-0.12.2/comfy/text_encoders/ace15.py", line 207, in encode_token_weights
    audio_codes = generate_audio_codes(getattr(self, self.lm_model, self.qwen3_06b), token_weight_pairs["lm_prompt"], token_weight_pairs["lm_prompt_negative"], min_tokens=lm_metadata["min_tokens"], max_tokens=lm_metadata["min_tokens"], seed=lm_metadata["seed"])
  File "/home/hum/aienv/ComfyUI-0.12.2/comfy/text_encoders/ace15.py", line 123, in generate_audio_codes
    return sample_manual_loop_no_classes(model, [positive, negative], paddings, cfg_scale=cfg_scale, seed=seed, min_tokens=min_tokens, max_new_tokens=max_tokens)
  File "/home/hum/aienv/ComfyUI-0.12.2/comfy/text_encoders/ace15.py", line 78, in sample_manual_loop_no_classes
    sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[..., :-1].clone()
    ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
  File "/home/hum/aienv/lib/python3.13/site-packages/torch/_tensor.py", line 1654, in __torch_function__
    ret = func(*args, **kwargs)
RuntimeError: unsupported operation: some elements of the input tensor and the written-to tensor refer to a single memory location. Please clone() the tensor before performing the operation.

You can find it here: https://huggingface.co/Comfy-Org/ace_step_1.5_ComfyUI_files/tree/main/split_files/text_encoders

is there a MERGED version to use in the standalone? or how to add it to the standalone chekpoint folders, if its 2 files...?

have you tried cloning this folder into your ace step 1.5 repo folder and then setting the directory accordingly in the startup file?

have you tried cloning this folder into your ace step 1.5 repo folder and then setting the directory accordingly in the startup file?

Me?? Yes it worked.

Sign up or log in to comment