AI & ML interests

Al & ML interests

wangbuer999 
posted an update 20 days ago
view post
Post
3560
Testing AI controlling AI with Hy3 Preview I barely lifted a finger the whole time.

One-click deployment of Hermes on WorkBuddy took some time with a few rounds of adjustments, and I finally got it up and running smoothly

Only minor issue was setting up Supermemory it was a bit slow on the uptake. I had to go over simple steps several times, guiding it patiently like teaching a kid.

The experience of AI orchestrating AI is absolutely incredible. started running Agents with Hunyuan right after its release, and it actually works perfectly.

295B parameters, 21B active parameters, with direct access to TokenHub now great cost-performance ratio too

Honestly, I used to get stuck on all kinds of environment configurations when deploying Agents locally. Using Hy3 to take command made the whole process way more streamlined.
wangbuer999 
posted an update 27 days ago
view post
Post
2411
Hands-on testing of HY-World 2.0 shows a significant improvement in end-to-end engineering maturity compared to version 1.5

The model supports direct multimodal input from text, single-frame images, and video. Inference can be launched without camera intrinsic/extrinsic calibration or additional preprocessing

After panorama generation, the built-in Spatial Agent automatically performs semantic navigation path planning. Combined with spatial consistency constraints from HY-WorldStereo, it ensures artifact-free multi-view generation and stable geometric alignment

Outputs include standard 3D asset formats such as Mesh, 3DGS, and point clouds, which can be directly imported into Unity/UE

It is suitable for engineering scenarios including game level prototyping, digital twins, and embodied simulation
wangbuer999 
posted an update 4 months ago
view post
Post
2652
HunyuanImage 3.0-Instruct just dropped

fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case

Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues

strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely

strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source

👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct

technical report:https://arxiv.org/abs/2509.23951

Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?
  • 6 replies
·
wangbuer999 
posted an update 4 months ago
view post
Post
3223
HY-MT1.5-1.8B Lightweight Translation Model Open-Source Game-Changer

Tencent raised the bar for lightweight translation!

Supports bidirectional translation across 36 languages total—33 mainstream languages + 5 ethnic/minority dialects

With only 1.8B parameters (less than 1/3 the size of HY-MT1.5-7B), it delivers performance on par with the 7B counterpart and outperforms most commercial translation APIs.

✅ Quantized versions (FP8/GPTQ-Int4) available for edge device deployment, perfect for real-time translation
✅ Full support for terminology intervention, context-aware translation, and formatted output
✅ Ready-to-use prompt templates + seamless integration with Hugging Face Transformers
✅ Recommended transformers ≥ 4.56.0 (FP8 model requires compressed-tensors 0.11.0)

10+ Hugging Face Spaces already integrated this model!

👉 Model Repo: tencent/HY-MT1.5-1.8B
👉 Technical Report: https://arxiv.org/abs/2512.24092
wangbuer999 
posted an update 4 months ago
view post
Post
3171
Qwen-Image-Edit LoRA 96 Camera Angles for 3D-Consistent Image Tweaks

fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA levels up perspective editing

96 poses (4 elevations × 8 azimuths × 3 distances) – close-ups, wide shots, all angles covered

Trained on 3000+ Gaussian Splatting renders – 3D consistency holds even for -30° low-angle shots

Works with Qwen/Qwen-Image-Edit-2511 base models (LoRA strength 0.8-1.0) + ComfyUI workflow included
Tested it – plug-and-play, no fussy setup.

fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA
wangbuer999 
published a Space 4 months ago