Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
NJX-njx 
posted an update 10 days ago
Post
1538
Hello, my friends,

Recently, I developed a tool in the field of AI learning: when a user inputs any knowledge point, the intelligent agent can write animation storyboards based on that knowledge point, generate the corresponding animation code for the storyboards through a coding agent, and finally render it into a video using the Manim engine.

The overall effect is similar to the videos from 3blue1brown. I hope that through such a tool, everyone can freely learn through videos of the same quality as 3b1b's.

However, I have recently encountered a problem regarding the video content. It is difficult to position geometric figures, symbols, etc., in the correct positions in the video, that is, there is a problem with positioning. I tried extracting video frames after generating the video and submitting them to a VLM for review to identify visual issues, and continuously modifying the prompts to optimize the generation quality, but the results were not satisfactory.

I wonder if anyone has any good methods to solve this positioning problem in the video.

Here is the project link: https://github.com/NJX-njx/code2video#
In this post