AI & ML interests

all of ai

NJX-njxย 
posted an update 5 days ago
view post
Post
118
Friends of the community, I have recently had some new ideas.

Some time ago, I came across a research analysis from two investors at a16z. In the past year of 2025, ChatGPT actually tried to promote some new AI functions in fields such as shopping, but in fact, the effect was not good.

I think the fundamental reason lies in the user's mindset, or rather, the user's interaction logic in vertical fields. The most prominent and distinctive feature of ChatGPT is that all-encompassing dialogue box, which is also a common problem with many homogeneous AI products nowadays (it seems that without a dialogue box, the AI's capabilities are sealed off).Although it can be adapted to many scenario fields, it will appear very boring in more vertical scenarios

Ask yourself, would you prefer the image-text waterfall flow interaction in shopping scenarios like Xiaohongshu, or the monotonous search box of ChatGPT? The answer is actually obvious from the start.

For all vertical scenarios, the interaction logic was already very well-developed before the emergence of AI. The user experience brought by such interaction logic is definitely not something that a single dialogue box can replace.

And if we want to create a good AI product in a vertical field, we should think more about how to silently embed the powerful capabilities of AI into the original interaction, and continuously iterate to provide users with a better experience.@lilianweng@clem@AdinaY
  • 3 replies
ยท
NJX-njxย 
posted an update 7 days ago
view post
Post
1533
Hello, my friends,

Recently, I developed a tool in the field of AI learning: when a user inputs any knowledge point, the intelligent agent can write animation storyboards based on that knowledge point, generate the corresponding animation code for the storyboards through a coding agent, and finally render it into a video using the Manim engine.

The overall effect is similar to the videos from 3blue1brown. I hope that through such a tool, everyone can freely learn through videos of the same quality as 3b1b's.

However, I have recently encountered a problem regarding the video content. It is difficult to position geometric figures, symbols, etc., in the correct positions in the video, that is, there is a problem with positioning. I tried extracting video frames after generating the video and submitting them to a VLM for review to identify visual issues, and continuously modifying the prompts to optimize the generation quality, but the results were not satisfactory.

I wonder if anyone has any good methods to solve this positioning problem in the video.

Here is the project link: https://github.com/NJX-njx/code2video#