Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
unmodeled-tyler 
posted an update Mar 2
Post
3026
Link to Repo: https://github.com/unmodeled-tyler/thought-tracer

I had a great time at Mistral's Hackathon in SF over the weekend! There were a lot of incredibly talented builders there and it was an honor to be a part of it! 😄

I built Thought Tracer - a TUI-based logitlens application for Ministral 3B/8B with optional AI analysis from Mistral Large on the Mistral API.

Thought Tracer allows you to see what the model "believes" at each layer until it arrives at it's final next token prediction. The Entropy tab displays entropy through each layer, additionally providing both token-level and prompt-level risk for hallucination.

If you have a Mistral API key, the AI analysis section is actually pretty cool because it's returned in rendered markdown and easily understandable language - providing a commentary on how the model likely arrived at it's final prediction, and also offering diagnostics for model developers. This commentary actually makes the tool pretty beginner friendly to anyone interested in exploring AI research tools for the first time.

Check it out if you're interested!

I'm actually going to continue building this out after the dust settles from the hackathon so expect more features!

I'm sorry, I might not have fully understood the capabilities of the entire project. From my understanding, for example, if I have an open-source pre-trained model and deploy it locally, your project can monitor the state of tokens as they are passed through each layer of the model. Is that what it means?

·

Yep, that’s the core of it! For example if I use the prompt “Paris is the capitol of France” and then I highlight “of” in my prompt, the layer predictions tab will show you what the model believes the next token to be through each layer.

You can watch the model start it’s guess in the very first layer (usually with something completely irrelevant) and then as it progress through each layer the model gets closer and closer until it converges on “France” as the most likely correct next token based on the context leading up to the selected token “of.” So then the model basically interprets it as “Paris is the capital of -> ? -> France”

You can see in the 1st layer the model was thinking “Paris is the capital of ales” then a deeper layer it was thinking “Paris is the capital of guardians” before it finally ended in the last layer with the correct prediction (remember, based on Paris is the capital of) “France”

The entropy tab calculates a few different metrics that also give a token-level and prompt-level hallucination risk assessment so you can see which types are higher risk for inducing hallucination in that particular model.