Spaces:
Running
Running
Support local chat LLM via Ollama. (#14)
Browse files- README.md +26 -9
- requirements.txt +2 -0
- src/chat.py +6 -5
- src/llm.py +21 -0
README.md
CHANGED
|
@@ -1,9 +1,9 @@
|
|
| 1 |
# What is this?
|
| 2 |
|
| 3 |
-

|
| 4 |
-
|
| 5 |
*TL;DR*: `repo2vec` is a simple-to-use, modular library enabling you to chat with any public or private codebase.
|
| 6 |
|
|
|
|
|
|
|
| 7 |
**Ok, but why chat with a codebase?**
|
| 8 |
|
| 9 |
Sometimes you just want to learn how a codebase works and how to integrate it, without spending hours sifting through
|
|
@@ -14,7 +14,10 @@ the code itself.
|
|
| 14 |
Features:
|
| 15 |
- **Dead-simple set-up.** Run *two scripts* and you have a functional chat interface for your code. That's really it.
|
| 16 |
- **Heavily documented answers.** Every response shows where in the code the context for the answer was pulled from. Let's build trust in the AI.
|
| 17 |
-
- **
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
# How to run it
|
| 20 |
## Indexing the codebase
|
|
@@ -56,20 +59,34 @@ We currently support two options for indexing the codebase:
|
|
| 56 |
We are planning on adding more providers soon, so that you can mix and match them. Contributions are also welcome!
|
| 57 |
|
| 58 |
## Chatting with the codebase
|
| 59 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
|
|
|
| 61 |
```
|
| 62 |
-
export
|
| 63 |
|
| 64 |
python src/chat.py \
|
| 65 |
github-repo-name \ # e.g. Storia-AI/repo2vec
|
|
|
|
|
|
|
| 66 |
--vector_store_type=marqo \ # or pinecone
|
| 67 |
--index_name=your-index-name
|
| 68 |
```
|
| 69 |
To get a public URL for your chat app, set `--share=true`.
|
| 70 |
|
| 71 |
-
Currently, the chat will use OpenAI's GPT-4, but we are working on adding support for other providers and local LLMs. Stay tuned!
|
| 72 |
-
|
| 73 |
# Peeking under the hood
|
| 74 |
|
| 75 |
## Indexing the repo
|
|
@@ -79,7 +96,7 @@ The `src/index.py` script performs the following steps:
|
|
| 79 |
2. **Chunks files**. See [Chunker](src/chunker.py).
|
| 80 |
- For code files, we implement a special `CodeChunker` that takes the parse tree into account.
|
| 81 |
3. **Batch-embeds chunks**. See [Embedder](src/embedder.py). We currently support:
|
| 82 |
-
- [Marqo](https://github.com/marqo-ai/marqo) as an embedder, which allows you to specify your favorite Hugging Face embedding model
|
| 83 |
- OpenAI's [batch embedding API](https://platform.openai.com/docs/guides/batch/overview), which is much faster and cheaper than the regular synchronous embedding API.
|
| 84 |
4. **Stores embeddings in a vector store**. See [VectorStore](src/vector_store.py).
|
| 85 |
- We currently support [Marqo](https://github.com/marqo-ai/marqo) and [Pinecone](https://pinecone.io), but you can easily plug in your own.
|
|
@@ -100,7 +117,7 @@ The `src/chat.py` brings up a [Gradio app](https://www.gradio.app/) with a chat
|
|
| 100 |
1. Rewrites the query to be self-contained based on previous queries
|
| 101 |
2. Embeds the rewritten query using OpenAI embeddings
|
| 102 |
3. Retrieves relevant documents from the vector store
|
| 103 |
-
4. Calls
|
| 104 |
|
| 105 |
The sources are conveniently surfaced in the chat and linked directly to GitHub.
|
| 106 |
|
|
|
|
| 1 |
# What is this?
|
| 2 |
|
|
|
|
|
|
|
| 3 |
*TL;DR*: `repo2vec` is a simple-to-use, modular library enabling you to chat with any public or private codebase.
|
| 4 |
|
| 5 |
+

|
| 6 |
+
|
| 7 |
**Ok, but why chat with a codebase?**
|
| 8 |
|
| 9 |
Sometimes you just want to learn how a codebase works and how to integrate it, without spending hours sifting through
|
|
|
|
| 14 |
Features:
|
| 15 |
- **Dead-simple set-up.** Run *two scripts* and you have a functional chat interface for your code. That's really it.
|
| 16 |
- **Heavily documented answers.** Every response shows where in the code the context for the answer was pulled from. Let's build trust in the AI.
|
| 17 |
+
- **Runs locally or on the cloud.**
|
| 18 |
+
- Want privacy? No problem: you can use [Marqo](https://github.com/marqo-ai/marqo) for embeddings + vector store and [Ollama](ollama.com) for the chat LLM.
|
| 19 |
+
- Want speed and high performance? Also no problem. We support OpenAI batch embeddings + [Pinecone](https://www.pinecone.io/) for the vector store + OpenAI or Anthropic for the chat LLM.
|
| 20 |
+
- **Plug-and-play.** Want to improve the algorithms powering the code understanding/generation? We've made every component of the pipeline easily swappable. Google-grade engineering standards allow you to customize to your heart's content.
|
| 21 |
|
| 22 |
# How to run it
|
| 23 |
## Indexing the codebase
|
|
|
|
| 59 |
We are planning on adding more providers soon, so that you can mix and match them. Contributions are also welcome!
|
| 60 |
|
| 61 |
## Chatting with the codebase
|
| 62 |
+
We provide a `gradio` app where you can chat with your codebase. You can use either a local LLM (via [Ollama](https://ollama.com)), or a cloud provider like OpenAI or Anthropic.
|
| 63 |
+
|
| 64 |
+
To chat with a local LLM:
|
| 65 |
+
1. Head over to [ollama.com](https://ollama.com) to download the appropriate binary for your machine.
|
| 66 |
+
2. Pull the desired model, e.g. `ollama pull llama3.1`.
|
| 67 |
+
3. Start the `gradio` app:
|
| 68 |
+
```
|
| 69 |
+
python src/chat.py \
|
| 70 |
+
github-repo-name \ # e.g. Storia-AI/repo2vec
|
| 71 |
+
--llm_provider=ollama
|
| 72 |
+
--llm_model=llama3.1
|
| 73 |
+
--vector_store_type=marqo \ # or pinecone
|
| 74 |
+
--index_name=your-index-name
|
| 75 |
+
```
|
| 76 |
|
| 77 |
+
To chat with a cloud-based LLM, for instance Anthropic's Claude:
|
| 78 |
```
|
| 79 |
+
export ANTHROPIC_API_KEY=...
|
| 80 |
|
| 81 |
python src/chat.py \
|
| 82 |
github-repo-name \ # e.g. Storia-AI/repo2vec
|
| 83 |
+
--llm_provider=anthropic \
|
| 84 |
+
--llm_model=claude-3-opus-20240229 \
|
| 85 |
--vector_store_type=marqo \ # or pinecone
|
| 86 |
--index_name=your-index-name
|
| 87 |
```
|
| 88 |
To get a public URL for your chat app, set `--share=true`.
|
| 89 |
|
|
|
|
|
|
|
| 90 |
# Peeking under the hood
|
| 91 |
|
| 92 |
## Indexing the repo
|
|
|
|
| 96 |
2. **Chunks files**. See [Chunker](src/chunker.py).
|
| 97 |
- For code files, we implement a special `CodeChunker` that takes the parse tree into account.
|
| 98 |
3. **Batch-embeds chunks**. See [Embedder](src/embedder.py). We currently support:
|
| 99 |
+
- [Marqo](https://github.com/marqo-ai/marqo) as an embedder, which allows you to specify your favorite Hugging Face embedding model, and
|
| 100 |
- OpenAI's [batch embedding API](https://platform.openai.com/docs/guides/batch/overview), which is much faster and cheaper than the regular synchronous embedding API.
|
| 101 |
4. **Stores embeddings in a vector store**. See [VectorStore](src/vector_store.py).
|
| 102 |
- We currently support [Marqo](https://github.com/marqo-ai/marqo) and [Pinecone](https://pinecone.io), but you can easily plug in your own.
|
|
|
|
| 117 |
1. Rewrites the query to be self-contained based on previous queries
|
| 118 |
2. Embeds the rewritten query using OpenAI embeddings
|
| 119 |
3. Retrieves relevant documents from the vector store
|
| 120 |
+
4. Calls a chat LLM to respond to the user query based on the retrieved documents.
|
| 121 |
|
| 122 |
The sources are conveniently surfaced in the chat and linked directly to GitHub.
|
| 123 |
|
requirements.txt
CHANGED
|
@@ -3,6 +3,8 @@ Pygments==2.18.0
|
|
| 3 |
gradio==4.42.0
|
| 4 |
langchain==0.2.14
|
| 5 |
langchain-community==0.2.12
|
|
|
|
|
|
|
| 6 |
langchain-openai==0.1.22
|
| 7 |
marqo==3.7.0
|
| 8 |
nbformat==5.10.4
|
|
|
|
| 3 |
gradio==4.42.0
|
| 4 |
langchain==0.2.14
|
| 5 |
langchain-community==0.2.12
|
| 6 |
+
langchain-anthropic==0.1.23
|
| 7 |
+
langchain-ollama==0.1.2
|
| 8 |
langchain-openai==0.1.22
|
| 9 |
marqo==3.7.0
|
| 10 |
nbformat==5.10.4
|
src/chat.py
CHANGED
|
@@ -12,9 +12,9 @@ from langchain.chains import (create_history_aware_retriever,
|
|
| 12 |
from langchain.chains.combine_documents import create_stuff_documents_chain
|
| 13 |
from langchain.schema import AIMessage, HumanMessage
|
| 14 |
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
|
| 15 |
-
from langchain_openai import ChatOpenAI
|
| 16 |
|
| 17 |
import vector_store
|
|
|
|
| 18 |
from repo_manager import RepoManager
|
| 19 |
|
| 20 |
load_dotenv()
|
|
@@ -22,7 +22,7 @@ load_dotenv()
|
|
| 22 |
|
| 23 |
def build_rag_chain(args):
|
| 24 |
"""Builds a RAG chain via LangChain."""
|
| 25 |
-
llm =
|
| 26 |
retriever = vector_store.build_from_args(args).to_langchain().as_retriever()
|
| 27 |
|
| 28 |
# Prompt to contextualize the latest query based on the chat history.
|
|
@@ -74,10 +74,11 @@ def append_sources_to_response(response):
|
|
| 74 |
if __name__ == "__main__":
|
| 75 |
parser = argparse.ArgumentParser(description="UI to chat with your codebase")
|
| 76 |
parser.add_argument("repo_id", help="The ID of the repository to index")
|
|
|
|
| 77 |
parser.add_argument(
|
| 78 |
-
"--
|
| 79 |
-
default="
|
| 80 |
-
help="The
|
| 81 |
)
|
| 82 |
parser.add_argument("--vector_store_type", default="pinecone", choices=["pinecone", "marqo"])
|
| 83 |
parser.add_argument("--index_name", required=True, help="Vector store index name")
|
|
|
|
| 12 |
from langchain.chains.combine_documents import create_stuff_documents_chain
|
| 13 |
from langchain.schema import AIMessage, HumanMessage
|
| 14 |
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
|
|
|
|
| 15 |
|
| 16 |
import vector_store
|
| 17 |
+
from llm import build_llm_via_langchain
|
| 18 |
from repo_manager import RepoManager
|
| 19 |
|
| 20 |
load_dotenv()
|
|
|
|
| 22 |
|
| 23 |
def build_rag_chain(args):
|
| 24 |
"""Builds a RAG chain via LangChain."""
|
| 25 |
+
llm = build_llm_via_langchain(args.llm_provider, args.llm_model)
|
| 26 |
retriever = vector_store.build_from_args(args).to_langchain().as_retriever()
|
| 27 |
|
| 28 |
# Prompt to contextualize the latest query based on the chat history.
|
|
|
|
| 74 |
if __name__ == "__main__":
|
| 75 |
parser = argparse.ArgumentParser(description="UI to chat with your codebase")
|
| 76 |
parser.add_argument("repo_id", help="The ID of the repository to index")
|
| 77 |
+
parser.add_argument("--llm_provider", default="anthropic", choices=["openai", "anthropic", "ollama"])
|
| 78 |
parser.add_argument(
|
| 79 |
+
"--llm_model",
|
| 80 |
+
default="claude-3-opus-20240229",
|
| 81 |
+
help="The LLM name. Must be supported by the provider specified via --llm_provider.",
|
| 82 |
)
|
| 83 |
parser.add_argument("--vector_store_type", default="pinecone", choices=["pinecone", "marqo"])
|
| 84 |
parser.add_argument("--index_name", required=True, help="Vector store index name")
|
src/llm.py
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
|
| 3 |
+
from langchain_anthropic import ChatAnthropic
|
| 4 |
+
from langchain_ollama import ChatOllama
|
| 5 |
+
from langchain_openai import ChatOpenAI
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
def build_llm_via_langchain(provider: str, model: str):
|
| 9 |
+
"""Builds a language model via LangChain."""
|
| 10 |
+
if provider == "openai":
|
| 11 |
+
if "OPENAI_API_KEY" not in os.environ:
|
| 12 |
+
raise ValueError("Please set the OPENAI_API_KEY environment variable.")
|
| 13 |
+
return ChatOpenAI(model=model)
|
| 14 |
+
elif provider == "anthropic":
|
| 15 |
+
if "ANTHROPIC_API_KEY" not in os.environ:
|
| 16 |
+
raise ValueError("Please set the ANTHROPIC_API_KEY environment variable.")
|
| 17 |
+
return ChatAnthropic(model=model)
|
| 18 |
+
elif provider == "ollama":
|
| 19 |
+
return ChatOllama(model=model)
|
| 20 |
+
else:
|
| 21 |
+
raise ValueError(f"Unrecognized LLM provider {provider}. Contributons are welcome!")
|