Spaces:

Asish22
/

code-crawler

Running

App Files Files Community

juliaturc commited on Aug 30, 2024

Commit

fc7dede

1 Parent(s): 2db1bb0

Support local chat LLM via Ollama. (#14)

Browse files

Files changed (4) hide show

README.md +26 -9
requirements.txt +2 -0
src/chat.py +6 -5
src/llm.py +21 -0

README.md CHANGED Viewed

@@ -1,9 +1,9 @@
 # What is this?
-![screenshot](assets/chat_screenshot.png)
 *TL;DR*: `repo2vec` is a simple-to-use, modular library enabling you to chat with any public or private codebase.
 **Ok, but why chat with a codebase?**
 Sometimes you just want to learn how a codebase works and how to integrate it, without spending hours sifting through
@@ -14,7 +14,10 @@ the code itself.
 Features:
 - **Dead-simple set-up.** Run *two scripts* and you have a functional chat interface for your code. That's really it.
 - **Heavily documented answers.** Every response shows where in the code the context for the answer was pulled from. Let's build trust in the AI.
-- **Plug-and-play.** Want to improve the algorithms powering the code understanding/generation? We've made every component of the pipeline easily swappable. Customize to your heart's content.
 # How to run it
 ## Indexing the codebase
@@ -56,20 +59,34 @@ We currently support two options for indexing the codebase:
     We are planning on adding more providers soon, so that you can mix and match them. Contributions are also welcome!
 ## Chatting with the codebase
-To bring a `gradio` app where you can chat with your codebase, simply point it to your vector store:
 ```
-export OPENAI_API_KEY=...
 python src/chat.py \
     github-repo-name \  # e.g. Storia-AI/repo2vec
     --vector_store_type=marqo \  # or pinecone
     --index_name=your-index-name
 ```
 To get a public URL for your chat app, set `--share=true`.
-Currently, the chat will use OpenAI's GPT-4, but we are working on adding support for other providers and local LLMs. Stay tuned!
 # Peeking under the hood
 ## Indexing the repo
@@ -79,7 +96,7 @@ The `src/index.py` script performs the following steps:
 2. **Chunks files**. See [Chunker](src/chunker.py).
     - For code files, we implement a special `CodeChunker` that takes the parse tree into account.
 3. **Batch-embeds chunks**. See [Embedder](src/embedder.py). We currently support:
-    - [Marqo](https://github.com/marqo-ai/marqo) as an embedder, which allows you to specify your favorite Hugging Face embedding model;
     - OpenAI's [batch embedding API](https://platform.openai.com/docs/guides/batch/overview), which is much faster and cheaper than the regular synchronous embedding API.
 4. **Stores embeddings in a vector store**. See [VectorStore](src/vector_store.py).
     - We currently support [Marqo](https://github.com/marqo-ai/marqo) and [Pinecone](https://pinecone.io), but you can easily plug in your own.
@@ -100,7 +117,7 @@ The `src/chat.py` brings up a [Gradio app](https://www.gradio.app/) with a chat
 1. Rewrites the query to be self-contained based on previous queries
 2. Embeds the rewritten query using OpenAI embeddings
 3. Retrieves relevant documents from the vector store
-4. Calls an OpenAI LLM to respond to the user query based on the retrieved documents.
 The sources are conveniently surfaced in the chat and linked directly to GitHub.

 # What is this?
 *TL;DR*: `repo2vec` is a simple-to-use, modular library enabling you to chat with any public or private codebase.
+![screenshot](assets/chat_screenshot.png)
 **Ok, but why chat with a codebase?**
 Sometimes you just want to learn how a codebase works and how to integrate it, without spending hours sifting through
 Features:
 - **Dead-simple set-up.** Run *two scripts* and you have a functional chat interface for your code. That's really it.
 - **Heavily documented answers.** Every response shows where in the code the context for the answer was pulled from. Let's build trust in the AI.
+- **Runs locally or on the cloud.**
+    - Want privacy? No problem: you can use [Marqo](https://github.com/marqo-ai/marqo) for embeddings + vector store and [Ollama](ollama.com) for the chat LLM.
+    - Want speed and high performance? Also no problem. We support OpenAI batch embeddings + [Pinecone](https://www.pinecone.io/) for the vector store + OpenAI or Anthropic for the chat LLM.
+- **Plug-and-play.** Want to improve the algorithms powering the code understanding/generation? We've made every component of the pipeline easily swappable. Google-grade engineering standards allow you to customize to your heart's content.
 # How to run it
 ## Indexing the codebase
     We are planning on adding more providers soon, so that you can mix and match them. Contributions are also welcome!
 ## Chatting with the codebase
+We provide a `gradio` app where you can chat with your codebase. You can use either a local LLM (via [Ollama](https://ollama.com)), or a cloud provider like OpenAI or Anthropic.
+To chat with a local LLM:
+1. Head over to [ollama.com](https://ollama.com) to download the appropriate binary for your machine.
+2. Pull the desired model, e.g. `ollama pull llama3.1`.
+3. Start the `gradio` app:
+    ```
+    python src/chat.py \
+        github-repo-name \  # e.g. Storia-AI/repo2vec
+        --llm_provider=ollama
+        --llm_model=llama3.1
+        --vector_store_type=marqo \  # or pinecone
+        --index_name=your-index-name
+    ```
+To chat with a cloud-based LLM, for instance Anthropic's Claude:
 ```
+export ANTHROPIC_API_KEY=...
 python src/chat.py \
     github-repo-name \  # e.g. Storia-AI/repo2vec
+    --llm_provider=anthropic \
+    --llm_model=claude-3-opus-20240229 \
     --vector_store_type=marqo \  # or pinecone
     --index_name=your-index-name
 ```
 To get a public URL for your chat app, set `--share=true`.
 # Peeking under the hood
 ## Indexing the repo
 2. **Chunks files**. See [Chunker](src/chunker.py).
     - For code files, we implement a special `CodeChunker` that takes the parse tree into account.
 3. **Batch-embeds chunks**. See [Embedder](src/embedder.py). We currently support:
+    - [Marqo](https://github.com/marqo-ai/marqo) as an embedder, which allows you to specify your favorite Hugging Face embedding model, and
     - OpenAI's [batch embedding API](https://platform.openai.com/docs/guides/batch/overview), which is much faster and cheaper than the regular synchronous embedding API.
 4. **Stores embeddings in a vector store**. See [VectorStore](src/vector_store.py).
     - We currently support [Marqo](https://github.com/marqo-ai/marqo) and [Pinecone](https://pinecone.io), but you can easily plug in your own.
 1. Rewrites the query to be self-contained based on previous queries
 2. Embeds the rewritten query using OpenAI embeddings
 3. Retrieves relevant documents from the vector store
+4. Calls a chat LLM to respond to the user query based on the retrieved documents.
 The sources are conveniently surfaced in the chat and linked directly to GitHub.

requirements.txt CHANGED Viewed

@@ -3,6 +3,8 @@ Pygments==2.18.0
 gradio==4.42.0
 langchain==0.2.14
 langchain-community==0.2.12
 langchain-openai==0.1.22
 marqo==3.7.0
 nbformat==5.10.4

 gradio==4.42.0
 langchain==0.2.14
 langchain-community==0.2.12
+langchain-anthropic==0.1.23
+langchain-ollama==0.1.2
 langchain-openai==0.1.22
 marqo==3.7.0
 nbformat==5.10.4

src/chat.py CHANGED Viewed

@@ -12,9 +12,9 @@ from langchain.chains import (create_history_aware_retriever,
 from langchain.chains.combine_documents import create_stuff_documents_chain
 from langchain.schema import AIMessage, HumanMessage
 from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
-from langchain_openai import ChatOpenAI
 import vector_store
 from repo_manager import RepoManager
 load_dotenv()
@@ -22,7 +22,7 @@ load_dotenv()
 def build_rag_chain(args):
     """Builds a RAG chain via LangChain."""
-    llm = ChatOpenAI(model=args.openai_model)
     retriever = vector_store.build_from_args(args).to_langchain().as_retriever()
     # Prompt to contextualize the latest query based on the chat history.
@@ -74,10 +74,11 @@ def append_sources_to_response(response):
 if __name__ == "__main__":
     parser = argparse.ArgumentParser(description="UI to chat with your codebase")
     parser.add_argument("repo_id", help="The ID of the repository to index")
     parser.add_argument(
-        "--openai_model",
-        default="gpt-4",
-        help="The OpenAI model to use for response generation",
     )
     parser.add_argument("--vector_store_type", default="pinecone", choices=["pinecone", "marqo"])
     parser.add_argument("--index_name", required=True, help="Vector store index name")

 from langchain.chains.combine_documents import create_stuff_documents_chain
 from langchain.schema import AIMessage, HumanMessage
 from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
 import vector_store
+from llm import build_llm_via_langchain
 from repo_manager import RepoManager
 load_dotenv()
 def build_rag_chain(args):
     """Builds a RAG chain via LangChain."""
+    llm = build_llm_via_langchain(args.llm_provider, args.llm_model)
     retriever = vector_store.build_from_args(args).to_langchain().as_retriever()
     # Prompt to contextualize the latest query based on the chat history.
 if __name__ == "__main__":
     parser = argparse.ArgumentParser(description="UI to chat with your codebase")
     parser.add_argument("repo_id", help="The ID of the repository to index")
+    parser.add_argument("--llm_provider", default="anthropic", choices=["openai", "anthropic", "ollama"])
     parser.add_argument(
+        "--llm_model",
+        default="claude-3-opus-20240229",
+        help="The LLM name. Must be supported by the provider specified via --llm_provider.",
     )
     parser.add_argument("--vector_store_type", default="pinecone", choices=["pinecone", "marqo"])
     parser.add_argument("--index_name", required=True, help="Vector store index name")

src/llm.py ADDED Viewed

	@@ -0,0 +1,21 @@

+import os
+from langchain_anthropic import ChatAnthropic
+from langchain_ollama import ChatOllama
+from langchain_openai import ChatOpenAI
+def build_llm_via_langchain(provider: str, model: str):
+    """Builds a language model via LangChain."""
+    if provider == "openai":
+        if "OPENAI_API_KEY" not in os.environ:
+            raise ValueError("Please set the OPENAI_API_KEY environment variable.")
+        return ChatOpenAI(model=model)
+    elif provider == "anthropic":
+        if "ANTHROPIC_API_KEY" not in os.environ:
+            raise ValueError("Please set the ANTHROPIC_API_KEY environment variable.")
+        return ChatAnthropic(model=model)
+    elif provider == "ollama":
+        return ChatOllama(model=model)
+    else:
+        raise ValueError(f"Unrecognized LLM provider {provider}. Contributons are welcome!")