juliaturc commited on
Commit
fc7dede
·
1 Parent(s): 2db1bb0

Support local chat LLM via Ollama. (#14)

Browse files
Files changed (4) hide show
  1. README.md +26 -9
  2. requirements.txt +2 -0
  3. src/chat.py +6 -5
  4. src/llm.py +21 -0
README.md CHANGED
@@ -1,9 +1,9 @@
1
  # What is this?
2
 
3
- ![screenshot](assets/chat_screenshot.png)
4
-
5
  *TL;DR*: `repo2vec` is a simple-to-use, modular library enabling you to chat with any public or private codebase.
6
 
 
 
7
  **Ok, but why chat with a codebase?**
8
 
9
  Sometimes you just want to learn how a codebase works and how to integrate it, without spending hours sifting through
@@ -14,7 +14,10 @@ the code itself.
14
  Features:
15
  - **Dead-simple set-up.** Run *two scripts* and you have a functional chat interface for your code. That's really it.
16
  - **Heavily documented answers.** Every response shows where in the code the context for the answer was pulled from. Let's build trust in the AI.
17
- - **Plug-and-play.** Want to improve the algorithms powering the code understanding/generation? We've made every component of the pipeline easily swappable. Customize to your heart's content.
 
 
 
18
 
19
  # How to run it
20
  ## Indexing the codebase
@@ -56,20 +59,34 @@ We currently support two options for indexing the codebase:
56
  We are planning on adding more providers soon, so that you can mix and match them. Contributions are also welcome!
57
 
58
  ## Chatting with the codebase
59
- To bring a `gradio` app where you can chat with your codebase, simply point it to your vector store:
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
 
61
  ```
62
- export OPENAI_API_KEY=...
63
 
64
  python src/chat.py \
65
  github-repo-name \ # e.g. Storia-AI/repo2vec
 
 
66
  --vector_store_type=marqo \ # or pinecone
67
  --index_name=your-index-name
68
  ```
69
  To get a public URL for your chat app, set `--share=true`.
70
 
71
- Currently, the chat will use OpenAI's GPT-4, but we are working on adding support for other providers and local LLMs. Stay tuned!
72
-
73
  # Peeking under the hood
74
 
75
  ## Indexing the repo
@@ -79,7 +96,7 @@ The `src/index.py` script performs the following steps:
79
  2. **Chunks files**. See [Chunker](src/chunker.py).
80
  - For code files, we implement a special `CodeChunker` that takes the parse tree into account.
81
  3. **Batch-embeds chunks**. See [Embedder](src/embedder.py). We currently support:
82
- - [Marqo](https://github.com/marqo-ai/marqo) as an embedder, which allows you to specify your favorite Hugging Face embedding model;
83
  - OpenAI's [batch embedding API](https://platform.openai.com/docs/guides/batch/overview), which is much faster and cheaper than the regular synchronous embedding API.
84
  4. **Stores embeddings in a vector store**. See [VectorStore](src/vector_store.py).
85
  - We currently support [Marqo](https://github.com/marqo-ai/marqo) and [Pinecone](https://pinecone.io), but you can easily plug in your own.
@@ -100,7 +117,7 @@ The `src/chat.py` brings up a [Gradio app](https://www.gradio.app/) with a chat
100
  1. Rewrites the query to be self-contained based on previous queries
101
  2. Embeds the rewritten query using OpenAI embeddings
102
  3. Retrieves relevant documents from the vector store
103
- 4. Calls an OpenAI LLM to respond to the user query based on the retrieved documents.
104
 
105
  The sources are conveniently surfaced in the chat and linked directly to GitHub.
106
 
 
1
  # What is this?
2
 
 
 
3
  *TL;DR*: `repo2vec` is a simple-to-use, modular library enabling you to chat with any public or private codebase.
4
 
5
+ ![screenshot](assets/chat_screenshot.png)
6
+
7
  **Ok, but why chat with a codebase?**
8
 
9
  Sometimes you just want to learn how a codebase works and how to integrate it, without spending hours sifting through
 
14
  Features:
15
  - **Dead-simple set-up.** Run *two scripts* and you have a functional chat interface for your code. That's really it.
16
  - **Heavily documented answers.** Every response shows where in the code the context for the answer was pulled from. Let's build trust in the AI.
17
+ - **Runs locally or on the cloud.**
18
+ - Want privacy? No problem: you can use [Marqo](https://github.com/marqo-ai/marqo) for embeddings + vector store and [Ollama](ollama.com) for the chat LLM.
19
+ - Want speed and high performance? Also no problem. We support OpenAI batch embeddings + [Pinecone](https://www.pinecone.io/) for the vector store + OpenAI or Anthropic for the chat LLM.
20
+ - **Plug-and-play.** Want to improve the algorithms powering the code understanding/generation? We've made every component of the pipeline easily swappable. Google-grade engineering standards allow you to customize to your heart's content.
21
 
22
  # How to run it
23
  ## Indexing the codebase
 
59
  We are planning on adding more providers soon, so that you can mix and match them. Contributions are also welcome!
60
 
61
  ## Chatting with the codebase
62
+ We provide a `gradio` app where you can chat with your codebase. You can use either a local LLM (via [Ollama](https://ollama.com)), or a cloud provider like OpenAI or Anthropic.
63
+
64
+ To chat with a local LLM:
65
+ 1. Head over to [ollama.com](https://ollama.com) to download the appropriate binary for your machine.
66
+ 2. Pull the desired model, e.g. `ollama pull llama3.1`.
67
+ 3. Start the `gradio` app:
68
+ ```
69
+ python src/chat.py \
70
+ github-repo-name \ # e.g. Storia-AI/repo2vec
71
+ --llm_provider=ollama
72
+ --llm_model=llama3.1
73
+ --vector_store_type=marqo \ # or pinecone
74
+ --index_name=your-index-name
75
+ ```
76
 
77
+ To chat with a cloud-based LLM, for instance Anthropic's Claude:
78
  ```
79
+ export ANTHROPIC_API_KEY=...
80
 
81
  python src/chat.py \
82
  github-repo-name \ # e.g. Storia-AI/repo2vec
83
+ --llm_provider=anthropic \
84
+ --llm_model=claude-3-opus-20240229 \
85
  --vector_store_type=marqo \ # or pinecone
86
  --index_name=your-index-name
87
  ```
88
  To get a public URL for your chat app, set `--share=true`.
89
 
 
 
90
  # Peeking under the hood
91
 
92
  ## Indexing the repo
 
96
  2. **Chunks files**. See [Chunker](src/chunker.py).
97
  - For code files, we implement a special `CodeChunker` that takes the parse tree into account.
98
  3. **Batch-embeds chunks**. See [Embedder](src/embedder.py). We currently support:
99
+ - [Marqo](https://github.com/marqo-ai/marqo) as an embedder, which allows you to specify your favorite Hugging Face embedding model, and
100
  - OpenAI's [batch embedding API](https://platform.openai.com/docs/guides/batch/overview), which is much faster and cheaper than the regular synchronous embedding API.
101
  4. **Stores embeddings in a vector store**. See [VectorStore](src/vector_store.py).
102
  - We currently support [Marqo](https://github.com/marqo-ai/marqo) and [Pinecone](https://pinecone.io), but you can easily plug in your own.
 
117
  1. Rewrites the query to be self-contained based on previous queries
118
  2. Embeds the rewritten query using OpenAI embeddings
119
  3. Retrieves relevant documents from the vector store
120
+ 4. Calls a chat LLM to respond to the user query based on the retrieved documents.
121
 
122
  The sources are conveniently surfaced in the chat and linked directly to GitHub.
123
 
requirements.txt CHANGED
@@ -3,6 +3,8 @@ Pygments==2.18.0
3
  gradio==4.42.0
4
  langchain==0.2.14
5
  langchain-community==0.2.12
 
 
6
  langchain-openai==0.1.22
7
  marqo==3.7.0
8
  nbformat==5.10.4
 
3
  gradio==4.42.0
4
  langchain==0.2.14
5
  langchain-community==0.2.12
6
+ langchain-anthropic==0.1.23
7
+ langchain-ollama==0.1.2
8
  langchain-openai==0.1.22
9
  marqo==3.7.0
10
  nbformat==5.10.4
src/chat.py CHANGED
@@ -12,9 +12,9 @@ from langchain.chains import (create_history_aware_retriever,
12
  from langchain.chains.combine_documents import create_stuff_documents_chain
13
  from langchain.schema import AIMessage, HumanMessage
14
  from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
15
- from langchain_openai import ChatOpenAI
16
 
17
  import vector_store
 
18
  from repo_manager import RepoManager
19
 
20
  load_dotenv()
@@ -22,7 +22,7 @@ load_dotenv()
22
 
23
  def build_rag_chain(args):
24
  """Builds a RAG chain via LangChain."""
25
- llm = ChatOpenAI(model=args.openai_model)
26
  retriever = vector_store.build_from_args(args).to_langchain().as_retriever()
27
 
28
  # Prompt to contextualize the latest query based on the chat history.
@@ -74,10 +74,11 @@ def append_sources_to_response(response):
74
  if __name__ == "__main__":
75
  parser = argparse.ArgumentParser(description="UI to chat with your codebase")
76
  parser.add_argument("repo_id", help="The ID of the repository to index")
 
77
  parser.add_argument(
78
- "--openai_model",
79
- default="gpt-4",
80
- help="The OpenAI model to use for response generation",
81
  )
82
  parser.add_argument("--vector_store_type", default="pinecone", choices=["pinecone", "marqo"])
83
  parser.add_argument("--index_name", required=True, help="Vector store index name")
 
12
  from langchain.chains.combine_documents import create_stuff_documents_chain
13
  from langchain.schema import AIMessage, HumanMessage
14
  from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
 
15
 
16
  import vector_store
17
+ from llm import build_llm_via_langchain
18
  from repo_manager import RepoManager
19
 
20
  load_dotenv()
 
22
 
23
  def build_rag_chain(args):
24
  """Builds a RAG chain via LangChain."""
25
+ llm = build_llm_via_langchain(args.llm_provider, args.llm_model)
26
  retriever = vector_store.build_from_args(args).to_langchain().as_retriever()
27
 
28
  # Prompt to contextualize the latest query based on the chat history.
 
74
  if __name__ == "__main__":
75
  parser = argparse.ArgumentParser(description="UI to chat with your codebase")
76
  parser.add_argument("repo_id", help="The ID of the repository to index")
77
+ parser.add_argument("--llm_provider", default="anthropic", choices=["openai", "anthropic", "ollama"])
78
  parser.add_argument(
79
+ "--llm_model",
80
+ default="claude-3-opus-20240229",
81
+ help="The LLM name. Must be supported by the provider specified via --llm_provider.",
82
  )
83
  parser.add_argument("--vector_store_type", default="pinecone", choices=["pinecone", "marqo"])
84
  parser.add_argument("--index_name", required=True, help="Vector store index name")
src/llm.py ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+
3
+ from langchain_anthropic import ChatAnthropic
4
+ from langchain_ollama import ChatOllama
5
+ from langchain_openai import ChatOpenAI
6
+
7
+
8
+ def build_llm_via_langchain(provider: str, model: str):
9
+ """Builds a language model via LangChain."""
10
+ if provider == "openai":
11
+ if "OPENAI_API_KEY" not in os.environ:
12
+ raise ValueError("Please set the OPENAI_API_KEY environment variable.")
13
+ return ChatOpenAI(model=model)
14
+ elif provider == "anthropic":
15
+ if "ANTHROPIC_API_KEY" not in os.environ:
16
+ raise ValueError("Please set the ANTHROPIC_API_KEY environment variable.")
17
+ return ChatAnthropic(model=model)
18
+ elif provider == "ollama":
19
+ return ChatOllama(model=model)
20
+ else:
21
+ raise ValueError(f"Unrecognized LLM provider {provider}. Contributons are welcome!")