wfmedeiros3 commited on
Commit
ebeba12
·
0 Parent(s):

Commit inicial e limpo para o Hugging Face

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .docker/router.yml +16 -0
  2. .dockerignore +12 -0
  3. .github/ISSUE_TEMPLATE/bug.yml +105 -0
  4. .github/ISSUE_TEMPLATE/config.yml +8 -0
  5. .github/ISSUE_TEMPLATE/docs.yml +19 -0
  6. .github/ISSUE_TEMPLATE/feature.yml +37 -0
  7. .github/ISSUE_TEMPLATE/question.yml +19 -0
  8. .github/pull_request_template.md +37 -0
  9. .github/release_please/.release-please-config.json +19 -0
  10. .github/release_please/.release-please-manifest.json +3 -0
  11. .github/workflows/actions/install_dependencies/action.yml +30 -0
  12. .github/workflows/fern-check.yml +21 -0
  13. .github/workflows/generate-release.yml +83 -0
  14. .github/workflows/preview-docs.yml +54 -0
  15. .github/workflows/publish-docs.yml +26 -0
  16. .github/workflows/release-please.yml +20 -0
  17. .github/workflows/stale.yml +30 -0
  18. .github/workflows/tests.yml +67 -0
  19. .gitignore +31 -0
  20. .pre-commit-config.yaml +43 -0
  21. CHANGELOG.md +173 -0
  22. CITATION.cff +16 -0
  23. Dockerfile.llamacpp-cpu +62 -0
  24. Dockerfile.ollama +51 -0
  25. LICENSE +201 -0
  26. Makefile +78 -0
  27. README.md +160 -0
  28. docker-compose.yaml +116 -0
  29. fern/README.md +39 -0
  30. fern/docs.yml +129 -0
  31. fern/docs/assets/favicon.ico +0 -0
  32. fern/docs/assets/header.jpeg +0 -0
  33. fern/docs/assets/logo_dark.png +0 -0
  34. fern/docs/assets/logo_light.png +0 -0
  35. fern/docs/pages/api-reference/api-reference.mdx +14 -0
  36. fern/docs/pages/api-reference/sdks.mdx +38 -0
  37. fern/docs/pages/installation/concepts.mdx +67 -0
  38. fern/docs/pages/installation/installation.mdx +433 -0
  39. fern/docs/pages/installation/troubleshooting.mdx +64 -0
  40. fern/docs/pages/manual/ingestion-reset.mdx +14 -0
  41. fern/docs/pages/manual/ingestion.mdx +137 -0
  42. fern/docs/pages/manual/llms.mdx +234 -0
  43. fern/docs/pages/manual/nodestore.mdx +66 -0
  44. fern/docs/pages/manual/reranker.mdx +36 -0
  45. fern/docs/pages/manual/settings.mdx +85 -0
  46. fern/docs/pages/manual/vectordb.mdx +187 -0
  47. fern/docs/pages/overview/welcome.mdx +42 -0
  48. fern/docs/pages/quickstart/quickstart.mdx +105 -0
  49. fern/docs/pages/recipes/quickstart.mdx +23 -0
  50. fern/docs/pages/recipes/summarize.mdx +20 -0
.docker/router.yml ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ http:
2
+ services:
3
+ ollama:
4
+ loadBalancer:
5
+ healthCheck:
6
+ interval: 5s
7
+ path: /
8
+ servers:
9
+ - url: http://ollama-cpu:11434
10
+ - url: http://ollama-cuda:11434
11
+ - url: http://host.docker.internal:11434
12
+
13
+ routers:
14
+ ollama-router:
15
+ rule: "PathPrefix(`/`)"
16
+ service: ollama
.dockerignore ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .venv
2
+ models
3
+ .github
4
+ .vscode
5
+ .DS_Store
6
+ .mypy_cache
7
+ .ruff_cache
8
+ local_data
9
+ terraform
10
+ tests
11
+ Dockerfile
12
+ Dockerfile.*
.github/ISSUE_TEMPLATE/bug.yml ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Bug Report
2
+ description: Report a bug or issue with the project.
3
+ title: "[BUG] "
4
+ labels: ["bug"]
5
+
6
+ body:
7
+ - type: markdown
8
+ attributes:
9
+ value: |
10
+ **Please describe the bug you encountered.**
11
+
12
+ - type: checkboxes
13
+ id: pre-check
14
+ attributes:
15
+ label: Pre-check
16
+ description: Please confirm that you have searched for duplicate issues before creating this one.
17
+ options:
18
+ - label: I have searched the existing issues and none cover this bug.
19
+ required: true
20
+
21
+ - type: textarea
22
+ id: description
23
+ attributes:
24
+ label: Description
25
+ description: Provide a detailed description of the bug.
26
+ placeholder: "Detailed description of the bug"
27
+ validations:
28
+ required: true
29
+
30
+ - type: textarea
31
+ id: steps
32
+ attributes:
33
+ label: Steps to Reproduce
34
+ description: Provide the steps to reproduce the bug.
35
+ placeholder: "1. Step one\n2. Step two\n3. Step three"
36
+ validations:
37
+ required: true
38
+
39
+ - type: input
40
+ id: expected
41
+ attributes:
42
+ label: Expected Behavior
43
+ description: Describe what you expected to happen.
44
+ placeholder: "Expected behavior"
45
+ validations:
46
+ required: true
47
+
48
+ - type: input
49
+ id: actual
50
+ attributes:
51
+ label: Actual Behavior
52
+ description: Describe what actually happened.
53
+ placeholder: "Actual behavior"
54
+ validations:
55
+ required: true
56
+
57
+ - type: input
58
+ id: environment
59
+ attributes:
60
+ label: Environment
61
+ description: Provide details about your environment (e.g., OS, GPU, profile, etc.).
62
+ placeholder: "Environment details"
63
+ validations:
64
+ required: true
65
+
66
+ - type: input
67
+ id: additional
68
+ attributes:
69
+ label: Additional Information
70
+ description: Provide any additional information that may be relevant (e.g., logs, screenshots).
71
+ placeholder: "Any additional information that may be relevant"
72
+
73
+ - type: input
74
+ id: version
75
+ attributes:
76
+ label: Version
77
+ description: Provide the version of the project where you encountered the bug.
78
+ placeholder: "Version number"
79
+
80
+ - type: markdown
81
+ attributes:
82
+ value: |
83
+ **Please ensure the following setup checklist has been reviewed before submitting the bug report.**
84
+
85
+ - type: checkboxes
86
+ id: general-setup-checklist
87
+ attributes:
88
+ label: Setup Checklist
89
+ description: Verify the following general aspects of your setup.
90
+ options:
91
+ - label: Confirm that you have followed the installation instructions in the project’s documentation.
92
+ - label: Check that you are using the latest version of the project.
93
+ - label: Verify disk space availability for model storage and data processing.
94
+ - label: Ensure that you have the necessary permissions to run the project.
95
+
96
+ - type: checkboxes
97
+ id: nvidia-setup-checklist
98
+ attributes:
99
+ label: NVIDIA GPU Setup Checklist
100
+ description: Verify the following aspects of your NVIDIA GPU setup.
101
+ options:
102
+ - label: Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to [CUDA's documentation](https://docs.nvidia.com/deploy/cuda-compatibility/#frequently-asked-questions))
103
+ - label: Ensure an NVIDIA GPU is installed and recognized by the system (run `nvidia-smi` to verify).
104
+ - label: Ensure proper permissions are set for accessing GPU resources.
105
+ - label: Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run `sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi`)
.github/ISSUE_TEMPLATE/config.yml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ blank_issues_enabled: false
2
+ contact_links:
3
+ - name: Documentation
4
+ url: https://docs.privategpt.dev
5
+ about: Please refer to our documentation for more details and guidance.
6
+ - name: Discord
7
+ url: https://discord.gg/bK6mRVpErU
8
+ about: Join our Discord community to ask questions and get help.
.github/ISSUE_TEMPLATE/docs.yml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Documentation
2
+ description: Suggest a change or addition to the documentation.
3
+ title: "[DOCS] "
4
+ labels: ["documentation"]
5
+
6
+ body:
7
+ - type: markdown
8
+ attributes:
9
+ value: |
10
+ **Please describe the documentation change or addition you would like to suggest.**
11
+
12
+ - type: textarea
13
+ id: description
14
+ attributes:
15
+ label: Description
16
+ description: Provide a detailed description of the documentation change.
17
+ placeholder: "Detailed description of the documentation change"
18
+ validations:
19
+ required: true
.github/ISSUE_TEMPLATE/feature.yml ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Enhancement
2
+ description: Suggest an enhancement or improvement to the project.
3
+ title: "[FEATURE] "
4
+ labels: ["enhancement"]
5
+
6
+ body:
7
+ - type: markdown
8
+ attributes:
9
+ value: |
10
+ **Please describe the enhancement or improvement you would like to suggest.**
11
+
12
+ - type: textarea
13
+ id: feature_description
14
+ attributes:
15
+ label: Feature Description
16
+ description: Provide a detailed description of the enhancement.
17
+ placeholder: "Detailed description of the enhancement"
18
+ validations:
19
+ required: true
20
+
21
+ - type: textarea
22
+ id: reason
23
+ attributes:
24
+ label: Reason
25
+ description: Explain the reason for this enhancement.
26
+ placeholder: "Reason for the enhancement"
27
+ validations:
28
+ required: true
29
+
30
+ - type: textarea
31
+ id: value
32
+ attributes:
33
+ label: Value of Feature
34
+ description: Describe the value or benefits this feature will bring.
35
+ placeholder: "Value or benefits of the feature"
36
+ validations:
37
+ required: true
.github/ISSUE_TEMPLATE/question.yml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Question
2
+ description: Ask a question about the project.
3
+ title: "[QUESTION] "
4
+ labels: ["question"]
5
+
6
+ body:
7
+ - type: markdown
8
+ attributes:
9
+ value: |
10
+ **Please describe your question in detail.**
11
+
12
+ - type: textarea
13
+ id: question
14
+ attributes:
15
+ label: Question
16
+ description: Provide a detailed description of your question.
17
+ placeholder: "Detailed description of the question"
18
+ validations:
19
+ required: true
.github/pull_request_template.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Description
2
+
3
+ Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
4
+
5
+ ## Type of Change
6
+
7
+ Please delete options that are not relevant.
8
+
9
+ - [ ] Bug fix (non-breaking change which fixes an issue)
10
+ - [ ] New feature (non-breaking change which adds functionality)
11
+ - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
12
+ - [ ] This change requires a documentation update
13
+
14
+ ## How Has This Been Tested?
15
+
16
+ Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
17
+
18
+ - [ ] Added new unit/integration tests
19
+ - [ ] I stared at the code and made sure it makes sense
20
+
21
+ **Test Configuration**:
22
+ * Firmware version:
23
+ * Hardware:
24
+ * Toolchain:
25
+ * SDK:
26
+
27
+ ## Checklist:
28
+
29
+ - [ ] My code follows the style guidelines of this project
30
+ - [ ] I have performed a self-review of my code
31
+ - [ ] I have commented my code, particularly in hard-to-understand areas
32
+ - [ ] I have made corresponding changes to the documentation
33
+ - [ ] My changes generate no new warnings
34
+ - [ ] I have added tests that prove my fix is effective or that my feature works
35
+ - [ ] New and existing unit tests pass locally with my changes
36
+ - [ ] Any dependent changes have been merged and published in downstream modules
37
+ - [ ] I ran `make check; make test` to ensure mypy and tests pass
.github/release_please/.release-please-config.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "$schema": "https://raw.githubusercontent.com/googleapis/release-please/main/schemas/config.json",
3
+ "release-type": "simple",
4
+ "version-file": "version.txt",
5
+ "extra-files": [
6
+ {
7
+ "type": "toml",
8
+ "path": "pyproject.toml",
9
+ "jsonpath": "$.tool.poetry.version"
10
+ },
11
+ {
12
+ "type": "generic",
13
+ "path": "docker-compose.yaml"
14
+ }
15
+ ],
16
+ "packages": {
17
+ ".": {}
18
+ }
19
+ }
.github/release_please/.release-please-manifest.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ ".": "0.6.2"
3
+ }
.github/workflows/actions/install_dependencies/action.yml ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Install Dependencies"
2
+ description: "Action to build the project dependencies from the main versions"
3
+ inputs:
4
+ python_version:
5
+ required: true
6
+ type: string
7
+ default: "3.11.4"
8
+ poetry_version:
9
+ required: true
10
+ type: string
11
+ default: "1.8.3"
12
+
13
+ runs:
14
+ using: composite
15
+ steps:
16
+ - name: Install Poetry
17
+ uses: snok/install-poetry@v1
18
+ with:
19
+ version: ${{ inputs.poetry_version }}
20
+ virtualenvs-create: true
21
+ virtualenvs-in-project: false
22
+ installer-parallel: true
23
+ - uses: actions/setup-python@v4
24
+ with:
25
+ python-version: ${{ inputs.python_version }}
26
+ cache: "poetry"
27
+ - name: Install Dependencies
28
+ run: poetry install --extras "ui vector-stores-qdrant" --no-root
29
+ shell: bash
30
+
.github/workflows/fern-check.yml ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: fern check
2
+
3
+ on:
4
+ pull_request:
5
+ branches:
6
+ - main
7
+ paths:
8
+ - "fern/**"
9
+
10
+ jobs:
11
+ fern-check:
12
+ runs-on: ubuntu-latest
13
+ steps:
14
+ - name: Checkout repo
15
+ uses: actions/checkout@v4
16
+
17
+ - name: Install Fern
18
+ run: npm install -g fern-api
19
+
20
+ - name: Check Fern API is valid
21
+ run: fern check
.github/workflows/generate-release.yml ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: generate-release
2
+
3
+ on:
4
+ release:
5
+ types: [ published ]
6
+ workflow_dispatch:
7
+
8
+ env:
9
+ REGISTRY: docker.io
10
+ IMAGE_NAME: zylonai/private-gpt
11
+ platforms: linux/amd64,linux/arm64
12
+ DEFAULT_TYPE: "ollama"
13
+
14
+ jobs:
15
+ build-and-push-image:
16
+ runs-on: ubuntu-latest
17
+
18
+ strategy:
19
+ matrix:
20
+ type: [ llamacpp-cpu, ollama ]
21
+
22
+ permissions:
23
+ contents: read
24
+ packages: write
25
+
26
+ outputs:
27
+ version: ${{ steps.version.outputs.version }}
28
+
29
+ steps:
30
+ - name: Free Disk Space (Ubuntu)
31
+ uses: jlumbroso/free-disk-space@main
32
+ with:
33
+ tool-cache: false
34
+ android: true
35
+ dotnet: true
36
+ haskell: true
37
+ large-packages: true
38
+ docker-images: false
39
+ swap-storage: true
40
+
41
+ - name: Checkout repository
42
+ uses: actions/checkout@v4
43
+
44
+ - name: Set up QEMU
45
+ uses: docker/setup-qemu-action@v3
46
+
47
+ - name: Set up Docker Buildx
48
+ uses: docker/setup-buildx-action@v3
49
+
50
+ - name: Log in to Docker Hub
51
+ uses: docker/login-action@v3
52
+ with:
53
+ username: ${{ secrets.DOCKER_USERNAME }}
54
+ password: ${{ secrets.DOCKER_PASSWORD }}
55
+
56
+ - name: Extract metadata (tags, labels) for Docker
57
+ id: meta
58
+ uses: docker/metadata-action@v5
59
+ with:
60
+ images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
61
+ tags: |
62
+ type=semver,pattern={{version}},enable=${{ matrix.type == env.DEFAULT_TYPE }}
63
+ type=semver,pattern={{version}}-${{ matrix.type }}
64
+ type=semver,pattern={{major}}.{{minor}},enable=${{ matrix.type == env.DEFAULT_TYPE }}
65
+ type=semver,pattern={{major}}.{{minor}}-${{ matrix.type }}
66
+ type=raw,value=latest,enable=${{ matrix.type == env.DEFAULT_TYPE }}
67
+ type=sha
68
+ flavor: |
69
+ latest=false
70
+
71
+ - name: Build and push Docker image
72
+ uses: docker/build-push-action@v6
73
+ with:
74
+ context: .
75
+ file: Dockerfile.${{ matrix.type }}
76
+ platforms: ${{ env.platforms }}
77
+ push: true
78
+ tags: ${{ steps.meta.outputs.tags }}
79
+ labels: ${{ steps.meta.outputs.labels }}
80
+
81
+ - name: Version output
82
+ id: version
83
+ run: echo "version=${{ steps.meta.outputs.version }}" >> "$GITHUB_OUTPUT"
.github/workflows/preview-docs.yml ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: deploy preview docs
2
+
3
+ on:
4
+ pull_request_target:
5
+ branches:
6
+ - main
7
+ paths:
8
+ - "fern/**"
9
+
10
+ jobs:
11
+ preview-docs:
12
+ runs-on: ubuntu-latest
13
+
14
+ permissions:
15
+ contents: read
16
+ pull-requests: write
17
+
18
+ steps:
19
+ - name: Checkout repository
20
+ uses: actions/checkout@v4
21
+ with:
22
+ ref: refs/pull/${{ github.event.pull_request.number }}/merge
23
+
24
+ - name: Setup Node.js
25
+ uses: actions/setup-node@v4
26
+ with:
27
+ node-version: "18"
28
+
29
+ - name: Install Fern
30
+ run: npm install -g fern-api
31
+
32
+ - name: Generate Documentation Preview with Fern
33
+ id: generate_docs
34
+ env:
35
+ FERN_TOKEN: ${{ secrets.FERN_TOKEN }}
36
+ run: |
37
+ output=$(fern generate --docs --preview --log-level debug)
38
+ echo "$output"
39
+ # Extract the URL
40
+ preview_url=$(echo "$output" | grep -oP '(?<=Published docs to )https://[^\s]*')
41
+ # Set the output for the step
42
+ echo "::set-output name=preview_url::$preview_url"
43
+ - name: Comment PR with URL using github-actions bot
44
+ uses: actions/github-script@v7
45
+ if: ${{ steps.generate_docs.outputs.preview_url }}
46
+ with:
47
+ script: |
48
+ const preview_url = '${{ steps.generate_docs.outputs.preview_url }}';
49
+ github.rest.issues.createComment({
50
+ issue_number: context.issue.number,
51
+ owner: context.repo.owner,
52
+ repo: context.repo.repo,
53
+ body: `Published docs preview URL: ${preview_url}`
54
+ })
.github/workflows/publish-docs.yml ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: publish docs
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ paths:
8
+ - "fern/**"
9
+
10
+ jobs:
11
+ publish-docs:
12
+ runs-on: ubuntu-latest
13
+ steps:
14
+ - name: Checkout repo
15
+ uses: actions/checkout@v4
16
+
17
+ - name: Setup node
18
+ uses: actions/setup-node@v3
19
+
20
+ - name: Download Fern
21
+ run: npm install -g fern-api
22
+
23
+ - name: Generate and Publish Docs
24
+ env:
25
+ FERN_TOKEN: ${{ secrets.FERN_TOKEN }}
26
+ run: fern generate --docs --log-level debug
.github/workflows/release-please.yml ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: release-please
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+
8
+ permissions:
9
+ contents: write
10
+ pull-requests: write
11
+
12
+ jobs:
13
+ release-please:
14
+ runs-on: ubuntu-latest
15
+ steps:
16
+ - uses: google-github-actions/release-please-action@v4
17
+ id: release
18
+ with:
19
+ config-file: .github/release_please/.release-please-config.json
20
+ manifest-file: .github/release_please/.release-please-manifest.json
.github/workflows/stale.yml ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This workflow warns and then closes issues and PRs that have had no activity for a specified amount of time.
2
+ #
3
+ # You can adjust the behavior by modifying this file.
4
+ # For more information, see:
5
+ # https://github.com/actions/stale
6
+ name: Mark stale issues and pull requests
7
+
8
+ on:
9
+ schedule:
10
+ - cron: '42 5 * * *'
11
+
12
+ jobs:
13
+ stale:
14
+
15
+ runs-on: ubuntu-latest
16
+ permissions:
17
+ issues: write
18
+ pull-requests: write
19
+
20
+ steps:
21
+ - uses: actions/stale@v8
22
+ with:
23
+ repo-token: ${{ secrets.GITHUB_TOKEN }}
24
+ days-before-stale: 15
25
+ stale-issue-message: 'Stale issue'
26
+ stale-pr-message: 'Stale pull request'
27
+ stale-issue-label: 'stale'
28
+ stale-pr-label: 'stale'
29
+ exempt-issue-labels: 'autorelease: pending'
30
+ exempt-pr-labels: 'autorelease: pending'
.github/workflows/tests.yml ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: tests
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ pull_request:
8
+
9
+ concurrency:
10
+ group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.head_ref || github.ref }}
11
+ cancel-in-progress: ${{ github.event_name == 'pull_request' }}
12
+
13
+ jobs:
14
+ setup:
15
+ runs-on: ubuntu-latest
16
+ steps:
17
+ - uses: actions/checkout@v4
18
+ - uses: ./.github/workflows/actions/install_dependencies
19
+
20
+ checks:
21
+ needs: setup
22
+ runs-on: ubuntu-latest
23
+ name: ${{ matrix.quality-command }}
24
+ strategy:
25
+ matrix:
26
+ quality-command:
27
+ - black
28
+ - ruff
29
+ - mypy
30
+ steps:
31
+ - uses: actions/checkout@v4
32
+ - uses: ./.github/workflows/actions/install_dependencies
33
+ - name: run ${{ matrix.quality-command }}
34
+ run: make ${{ matrix.quality-command }}
35
+
36
+ test:
37
+ needs: setup
38
+ runs-on: ubuntu-latest
39
+ name: test
40
+ steps:
41
+ - uses: actions/checkout@v4
42
+ - uses: ./.github/workflows/actions/install_dependencies
43
+ - name: run test
44
+ run: make test-coverage
45
+ # Run even if make test fails for coverage reports
46
+ # TODO: select a better xml results displayer
47
+ - name: Archive test results coverage results
48
+ uses: actions/upload-artifact@v3
49
+ if: always()
50
+ with:
51
+ name: test_results
52
+ path: tests-results.xml
53
+ - name: Archive code coverage results
54
+ uses: actions/upload-artifact@v3
55
+ if: always()
56
+ with:
57
+ name: code-coverage-report
58
+ path: htmlcov/
59
+
60
+ all_checks_passed:
61
+ # Used to easily force requirements checks in GitHub
62
+ needs:
63
+ - checks
64
+ - test
65
+ runs-on: ubuntu-latest
66
+ steps:
67
+ - run: echo "All checks passed"
.gitignore ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .venv
2
+ .env
3
+ venv
4
+
5
+ settings-me.yaml
6
+
7
+ .ruff_cache
8
+ .pytest_cache
9
+ .mypy_cache
10
+
11
+ # byte-compiled / optimized / DLL files
12
+ __pycache__/
13
+ *.py[cod]
14
+
15
+ # unit tests / coverage reports
16
+ /tests-results.xml
17
+ /.coverage
18
+ /coverage.xml
19
+ /htmlcov/
20
+
21
+ # pyenv
22
+ /.python-version
23
+
24
+ # IDE
25
+ .idea/
26
+ .vscode/
27
+ /.run/
28
+ .fleet/
29
+
30
+ # macOS
31
+ .DS_Store
.pre-commit-config.yaml ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ default_install_hook_types:
2
+ # Mandatory to install both pre-commit and pre-push hooks (see https://pre-commit.com/#top_level-default_install_hook_types)
3
+ # Add new hook types here to ensure automatic installation when running `pre-commit install`
4
+ - pre-commit
5
+ - pre-push
6
+ repos:
7
+ - repo: https://github.com/pre-commit/pre-commit-hooks
8
+ rev: v4.3.0
9
+ hooks:
10
+ - id: trailing-whitespace
11
+ - id: end-of-file-fixer
12
+ - id: check-yaml
13
+ - id: check-json
14
+ - id: check-added-large-files
15
+
16
+ - repo: local
17
+ hooks:
18
+ - id: black
19
+ name: Formatting (black)
20
+ entry: black
21
+ language: system
22
+ types: [python]
23
+ stages: [commit]
24
+ - id: ruff
25
+ name: Linter (ruff)
26
+ entry: ruff
27
+ language: system
28
+ types: [python]
29
+ stages: [commit]
30
+ - id: mypy
31
+ name: Type checking (mypy)
32
+ entry: make mypy
33
+ pass_filenames: false
34
+ language: system
35
+ types: [python]
36
+ stages: [commit]
37
+ - id: test
38
+ name: Unit tests (pytest)
39
+ entry: make test
40
+ pass_filenames: false
41
+ language: system
42
+ types: [python]
43
+ stages: [push]
CHANGELOG.md ADDED
@@ -0,0 +1,173 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Changelog
2
+
3
+ ## [0.6.2](https://github.com/zylon-ai/private-gpt/compare/v0.6.1...v0.6.2) (2024-08-08)
4
+
5
+
6
+ ### Bug Fixes
7
+
8
+ * add numpy issue to troubleshooting ([#2048](https://github.com/zylon-ai/private-gpt/issues/2048)) ([4ca6d0c](https://github.com/zylon-ai/private-gpt/commit/4ca6d0cb556be7a598f7d3e3b00d2a29214ee1e8))
9
+ * auto-update version ([#2052](https://github.com/zylon-ai/private-gpt/issues/2052)) ([7fefe40](https://github.com/zylon-ai/private-gpt/commit/7fefe408b4267684c6e3c1a43c5dc2b73ec61fe4))
10
+ * publish image name ([#2043](https://github.com/zylon-ai/private-gpt/issues/2043)) ([b1acf9d](https://github.com/zylon-ai/private-gpt/commit/b1acf9dc2cbca2047cd0087f13254ff5cda6e570))
11
+ * update matplotlib to 3.9.1-post1 to fix win install ([b16abbe](https://github.com/zylon-ai/private-gpt/commit/b16abbefe49527ac038d235659854b98345d5387))
12
+
13
+ ## [0.6.1](https://github.com/zylon-ai/private-gpt/compare/v0.6.0...v0.6.1) (2024-08-05)
14
+
15
+
16
+ ### Bug Fixes
17
+
18
+ * add built image from DockerHub ([#2042](https://github.com/zylon-ai/private-gpt/issues/2042)) ([f09f6dd](https://github.com/zylon-ai/private-gpt/commit/f09f6dd2553077d4566dbe6b48a450e05c2f049e))
19
+ * Adding azopenai to model list ([#2035](https://github.com/zylon-ai/private-gpt/issues/2035)) ([1c665f7](https://github.com/zylon-ai/private-gpt/commit/1c665f7900658144f62814b51f6e3434a6d7377f))
20
+ * **deploy:** generate docker release when new version is released ([#2038](https://github.com/zylon-ai/private-gpt/issues/2038)) ([1d4c14d](https://github.com/zylon-ai/private-gpt/commit/1d4c14d7a3c383c874b323d934be01afbaca899e))
21
+ * **deploy:** improve Docker-Compose and quickstart on Docker ([#2037](https://github.com/zylon-ai/private-gpt/issues/2037)) ([dae0727](https://github.com/zylon-ai/private-gpt/commit/dae0727a1b4abd35d2b0851fe30e0a4ed67e0fbb))
22
+
23
+ ## [0.6.0](https://github.com/zylon-ai/private-gpt/compare/v0.5.0...v0.6.0) (2024-08-02)
24
+
25
+
26
+ ### Features
27
+
28
+ * bump dependencies ([#1987](https://github.com/zylon-ai/private-gpt/issues/1987)) ([b687dc8](https://github.com/zylon-ai/private-gpt/commit/b687dc852413404c52d26dcb94536351a63b169d))
29
+ * **docs:** add privategpt-ts sdk ([#1924](https://github.com/zylon-ai/private-gpt/issues/1924)) ([d13029a](https://github.com/zylon-ai/private-gpt/commit/d13029a046f6e19e8ee65bef3acd96365c738df2))
30
+ * **docs:** Fix setup docu ([#1926](https://github.com/zylon-ai/private-gpt/issues/1926)) ([067a5f1](https://github.com/zylon-ai/private-gpt/commit/067a5f144ca6e605c99d7dbe9ca7d8207ac8808d))
31
+ * **docs:** update doc for ipex-llm ([#1968](https://github.com/zylon-ai/private-gpt/issues/1968)) ([19a7c06](https://github.com/zylon-ai/private-gpt/commit/19a7c065ef7f42b37f289dd28ac945f7afc0e73a))
32
+ * **docs:** update documentation and fix preview-docs ([#2000](https://github.com/zylon-ai/private-gpt/issues/2000)) ([4523a30](https://github.com/zylon-ai/private-gpt/commit/4523a30c8f004aac7a7ae224671e2c45ec0cb973))
33
+ * **llm:** add progress bar when ollama is pulling models ([#2031](https://github.com/zylon-ai/private-gpt/issues/2031)) ([cf61bf7](https://github.com/zylon-ai/private-gpt/commit/cf61bf780f8d122e4057d002abf03563bb45614a))
34
+ * **llm:** autopull ollama models ([#2019](https://github.com/zylon-ai/private-gpt/issues/2019)) ([20bad17](https://github.com/zylon-ai/private-gpt/commit/20bad17c9857809158e689e9671402136c1e3d84))
35
+ * **llm:** Support for Google Gemini LLMs and Embeddings ([#1965](https://github.com/zylon-ai/private-gpt/issues/1965)) ([fc13368](https://github.com/zylon-ai/private-gpt/commit/fc13368bc72d1f4c27644677431420ed77731c03))
36
+ * make llama3.1 as default ([#2022](https://github.com/zylon-ai/private-gpt/issues/2022)) ([9027d69](https://github.com/zylon-ai/private-gpt/commit/9027d695c11fbb01e62424b855665de71d513417))
37
+ * prompt_style applied to all LLMs + extra LLM params. ([#1835](https://github.com/zylon-ai/private-gpt/issues/1835)) ([e21bf20](https://github.com/zylon-ai/private-gpt/commit/e21bf20c10938b24711d9f2c765997f44d7e02a9))
38
+ * **recipe:** add our first recipe `Summarize` ([#2028](https://github.com/zylon-ai/private-gpt/issues/2028)) ([8119842](https://github.com/zylon-ai/private-gpt/commit/8119842ae6f1f5ecfaf42b06fa0d1ffec675def4))
39
+ * **vectordb:** Milvus vector db Integration ([#1996](https://github.com/zylon-ai/private-gpt/issues/1996)) ([43cc31f](https://github.com/zylon-ai/private-gpt/commit/43cc31f74015f8d8fcbf7a8ea7d7d9ecc66cf8c9))
40
+ * **vectorstore:** Add clickhouse support as vectore store ([#1883](https://github.com/zylon-ai/private-gpt/issues/1883)) ([2612928](https://github.com/zylon-ai/private-gpt/commit/26129288394c7483e6fc0496a11dc35679528cc1))
41
+
42
+
43
+ ### Bug Fixes
44
+
45
+ * "no such group" error in Dockerfile, added docx2txt and cryptography deps ([#1841](https://github.com/zylon-ai/private-gpt/issues/1841)) ([947e737](https://github.com/zylon-ai/private-gpt/commit/947e737f300adf621d2261d527192f36f3387f8e))
46
+ * **config:** make tokenizer optional and include a troubleshooting doc ([#1998](https://github.com/zylon-ai/private-gpt/issues/1998)) ([01b7ccd](https://github.com/zylon-ai/private-gpt/commit/01b7ccd0648be032846647c9a184925d3682f612))
47
+ * **docs:** Fix concepts.mdx referencing to installation page ([#1779](https://github.com/zylon-ai/private-gpt/issues/1779)) ([dde0224](https://github.com/zylon-ai/private-gpt/commit/dde02245bcd51a7ede7b6789c82ae217cac53d92))
48
+ * **docs:** Update installation.mdx ([#1866](https://github.com/zylon-ai/private-gpt/issues/1866)) ([c1802e7](https://github.com/zylon-ai/private-gpt/commit/c1802e7cf0e56a2603213ec3b6a4af8fadb8a17a))
49
+ * ffmpy dependency ([#2020](https://github.com/zylon-ai/private-gpt/issues/2020)) ([dabf556](https://github.com/zylon-ai/private-gpt/commit/dabf556dae9cb00fe0262270e5138d982585682e))
50
+ * light mode ([#2025](https://github.com/zylon-ai/private-gpt/issues/2025)) ([1020cd5](https://github.com/zylon-ai/private-gpt/commit/1020cd53288af71a17882781f392512568f1b846))
51
+ * **LLM:** mistral ignoring assistant messages ([#1954](https://github.com/zylon-ai/private-gpt/issues/1954)) ([c7212ac](https://github.com/zylon-ai/private-gpt/commit/c7212ac7cc891f9e3c713cc206ae9807c5dfdeb6))
52
+ * **llm:** special tokens and leading space ([#1831](https://github.com/zylon-ai/private-gpt/issues/1831)) ([347be64](https://github.com/zylon-ai/private-gpt/commit/347be643f7929c56382a77c3f45f0867605e0e0a))
53
+ * make embedding_api_base match api_base when on docker ([#1859](https://github.com/zylon-ai/private-gpt/issues/1859)) ([2a432bf](https://github.com/zylon-ai/private-gpt/commit/2a432bf9c5582a94eb4052b1e80cabdb118d298e))
54
+ * nomic embeddings ([#2030](https://github.com/zylon-ai/private-gpt/issues/2030)) ([5465958](https://github.com/zylon-ai/private-gpt/commit/54659588b5b109a3dd17cca835e275240464d275))
55
+ * prevent to ingest local files (by default) ([#2010](https://github.com/zylon-ai/private-gpt/issues/2010)) ([e54a8fe](https://github.com/zylon-ai/private-gpt/commit/e54a8fe0433252808d0a60f6a08a43c9f5a42f3b))
56
+ * Replacing unsafe `eval()` with `json.loads()` ([#1890](https://github.com/zylon-ai/private-gpt/issues/1890)) ([9d0d614](https://github.com/zylon-ai/private-gpt/commit/9d0d614706581a8bfa57db45f62f84ab23d26f15))
57
+ * **settings:** enable cors by default so it will work when using ts sdk (spa) ([#1925](https://github.com/zylon-ai/private-gpt/issues/1925)) ([966af47](https://github.com/zylon-ai/private-gpt/commit/966af4771dbe5cf3fdf554b5fdf8f732407859c4))
58
+ * **ui:** gradio bug fixes ([#2021](https://github.com/zylon-ai/private-gpt/issues/2021)) ([d4375d0](https://github.com/zylon-ai/private-gpt/commit/d4375d078f18ba53562fd71651159f997fff865f))
59
+ * unify embedding models ([#2027](https://github.com/zylon-ai/private-gpt/issues/2027)) ([40638a1](https://github.com/zylon-ai/private-gpt/commit/40638a18a5713d60fec8fe52796dcce66d88258c))
60
+
61
+ ## [0.5.0](https://github.com/zylon-ai/private-gpt/compare/v0.4.0...v0.5.0) (2024-04-02)
62
+
63
+
64
+ ### Features
65
+
66
+ * **code:** improve concat of strings in ui ([#1785](https://github.com/zylon-ai/private-gpt/issues/1785)) ([bac818a](https://github.com/zylon-ai/private-gpt/commit/bac818add51b104cda925b8f1f7b51448e935ca1))
67
+ * **docker:** set default Docker to use Ollama ([#1812](https://github.com/zylon-ai/private-gpt/issues/1812)) ([f83abff](https://github.com/zylon-ai/private-gpt/commit/f83abff8bc955a6952c92cc7bcb8985fcec93afa))
68
+ * **docs:** Add guide Llama-CPP Linux AMD GPU support ([#1782](https://github.com/zylon-ai/private-gpt/issues/1782)) ([8a836e4](https://github.com/zylon-ai/private-gpt/commit/8a836e4651543f099c59e2bf497ab8c55a7cd2e5))
69
+ * **docs:** Feature/upgrade docs ([#1741](https://github.com/zylon-ai/private-gpt/issues/1741)) ([5725181](https://github.com/zylon-ai/private-gpt/commit/572518143ac46532382db70bed6f73b5082302c1))
70
+ * **docs:** upgrade fern ([#1596](https://github.com/zylon-ai/private-gpt/issues/1596)) ([84ad16a](https://github.com/zylon-ai/private-gpt/commit/84ad16af80191597a953248ce66e963180e8ddec))
71
+ * **ingest:** Created a faster ingestion mode - pipeline ([#1750](https://github.com/zylon-ai/private-gpt/issues/1750)) ([134fc54](https://github.com/zylon-ai/private-gpt/commit/134fc54d7d636be91680dc531f5cbe2c5892ac56))
72
+ * **llm - embed:** Add support for Azure OpenAI ([#1698](https://github.com/zylon-ai/private-gpt/issues/1698)) ([1efac6a](https://github.com/zylon-ai/private-gpt/commit/1efac6a3fe19e4d62325e2c2915cd84ea277f04f))
73
+ * **llm:** adds serveral settings for llamacpp and ollama ([#1703](https://github.com/zylon-ai/private-gpt/issues/1703)) ([02dc83e](https://github.com/zylon-ai/private-gpt/commit/02dc83e8e9f7ada181ff813f25051bbdff7b7c6b))
74
+ * **llm:** Ollama LLM-Embeddings decouple + longer keep_alive settings ([#1800](https://github.com/zylon-ai/private-gpt/issues/1800)) ([b3b0140](https://github.com/zylon-ai/private-gpt/commit/b3b0140e244e7a313bfaf4ef10eb0f7e4192710e))
75
+ * **llm:** Ollama timeout setting ([#1773](https://github.com/zylon-ai/private-gpt/issues/1773)) ([6f6c785](https://github.com/zylon-ai/private-gpt/commit/6f6c785dac2bbad37d0b67fda215784298514d39))
76
+ * **local:** tiktoken cache within repo for offline ([#1467](https://github.com/zylon-ai/private-gpt/issues/1467)) ([821bca3](https://github.com/zylon-ai/private-gpt/commit/821bca32e9ee7c909fd6488445ff6a04463bf91b))
77
+ * **nodestore:** add Postgres for the doc and index store ([#1706](https://github.com/zylon-ai/private-gpt/issues/1706)) ([68b3a34](https://github.com/zylon-ai/private-gpt/commit/68b3a34b032a08ca073a687d2058f926032495b3))
78
+ * **rag:** expose similarity_top_k and similarity_score to settings ([#1771](https://github.com/zylon-ai/private-gpt/issues/1771)) ([087cb0b](https://github.com/zylon-ai/private-gpt/commit/087cb0b7b74c3eb80f4f60b47b3a021c81272ae1))
79
+ * **RAG:** Introduce SentenceTransformer Reranker ([#1810](https://github.com/zylon-ai/private-gpt/issues/1810)) ([83adc12](https://github.com/zylon-ai/private-gpt/commit/83adc12a8ef0fa0c13a0dec084fa596445fc9075))
80
+ * **scripts:** Wipe qdrant and obtain db Stats command ([#1783](https://github.com/zylon-ai/private-gpt/issues/1783)) ([ea153fb](https://github.com/zylon-ai/private-gpt/commit/ea153fb92f1f61f64c0d04fff0048d4d00b6f8d0))
81
+ * **ui:** Add Model Information to ChatInterface label ([f0b174c](https://github.com/zylon-ai/private-gpt/commit/f0b174c097c2d5e52deae8ef88de30a0d9013a38))
82
+ * **ui:** add sources check to not repeat identical sources ([#1705](https://github.com/zylon-ai/private-gpt/issues/1705)) ([290b9fb](https://github.com/zylon-ai/private-gpt/commit/290b9fb084632216300e89bdadbfeb0380724b12))
83
+ * **UI:** Faster startup and document listing ([#1763](https://github.com/zylon-ai/private-gpt/issues/1763)) ([348df78](https://github.com/zylon-ai/private-gpt/commit/348df781b51606b2f9810bcd46f850e54192fd16))
84
+ * **ui:** maintain score order when curating sources ([#1643](https://github.com/zylon-ai/private-gpt/issues/1643)) ([410bf7a](https://github.com/zylon-ai/private-gpt/commit/410bf7a71f17e77c4aec723ab80c233b53765964))
85
+ * unify settings for vector and nodestore connections to PostgreSQL ([#1730](https://github.com/zylon-ai/private-gpt/issues/1730)) ([63de7e4](https://github.com/zylon-ai/private-gpt/commit/63de7e4930ac90dd87620225112a22ffcbbb31ee))
86
+ * wipe per storage type ([#1772](https://github.com/zylon-ai/private-gpt/issues/1772)) ([c2d6948](https://github.com/zylon-ai/private-gpt/commit/c2d694852b4696834962a42fde047b728722ad74))
87
+
88
+
89
+ ### Bug Fixes
90
+
91
+ * **docs:** Minor documentation amendment ([#1739](https://github.com/zylon-ai/private-gpt/issues/1739)) ([258d02d](https://github.com/zylon-ai/private-gpt/commit/258d02d87c5cb81d6c3a6f06aa69339b670dffa9))
92
+ * Fixed docker-compose ([#1758](https://github.com/zylon-ai/private-gpt/issues/1758)) ([774e256](https://github.com/zylon-ai/private-gpt/commit/774e2560520dc31146561d09a2eb464c68593871))
93
+ * **ingest:** update script label ([#1770](https://github.com/zylon-ai/private-gpt/issues/1770)) ([7d2de5c](https://github.com/zylon-ai/private-gpt/commit/7d2de5c96fd42e339b26269b3155791311ef1d08))
94
+ * **settings:** set default tokenizer to avoid running make setup fail ([#1709](https://github.com/zylon-ai/private-gpt/issues/1709)) ([d17c34e](https://github.com/zylon-ai/private-gpt/commit/d17c34e81a84518086b93605b15032e2482377f7))
95
+
96
+ ## [0.4.0](https://github.com/imartinez/privateGPT/compare/v0.3.0...v0.4.0) (2024-03-06)
97
+
98
+
99
+ ### Features
100
+
101
+ * Upgrade to LlamaIndex to 0.10 ([#1663](https://github.com/imartinez/privateGPT/issues/1663)) ([45f0571](https://github.com/imartinez/privateGPT/commit/45f05711eb71ffccdedb26f37e680ced55795d44))
102
+ * **Vector:** support pgvector ([#1624](https://github.com/imartinez/privateGPT/issues/1624)) ([cd40e39](https://github.com/imartinez/privateGPT/commit/cd40e3982b780b548b9eea6438c759f1c22743a8))
103
+
104
+ ## [0.3.0](https://github.com/imartinez/privateGPT/compare/v0.2.0...v0.3.0) (2024-02-16)
105
+
106
+
107
+ ### Features
108
+
109
+ * add mistral + chatml prompts ([#1426](https://github.com/imartinez/privateGPT/issues/1426)) ([e326126](https://github.com/imartinez/privateGPT/commit/e326126d0d4cd7e46a79f080c442c86f6dd4d24b))
110
+ * Add stream information to generate SDKs ([#1569](https://github.com/imartinez/privateGPT/issues/1569)) ([24fae66](https://github.com/imartinez/privateGPT/commit/24fae660e6913aac6b52745fb2c2fe128ba2eb79))
111
+ * **API:** Ingest plain text ([#1417](https://github.com/imartinez/privateGPT/issues/1417)) ([6eeb95e](https://github.com/imartinez/privateGPT/commit/6eeb95ec7f17a618aaa47f5034ee5bccae02b667))
112
+ * **bulk-ingest:** Add --ignored Flag to Exclude Specific Files and Directories During Ingestion ([#1432](https://github.com/imartinez/privateGPT/issues/1432)) ([b178b51](https://github.com/imartinez/privateGPT/commit/b178b514519550e355baf0f4f3f6beb73dca7df2))
113
+ * **llm:** Add openailike llm mode ([#1447](https://github.com/imartinez/privateGPT/issues/1447)) ([2d27a9f](https://github.com/imartinez/privateGPT/commit/2d27a9f956d672cb1fe715cf0acdd35c37f378a5)), closes [#1424](https://github.com/imartinez/privateGPT/issues/1424)
114
+ * **llm:** Add support for Ollama LLM ([#1526](https://github.com/imartinez/privateGPT/issues/1526)) ([6bbec79](https://github.com/imartinez/privateGPT/commit/6bbec79583b7f28d9bea4b39c099ebef149db843))
115
+ * **settings:** Configurable context_window and tokenizer ([#1437](https://github.com/imartinez/privateGPT/issues/1437)) ([4780540](https://github.com/imartinez/privateGPT/commit/47805408703c23f0fd5cab52338142c1886b450b))
116
+ * **settings:** Update default model to TheBloke/Mistral-7B-Instruct-v0.2-GGUF ([#1415](https://github.com/imartinez/privateGPT/issues/1415)) ([8ec7cf4](https://github.com/imartinez/privateGPT/commit/8ec7cf49f40701a4f2156c48eb2fad9fe6220629))
117
+ * **ui:** make chat area stretch to fill the screen ([#1397](https://github.com/imartinez/privateGPT/issues/1397)) ([c71ae7c](https://github.com/imartinez/privateGPT/commit/c71ae7cee92463bbc5ea9c434eab9f99166e1363))
118
+ * **UI:** Select file to Query or Delete + Delete ALL ([#1612](https://github.com/imartinez/privateGPT/issues/1612)) ([aa13afd](https://github.com/imartinez/privateGPT/commit/aa13afde07122f2ddda3942f630e5cadc7e4e1ee))
119
+
120
+
121
+ ### Bug Fixes
122
+
123
+ * Adding an LLM param to fix broken generator from llamacpp ([#1519](https://github.com/imartinez/privateGPT/issues/1519)) ([869233f](https://github.com/imartinez/privateGPT/commit/869233f0e4f03dc23e5fae43cf7cb55350afdee9))
124
+ * **deploy:** fix local and external dockerfiles ([fde2b94](https://github.com/imartinez/privateGPT/commit/fde2b942bc03688701ed563be6d7d597c75e4e4e))
125
+ * **docker:** docker broken copy ([#1419](https://github.com/imartinez/privateGPT/issues/1419)) ([059f358](https://github.com/imartinez/privateGPT/commit/059f35840adbc3fb93d847d6decf6da32d08670c))
126
+ * **docs:** Update quickstart doc and set version in pyproject.toml to 0.2.0 ([0a89d76](https://github.com/imartinez/privateGPT/commit/0a89d76cc5ed4371ffe8068858f23dfbb5e8cc37))
127
+ * minor bug in chat stream output - python error being serialized ([#1449](https://github.com/imartinez/privateGPT/issues/1449)) ([6191bcd](https://github.com/imartinez/privateGPT/commit/6191bcdbd6e92b6f4d5995967dc196c9348c5954))
128
+ * **settings:** correct yaml multiline string ([#1403](https://github.com/imartinez/privateGPT/issues/1403)) ([2564f8d](https://github.com/imartinez/privateGPT/commit/2564f8d2bb8c4332a6a0ab6d722a2ac15006b85f))
129
+ * **tests:** load the test settings only when running tests ([d3acd85](https://github.com/imartinez/privateGPT/commit/d3acd85fe34030f8cfd7daf50b30c534087bdf2b))
130
+ * **UI:** Updated ui.py. Frees up the CPU to not be bottlenecked. ([24fb80c](https://github.com/imartinez/privateGPT/commit/24fb80ca38f21910fe4fd81505d14960e9ed4faa))
131
+
132
+ ## [0.2.0](https://github.com/imartinez/privateGPT/compare/v0.1.0...v0.2.0) (2023-12-10)
133
+
134
+
135
+ ### Features
136
+
137
+ * **llm:** drop default_system_prompt ([#1385](https://github.com/imartinez/privateGPT/issues/1385)) ([a3ed14c](https://github.com/imartinez/privateGPT/commit/a3ed14c58f77351dbd5f8f2d7868d1642a44f017))
138
+ * **ui:** Allows User to Set System Prompt via "Additional Options" in Chat Interface ([#1353](https://github.com/imartinez/privateGPT/issues/1353)) ([145f3ec](https://github.com/imartinez/privateGPT/commit/145f3ec9f41c4def5abf4065a06fb0786e2d992a))
139
+
140
+ ## [0.1.0](https://github.com/imartinez/privateGPT/compare/v0.0.2...v0.1.0) (2023-11-30)
141
+
142
+
143
+ ### Features
144
+
145
+ * Disable Gradio Analytics ([#1165](https://github.com/imartinez/privateGPT/issues/1165)) ([6583dc8](https://github.com/imartinez/privateGPT/commit/6583dc84c082773443fc3973b1cdf8095fa3fec3))
146
+ * Drop loguru and use builtin `logging` ([#1133](https://github.com/imartinez/privateGPT/issues/1133)) ([64c5ae2](https://github.com/imartinez/privateGPT/commit/64c5ae214a9520151c9c2d52ece535867d799367))
147
+ * enable resume download for hf_hub_download ([#1249](https://github.com/imartinez/privateGPT/issues/1249)) ([4197ada](https://github.com/imartinez/privateGPT/commit/4197ada6267c822f32c1d7ba2be6e7ce145a3404))
148
+ * move torch and transformers to local group ([#1172](https://github.com/imartinez/privateGPT/issues/1172)) ([0d677e1](https://github.com/imartinez/privateGPT/commit/0d677e10b970aec222ec04837d0f08f1631b6d4a))
149
+ * Qdrant support ([#1228](https://github.com/imartinez/privateGPT/issues/1228)) ([03d1ae6](https://github.com/imartinez/privateGPT/commit/03d1ae6d70dffdd2411f0d4e92f65080fff5a6e2))
150
+
151
+
152
+ ### Bug Fixes
153
+
154
+ * Docker and sagemaker setup ([#1118](https://github.com/imartinez/privateGPT/issues/1118)) ([895588b](https://github.com/imartinez/privateGPT/commit/895588b82a06c2bc71a9e22fb840c7f6442a3b5b))
155
+ * fix pytorch version to avoid wheel bug ([#1123](https://github.com/imartinez/privateGPT/issues/1123)) ([24cfddd](https://github.com/imartinez/privateGPT/commit/24cfddd60f74aadd2dade4c63f6012a2489938a1))
156
+ * Remove global state ([#1216](https://github.com/imartinez/privateGPT/issues/1216)) ([022bd71](https://github.com/imartinez/privateGPT/commit/022bd718e3dfc197027b1e24fb97e5525b186db4))
157
+ * sagemaker config and chat methods ([#1142](https://github.com/imartinez/privateGPT/issues/1142)) ([a517a58](https://github.com/imartinez/privateGPT/commit/a517a588c4927aa5c5c2a93e4f82a58f0599d251))
158
+ * typo in README.md ([#1091](https://github.com/imartinez/privateGPT/issues/1091)) ([ba23443](https://github.com/imartinez/privateGPT/commit/ba23443a70d323cd4f9a242b33fd9dce1bacd2db))
159
+ * Windows 11 failing to auto-delete tmp file ([#1260](https://github.com/imartinez/privateGPT/issues/1260)) ([0d52002](https://github.com/imartinez/privateGPT/commit/0d520026a3d5b08a9b8487be992d3095b21e710c))
160
+ * Windows permission error on ingest service tmp files ([#1280](https://github.com/imartinez/privateGPT/issues/1280)) ([f1cbff0](https://github.com/imartinez/privateGPT/commit/f1cbff0fb7059432d9e71473cbdd039032dab60d))
161
+
162
+ ## [0.0.2](https://github.com/imartinez/privateGPT/compare/v0.0.1...v0.0.2) (2023-10-20)
163
+
164
+
165
+ ### Bug Fixes
166
+
167
+ * chromadb max batch size ([#1087](https://github.com/imartinez/privateGPT/issues/1087)) ([f5a9bf4](https://github.com/imartinez/privateGPT/commit/f5a9bf4e374b2d4c76438cf8a97cccf222ec8e6f))
168
+
169
+ ## 0.0.1 (2023-10-20)
170
+
171
+ ### Miscellaneous Chores
172
+
173
+ * Initial version ([490d93f](https://github.com/imartinez/privateGPT/commit/490d93fdc1977443c92f6c42e57a1c585aa59430))
CITATION.cff ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This CITATION.cff file was generated with cffinit.
2
+ # Visit https://bit.ly/cffinit to generate yours today!
3
+
4
+ cff-version: 1.2.0
5
+ title: PrivateGPT
6
+ message: >-
7
+ If you use this software, please cite it using the
8
+ metadata from this file.
9
+ type: software
10
+ authors:
11
+ - name: Zylon by PrivateGPT
12
+ address: hello@zylon.ai
13
+ website: 'https://www.zylon.ai/'
14
+ repository-code: 'https://github.com/zylon-ai/private-gpt'
15
+ license: Apache-2.0
16
+ date-released: '2023-05-02'
Dockerfile.llamacpp-cpu ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### IMPORTANT, THIS IMAGE CAN ONLY BE RUN IN LINUX DOCKER
2
+ ### You will run into a segfault in mac
3
+ FROM python:3.11.6-slim-bookworm AS base
4
+
5
+ # Install poetry
6
+ RUN pip install pipx
7
+ RUN python3 -m pipx ensurepath
8
+ RUN pipx install poetry==1.8.3
9
+ ENV PATH="/root/.local/bin:$PATH"
10
+ ENV PATH=".venv/bin/:$PATH"
11
+
12
+ # Dependencies to build llama-cpp
13
+ RUN apt update && apt install -y \
14
+ libopenblas-dev\
15
+ ninja-build\
16
+ build-essential\
17
+ pkg-config\
18
+ wget
19
+
20
+ # https://python-poetry.org/docs/configuration/#virtualenvsin-project
21
+ ENV POETRY_VIRTUALENVS_IN_PROJECT=true
22
+
23
+ FROM base AS dependencies
24
+ WORKDIR /home/worker/app
25
+ COPY pyproject.toml poetry.lock ./
26
+
27
+ ARG POETRY_EXTRAS="ui embeddings-huggingface llms-llama-cpp vector-stores-qdrant"
28
+ RUN poetry install --no-root --extras "${POETRY_EXTRAS}"
29
+
30
+ FROM base AS app
31
+
32
+ ENV PYTHONUNBUFFERED=1
33
+ ENV PORT=8080
34
+ ENV APP_ENV=prod
35
+ ENV PYTHONPATH="$PYTHONPATH:/home/worker/app/private_gpt/"
36
+ EXPOSE 8080
37
+
38
+ # Prepare a non-root user
39
+ # More info about how to configure UIDs and GIDs in Docker:
40
+ # https://github.com/systemd/systemd/blob/main/docs/UIDS-GIDS.md
41
+
42
+ # Define the User ID (UID) for the non-root user
43
+ # UID 100 is chosen to avoid conflicts with existing system users
44
+ ARG UID=100
45
+
46
+ # Define the Group ID (GID) for the non-root user
47
+ # GID 65534 is often used for the 'nogroup' or 'nobody' group
48
+ ARG GID=65534
49
+
50
+ RUN adduser --system --gid ${GID} --uid ${UID} --home /home/worker worker
51
+ WORKDIR /home/worker/app
52
+
53
+ RUN chown worker /home/worker/app
54
+ RUN mkdir local_data && chown worker local_data
55
+ RUN mkdir models && chown worker models
56
+ COPY --chown=worker --from=dependencies /home/worker/app/.venv/ .venv
57
+ COPY --chown=worker private_gpt/ private_gpt
58
+ COPY --chown=worker *.yaml ./
59
+ COPY --chown=worker scripts/ scripts
60
+
61
+ USER worker
62
+ ENTRYPOINT python -m private_gpt
Dockerfile.ollama ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11.6-slim-bookworm AS base
2
+
3
+ # Install poetry
4
+ RUN pip install pipx
5
+ RUN python3 -m pipx ensurepath
6
+ RUN pipx install poetry==1.8.3
7
+ ENV PATH="/root/.local/bin:$PATH"
8
+ ENV PATH=".venv/bin/:$PATH"
9
+
10
+ # https://python-poetry.org/docs/configuration/#virtualenvsin-project
11
+ ENV POETRY_VIRTUALENVS_IN_PROJECT=true
12
+
13
+ FROM base AS dependencies
14
+ WORKDIR /home/worker/app
15
+ COPY pyproject.toml poetry.lock ./
16
+
17
+ ARG POETRY_EXTRAS="ui vector-stores-qdrant llms-ollama embeddings-ollama"
18
+ RUN poetry install --no-root --extras "${POETRY_EXTRAS}"
19
+
20
+ FROM base AS app
21
+ ENV PYTHONUNBUFFERED=1
22
+ ENV PORT=8080
23
+ ENV APP_ENV=prod
24
+ ENV PYTHONPATH="$PYTHONPATH:/home/worker/app/private_gpt/"
25
+ EXPOSE 8080
26
+
27
+ # Prepare a non-root user
28
+ # More info about how to configure UIDs and GIDs in Docker:
29
+ # https://github.com/systemd/systemd/blob/main/docs/UIDS-GIDS.md
30
+
31
+ # Define the User ID (UID) for the non-root user
32
+ # UID 100 is chosen to avoid conflicts with existing system users
33
+ ARG UID=100
34
+
35
+ # Define the Group ID (GID) for the non-root user
36
+ # GID 65534 is often used for the 'nogroup' or 'nobody' group
37
+ ARG GID=65534
38
+
39
+ RUN adduser --system --gid ${GID} --uid ${UID} --home /home/worker worker
40
+ WORKDIR /home/worker/app
41
+
42
+ RUN chown worker /home/worker/app
43
+ RUN mkdir local_data && chown worker local_data
44
+ RUN mkdir models && chown worker models
45
+ COPY --chown=worker --from=dependencies /home/worker/app/.venv/ .venv
46
+ COPY --chown=worker private_gpt/ private_gpt
47
+ COPY --chown=worker *.yaml .
48
+ COPY --chown=worker scripts/ scripts
49
+
50
+ USER worker
51
+ ENTRYPOINT python -m private_gpt
LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
Makefile ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Any args passed to the make script, use with $(call args, default_value)
2
+ args = `arg="$(filter-out $@,$(MAKECMDGOALS))" && echo $${arg:-${1}}`
3
+
4
+ ########################################################################################################################
5
+ # Quality checks
6
+ ########################################################################################################################
7
+
8
+ test:
9
+ PYTHONPATH=. poetry run pytest tests
10
+
11
+ test-coverage:
12
+ PYTHONPATH=. poetry run pytest tests --cov private_gpt --cov-report term --cov-report=html --cov-report xml --junit-xml=tests-results.xml
13
+
14
+ black:
15
+ poetry run black . --check
16
+
17
+ ruff:
18
+ poetry run ruff check private_gpt tests
19
+
20
+ format:
21
+ poetry run black .
22
+ poetry run ruff check private_gpt tests --fix
23
+
24
+ mypy:
25
+ poetry run mypy private_gpt
26
+
27
+ check:
28
+ make format
29
+ make mypy
30
+
31
+ ########################################################################################################################
32
+ # Run
33
+ ########################################################################################################################
34
+
35
+ run:
36
+ poetry run python -m private_gpt
37
+
38
+ dev-windows:
39
+ (set PGPT_PROFILES=local & poetry run python -m uvicorn private_gpt.main:app --reload --port 8001)
40
+
41
+ dev:
42
+ PYTHONUNBUFFERED=1 PGPT_PROFILES=local poetry run python -m uvicorn private_gpt.main:app --reload --port 8001
43
+
44
+ ########################################################################################################################
45
+ # Misc
46
+ ########################################################################################################################
47
+
48
+ api-docs:
49
+ PGPT_PROFILES=mock poetry run python scripts/extract_openapi.py private_gpt.main:app --out fern/openapi/openapi.json
50
+
51
+ ingest:
52
+ @poetry run python scripts/ingest_folder.py $(call args)
53
+
54
+ stats:
55
+ poetry run python scripts/utils.py stats
56
+
57
+ wipe:
58
+ poetry run python scripts/utils.py wipe
59
+
60
+ setup:
61
+ poetry run python scripts/setup
62
+
63
+ list:
64
+ @echo "Available commands:"
65
+ @echo " test : Run tests using pytest"
66
+ @echo " test-coverage : Run tests with coverage report"
67
+ @echo " black : Check code format with black"
68
+ @echo " ruff : Check code with ruff"
69
+ @echo " format : Format code with black and ruff"
70
+ @echo " mypy : Run mypy for type checking"
71
+ @echo " check : Run format and mypy commands"
72
+ @echo " run : Run the application"
73
+ @echo " dev-windows : Run the application in development mode on Windows"
74
+ @echo " dev : Run the application in development mode"
75
+ @echo " api-docs : Generate API documentation"
76
+ @echo " ingest : Ingest data using specified script"
77
+ @echo " wipe : Wipe data using specified script"
78
+ @echo " setup : Setup the application"
README.md ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PrivateGPT
2
+
3
+ <a href="https://trendshift.io/repositories/2601" target="_blank"><img src="https://trendshift.io/api/badge/repositories/2601" alt="imartinez%2FprivateGPT | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
4
+
5
+ [![Tests](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml/badge.svg)](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml?query=branch%3Amain)
6
+ [![Website](https://img.shields.io/website?up_message=check%20it&down_message=down&url=https%3A%2F%2Fdocs.privategpt.dev%2F&label=Documentation)](https://docs.privategpt.dev/)
7
+ [![Discord](https://img.shields.io/discord/1164200432894234644?logo=discord&label=PrivateGPT)](https://discord.gg/bK6mRVpErU)
8
+ [![X (formerly Twitter) Follow](https://img.shields.io/twitter/follow/ZylonPrivateGPT)](https://twitter.com/ZylonPrivateGPT)
9
+
10
+ ![Gradio UI](/fern/docs/assets/ui.png?raw=true)
11
+
12
+ PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power
13
+ of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your
14
+ execution environment at any point.
15
+
16
+ >[!TIP]
17
+ > If you are looking for an **enterprise-ready, fully private AI workspace**
18
+ > check out [Zylon's website](https://zylon.ai) or [request a demo](https://cal.com/zylon/demo?source=pgpt-readme).
19
+ > Crafted by the team behind PrivateGPT, Zylon is a best-in-class AI collaborative
20
+ > workspace that can be easily deployed on-premise (data center, bare metal...) or in your private cloud (AWS, GCP, Azure...).
21
+
22
+ The project provides an API offering all the primitives required to build private, context-aware AI applications.
23
+ It follows and extends the [OpenAI API standard](https://openai.com/blog/openai-api),
24
+ and supports both normal and streaming responses.
25
+
26
+ The API is divided into two logical blocks:
27
+
28
+ **High-level API**, which abstracts all the complexity of a RAG (Retrieval Augmented Generation)
29
+ pipeline implementation:
30
+ - Ingestion of documents: internally managing document parsing,
31
+ splitting, metadata extraction, embedding generation and storage.
32
+ - Chat & Completions using context from ingested documents:
33
+ abstracting the retrieval of context, the prompt engineering and the response generation.
34
+
35
+ **Low-level API**, which allows advanced users to implement their own complex pipelines:
36
+ - Embeddings generation: based on a piece of text.
37
+ - Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.
38
+
39
+ In addition to this, a working [Gradio UI](https://www.gradio.app/)
40
+ client is provided to test the API, together with a set of useful tools such as bulk model
41
+ download script, ingestion script, documents folder watch, etc.
42
+
43
+ ## 🎞️ Overview
44
+ >[!WARNING]
45
+ > This README is not updated as frequently as the [documentation](https://docs.privategpt.dev/).
46
+ > Please check it out for the latest updates!
47
+
48
+ ### Motivation behind PrivateGPT
49
+ Generative AI is a game changer for our society, but adoption in companies of all sizes and data-sensitive
50
+ domains like healthcare or legal is limited by a clear concern: **privacy**.
51
+ Not being able to ensure that your data is fully under your control when using third-party AI tools
52
+ is a risk those industries cannot take.
53
+
54
+ ### Primordial version
55
+ The first version of PrivateGPT was launched in May 2023 as a novel approach to address the privacy
56
+ concerns by using LLMs in a complete offline way.
57
+
58
+ That version, which rapidly became a go-to project for privacy-sensitive setups and served as the seed
59
+ for thousands of local-focused generative AI projects, was the foundation of what PrivateGPT is becoming nowadays;
60
+ thus a simpler and more educational implementation to understand the basic concepts required
61
+ to build a fully local -and therefore, private- chatGPT-like tool.
62
+
63
+ If you want to keep experimenting with it, we have saved it in the
64
+ [primordial branch](https://github.com/zylon-ai/private-gpt/tree/primordial) of the project.
65
+
66
+ > It is strongly recommended to do a clean clone and install of this new version of
67
+ PrivateGPT if you come from the previous, primordial version.
68
+
69
+ ### Present and Future of PrivateGPT
70
+ PrivateGPT is now evolving towards becoming a gateway to generative AI models and primitives, including
71
+ completions, document ingestion, RAG pipelines and other low-level building blocks.
72
+ We want to make it easier for any developer to build AI applications and experiences, as well as provide
73
+ a suitable extensive architecture for the community to keep contributing.
74
+
75
+ Stay tuned to our [releases](https://github.com/zylon-ai/private-gpt/releases) to check out all the new features and changes included.
76
+
77
+ ## 📄 Documentation
78
+ Full documentation on installation, dependencies, configuration, running the server, deployment options,
79
+ ingesting local documents, API details and UI features can be found here: https://docs.privategpt.dev/
80
+
81
+ ## 🧩 Architecture
82
+ Conceptually, PrivateGPT is an API that wraps a RAG pipeline and exposes its
83
+ primitives.
84
+ * The API is built using [FastAPI](https://fastapi.tiangolo.com/) and follows
85
+ [OpenAI's API scheme](https://platform.openai.com/docs/api-reference).
86
+ * The RAG pipeline is based on [LlamaIndex](https://www.llamaindex.ai/).
87
+
88
+ The design of PrivateGPT allows to easily extend and adapt both the API and the
89
+ RAG implementation. Some key architectural decisions are:
90
+ * Dependency Injection, decoupling the different components and layers.
91
+ * Usage of LlamaIndex abstractions such as `LLM`, `BaseEmbedding` or `VectorStore`,
92
+ making it immediate to change the actual implementations of those abstractions.
93
+ * Simplicity, adding as few layers and new abstractions as possible.
94
+ * Ready to use, providing a full implementation of the API and RAG
95
+ pipeline.
96
+
97
+ Main building blocks:
98
+ * APIs are defined in `private_gpt:server:<api>`. Each package contains an
99
+ `<api>_router.py` (FastAPI layer) and an `<api>_service.py` (the
100
+ service implementation). Each *Service* uses LlamaIndex base abstractions instead
101
+ of specific implementations,
102
+ decoupling the actual implementation from its usage.
103
+ * Components are placed in
104
+ `private_gpt:components:<component>`. Each *Component* is in charge of providing
105
+ actual implementations to the base abstractions used in the Services - for example
106
+ `LLMComponent` is in charge of providing an actual implementation of an `LLM`
107
+ (for example `LlamaCPP` or `OpenAI`).
108
+
109
+ ## 💡 Contributing
110
+ Contributions are welcomed! To ensure code quality we have enabled several format and
111
+ typing checks, just run `make check` before committing to make sure your code is ok.
112
+ Remember to test your code! You'll find a tests folder with helpers, and you can run
113
+ tests using `make test` command.
114
+
115
+ Don't know what to contribute? Here is the public
116
+ [Project Board](https://github.com/users/imartinez/projects/3) with several ideas.
117
+
118
+ Head over to Discord
119
+ #contributors channel and ask for write permissions on that GitHub project.
120
+
121
+ ## 💬 Community
122
+ Join the conversation around PrivateGPT on our:
123
+ - [Twitter (aka X)](https://twitter.com/PrivateGPT_AI)
124
+ - [Discord](https://discord.gg/bK6mRVpErU)
125
+
126
+ ## 📖 Citation
127
+ If you use PrivateGPT in a paper, check out the [Citation file](CITATION.cff) for the correct citation.
128
+ You can also use the "Cite this repository" button in this repo to get the citation in different formats.
129
+
130
+ Here are a couple of examples:
131
+
132
+ #### BibTeX
133
+ ```bibtex
134
+ @software{Zylon_PrivateGPT_2023,
135
+ author = {Zylon by PrivateGPT},
136
+ license = {Apache-2.0},
137
+ month = may,
138
+ title = {{PrivateGPT}},
139
+ url = {https://github.com/zylon-ai/private-gpt},
140
+ year = {2023}
141
+ }
142
+ ```
143
+
144
+ #### APA
145
+ ```
146
+ Zylon by PrivateGPT (2023). PrivateGPT [Computer software]. https://github.com/zylon-ai/private-gpt
147
+ ```
148
+
149
+ ## 🤗 Partners & Supporters
150
+ PrivateGPT is actively supported by the teams behind:
151
+ * [Qdrant](https://qdrant.tech/), providing the default vector database
152
+ * [Fern](https://buildwithfern.com/), providing Documentation and SDKs
153
+ * [LlamaIndex](https://www.llamaindex.ai/), providing the base RAG framework and abstractions
154
+
155
+ This project has been strongly influenced and supported by other amazing projects like
156
+ [LangChain](https://github.com/hwchase17/langchain),
157
+ [GPT4All](https://github.com/nomic-ai/gpt4all),
158
+ [LlamaCpp](https://github.com/ggerganov/llama.cpp),
159
+ [Chroma](https://www.trychroma.com/)
160
+ and [SentenceTransformers](https://www.sbert.net/).
docker-compose.yaml ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ services:
2
+
3
+ #-----------------------------------
4
+ #---- Private-GPT services ---------
5
+ #-----------------------------------
6
+
7
+ # Private-GPT service for the Ollama CPU and GPU modes
8
+ # This service builds from an external Dockerfile and runs the Ollama mode.
9
+ private-gpt-ollama:
10
+ image: ${PGPT_IMAGE:-zylonai/private-gpt}:${PGPT_TAG:-0.6.2}-ollama # x-release-please-version
11
+ user: root
12
+ build:
13
+ context: .
14
+ dockerfile: Dockerfile.ollama
15
+ volumes:
16
+ - ./local_data:/home/worker/app/local_data
17
+ ports:
18
+ - "8001:8001"
19
+ environment:
20
+ PORT: 8001
21
+ PGPT_PROFILES: docker
22
+ PGPT_MODE: ollama
23
+ PGPT_EMBED_MODE: ollama
24
+ PGPT_OLLAMA_API_BASE: http://ollama:11434
25
+ HF_TOKEN: ${HF_TOKEN:-}
26
+ profiles:
27
+ - ""
28
+ - ollama-cpu
29
+ - ollama-cuda
30
+ - ollama-api
31
+ depends_on:
32
+ ollama:
33
+ condition: service_healthy
34
+
35
+ # Private-GPT service for the local mode
36
+ # This service builds from a local Dockerfile and runs the application in local mode.
37
+ private-gpt-llamacpp-cpu:
38
+ image: ${PGPT_IMAGE:-zylonai/private-gpt}:${PGPT_TAG:-0.6.2}-llamacpp-cpu # x-release-please-version
39
+ user: root
40
+ build:
41
+ context: .
42
+ dockerfile: Dockerfile.llamacpp-cpu
43
+ volumes:
44
+ - ./local_data/:/home/worker/app/local_data
45
+ - ./models/:/home/worker/app/models
46
+ entrypoint: sh -c ".venv/bin/python scripts/setup && .venv/bin/python -m private_gpt"
47
+ ports:
48
+ - "8001:8001"
49
+ environment:
50
+ PORT: 8001
51
+ PGPT_PROFILES: local
52
+ HF_TOKEN: ${HF_TOKEN:-}
53
+ profiles:
54
+ - llamacpp-cpu
55
+
56
+ #-----------------------------------
57
+ #---- Ollama services --------------
58
+ #-----------------------------------
59
+
60
+ # Traefik reverse proxy for the Ollama service
61
+ # This will route requests to the Ollama service based on the profile.
62
+ ollama:
63
+ image: traefik:v2.10
64
+ healthcheck:
65
+ test: ["CMD", "sh", "-c", "wget -q --spider http://ollama:11434 || exit 1"]
66
+ interval: 10s
67
+ retries: 3
68
+ start_period: 5s
69
+ timeout: 5s
70
+ ports:
71
+ - "8080:8080"
72
+ command:
73
+ - "--providers.file.filename=/etc/router.yml"
74
+ - "--log.level=ERROR"
75
+ - "--api.insecure=true"
76
+ - "--providers.docker=true"
77
+ - "--providers.docker.exposedbydefault=false"
78
+ - "--entrypoints.web.address=:11434"
79
+ volumes:
80
+ - /var/run/docker.sock:/var/run/docker.sock:ro
81
+ - ./.docker/router.yml:/etc/router.yml:ro
82
+ extra_hosts:
83
+ - "host.docker.internal:host-gateway"
84
+ profiles:
85
+ - ""
86
+ - ollama-cpu
87
+ - ollama-cuda
88
+ - ollama-api
89
+
90
+ # Ollama service for the CPU mode
91
+ ollama-cpu:
92
+ image: ollama/ollama:latest
93
+ ports:
94
+ - "11434:11434"
95
+ volumes:
96
+ - ./models:/root/.ollama
97
+ profiles:
98
+ - ""
99
+ - ollama-cpu
100
+
101
+ # Ollama service for the CUDA mode
102
+ ollama-cuda:
103
+ image: ollama/ollama:latest
104
+ ports:
105
+ - "11434:11434"
106
+ volumes:
107
+ - ./models:/root/.ollama
108
+ deploy:
109
+ resources:
110
+ reservations:
111
+ devices:
112
+ - driver: nvidia
113
+ count: 1
114
+ capabilities: [gpu]
115
+ profiles:
116
+ - ollama-cuda
fern/README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Documentation of PrivateGPT
2
+
3
+ The documentation of this project is being rendered thanks to [fern](https://github.com/fern-api/fern).
4
+
5
+ Fern is basically transforming your `.md` and `.mdx` files into a static website: your documentation.
6
+
7
+ The configuration of your documentation is done in the `./docs.yml` file.
8
+ There, you can configure the navbar, tabs, sections and pages being rendered.
9
+
10
+ The documentation of fern (and the syntax of its configuration `docs.yml`) is
11
+ available there [docs.buildwithfern.com](https://docs.buildwithfern.com/).
12
+
13
+ ## How to run fern
14
+
15
+ **You cannot render your documentation locally without fern credentials.**
16
+
17
+ To see how your documentation looks like, you **have to** use the CICD of this
18
+ repository (by opening a PR, CICD job will be executed, and a preview of
19
+ your PR's documentation will be deployed in vercel automatically, through fern).
20
+
21
+ The only thing you can do locally, is to run `fern check`, which check the syntax of
22
+ your `docs.yml` file.
23
+
24
+ ## How to add a new page
25
+ Add in the `docs.yml` a new `page`, with the following syntax:
26
+
27
+ ```yml
28
+ navigation:
29
+ # ...
30
+ - tab: my-existing-tab
31
+ layout:
32
+ # ...
33
+ - section: My Existing Section
34
+ contents:
35
+ # ...
36
+ - page: My new page display name
37
+ # The path of the page, relative to `fern/`
38
+ path: ./docs/pages/my-existing-tab/new-page-content.mdx
39
+ ```
fern/docs.yml ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Main Fern configuration file
2
+ instances:
3
+ - url: privategpt.docs.buildwithfern.com
4
+ custom-domain: docs.privategpt.dev
5
+
6
+ title: PrivateGPT | Docs
7
+
8
+ # The tabs definition, in the top left corner
9
+ tabs:
10
+ overview:
11
+ display-name: Overview
12
+ icon: "fa-solid fa-home"
13
+ quickstart:
14
+ display-name: Quickstart
15
+ icon: "fa-solid fa-rocket"
16
+ installation:
17
+ display-name: Installation
18
+ icon: "fa-solid fa-download"
19
+ manual:
20
+ display-name: Manual
21
+ icon: "fa-solid fa-book"
22
+ recipes:
23
+ display-name: Recipes
24
+ icon: "fa-solid fa-flask"
25
+ api-reference:
26
+ display-name: API Reference
27
+ icon: "fa-solid fa-file-contract"
28
+
29
+ # Definition of tabs contents, will be displayed on the left side of the page, below all tabs
30
+ navigation:
31
+ # The default tab
32
+ - tab: overview
33
+ layout:
34
+ - section: Welcome
35
+ contents:
36
+ - page: Introduction
37
+ path: ./docs/pages/overview/welcome.mdx
38
+ - tab: quickstart
39
+ layout:
40
+ - section: Getting started
41
+ contents:
42
+ - page: Quickstart
43
+ path: ./docs/pages/quickstart/quickstart.mdx
44
+ # How to install PrivateGPT, with FAQ and troubleshooting
45
+ - tab: installation
46
+ layout:
47
+ - section: Getting started
48
+ contents:
49
+ - page: Main Concepts
50
+ path: ./docs/pages/installation/concepts.mdx
51
+ - page: Installation
52
+ path: ./docs/pages/installation/installation.mdx
53
+ - page: Troubleshooting
54
+ path: ./docs/pages/installation/troubleshooting.mdx
55
+ # Manual of PrivateGPT: how to use it and configure it
56
+ - tab: manual
57
+ layout:
58
+ - section: General configuration
59
+ contents:
60
+ - page: Configuration
61
+ path: ./docs/pages/manual/settings.mdx
62
+ - section: Document management
63
+ contents:
64
+ - page: Ingestion
65
+ path: ./docs/pages/manual/ingestion.mdx
66
+ - page: Deletion
67
+ path: ./docs/pages/manual/ingestion-reset.mdx
68
+ - section: Storage
69
+ contents:
70
+ - page: Vector Stores
71
+ path: ./docs/pages/manual/vectordb.mdx
72
+ - page: Node Stores
73
+ path: ./docs/pages/manual/nodestore.mdx
74
+ - section: Advanced Setup
75
+ contents:
76
+ - page: LLM Backends
77
+ path: ./docs/pages/manual/llms.mdx
78
+ - page: Reranking
79
+ path: ./docs/pages/manual/reranker.mdx
80
+ - section: User Interface
81
+ contents:
82
+ - page: Gradio Manual
83
+ path: ./docs/pages/ui/gradio.mdx
84
+ - page: Alternatives
85
+ path: ./docs/pages/ui/alternatives.mdx
86
+ - tab: recipes
87
+ layout:
88
+ - section: Getting started
89
+ contents:
90
+ - page: Quickstart
91
+ path: ./docs/pages/recipes/quickstart.mdx
92
+ - section: General use cases
93
+ contents:
94
+ - page: Summarize
95
+ path: ./docs/pages/recipes/summarize.mdx
96
+ # More advanced usage of PrivateGPT, by API
97
+ - tab: api-reference
98
+ layout:
99
+ - section: Overview
100
+ contents:
101
+ - page : API Reference overview
102
+ path: ./docs/pages/api-reference/api-reference.mdx
103
+ - page: SDKs
104
+ path: ./docs/pages/api-reference/sdks.mdx
105
+ - api: API Reference
106
+
107
+ # Definition of the navbar, will be displayed in the top right corner.
108
+ # `type:primary` is always displayed at the most right side of the navbar
109
+ navbar-links:
110
+ - type: secondary
111
+ text: Contact us
112
+ url: "mailto:hello@zylon.ai"
113
+ - type: github
114
+ value: "https://github.com/zylon-ai/private-gpt"
115
+ - type: primary
116
+ text: Join the Discord
117
+ url: https://discord.com/invite/bK6mRVpErU
118
+
119
+ colors:
120
+ accentPrimary:
121
+ dark: "#C6BBFF"
122
+ light: "#756E98"
123
+
124
+ logo:
125
+ dark: ./docs/assets/logo_light.png
126
+ light: ./docs/assets/logo_dark.png
127
+ height: 50
128
+
129
+ favicon: ./docs/assets/favicon.ico
fern/docs/assets/favicon.ico ADDED
fern/docs/assets/header.jpeg ADDED
fern/docs/assets/logo_dark.png ADDED
fern/docs/assets/logo_light.png ADDED
fern/docs/pages/api-reference/api-reference.mdx ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # API Reference
2
+
3
+ The API is divided in two logical blocks:
4
+
5
+ 1. High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:
6
+ - Ingestion of documents: internally managing document parsing, splitting, metadata extraction,
7
+ embedding generation and storage.
8
+ - Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt
9
+ engineering and the response generation.
10
+
11
+ 2. Low-level API, allowing advanced users to implement their own complex pipelines:
12
+ - Embeddings generation: based on a piece of text.
13
+ - Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested
14
+ documents.
fern/docs/pages/api-reference/sdks.mdx ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ We use [Fern](www.buildwithfern.com) to offer API clients for Node.js, Python, Go, and Java.
2
+ We recommend using these clients to interact with our endpoints.
3
+ The clients are kept up to date automatically, so we encourage you to use the latest version.
4
+
5
+ ## SDKs
6
+
7
+ *Coming soon!*
8
+
9
+ <Cards>
10
+ <Card
11
+ title="TypeScript"
12
+ icon="fa-brands fa-node"
13
+ href="https://github.com/zylon-ai/privategpt-ts"
14
+ />
15
+ <Card
16
+ title="Python"
17
+ icon="fa-brands fa-python"
18
+ href="https://github.com/zylon-ai/pgpt-python"
19
+ />
20
+ <br />
21
+ </Cards>
22
+
23
+ <br />
24
+
25
+ <Cards>
26
+ <Card
27
+ title="Java - WIP"
28
+ icon="fa-brands fa-java"
29
+ href="https://github.com/zylon-ai/private-gpt-java"
30
+ />
31
+ <Card
32
+ title="Go - WIP"
33
+ icon="fa-brands fa-golang"
34
+ href="https://github.com/zylon-ai/private-gpt-go"
35
+ />
36
+ </Cards>
37
+
38
+ <br />
fern/docs/pages/installation/concepts.mdx ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ PrivateGPT is a service that wraps a set of AI RAG primitives in a comprehensive set of APIs providing a private, secure, customizable and easy to use GenAI development framework.
2
+
3
+ It uses FastAPI and LLamaIndex as its core frameworks. Those can be customized by changing the codebase itself.
4
+
5
+ It supports a variety of LLM providers, embeddings providers, and vector stores, both local and remote. Those can be easily changed without changing the codebase.
6
+
7
+ # Different Setups support
8
+
9
+ ## Setup configurations available
10
+ You get to decide the setup for these 3 main components:
11
+ - **LLM**: the large language model provider used for inference. It can be local, or remote, or even OpenAI.
12
+ - **Embeddings**: the embeddings provider used to encode the input, the documents and the users' queries. Same as the LLM, it can be local, or remote, or even OpenAI.
13
+ - **Vector store**: the store used to index and retrieve the documents.
14
+
15
+ There is an extra component that can be enabled or disabled: the UI. It is a Gradio UI that allows to interact with the API in a more user-friendly way.
16
+
17
+ <Callout intent = "warning">
18
+ A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk
19
+ model download script, ingestion script, documents folder watch, etc. Please refer to the [UI alternatives](/manual/user-interface/alternatives) page for more UI alternatives.
20
+ </Callout>
21
+
22
+ ### Setups and Dependencies
23
+ Your setup will be the combination of the different options available. You'll find recommended setups in the [installation](./installation) section.
24
+ PrivateGPT uses poetry to manage its dependencies. You can install the dependencies for the different setups by running `poetry install --extras "<extra1> <extra2>..."`.
25
+ Extras are the different options available for each component. For example, to install the dependencies for a a local setup with UI and qdrant as vector database, Ollama as LLM and local embeddings, you would run:
26
+
27
+ ```bash
28
+ poetry install --extras "ui vector-stores-qdrant llms-ollama embeddings-ollama"
29
+ ```
30
+
31
+ Refer to the [installation](./installation) section for more details.
32
+
33
+ ### Setups and Configuration
34
+ PrivateGPT uses yaml to define its configuration in files named `settings-<profile>.yaml`.
35
+ Different configuration files can be created in the root directory of the project.
36
+ PrivateGPT will load the configuration at startup from the profile specified in the `PGPT_PROFILES` environment variable.
37
+ For example, running:
38
+ ```bash
39
+ PGPT_PROFILES=ollama make run
40
+ ```
41
+ will load the configuration from `settings.yaml` and `settings-ollama.yaml`.
42
+ - `settings.yaml` is always loaded and contains the default configuration.
43
+ - `settings-ollama.yaml` is loaded if the `ollama` profile is specified in the `PGPT_PROFILES` environment variable. It can override configuration from the default `settings.yaml`
44
+
45
+ ## About Fully Local Setups
46
+ In order to run PrivateGPT in a fully local setup, you will need to run the LLM, Embeddings and Vector Store locally.
47
+
48
+ ### LLM
49
+ For local LLM there are two options:
50
+ * (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
51
+ * You can use the 'llms-llama-cpp' option in PrivateGPT, which will use LlamaCPP. It works great on Mac with Metal most of the times (leverages Metal GPU), but it can be tricky in certain Linux and Windows distributions, depending on the GPU. In the installation document you'll find guides and troubleshooting.
52
+
53
+ In order for LlamaCPP powered LLM to work (the second option), you need to download the LLM model to the `models` folder. You can do so by running the `setup` script:
54
+ ```bash
55
+ poetry run python scripts/setup
56
+ ```
57
+ ### Embeddings
58
+ For local Embeddings there are two options:
59
+ * (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
60
+ * You can use the 'embeddings-huggingface' option in PrivateGPT, which will use HuggingFace.
61
+
62
+ In order for HuggingFace LLM to work (the second option), you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
63
+ ```bash
64
+ poetry run python scripts/setup
65
+ ```
66
+ ### Vector stores
67
+ The vector stores supported (Qdrant, Milvus, ChromaDB and Postgres) run locally by default.
fern/docs/pages/installation/installation.mdx ADDED
@@ -0,0 +1,433 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ It is important that you review the [Main Concepts](../concepts) section to understand the different components of PrivateGPT and how they interact with each other.
2
+
3
+ ## Base requirements to run PrivateGPT
4
+
5
+ ### 1. Clone the PrivateGPT Repository
6
+ Clone the repository and navigate to it:
7
+ ```bash
8
+ git clone https://github.com/zylon-ai/private-gpt
9
+ cd private-gpt
10
+ ```
11
+
12
+ ### 2. Install Python 3.11
13
+ If you do not have Python 3.11 installed, install it using a Python version manager like `pyenv`. Earlier Python versions are not supported.
14
+ #### macOS/Linux
15
+ Install and set Python 3.11 using [pyenv](https://github.com/pyenv/pyenv):
16
+ ```bash
17
+ pyenv install 3.11
18
+ pyenv local 3.11
19
+ ```
20
+ #### Windows
21
+ Install and set Python 3.11 using [pyenv-win](https://github.com/pyenv-win/pyenv-win):
22
+ ```bash
23
+ pyenv install 3.11
24
+ pyenv local 3.11
25
+ ```
26
+
27
+ ### 3. Install `Poetry`
28
+ Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) for dependency management:
29
+ Follow the instructions on the official Poetry website to install it.
30
+
31
+ <Callout intent="warning">
32
+ A bug exists in Poetry versions 1.7.0 and earlier. We strongly recommend upgrading to a tested version.
33
+ To upgrade Poetry to latest tested version, run `poetry self update 1.8.3` after installing it.
34
+ </Callout>
35
+
36
+ ### 4. Optional: Install `make`
37
+ To run various scripts, you need to install `make`. Follow the instructions for your operating system:
38
+ #### macOS
39
+ (Using Homebrew):
40
+ ```bash
41
+ brew install make
42
+ ```
43
+ #### Windows
44
+ (Using Chocolatey):
45
+ ```bash
46
+ choco install make
47
+ ```
48
+
49
+ ## Install and Run Your Desired Setup
50
+
51
+ PrivateGPT allows customization of the setup, from fully local to cloud-based, by deciding the modules to use. To install only the required dependencies, PrivateGPT offers different `extras` that can be combined during the installation process:
52
+
53
+ ```bash
54
+ poetry install --extras "<extra1> <extra2>..."
55
+ ```
56
+ Where `<extra>` can be any of the following options described below.
57
+
58
+ ### Available Modules
59
+
60
+ You need to choose one option per category (LLM, Embeddings, Vector Stores, UI). Below are the tables listing the available options for each category.
61
+
62
+ #### LLM
63
+
64
+ | **Option** | **Description** | **Extra** |
65
+ |--------------|------------------------------------------------------------------------|---------------------|
66
+ | **ollama** | Adds support for Ollama LLM, requires Ollama running locally | llms-ollama |
67
+ | llama-cpp | Adds support for local LLM using LlamaCPP | llms-llama-cpp |
68
+ | sagemaker | Adds support for Amazon Sagemaker LLM, requires Sagemaker endpoints | llms-sagemaker |
69
+ | openai | Adds support for OpenAI LLM, requires OpenAI API key | llms-openai |
70
+ | openailike | Adds support for 3rd party LLM providers compatible with OpenAI's API | llms-openai-like |
71
+ | azopenai | Adds support for Azure OpenAI LLM, requires Azure endpoints | llms-azopenai |
72
+ | gemini | Adds support for Gemini LLM, requires Gemini API key | llms-gemini |
73
+
74
+ #### Embeddings
75
+
76
+ | **Option** | **Description** | **Extra** |
77
+ |------------------|--------------------------------------------------------------------------------|-------------------------|
78
+ | **ollama** | Adds support for Ollama Embeddings, requires Ollama running locally | embeddings-ollama |
79
+ | huggingface | Adds support for local Embeddings using HuggingFace | embeddings-huggingface |
80
+ | openai | Adds support for OpenAI Embeddings, requires OpenAI API key | embeddings-openai |
81
+ | sagemaker | Adds support for Amazon Sagemaker Embeddings, requires Sagemaker endpoints | embeddings-sagemaker |
82
+ | azopenai | Adds support for Azure OpenAI Embeddings, requires Azure endpoints | embeddings-azopenai |
83
+ | gemini | Adds support for Gemini Embeddings, requires Gemini API key | embeddings-gemini |
84
+
85
+ #### Vector Stores
86
+
87
+ | **Option** | **Description** | **Extra** |
88
+ |------------------|-----------------------------------------|-------------------------|
89
+ | **qdrant** | Adds support for Qdrant vector store | vector-stores-qdrant |
90
+ | milvus | Adds support for Milvus vector store | vector-stores-milvus |
91
+ | chroma | Adds support for Chroma DB vector store | vector-stores-chroma |
92
+ | postgres | Adds support for Postgres vector store | vector-stores-postgres |
93
+ | clickhouse | Adds support for Clickhouse vector store| vector-stores-clickhouse|
94
+
95
+ #### UI
96
+
97
+ | **Option** | **Description** | **Extra** |
98
+ |--------------|------------------------------------------|-----------|
99
+ | Gradio | Adds support for UI using Gradio | ui |
100
+
101
+ <Callout intent = "warning">
102
+ A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk
103
+ model download script, ingestion script, documents folder watch, etc. Please refer to the [UI alternatives](/manual/user-interface/alternatives) page for more UI alternatives.
104
+ </Callout>
105
+
106
+ ## Recommended Setups
107
+
108
+ There are just some examples of recommended setups. You can mix and match the different options to fit your needs.
109
+ You'll find more information in the Manual section of the documentation.
110
+
111
+ > **Important for Windows**: In the examples below or how to run PrivateGPT with `make run`, `PGPT_PROFILES` env var is being set inline following Unix command line syntax (works on MacOS and Linux).
112
+ If you are using Windows, you'll need to set the env var in a different way, for example:
113
+
114
+ ```powershell
115
+ # Powershell
116
+ $env:PGPT_PROFILES="ollama"
117
+ make run
118
+ ```
119
+
120
+ or
121
+
122
+ ```cmd
123
+ # CMD
124
+ set PGPT_PROFILES=ollama
125
+ make run
126
+ ```
127
+
128
+ Refer to the [troubleshooting](./troubleshooting) section for specific issues you might encounter.
129
+
130
+ ### Local, Ollama-powered setup - RECOMMENDED
131
+
132
+ **The easiest way to run PrivateGPT fully locally** is to depend on Ollama for the LLM. Ollama provides local LLM and Embeddings super easy to install and use, abstracting the complexity of GPU support. It's the recommended setup for local development.
133
+
134
+ Go to [ollama.ai](https://ollama.ai/) and follow the instructions to install Ollama on your machine.
135
+
136
+ After the installation, make sure the Ollama desktop app is closed.
137
+
138
+ Now, start Ollama service (it will start a local inference server, serving both the LLM and the Embeddings):
139
+ ```bash
140
+ ollama serve
141
+ ```
142
+
143
+ Install the models to be used, the default settings-ollama.yaml is configured to user llama3.1 8b LLM (~4GB) and nomic-embed-text Embeddings (~275MB)
144
+
145
+ By default, PGPT will automatically pull models as needed. This behavior can be changed by modifying the `ollama.autopull_models` property.
146
+
147
+ In any case, if you want to manually pull models, run the following commands:
148
+
149
+ ```bash
150
+ ollama pull llama3.1
151
+ ollama pull nomic-embed-text
152
+ ```
153
+
154
+ Once done, on a different terminal, you can install PrivateGPT with the following command:
155
+ ```bash
156
+ poetry install --extras "ui llms-ollama embeddings-ollama vector-stores-qdrant"
157
+ ```
158
+
159
+ Once installed, you can run PrivateGPT. Make sure you have a working Ollama running locally before running the following command.
160
+
161
+ ```bash
162
+ PGPT_PROFILES=ollama make run
163
+ ```
164
+
165
+ PrivateGPT will use the already existing `settings-ollama.yaml` settings file, which is already configured to use Ollama LLM and Embeddings, and Qdrant. Review it and adapt it to your needs (different models, different Ollama port, etc.)
166
+
167
+ The UI will be available at http://localhost:8001
168
+
169
+ ### Private, Sagemaker-powered setup
170
+
171
+ If you need more performance, you can run a version of PrivateGPT that relies on powerful AWS Sagemaker machines to serve the LLM and Embeddings.
172
+
173
+ You need to have access to sagemaker inference endpoints for the LLM and / or the embeddings, and have AWS credentials properly configured.
174
+
175
+ Edit the `settings-sagemaker.yaml` file to include the correct Sagemaker endpoints.
176
+
177
+ Then, install PrivateGPT with the following command:
178
+ ```bash
179
+ poetry install --extras "ui llms-sagemaker embeddings-sagemaker vector-stores-qdrant"
180
+ ```
181
+
182
+ Once installed, you can run PrivateGPT. Make sure you have a working Ollama running locally before running the following command.
183
+
184
+ ```bash
185
+ PGPT_PROFILES=sagemaker make run
186
+ ```
187
+
188
+ PrivateGPT will use the already existing `settings-sagemaker.yaml` settings file, which is already configured to use Sagemaker LLM and Embeddings endpoints, and Qdrant.
189
+
190
+ The UI will be available at http://localhost:8001
191
+
192
+ ### Non-Private, OpenAI-powered test setup
193
+
194
+ If you want to test PrivateGPT with OpenAI's LLM and Embeddings -taking into account your data is going to OpenAI!- you can run the following command:
195
+
196
+ You need an OPENAI API key to run this setup.
197
+
198
+ Edit the `settings-openai.yaml` file to include the correct API KEY. Never commit it! It's a secret! As an alternative to editing `settings-openai.yaml`, you can just set the env var OPENAI_API_KEY.
199
+
200
+ Then, install PrivateGPT with the following command:
201
+ ```bash
202
+ poetry install --extras "ui llms-openai embeddings-openai vector-stores-qdrant"
203
+ ```
204
+
205
+ Once installed, you can run PrivateGPT.
206
+
207
+ ```bash
208
+ PGPT_PROFILES=openai make run
209
+ ```
210
+
211
+ PrivateGPT will use the already existing `settings-openai.yaml` settings file, which is already configured to use OpenAI LLM and Embeddings endpoints, and Qdrant.
212
+
213
+ The UI will be available at http://localhost:8001
214
+
215
+ ### Non-Private, Azure OpenAI-powered test setup
216
+
217
+ If you want to test PrivateGPT with Azure OpenAI's LLM and Embeddings -taking into account your data is going to Azure OpenAI!- you can run the following command:
218
+
219
+ You need to have access to Azure OpenAI inference endpoints for the LLM and / or the embeddings, and have Azure OpenAI credentials properly configured.
220
+
221
+ Edit the `settings-azopenai.yaml` file to include the correct Azure OpenAI endpoints.
222
+
223
+ Then, install PrivateGPT with the following command:
224
+ ```bash
225
+ poetry install --extras "ui llms-azopenai embeddings-azopenai vector-stores-qdrant"
226
+ ```
227
+
228
+ Once installed, you can run PrivateGPT.
229
+
230
+ ```bash
231
+ PGPT_PROFILES=azopenai make run
232
+ ```
233
+
234
+ PrivateGPT will use the already existing `settings-azopenai.yaml` settings file, which is already configured to use Azure OpenAI LLM and Embeddings endpoints, and Qdrant.
235
+
236
+ The UI will be available at http://localhost:8001
237
+
238
+ ### Local, Llama-CPP powered setup
239
+
240
+ If you want to run PrivateGPT fully locally without relying on Ollama, you can run the following command:
241
+
242
+ ```bash
243
+ poetry install --extras "ui llms-llama-cpp embeddings-huggingface vector-stores-qdrant"
244
+ ```
245
+
246
+ In order for local LLM and embeddings to work, you need to download the models to the `models` folder. You can do so by running the `setup` script:
247
+ ```bash
248
+ poetry run python scripts/setup
249
+ ```
250
+
251
+ Once installed, you can run PrivateGPT with the following command:
252
+
253
+ ```bash
254
+ PGPT_PROFILES=local make run
255
+ ```
256
+
257
+ PrivateGPT will load the already existing `settings-local.yaml` file, which is already configured to use LlamaCPP LLM, HuggingFace embeddings and Qdrant.
258
+
259
+ The UI will be available at http://localhost:8001
260
+
261
+ #### Llama-CPP support
262
+
263
+ For PrivateGPT to run fully locally without Ollama, Llama.cpp is required and in
264
+ particular [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
265
+ is used.
266
+
267
+ You'll need to have a valid C++ compiler like gcc installed. See [Troubleshooting: C++ Compiler](#troubleshooting-c-compiler) for more details.
268
+
269
+ > It's highly encouraged that you fully read llama-cpp and llama-cpp-python documentation relevant to your platform.
270
+ > Running into installation issues is very likely, and you'll need to troubleshoot them yourself.
271
+
272
+ ##### Llama-CPP OSX GPU support
273
+
274
+ You will need to build [llama.cpp](https://github.com/ggerganov/llama.cpp) with metal support.
275
+
276
+ To do that, you need to install `llama.cpp` python's binding `llama-cpp-python` through pip, with the compilation flag
277
+ that activate `METAL`: you have to pass `-DLLAMA_METAL=on` to the CMake command tha `pip` runs for you (see below).
278
+
279
+ In other words, one should simply run:
280
+ ```bash
281
+ CMAKE_ARGS="-DLLAMA_METAL=on" pip install --force-reinstall --no-cache-dir llama-cpp-python
282
+ ```
283
+
284
+ The above command will force the re-installation of `llama-cpp-python` with `METAL` support by compiling
285
+ `llama.cpp` locally with your `METAL` libraries (shipped by default with your macOS).
286
+
287
+ More information is available in the documentation of the libraries themselves:
288
+ * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python#installation-with-hardware-acceleration)
289
+ * [llama-cpp-python's documentation](https://llama-cpp-python.readthedocs.io/en/latest/#installation-with-hardware-acceleration)
290
+ * [llama.cpp](https://github.com/ggerganov/llama.cpp#build)
291
+
292
+ ##### Llama-CPP Windows NVIDIA GPU support
293
+
294
+ Windows GPU support is done through CUDA.
295
+ Follow the instructions on the original [llama.cpp](https://github.com/ggerganov/llama.cpp) repo to install the required
296
+ dependencies.
297
+
298
+ Some tips to get it working with an NVIDIA card and CUDA (Tested on Windows 10 with CUDA 11.5 RTX 3070):
299
+
300
+ * Install latest VS2022 (and build tools) https://visualstudio.microsoft.com/vs/community/
301
+ * Install CUDA toolkit https://developer.nvidia.com/cuda-downloads
302
+ * Verify your installation is correct by running `nvcc --version` and `nvidia-smi`, ensure your CUDA version is up to
303
+ date and your GPU is detected.
304
+ * [Optional] Install CMake to troubleshoot building issues by compiling llama.cpp directly https://cmake.org/download/
305
+
306
+ If you have all required dependencies properly configured running the
307
+ following powershell command should succeed.
308
+
309
+ ```powershell
310
+ $env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
311
+ ```
312
+
313
+ If your installation was correct, you should see a message similar to the following next
314
+ time you start the server `BLAS = 1`. If there is some issue, please refer to the
315
+ [troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
316
+
317
+ ```console
318
+ llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
319
+ AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |
320
+ ```
321
+
322
+ Note that llama.cpp offloads matrix calculations to the GPU but the performance is
323
+ still hit heavily due to latency between CPU and GPU communication. You might need to tweak
324
+ batch sizes and other parameters to get the best performance for your particular system.
325
+
326
+ ##### Llama-CPP Linux NVIDIA GPU support and Windows-WSL
327
+
328
+ Linux GPU support is done through CUDA.
329
+ Follow the instructions on the original [llama.cpp](https://github.com/ggerganov/llama.cpp) repo to install the required
330
+ external
331
+ dependencies.
332
+
333
+ Some tips:
334
+
335
+ * Make sure you have an up-to-date C++ compiler
336
+ * Install CUDA toolkit https://developer.nvidia.com/cuda-downloads
337
+ * Verify your installation is correct by running `nvcc --version` and `nvidia-smi`, ensure your CUDA version is up to
338
+ date and your GPU is detected.
339
+
340
+ After that running the following command in the repository will install llama.cpp with GPU support:
341
+
342
+ ```bash
343
+ CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
344
+ ```
345
+
346
+ If your installation was correct, you should see a message similar to the following next
347
+ time you start the server `BLAS = 1`. If there is some issue, please refer to the
348
+ [troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
349
+
350
+ ```
351
+ llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
352
+ AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |
353
+ ```
354
+
355
+ ##### Llama-CPP Linux AMD GPU support
356
+
357
+ Linux GPU support is done through ROCm.
358
+ Some tips:
359
+ * Install ROCm from [quick-start install guide](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/tutorial/quick-start.html)
360
+ * [Install PyTorch for ROCm](https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/install-pytorch.html)
361
+ ```bash
362
+ wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.0/torch-2.1.1%2Brocm6.0-cp311-cp311-linux_x86_64.whl
363
+ poetry run pip install --force-reinstall --no-cache-dir torch-2.1.1+rocm6.0-cp311-cp311-linux_x86_64.whl
364
+ ```
365
+ * Install bitsandbytes for ROCm
366
+ ```bash
367
+ PYTORCH_ROCM_ARCH=gfx900,gfx906,gfx908,gfx90a,gfx1030,gfx1100,gfx1101,gfx940,gfx941,gfx942
368
+ BITSANDBYTES_VERSION=62353b0200b8557026c176e74ac48b84b953a854
369
+ git clone https://github.com/arlo-phoenix/bitsandbytes-rocm-5.6
370
+ cd bitsandbytes-rocm-5.6
371
+ git checkout ${BITSANDBYTES_VERSION}
372
+ make hip ROCM_TARGET=${PYTORCH_ROCM_ARCH} ROCM_HOME=/opt/rocm/
373
+ pip install . --extra-index-url https://download.pytorch.org/whl/nightly
374
+ ```
375
+
376
+ After that running the following command in the repository will install llama.cpp with GPU support:
377
+ ```bash
378
+ LLAMA_CPP_PYTHON_VERSION=0.2.56
379
+ DAMDGPU_TARGETS=gfx900;gfx906;gfx908;gfx90a;gfx1030;gfx1100;gfx1101;gfx940;gfx941;gfx942
380
+ CMAKE_ARGS="-DLLAMA_HIPBLAS=ON -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DAMDGPU_TARGETS=${DAMDGPU_TARGETS}" poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==${LLAMA_CPP_PYTHON_VERSION}
381
+ ```
382
+
383
+ If your installation was correct, you should see a message similar to the following next time you start the server `BLAS = 1`.
384
+
385
+ ```
386
+ AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
387
+ ```
388
+
389
+ ##### Llama-CPP Known issues and Troubleshooting
390
+
391
+ Execution of LLMs locally still has a lot of sharp edges, specially when running on non Linux platforms.
392
+ You might encounter several issues:
393
+
394
+ * Performance: RAM or VRAM usage is very high, your computer might experience slowdowns or even crashes.
395
+ * GPU Virtualization on Windows and OSX: Simply not possible with docker desktop, you have to run the server directly on
396
+ the host.
397
+ * Building errors: Some of PrivateGPT dependencies need to build native code, and they might fail on some platforms.
398
+ Most likely you are missing some dev tools in your machine (updated C++ compiler, CUDA is not on PATH, etc.).
399
+ If you encounter any of these issues, please open an issue and we'll try to help.
400
+
401
+ One of the first reflex to adopt is: get more information.
402
+ If, during your installation, something does not go as planned, retry in *verbose* mode, and see what goes wrong.
403
+
404
+ For example, when installing packages with `pip install`, you can add the option `-vvv` to show the details of the installation.
405
+
406
+ ##### Llama-CPP Troubleshooting: C++ Compiler
407
+
408
+ If you encounter an error while building a wheel during the `pip install` process, you may need to install a C++
409
+ compiler on your computer.
410
+
411
+ **For Windows 10/11**
412
+
413
+ To install a C++ compiler on Windows 10/11, follow these steps:
414
+
415
+ 1. Install Visual Studio 2022.
416
+ 2. Make sure the following components are selected:
417
+ * Universal Windows Platform development
418
+ * C++ CMake tools for Windows
419
+ 3. Download the MinGW installer from the [MinGW website](https://sourceforge.net/projects/mingw/).
420
+ 4. Run the installer and select the `gcc` component.
421
+
422
+ **For OSX**
423
+
424
+ 1. Check if you have a C++ compiler installed, `Xcode` should have done it for you. To install Xcode, go to the App
425
+ Store and search for Xcode and install it. **Or** you can install the command line tools by running `xcode-select --install`.
426
+ 2. If not, you can install clang or gcc with homebrew `brew install gcc`
427
+
428
+ ##### Llama-CPP Troubleshooting: Mac Running Intel
429
+
430
+ When running a Mac with Intel hardware (not M1), you may run into _clang: error: the clang compiler does not support '
431
+ -march=native'_ during pip install.
432
+
433
+ If so set your archflags during pip install. eg: _ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt_
fern/docs/pages/installation/troubleshooting.mdx ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Downloading Gated and Private Models
2
+ Many models are gated or private, requiring special access to use them. Follow these steps to gain access and set up your environment for using these models.
3
+ ## Accessing Gated Models
4
+ 1. **Request Access:**
5
+ Follow the instructions provided [here](https://huggingface.co/docs/hub/en/models-gated) to request access to the gated model.
6
+ 2. **Generate a Token:**
7
+ Once you have access, generate a token by following the instructions [here](https://huggingface.co/docs/hub/en/security-tokens).
8
+ 3. **Set the Token:**
9
+ Add the generated token to your `settings.yaml` file:
10
+ ```yaml
11
+ huggingface:
12
+ access_token: <your-token>
13
+ ```
14
+ Alternatively, set the `HF_TOKEN` environment variable:
15
+ ```bash
16
+ export HF_TOKEN=<your-token>
17
+ ```
18
+
19
+ # Tokenizer Setup
20
+ PrivateGPT uses the `AutoTokenizer` library to tokenize input text accurately. It connects to HuggingFace's API to download the appropriate tokenizer for the specified model.
21
+
22
+ ## Configuring the Tokenizer
23
+ 1. **Specify the Model:**
24
+ In your `settings.yaml` file, specify the model you want to use:
25
+ ```yaml
26
+ llm:
27
+ tokenizer: meta-llama/Meta-Llama-3.1-8B-Instruct
28
+ ```
29
+ 2. **Set Access Token for Gated Models:**
30
+ If you are using a gated model, ensure the `access_token` is set as mentioned in the previous section.
31
+ This configuration ensures that PrivateGPT can download and use the correct tokenizer for the model you are working with.
32
+
33
+ # Embedding dimensions mismatch
34
+ If you encounter an error message like `Embedding dimensions mismatch`, it is likely due to the embedding model and
35
+ current vector dimension mismatch. To resolve this issue, ensure that the model and the input data have the same vector dimensions.
36
+
37
+ By default, PrivateGPT uses `nomic-embed-text` embeddings, which have a vector dimension of 768.
38
+ If you are using a different embedding model, ensure that the vector dimensions match the model's output.
39
+
40
+ <Callout intent = "warning">
41
+ In versions below to 0.6.0, the default embedding model was `BAAI/bge-small-en-v1.5` in `huggingface` setup.
42
+ If you plan to reuse the old generated embeddings, you need to update the `settings.yaml` file to use the correct embedding model:
43
+ ```yaml
44
+ huggingface:
45
+ embedding_hf_model_name: BAAI/bge-small-en-v1.5
46
+ embedding:
47
+ embed_dim: 384
48
+ ```
49
+ </Callout>
50
+
51
+ # Building Llama-cpp with NVIDIA GPU support
52
+
53
+ ## Out-of-memory error
54
+
55
+ If you encounter an out-of-memory error while running `llama-cpp` with CUDA, you can try the following steps to resolve the issue:
56
+ 1. **Set the next environment:**
57
+ ```bash
58
+ TOKENIZERS_PARALLELISM=true
59
+ ```
60
+ 2. **Run PrivateGPT:**
61
+ ```bash
62
+ poetry run python -m privategpt
63
+ ```
64
+ Give thanks to [MarioRossiGithub](https://github.com/MarioRossiGithub) for providing the following solution.
fern/docs/pages/manual/ingestion-reset.mdx ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Reset Local documents database
2
+
3
+ When running in a local setup, you can remove all ingested documents by simply
4
+ deleting all contents of `local_data` folder (except .gitignore).
5
+
6
+ To simplify this process, you can use the command:
7
+ ```bash
8
+ make wipe
9
+ ```
10
+
11
+ # Advanced usage
12
+
13
+ You can actually delete your documents from your storage by using the
14
+ API endpoint `DELETE` in the Ingestion API.
fern/docs/pages/manual/ingestion.mdx ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ingesting & Managing Documents
2
+
3
+ The ingestion of documents can be done in different ways:
4
+
5
+ * Using the `/ingest` API
6
+ * Using the Gradio UI
7
+ * Using the Bulk Local Ingestion functionality (check next section)
8
+
9
+ ## Bulk Local Ingestion
10
+
11
+ You will need to activate `data.local_ingestion.enabled` in your setting file to use this feature. Additionally,
12
+ it is probably a good idea to set `data.local_ingestion.allow_ingest_from` to specify which folders are allowed to be ingested.
13
+
14
+ <Callout intent = "warning">
15
+ Be careful enabling this feature in a production environment, as it can be a security risk, as it allows users to
16
+ ingest any local file with permissions.
17
+ </Callout>
18
+
19
+ When you are running PrivateGPT in a fully local setup, you can ingest a complete folder for convenience (containing
20
+ pdf, text files, etc.)
21
+ and optionally watch changes on it with the command:
22
+
23
+ ```bash
24
+ make ingest /path/to/folder -- --watch
25
+ ```
26
+
27
+ To log the processed and failed files to an additional file, use:
28
+
29
+ ```bash
30
+ make ingest /path/to/folder -- --watch --log-file /path/to/log/file.log
31
+ ```
32
+
33
+ **Note for Windows Users:** Depending on your Windows version and whether you are using PowerShell to execute
34
+ PrivateGPT API calls, you may need to include the parameter name before passing the folder path for consumption:
35
+
36
+ ```bash
37
+ make ingest arg=/path/to/folder -- --watch --log-file /path/to/log/file.log
38
+ ```
39
+
40
+ After ingestion is complete, you should be able to chat with your documents
41
+ by navigating to http://localhost:8001 and using the option `Query documents`,
42
+ or using the completions / chat API.
43
+
44
+ ## Ingestion troubleshooting
45
+
46
+ ### Running out of memory
47
+
48
+ To do not run out of memory, you should ingest your documents without the LLM loaded in your (video) memory.
49
+ To do so, you should change your configuration to set `llm.mode: mock`.
50
+
51
+ You can also use the existing `PGPT_PROFILES=mock` that will set the following configuration for you:
52
+
53
+ ```yaml
54
+ llm:
55
+ mode: mock
56
+ embedding:
57
+ mode: local
58
+ ```
59
+
60
+ This configuration allows you to use hardware acceleration for creating embeddings while avoiding loading the full LLM into (video) memory.
61
+
62
+ Once your documents are ingested, you can set the `llm.mode` value back to `local` (or your previous custom value).
63
+
64
+ ### Ingestion speed
65
+
66
+ The ingestion speed depends on the number of documents you are ingesting, and the size of each document.
67
+ To speed up the ingestion, you can change the ingestion mode in configuration.
68
+
69
+ The following ingestion mode exist:
70
+ * `simple`: historic behavior, ingest one document at a time, sequentially
71
+ * `batch`: read, parse, and embed multiple documents using batches (batch read, and then batch parse, and then batch embed)
72
+ * `parallel`: read, parse, and embed multiple documents in parallel. This is the fastest ingestion mode for local setup.
73
+ * `pipeline`: Alternative to parallel.
74
+ To change the ingestion mode, you can use the `embedding.ingest_mode` configuration value. The default value is `simple`.
75
+
76
+ To configure the number of workers used for parallel or batched ingestion, you can use
77
+ the `embedding.count_workers` configuration value. If you set this value too high, you might run out of
78
+ memory, so be mindful when setting this value. The default value is `2`.
79
+ For `batch` mode, you can easily set this value to your number of threads available on your CPU without
80
+ running out of memory. For `parallel` mode, you should be more careful, and set this value to a lower value.
81
+
82
+ The configuration below should be enough for users who want to stress more their hardware:
83
+ ```yaml
84
+ embedding:
85
+ ingest_mode: parallel
86
+ count_workers: 4
87
+ ```
88
+
89
+ If your hardware is powerful enough, and that you are loading heavy documents, you can increase the number of workers.
90
+ It is recommended to do your own tests to find the optimal value for your hardware.
91
+
92
+ If you have a `bash` shell, you can use this set of command to do your own benchmark:
93
+
94
+ ```bash
95
+ # Wipe your local data, to put yourself in a clean state
96
+ # This will delete all your ingested documents
97
+ make wipe
98
+
99
+ time PGPT_PROFILES=mock python ./scripts/ingest_folder.py ~/my-dir/to-ingest/
100
+ ```
101
+
102
+ ## Supported file formats
103
+
104
+ PrivateGPT by default supports all the file formats that contains clear text (for example, `.txt` files, `.html`, etc.).
105
+ However, these text based file formats as only considered as text files, and are not pre-processed in any other way.
106
+
107
+ It also supports the following file formats:
108
+ * `.hwp`
109
+ * `.pdf`
110
+ * `.docx`
111
+ * `.pptx`
112
+ * `.ppt`
113
+ * `.pptm`
114
+ * `.jpg`
115
+ * `.png`
116
+ * `.jpeg`
117
+ * `.mp3`
118
+ * `.mp4`
119
+ * `.csv`
120
+ * `.epub`
121
+ * `.md`
122
+ * `.mbox`
123
+ * `.ipynb`
124
+ * `.json`
125
+
126
+ <Callout intent = "info">
127
+ While `PrivateGPT` supports these file formats, it **might** require additional
128
+ dependencies to be installed in your python's virtual environment.
129
+ For example, if you try to ingest `.epub` files, `PrivateGPT` might fail to do it, and will instead display an
130
+ explanatory error asking you to download the necessary dependencies to install this file format.
131
+ </Callout>
132
+
133
+ <Callout intent = "info">
134
+ **Other file formats might work**, but they will be considered as plain text
135
+ files (in other words, they will be ingested as `.txt` files).
136
+ </Callout>
137
+
fern/docs/pages/manual/llms.mdx ADDED
@@ -0,0 +1,234 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Running the Server
2
+
3
+ PrivateGPT supports running with different LLMs & setups.
4
+
5
+ ### Local models
6
+
7
+ Both the LLM and the Embeddings model will run locally.
8
+
9
+ Make sure you have followed the *Local LLM requirements* section before moving on.
10
+
11
+ This command will start PrivateGPT using the `settings.yaml` (default profile) together with the `settings-local.yaml`
12
+ configuration files. By default, it will enable both the API and the Gradio UI. Run:
13
+
14
+ ```bash
15
+ PGPT_PROFILES=local make run
16
+ ```
17
+
18
+ or
19
+
20
+ ```bash
21
+ PGPT_PROFILES=local poetry run python -m private_gpt
22
+ ```
23
+
24
+ When the server is started it will print a log *Application startup complete*.
25
+ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API
26
+ using Swagger UI.
27
+
28
+ #### Customizing low level parameters
29
+
30
+ Currently, not all the parameters of `llama.cpp` and `llama-cpp-python` are available at PrivateGPT's `settings.yaml` file.
31
+ In case you need to customize parameters such as the number of layers loaded into the GPU, you might change
32
+ these at the `llm_component.py` file under the `private_gpt/components/llm/llm_component.py`.
33
+
34
+ ##### Available LLM config options
35
+
36
+ The `llm` section of the settings allows for the following configurations:
37
+
38
+ - `mode`: how to run your llm
39
+ - `max_new_tokens`: this lets you configure the number of new tokens the LLM will generate and add to the context window (by default Llama.cpp uses `256`)
40
+
41
+ Example:
42
+
43
+ ```yaml
44
+ llm:
45
+ mode: local
46
+ max_new_tokens: 256
47
+ ```
48
+
49
+ If you are getting an out of memory error, you might also try a smaller model or stick to the proposed
50
+ recommended models, instead of custom tuning the parameters.
51
+
52
+ ### Using OpenAI
53
+
54
+ If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
55
+ decide to run PrivateGPT using OpenAI as the LLM and Embeddings model.
56
+
57
+ In order to do so, create a profile `settings-openai.yaml` with the following contents:
58
+
59
+ ```yaml
60
+ llm:
61
+ mode: openai
62
+
63
+ openai:
64
+ api_base: <openai-api-base-url> # Defaults to https://api.openai.com/v1
65
+ api_key: <your_openai_api_key> # You could skip this configuration and use the OPENAI_API_KEY env var instead
66
+ model: <openai_model_to_use> # Optional model to use. Default is "gpt-3.5-turbo"
67
+ # Note: Open AI Models are listed here: https://platform.openai.com/docs/models
68
+ ```
69
+
70
+ And run PrivateGPT loading that profile you just created:
71
+
72
+ `PGPT_PROFILES=openai make run`
73
+
74
+ or
75
+
76
+ `PGPT_PROFILES=openai poetry run python -m private_gpt`
77
+
78
+ When the server is started it will print a log *Application startup complete*.
79
+ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
80
+ You'll notice the speed and quality of response is higher, given you are using OpenAI's servers for the heavy
81
+ computations.
82
+
83
+ ### Using OpenAI compatible API
84
+
85
+ Many tools, including [LocalAI](https://localai.io/) and [vLLM](https://docs.vllm.ai/en/latest/),
86
+ support serving local models with an OpenAI compatible API. Even when overriding the `api_base`,
87
+ using the `openai` mode doesn't allow you to use custom models. Instead, you should use the `openailike` mode:
88
+
89
+ ```yaml
90
+ llm:
91
+ mode: openailike
92
+ ```
93
+
94
+ This mode uses the same settings as the `openai` mode.
95
+
96
+ As an example, you can follow the [vLLM quickstart guide](https://docs.vllm.ai/en/latest/getting_started/quickstart.html#openai-compatible-server)
97
+ to run an OpenAI compatible server. Then, you can run PrivateGPT using the `settings-vllm.yaml` profile:
98
+
99
+ `PGPT_PROFILES=vllm make run`
100
+
101
+ ### Using Azure OpenAI
102
+
103
+ If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
104
+ decide to run PrivateGPT using Azure OpenAI as the LLM and Embeddings model.
105
+
106
+ In order to do so, create a profile `settings-azopenai.yaml` with the following contents:
107
+
108
+ ```yaml
109
+ llm:
110
+ mode: azopenai
111
+
112
+ embedding:
113
+ mode: azopenai
114
+
115
+ azopenai:
116
+ api_key: <your_azopenai_api_key> # You could skip this configuration and use the AZ_OPENAI_API_KEY env var instead
117
+ azure_endpoint: <your_azopenai_endpoint> # You could skip this configuration and use the AZ_OPENAI_ENDPOINT env var instead
118
+ api_version: <api_version> # The API version to use. Default is "2023_05_15"
119
+ embedding_deployment_name: <your_embedding_deployment_name> # You could skip this configuration and use the AZ_OPENAI_EMBEDDING_DEPLOYMENT_NAME env var instead
120
+ embedding_model: <openai_embeddings_to_use> # Optional model to use. Default is "text-embedding-ada-002"
121
+ llm_deployment_name: <your_model_deployment_name> # You could skip this configuration and use the AZ_OPENAI_LLM_DEPLOYMENT_NAME env var instead
122
+ llm_model: <openai_model_to_use> # Optional model to use. Default is "gpt-35-turbo"
123
+ ```
124
+
125
+ And run PrivateGPT loading that profile you just created:
126
+
127
+ `PGPT_PROFILES=azopenai make run`
128
+
129
+ or
130
+
131
+ `PGPT_PROFILES=azopenai poetry run python -m private_gpt`
132
+
133
+ When the server is started it will print a log *Application startup complete*.
134
+ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
135
+ You'll notice the speed and quality of response is higher, given you are using Azure OpenAI's servers for the heavy
136
+ computations.
137
+
138
+ ### Using AWS Sagemaker
139
+
140
+ For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.
141
+
142
+ Note: how to deploy models on Sagemaker is out of the scope of this documentation.
143
+
144
+ In order to do so, create a profile `settings-sagemaker.yaml` with the following contents (remember to
145
+ update the values of the llm_endpoint_name and embedding_endpoint_name to yours):
146
+
147
+ ```yaml
148
+ llm:
149
+ mode: sagemaker
150
+
151
+ sagemaker:
152
+ llm_endpoint_name: huggingface-pytorch-tgi-inference-2023-09-25-19-53-32-140
153
+ embedding_endpoint_name: huggingface-pytorch-inference-2023-11-03-07-41-36-479
154
+ ```
155
+
156
+ And run PrivateGPT loading that profile you just created:
157
+
158
+ `PGPT_PROFILES=sagemaker make run`
159
+
160
+ or
161
+
162
+ `PGPT_PROFILES=sagemaker poetry run python -m private_gpt`
163
+
164
+ When the server is started it will print a log *Application startup complete*.
165
+ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
166
+
167
+ ### Using Ollama
168
+
169
+ Another option for a fully private setup is using [Ollama](https://ollama.ai/).
170
+
171
+ Note: how to deploy Ollama and pull models onto it is out of the scope of this documentation.
172
+
173
+ In order to do so, create a profile `settings-ollama.yaml` with the following contents:
174
+
175
+ ```yaml
176
+ llm:
177
+ mode: ollama
178
+
179
+ ollama:
180
+ model: <ollama_model_to_use> # Required Model to use.
181
+ # Note: Ollama Models are listed here: https://ollama.ai/library
182
+ # Be sure to pull the model to your Ollama server
183
+ api_base: <ollama-api-base-url> # Defaults to http://localhost:11434
184
+ ```
185
+
186
+ And run PrivateGPT loading that profile you just created:
187
+
188
+ `PGPT_PROFILES=ollama make run`
189
+
190
+ or
191
+
192
+ `PGPT_PROFILES=ollama poetry run python -m private_gpt`
193
+
194
+ When the server is started it will print a log *Application startup complete*.
195
+ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
196
+
197
+ ### Using IPEX-LLM
198
+
199
+ For a fully private setup on Intel GPUs (such as a local PC with an iGPU, or discrete GPUs like Arc, Flex, and Max), you can use [IPEX-LLM](https://github.com/intel-analytics/ipex-llm).
200
+
201
+ To deploy Ollama and pull models using IPEX-LLM, please refer to [this guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/ollama_quickstart.html). Then, follow the same steps outlined in the [Using Ollama](#using-ollama) section to create a `settings-ollama.yaml` profile and run the private-GPT server.
202
+
203
+ ### Using Gemini
204
+
205
+ If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
206
+ decide to run PrivateGPT using Gemini as the LLM and Embeddings model. In addition, you will benefit from
207
+ multimodal inputs, such as text and images, in a very large contextual window.
208
+
209
+ In order to do so, create a profile `settings-gemini.yaml` with the following contents:
210
+
211
+ ```yaml
212
+ llm:
213
+ mode: gemini
214
+
215
+ embedding:
216
+ mode: gemini
217
+
218
+ gemini:
219
+ api_key: <your_gemini_api_key> # You could skip this configuration and use the GEMINI_API_KEY env var instead
220
+ model: <gemini_model_to_use> # Optional model to use. Default is models/gemini-pro"
221
+ embedding_model: <gemini_embeddings_to_use> # Optional model to use. Default is "models/embedding-001"
222
+ ```
223
+
224
+ And run PrivateGPT loading that profile you just created:
225
+
226
+ `PGPT_PROFILES=gemini make run`
227
+
228
+ or
229
+
230
+ `PGPT_PROFILES=gemini poetry run python -m private_gpt`
231
+
232
+ When the server is started it will print a log *Application startup complete*.
233
+ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
234
+
fern/docs/pages/manual/nodestore.mdx ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## NodeStores
2
+ PrivateGPT supports **Simple** and [Postgres](https://www.postgresql.org/) providers. Simple being the default.
3
+
4
+ In order to select one or the other, set the `nodestore.database` property in the `settings.yaml` file to `simple` or `postgres`.
5
+
6
+ ```yaml
7
+ nodestore:
8
+ database: simple
9
+ ```
10
+
11
+ ### Simple Document Store
12
+
13
+ Setting up simple document store: Persist data with in-memory and disk storage.
14
+
15
+ Enabling the simple document store is an excellent choice for small projects or proofs of concept where you need to persist data while maintaining minimal setup complexity. To get started, set the nodestore.database property in your settings.yaml file as follows:
16
+
17
+ ```yaml
18
+ nodestore:
19
+ database: simple
20
+ ```
21
+ The beauty of the simple document store is its flexibility and ease of implementation. It provides a solid foundation for managing and retrieving data without the need for complex setup or configuration. The combination of in-memory processing and disk persistence ensures that you can efficiently handle small to medium-sized datasets while maintaining data consistency across runs.
22
+
23
+ ### Postgres Document Store
24
+
25
+ To enable Postgres, set the `nodestore.database` property in the `settings.yaml` file to `postgres` and install the `storage-nodestore-postgres` extra. Note: Vector Embeddings Storage in Postgres is configured separately
26
+
27
+ ```bash
28
+ poetry install --extras storage-nodestore-postgres
29
+ ```
30
+
31
+ The available configuration options are:
32
+ | Field | Description |
33
+ |---------------|-----------------------------------------------------------|
34
+ | **host** | The server hosting the Postgres database. Default is `localhost` |
35
+ | **port** | The port on which the Postgres database is accessible. Default is `5432` |
36
+ | **database** | The specific database to connect to. Default is `postgres` |
37
+ | **user** | The username for database access. Default is `postgres` |
38
+ | **password** | The password for database access. (Required) |
39
+ | **schema_name** | The database schema to use. Default is `private_gpt` |
40
+
41
+ For example:
42
+ ```yaml
43
+ nodestore:
44
+ database: postgres
45
+
46
+ postgres:
47
+ host: localhost
48
+ port: 5432
49
+ database: postgres
50
+ user: postgres
51
+ password: <PASSWORD>
52
+ schema_name: private_gpt
53
+ ```
54
+
55
+ Given the above configuration, Two PostgreSQL tables will be created upon successful connection: one for storing metadata related to the index and another for document data itself.
56
+
57
+ ```
58
+ postgres=# \dt private_gpt.*
59
+ List of relations
60
+ Schema | Name | Type | Owner
61
+ -------------+-----------------+-------+--------------
62
+ private_gpt | data_docstore | table | postgres
63
+ private_gpt | data_indexstore | table | postgres
64
+
65
+ postgres=#
66
+ ```
fern/docs/pages/manual/reranker.mdx ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Enhancing Response Quality with Reranking
2
+
3
+ PrivateGPT offers a reranking feature aimed at optimizing response generation by filtering out irrelevant documents, potentially leading to faster response times and enhanced relevance of answers generated by the LLM.
4
+
5
+ ### Enabling Reranking
6
+
7
+ Document reranking can significantly improve the efficiency and quality of the responses by pre-selecting the most relevant documents before generating an answer. To leverage this feature, ensure that it is enabled in the RAG settings and consider adjusting the parameters to best fit your use case.
8
+
9
+ #### Additional Requirements
10
+
11
+ Before enabling reranking, you must install additional dependencies:
12
+
13
+ ```bash
14
+ poetry install --extras rerank-sentence-transformers
15
+ ```
16
+
17
+ This command installs dependencies for the cross-encoder reranker from sentence-transformers, which is currently the only supported method by PrivateGPT for document reranking.
18
+
19
+ #### Configuration
20
+
21
+ To enable and configure reranking, adjust the `rag` section within the `settings.yaml` file. Here are the key settings to consider:
22
+
23
+ - `similarity_top_k`: Determines the number of documents to initially retrieve and consider for reranking. This value should be larger than `top_n`.
24
+ - `rerank`:
25
+ - `enabled`: Set to `true` to activate the reranking feature.
26
+ - `top_n`: Specifies the number of documents to use in the final answer generation process, chosen from the top-ranked documents provided by `similarity_top_k`.
27
+
28
+ Example configuration snippet:
29
+
30
+ ```yaml
31
+ rag:
32
+ similarity_top_k: 10 # Number of documents to retrieve and consider for reranking
33
+ rerank:
34
+ enabled: true
35
+ top_n: 3 # Number of top-ranked documents to use for generating the answer
36
+ ```
fern/docs/pages/manual/settings.mdx ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Settings and profiles for your private GPT
2
+
3
+ The configuration of your private GPT server is done thanks to `settings` files (more precisely `settings.yaml`).
4
+ These text files are written using the [YAML](https://en.wikipedia.org/wiki/YAML) syntax.
5
+
6
+ While PrivateGPT is distributing safe and universal configuration files, you might want to quickly customize your
7
+ PrivateGPT, and this can be done using the `settings` files.
8
+
9
+ This project is defining the concept of **profiles** (or configuration profiles).
10
+ This mechanism, using your environment variables, is giving you the ability to easily switch between
11
+ configuration you've made.
12
+
13
+ A typical use case of profile is to easily switch between LLM and embeddings.
14
+ To be a bit more precise, you can change the language (to French, Spanish, Italian, English, etc) by simply changing
15
+ the profile you've selected; no code changes required!
16
+
17
+ PrivateGPT is configured through *profiles* that are defined using yaml files, and selected through env variables.
18
+ The full list of properties configurable can be found in `settings.yaml`.
19
+
20
+ ## How to know which profiles exist
21
+ Given that a profile `foo_bar` points to the file `settings-foo_bar.yaml` and vice-versa, you simply have to look
22
+ at the files starting with `settings` and ending in `.yaml`.
23
+
24
+ ## How to use an existing profiles
25
+ **Please note that the syntax to set the value of an environment variables depends on your OS**.
26
+ You have to set environment variable `PGPT_PROFILES` to the name of the profile you want to use.
27
+
28
+ For example, on **linux and macOS**, this gives:
29
+ ```bash
30
+ export PGPT_PROFILES=my_profile_name_here
31
+ ```
32
+
33
+ Windows Command Prompt (cmd) has a different syntax:
34
+ ```shell
35
+ set PGPT_PROFILES=my_profile_name_here
36
+ ```
37
+
38
+ Windows Powershell has a different syntax:
39
+ ```shell
40
+ $env:PGPT_PROFILES="my_profile_name_here"
41
+ ```
42
+ If the above is not working, you might want to try other ways to set an env variable in your window's terminal.
43
+
44
+ ---
45
+
46
+ Once you've set this environment variable to the desired profile, you can simply launch your PrivateGPT,
47
+ and it will run using your profile on top of the default configuration.
48
+
49
+ ## Reference
50
+ Additional details on the profiles are described in this section
51
+
52
+ ### Environment variable `PGPT_SETTINGS_FOLDER`
53
+
54
+ The location of the settings folder. Defaults to the root of the project.
55
+ Should contain the default `settings.yaml` and any other `settings-{profile}.yaml`.
56
+
57
+ ### Environment variable `PGPT_PROFILES`
58
+
59
+ By default, the profile definition in `settings.yaml` is loaded.
60
+ Using this env var you can load additional profiles; format is a comma separated list of profile names.
61
+ This will merge `settings-{profile}.yaml` on top of the base settings file.
62
+
63
+ For example:
64
+ `PGPT_PROFILES=local,cuda` will load `settings-local.yaml`
65
+ and `settings-cuda.yaml`, their contents will be merged with
66
+ later profiles properties overriding values of earlier ones like `settings.yaml`.
67
+
68
+ During testing, the `test` profile will be active along with the default, therefore `settings-test.yaml`
69
+ file is required.
70
+
71
+ ### Environment variables expansion
72
+
73
+ Configuration files can contain environment variables,
74
+ they will be expanded at runtime.
75
+
76
+ Expansion must follow the pattern `${VARIABLE_NAME:default_value}`.
77
+
78
+ For example, the following configuration will use the value of the `PORT`
79
+ environment variable or `8001` if it's not set.
80
+ Missing variables with no default will produce an error.
81
+
82
+ ```yaml
83
+ server:
84
+ port: ${PORT:8001}
85
+ ```
fern/docs/pages/manual/vectordb.mdx ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Vectorstores
2
+ PrivateGPT supports [Qdrant](https://qdrant.tech/), [Milvus](https://milvus.io/), [Chroma](https://www.trychroma.com/), [PGVector](https://github.com/pgvector/pgvector) and [ClickHouse](https://github.com/ClickHouse/ClickHouse) as vectorstore providers. Qdrant being the default.
3
+
4
+ In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `milvus`, `chroma`, `postgres` and `clickhouse`.
5
+
6
+ ```yaml
7
+ vectorstore:
8
+ database: qdrant
9
+ ```
10
+
11
+ ### Qdrant configuration
12
+
13
+ To enable Qdrant, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`.
14
+
15
+ Qdrant settings can be configured by setting values to the `qdrant` property in the `settings.yaml` file.
16
+
17
+ The available configuration options are:
18
+ | Field | Description |
19
+ |--------------|-------------|
20
+ | location | If `:memory:` - use in-memory Qdrant instance. If `str` - use it as a `url` parameter.|
21
+ | url | Either host or str of 'Optional[scheme], host, Optional[port], Optional[prefix]'. Eg. `http://localhost:6333` |
22
+ | port | Port of the REST API interface. Default: `6333` |
23
+ | grpc_port | Port of the gRPC interface. Default: `6334` |
24
+ | prefer_grpc | If `true` - use gRPC interface whenever possible in custom methods. |
25
+ | https | If `true` - use HTTPS(SSL) protocol.|
26
+ | api_key | API key for authentication in Qdrant Cloud.|
27
+ | prefix | If set, add `prefix` to the REST URL path. Example: `service/v1` will result in `http://localhost:6333/service/v1/{qdrant-endpoint}` for REST API.|
28
+ | timeout | Timeout for REST and gRPC API requests. Default: 5.0 seconds for REST and unlimited for gRPC |
29
+ | host | Host name of Qdrant service. If url and host are not set, defaults to 'localhost'.|
30
+ | path | Persistence path for QdrantLocal. Eg. `local_data/private_gpt/qdrant`|
31
+ | force_disable_check_same_thread | Force disable check_same_thread for QdrantLocal sqlite connection, defaults to True.|
32
+
33
+ By default Qdrant tries to connect to an instance of Qdrant server at `http://localhost:3000`.
34
+
35
+ To obtain a local setup (disk-based database) without running a Qdrant server, configure the `qdrant.path` value in settings.yaml:
36
+
37
+ ```yaml
38
+ qdrant:
39
+ path: local_data/private_gpt/qdrant
40
+ ```
41
+
42
+ ### Milvus configuration
43
+
44
+ To enable Milvus, set the `vectorstore.database` property in the `settings.yaml` file to `milvus` and install the `milvus` extra.
45
+
46
+ ```bash
47
+ poetry install --extras vector-stores-milvus
48
+ ```
49
+
50
+ The available configuration options are:
51
+ | Field | Description |
52
+ |--------------|-------------|
53
+ | uri | Default is set to "local_data/private_gpt/milvus/milvus_local.db" as a local file; you can also set up a more performant Milvus server on docker or k8s e.g.http://localhost:19530, as your uri; To use Zilliz Cloud, adjust the uri and token to Endpoint and Api key in Zilliz Cloud.|
54
+ | token | Pair with Milvus server on docker or k8s or zilliz cloud api key.|
55
+ | collection_name | The name of the collection, set to default "milvus_db".|
56
+ | overwrite | Overwrite the data in collection if it existed, set to default as True. |
57
+
58
+ To obtain a local setup (disk-based database) without running a Milvus server, configure the uri value in settings.yaml, to store in local_data/private_gpt/milvus/milvus_local.db.
59
+
60
+ ### Chroma configuration
61
+
62
+ To enable Chroma, set the `vectorstore.database` property in the `settings.yaml` file to `chroma` and install the `chroma` extra.
63
+
64
+ ```bash
65
+ poetry install --extras chroma
66
+ ```
67
+
68
+ By default `chroma` will use a disk-based database stored in local_data_path / "chroma_db" (being local_data_path defined in settings.yaml)
69
+
70
+ ### PGVector
71
+ To use the PGVector store a [postgreSQL](https://www.postgresql.org/) database with the PGVector extension must be used.
72
+
73
+ To enable PGVector, set the `vectorstore.database` property in the `settings.yaml` file to `postgres` and install the `vector-stores-postgres` extra.
74
+
75
+ ```bash
76
+ poetry install --extras vector-stores-postgres
77
+ ```
78
+
79
+ PGVector settings can be configured by setting values to the `postgres` property in the `settings.yaml` file.
80
+
81
+ The available configuration options are:
82
+ | Field | Description |
83
+ |---------------|-----------------------------------------------------------|
84
+ | **host** | The server hosting the Postgres database. Default is `localhost` |
85
+ | **port** | The port on which the Postgres database is accessible. Default is `5432` |
86
+ | **database** | The specific database to connect to. Default is `postgres` |
87
+ | **user** | The username for database access. Default is `postgres` |
88
+ | **password** | The password for database access. (Required) |
89
+ | **schema_name** | The database schema to use. Default is `private_gpt` |
90
+
91
+ For example:
92
+ ```yaml
93
+ vectorstore:
94
+ database: postgres
95
+
96
+ postgres:
97
+ host: localhost
98
+ port: 5432
99
+ database: postgres
100
+ user: postgres
101
+ password: <PASSWORD>
102
+ schema_name: private_gpt
103
+ ```
104
+
105
+ The following table will be created in the database
106
+ ```
107
+ postgres=# \d private_gpt.data_embeddings
108
+ Table "private_gpt.data_embeddings"
109
+ Column | Type | Collation | Nullable | Default
110
+ -----------+-------------------+-----------+----------+---------------------------------------------------------
111
+ id | bigint | | not null | nextval('private_gpt.data_embeddings_id_seq'::regclass)
112
+ text | character varying | | not null |
113
+ metadata_ | json | | |
114
+ node_id | character varying | | |
115
+ embedding | vector(768) | | |
116
+ Indexes:
117
+ "data_embeddings_pkey" PRIMARY KEY, btree (id)
118
+
119
+ postgres=#
120
+ ```
121
+ The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value. If the embedding model changes this table may need to be dropped and recreated to avoid a dimension mismatch.
122
+
123
+ ### ClickHouse
124
+
125
+ To utilize ClickHouse as the vector store, a [ClickHouse](https://github.com/ClickHouse/ClickHouse) database must be employed.
126
+
127
+ To enable ClickHouse, set the `vectorstore.database` property in the `settings.yaml` file to `clickhouse` and install the `vector-stores-clickhouse` extra.
128
+
129
+ ```bash
130
+ poetry install --extras vector-stores-clickhouse
131
+ ```
132
+
133
+ ClickHouse settings can be configured by setting values to the `clickhouse` property in the `settings.yaml` file.
134
+
135
+ The available configuration options are:
136
+ | Field | Description |
137
+ |----------------------|----------------------------------------------------------------|
138
+ | **host** | The server hosting the ClickHouse database. Default is `localhost` |
139
+ | **port** | The port on which the ClickHouse database is accessible. Default is `8123` |
140
+ | **username** | The username for database access. Default is `default` |
141
+ | **password** | The password for database access. (Optional) |
142
+ | **database** | The specific database to connect to. Default is `__default__` |
143
+ | **secure** | Use https/TLS for secure connection to the server. Default is `false` |
144
+ | **interface** | The protocol used for the connection, either 'http' or 'https'. (Optional) |
145
+ | **settings** | Specific ClickHouse server settings to be used with the session. (Optional) |
146
+ | **connect_timeout** | Timeout in seconds for establishing a connection. (Optional) |
147
+ | **send_receive_timeout** | Read timeout in seconds for http connection. (Optional) |
148
+ | **verify** | Verify the server certificate in secure/https mode. (Optional) |
149
+ | **ca_cert** | Path to Certificate Authority root certificate (.pem format). (Optional) |
150
+ | **client_cert** | Path to TLS Client certificate (.pem format). (Optional) |
151
+ | **client_cert_key** | Path to the private key for the TLS Client certificate. (Optional) |
152
+ | **http_proxy** | HTTP proxy address. (Optional) |
153
+ | **https_proxy** | HTTPS proxy address. (Optional) |
154
+ | **server_host_name** | Server host name to be checked against the TLS certificate. (Optional) |
155
+
156
+ For example:
157
+ ```yaml
158
+ vectorstore:
159
+ database: clickhouse
160
+
161
+ clickhouse:
162
+ host: localhost
163
+ port: 8443
164
+ username: admin
165
+ password: <PASSWORD>
166
+ database: embeddings
167
+ secure: false
168
+ ```
169
+
170
+ The following table will be created in the database:
171
+ ```
172
+ clickhouse-client
173
+ :) \d embeddings.llama_index
174
+ Table "llama_index"
175
+ № | name | type | default_type | default_expression | comment | codec_expression | ttl_expression
176
+ ----|-----------|----------------------------------------------|--------------|--------------------|---------|------------------|---------------
177
+ 1 | id | String | | | | |
178
+ 2 | doc_id | String | | | | |
179
+ 3 | text | String | | | | |
180
+ 4 | vector | Array(Float32) | | | | |
181
+ 5 | node_info | Tuple(start Nullable(UInt64), end Nullable(UInt64)) | | | | |
182
+ 6 | metadata | String | | | | |
183
+
184
+ clickhouse-client
185
+ ```
186
+
187
+ The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value. If the embedding model changes, this table may need to be dropped and recreated to avoid a dimension mismatch.
fern/docs/pages/overview/welcome.mdx ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ PrivateGPT provides an **API** containing all the building blocks required to
2
+ build **private, context-aware AI applications**.
3
+
4
+ <Callout intent = "tip">
5
+ If you are looking for an **enterprise-ready, fully private AI workspace**
6
+ check out [Zylon's website](https://zylon.ai) or [request a demo](https://cal.com/zylon/demo?source=pgpt-docs).
7
+ Crafted by the team behind PrivateGPT, Zylon is a best-in-class AI collaborative
8
+ workspace that can be easily deployed on-premise (data center, bare metal...) or in your private cloud (AWS, GCP, Azure...).
9
+ </Callout>
10
+
11
+ The API follows and extends OpenAI API standard, and supports both normal and streaming responses.
12
+ That means that, if you can use OpenAI API in one of your tools, you can use your own PrivateGPT API instead,
13
+ with no code changes, **and for free** if you are running PrivateGPT in a `local` setup.
14
+
15
+ Get started by understanding the [Main Concepts and Installation](/installation) and then dive into the [API Reference](/api-reference).
16
+
17
+ ## Frequently Visited Resources
18
+
19
+ <Cards>
20
+ <Card
21
+ title="Main Concepts"
22
+ icon="fa-solid fa-lines-leaning"
23
+ href="/installation"
24
+ />
25
+ <Card
26
+ title="API Reference"
27
+ icon="fa-solid fa-code"
28
+ href="/api-reference"
29
+ />
30
+ <Card
31
+ title="Twitter"
32
+ icon="fa-brands fa-twitter"
33
+ href="https://twitter.com/PrivateGPT_AI"
34
+ />
35
+ <Card
36
+ title="Discord Server"
37
+ icon="fa-brands fa-discord"
38
+ href="https://discord.gg/bK6mRVpErU"
39
+ />
40
+ </Cards>
41
+
42
+ <br />
fern/docs/pages/quickstart/quickstart.mdx ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This guide provides a quick start for running different profiles of PrivateGPT using Docker Compose.
2
+ The profiles cater to various environments, including Ollama setups (CPU, CUDA, MacOS), and a fully local setup.
3
+
4
+ By default, Docker Compose will download pre-built images from a remote registry when starting the services. However, you have the option to build the images locally if needed. Details on building Docker image locally are provided at the end of this guide.
5
+
6
+ If you want to run PrivateGPT locally without Docker, refer to the [Local Installation Guide](/installation).
7
+
8
+ ## Prerequisites
9
+ - **Docker and Docker Compose:** Ensure both are installed on your system.
10
+ [Installation Guide for Docker](https://docs.docker.com/get-docker/), [Installation Guide for Docker Compose](https://docs.docker.com/compose/install/).
11
+ - **Clone PrivateGPT Repository:** Clone the PrivateGPT repository to your machine and navigate to the directory:
12
+ ```sh
13
+ git clone https://github.com/zylon-ai/private-gpt.git
14
+ cd private-gpt
15
+ ```
16
+
17
+ ## Setups
18
+
19
+ ### Ollama Setups (Recommended)
20
+
21
+ #### 1. Default/Ollama CPU
22
+
23
+ **Description:**
24
+ This profile runs the Ollama service using CPU resources. It is the standard configuration for running Ollama-based Private-GPT services without GPU acceleration.
25
+
26
+ **Run:**
27
+ To start the services using pre-built images, run:
28
+ ```sh
29
+ docker-compose up
30
+ ```
31
+ or with a specific profile:
32
+ ```sh
33
+ docker-compose --profile ollama-cpu up
34
+ ```
35
+
36
+ #### 2. Ollama Nvidia CUDA
37
+
38
+ **Description:**
39
+ This profile leverages GPU acceleration with CUDA support, suitable for computationally intensive tasks that benefit from GPU resources.
40
+
41
+ **Requirements:**
42
+ Ensure that your system has compatible GPU hardware and the necessary NVIDIA drivers installed. The installation process is detailed [here](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html).
43
+
44
+ **Run:**
45
+ To start the services with CUDA support using pre-built images, run:
46
+ ```sh
47
+ docker-compose --profile ollama-cuda up
48
+ ```
49
+
50
+ #### 3. Ollama External API
51
+
52
+ **Description:**
53
+ This profile is designed for running PrivateGPT using Ollama installed on the host machine. This setup is particularly useful for MacOS users, as Docker does not yet support Metal GPU.
54
+
55
+ **Requirements:**
56
+ Install Ollama on your machine by following the instructions at [ollama.ai](https://ollama.ai/).
57
+
58
+ **Run:**
59
+ To start the Ollama service, use:
60
+ ```sh
61
+ OLLAMA_HOST=0.0.0.0 ollama serve
62
+ ```
63
+ To start the services with the host configuration using pre-built images, run:
64
+ ```sh
65
+ docker-compose --profile ollama-api up
66
+ ```
67
+
68
+ ### Fully Local Setups
69
+
70
+ #### 1. LlamaCPP CPU
71
+
72
+ **Description:**
73
+ This profile runs the Private-GPT services locally using `llama-cpp` and Hugging Face models.
74
+
75
+ **Requirements:**
76
+ A **Hugging Face Token (HF_TOKEN)** is required for accessing Hugging Face models. Obtain your token following [this guide](/installation/getting-started/troubleshooting#downloading-gated-and-private-models).
77
+
78
+ **Run:**
79
+ Start the services with your Hugging Face token using pre-built images:
80
+ ```sh
81
+ HF_TOKEN=<your_hf_token> docker-compose --profile llamacpp-cpu up
82
+ ```
83
+ Replace `<your_hf_token>` with your actual Hugging Face token.
84
+
85
+ ## Building Locally
86
+
87
+ If you prefer to build Docker images locally, which is useful when making changes to the codebase or the Dockerfiles, follow these steps:
88
+
89
+ ### Building Locally
90
+ To build the Docker images locally, navigate to the cloned repository directory and run:
91
+ ```sh
92
+ docker-compose build
93
+ ```
94
+ This command compiles the necessary Docker images based on the current codebase and Dockerfile configurations.
95
+
96
+ ### Forcing a Rebuild with --build
97
+ If you have made changes and need to ensure these changes are reflected in the Docker images, you can force a rebuild before starting the services:
98
+ ```sh
99
+ docker-compose up --build
100
+ ```
101
+ or with a specific profile:
102
+ ```sh
103
+ docker-compose --profile <profile_name> up --build
104
+ ```
105
+ Replace `<profile_name>` with the desired profile.
fern/docs/pages/recipes/quickstart.mdx ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Recipes
2
+
3
+ Recipes are predefined use cases that help users solve very specific tasks using PrivateGPT.
4
+ They provide a streamlined approach to achieve common goals with the platform, offering both a starting point and inspiration for further exploration.
5
+ The main goal of Recipes is to empower the community to create and share solutions, expanding the capabilities of PrivateGPT.
6
+
7
+ ## How to Create a New Recipe
8
+
9
+ 1. **Identify the Task**: Define a specific task or problem that the Recipe will address.
10
+ 2. **Develop the Solution**: Create a clear and concise guide, including any necessary code snippets or configurations.
11
+ 3. **Submit a PR**: Fork the PrivateGPT repository, add your Recipe to the appropriate section, and submit a PR for review.
12
+
13
+ We encourage you to be creative and think outside the box! Your contributions help shape the future of PrivateGPT.
14
+
15
+ ## Available Recipes
16
+
17
+ <Cards>
18
+ <Card
19
+ title="Summarize"
20
+ icon="fa-solid fa-file-alt"
21
+ href="/recipes/general-use-cases/summarize"
22
+ />
23
+ </Cards>
fern/docs/pages/recipes/summarize.mdx ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ The Summarize Recipe provides a method to extract concise summaries from ingested documents or texts using PrivateGPT.
2
+ This tool is particularly useful for quickly understanding large volumes of information by distilling key points and main ideas.
3
+
4
+ ## Use Case
5
+
6
+ The primary use case for the `Summarize` tool is to automate the summarization of lengthy documents,
7
+ making it easier for users to grasp the essential information without reading through entire texts.
8
+ This can be applied in various scenarios, such as summarizing research papers, news articles, or business reports.
9
+
10
+ ## Key Features
11
+
12
+ 1. **Ingestion-compatible**: The user provides the text to be summarized. The text can be directly inputted or retrieved from ingested documents within the system.
13
+ 2. **Customization**: The summary generation can be influenced by providing specific `instructions` or a `prompt`. These inputs guide the model on how to frame the summary, allowing for customization according to user needs.
14
+ 3. **Streaming Support**: The tool supports streaming, allowing for real-time summary generation, which can be particularly useful for handling large texts or providing immediate feedback.
15
+
16
+ ## Contributing
17
+
18
+ If you have ideas for improving the Summarize or want to add new features, feel free to contribute!
19
+ You can submit your enhancements via a pull request on our [GitHub repository](https://github.com/zylon-ai/private-gpt).
20
+