chasemcdo commited on
Commit
a69b4ce
·
1 Parent(s): 634c119

add localai

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .dockerignore +6 -0
  2. .env +30 -0
  3. .github/ISSUE_TEMPLATE/bug_report.md +31 -0
  4. .github/ISSUE_TEMPLATE/config.yml +8 -0
  5. .github/ISSUE_TEMPLATE/feature_request.md +22 -0
  6. .github/PULL_REQUEST_TEMPLATE.md +23 -0
  7. .github/bump_deps.sh +9 -0
  8. .github/release.yml +24 -0
  9. .github/stale.yml +18 -0
  10. .github/workflows/bump_deps.yaml +51 -0
  11. .github/workflows/image.yml +109 -0
  12. .github/workflows/release.yaml +79 -0
  13. .github/workflows/test.yml +44 -0
  14. .gitignore +32 -0
  15. .vscode/launch.json +33 -0
  16. Dockerfile +113 -0
  17. Earthfile +5 -0
  18. LICENSE +21 -0
  19. Makefile +303 -0
  20. api/api.go +167 -0
  21. api/api_test.go +514 -0
  22. api/apt_suite_test.go +13 -0
  23. api/config.go +368 -0
  24. api/config_test.go +54 -0
  25. api/gallery.go +237 -0
  26. api/localai.go +78 -0
  27. api/openai.go +772 -0
  28. api/options.go +153 -0
  29. api/prediction.go +647 -0
  30. assets.go +6 -0
  31. docker-compose.yaml +15 -0
  32. entrypoint.sh +11 -0
  33. examples/README.md +145 -0
  34. examples/autoGPT/.env +5 -0
  35. examples/autoGPT/README.md +32 -0
  36. examples/autoGPT/docker-compose.yaml +42 -0
  37. examples/chatbot-ui-manual/README.md +48 -0
  38. examples/chatbot-ui-manual/docker-compose.yaml +24 -0
  39. examples/chatbot-ui-manual/models/completion.tmpl +1 -0
  40. examples/chatbot-ui-manual/models/gpt-3.5-turbo.yaml +16 -0
  41. examples/chatbot-ui-manual/models/gpt4all.tmpl +4 -0
  42. examples/chatbot-ui/README.md +44 -0
  43. examples/chatbot-ui/docker-compose.yaml +37 -0
  44. examples/discord-bot/.env.example +6 -0
  45. examples/discord-bot/README.md +76 -0
  46. examples/discord-bot/docker-compose.yaml +21 -0
  47. examples/discord-bot/models +1 -0
  48. examples/flowise/README.md +30 -0
  49. examples/flowise/docker-compose.yaml +37 -0
  50. examples/k8sgpt/README.md +70 -0
.dockerignore ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ .git
2
+ .idea
3
+ models
4
+ examples/chatbot-ui/models
5
+ examples/rwkv/models
6
+ examples/**/models
.env ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Set number of threads.
2
+ ## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
3
+ # THREADS=14
4
+
5
+ ## Specify a different bind address (defaults to ":8080")
6
+ # ADDRESS=127.0.0.1:8080
7
+
8
+ ## Default models context size
9
+ # CONTEXT_SIZE=512
10
+
11
+ ## Default path for models
12
+ MODELS_PATH=/models
13
+
14
+ ## Enable debug mode
15
+ # DEBUG=true
16
+
17
+ ## Specify a build type. Available: cublas, openblas.
18
+ # BUILD_TYPE=openblas
19
+
20
+ ## Uncomment and set to false to disable rebuilding from source
21
+ # REBUILD=false
22
+
23
+ ## Enable image generation with stablediffusion (requires REBUILD=true)
24
+ # GO_TAGS=stablediffusion
25
+
26
+ ## Path where to store generated images
27
+ # IMAGE_PATH=/tmp
28
+
29
+ ## Specify a default upload limit in MB (whisper)
30
+ # UPLOAD_LIMIT
.github/ISSUE_TEMPLATE/bug_report.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: Bug report
3
+ about: Create a report to help us improve
4
+ title: ''
5
+ labels: bug
6
+ assignees: mudler
7
+
8
+ ---
9
+
10
+ <!-- Thanks for helping us to improve LocalAI! We welcome all bug reports. Please fill out each area of the template so we can better help you. Comments like this will be hidden when you post but you can delete them if you wish. -->
11
+
12
+ **LocalAI version:**
13
+ <!-- Container Image or LocalAI tag/commit -->
14
+
15
+ **Environment, CPU architecture, OS, and Version:**
16
+ <!-- Provide the output from "uname -a", HW specs, if it's a VM -->
17
+
18
+ **Describe the bug**
19
+ <!-- A clear and concise description of what the bug is. -->
20
+
21
+ **To Reproduce**
22
+ <!-- Steps to reproduce the behavior, including the LocalAI command used, if any -->
23
+
24
+ **Expected behavior**
25
+ <!-- A clear and concise description of what you expected to happen. -->
26
+
27
+ **Logs**
28
+ <!-- If applicable, add logs while running LocalAI in debug mode (`--debug` or `DEBUG=true`) to help explain your problem. -->
29
+
30
+ **Additional context**
31
+ <!-- Add any other context about the problem here. -->
.github/ISSUE_TEMPLATE/config.yml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ blank_issues_enabled: false
2
+ contact_links:
3
+ - name: Community Support
4
+ url: https://github.com/go-skynet/LocalAI/discussions
5
+ about: Please ask and answer questions here.
6
+ - name: Discord
7
+ url: https://discord.gg/uJAeKSAGDy
8
+ about: Join our community on Discord!
.github/ISSUE_TEMPLATE/feature_request.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: Feature request
3
+ about: Suggest an idea for this project
4
+ title: ''
5
+ labels: enhancement
6
+ assignees: mudler
7
+
8
+ ---
9
+
10
+ <!-- Thanks for helping us to improve LocalAI! We welcome all feature requests. Please fill out each area of the template so we can better help you. Comments like this will be hidden when you post but you can delete them if you wish. -->
11
+
12
+ **Is your feature request related to a problem? Please describe.**
13
+ <!-- A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] -->
14
+
15
+ **Describe the solution you'd like**
16
+ <!-- A clear and concise description of what you want to happen. -->
17
+
18
+ **Describe alternatives you've considered**
19
+ <!-- A clear and concise description of any alternative solutions or features you've considered. -->
20
+
21
+ **Additional context**
22
+ <!-- Add any other context or screenshots about the feature request here. -->
.github/PULL_REQUEST_TEMPLATE.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ **Description**
2
+
3
+ This PR fixes #
4
+
5
+ **Notes for Reviewers**
6
+
7
+
8
+ **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
9
+ - [ ] Yes, I signed my commits.
10
+
11
+
12
+ <!--
13
+ Thank you for contributing to LocalAI!
14
+
15
+ Contributing Conventions:
16
+
17
+ 1. Include descriptive PR titles with [<component-name>] prepended.
18
+ 2. Build and test your changes before submitting a PR.
19
+ 3. Sign your commits
20
+
21
+ By following the community's contribution conventions upfront, the review process will
22
+ be accelerated and your PR merged more quickly.
23
+ -->
.github/bump_deps.sh ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ set -xe
3
+ REPO=$1
4
+ BRANCH=$2
5
+ VAR=$3
6
+
7
+ LAST_COMMIT=$(curl -s -H "Accept: application/vnd.github.VERSION.sha" "https://api.github.com/repos/$REPO/commits/$BRANCH")
8
+
9
+ sed -i Makefile -e "s/$VAR?=.*/$VAR?=$LAST_COMMIT/"
.github/release.yml ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # .github/release.yml
2
+
3
+ changelog:
4
+ exclude:
5
+ labels:
6
+ - ignore-for-release
7
+ categories:
8
+ - title: Breaking Changes 🛠
9
+ labels:
10
+ - Semver-Major
11
+ - breaking-change
12
+ - title: "Bug fixes :bug:"
13
+ labels:
14
+ - bug
15
+ - title: Exciting New Features 🎉
16
+ labels:
17
+ - Semver-Minor
18
+ - enhancement
19
+ - title: 👒 Dependencies
20
+ labels:
21
+ - dependencies
22
+ - title: Other Changes
23
+ labels:
24
+ - "*"
.github/stale.yml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Number of days of inactivity before an issue becomes stale
2
+ daysUntilStale: 45
3
+ # Number of days of inactivity before a stale issue is closed
4
+ daysUntilClose: 10
5
+ # Issues with these labels will never be considered stale
6
+ exemptLabels:
7
+ - issue/willfix
8
+ # Label to use when marking an issue as stale
9
+ staleLabel: issue/stale
10
+ # Comment to post when marking an issue as stale. Set to `false` to disable
11
+ markComment: >
12
+ This issue has been automatically marked as stale because it has not had
13
+ recent activity. It will be closed if no further activity occurs. Thank you
14
+ for your contributions.
15
+ # Comment to post when closing a stale issue. Set to `false` to disable
16
+ closeComment: >
17
+ This issue is being automatically closed due to inactivity.
18
+ However, you may choose to reopen this issue.
.github/workflows/bump_deps.yaml ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Bump dependencies
2
+ on:
3
+ schedule:
4
+ - cron: 0 20 * * *
5
+ workflow_dispatch:
6
+ jobs:
7
+ bump:
8
+ strategy:
9
+ fail-fast: false
10
+ matrix:
11
+ include:
12
+ - repository: "go-skynet/go-llama.cpp"
13
+ variable: "GOLLAMA_VERSION"
14
+ branch: "master"
15
+ - repository: "go-skynet/go-ggml-transformers.cpp"
16
+ variable: "GOGGMLTRANSFORMERS_VERSION"
17
+ branch: "master"
18
+ - repository: "donomii/go-rwkv.cpp"
19
+ variable: "RWKV_VERSION"
20
+ branch: "main"
21
+ - repository: "ggerganov/whisper.cpp"
22
+ variable: "WHISPER_CPP_VERSION"
23
+ branch: "master"
24
+ - repository: "go-skynet/go-bert.cpp"
25
+ variable: "BERT_VERSION"
26
+ branch: "master"
27
+ - repository: "go-skynet/bloomz.cpp"
28
+ variable: "BLOOMZ_VERSION"
29
+ branch: "main"
30
+ - repository: "nomic-ai/gpt4all"
31
+ variable: "GPT4ALL_VERSION"
32
+ branch: "main"
33
+ runs-on: ubuntu-latest
34
+ steps:
35
+ - uses: actions/checkout@v3
36
+ - name: Bump dependencies 🔧
37
+ run: |
38
+ bash .github/bump_deps.sh ${{ matrix.repository }} ${{ matrix.branch }} ${{ matrix.variable }}
39
+ - name: Create Pull Request
40
+ uses: peter-evans/create-pull-request@v5
41
+ with:
42
+ token: ${{ secrets.UPDATE_BOT_TOKEN }}
43
+ push-to-fork: ci-forks/LocalAI
44
+ commit-message: ':arrow_up: Update ${{ matrix.repository }}'
45
+ title: ':arrow_up: Update ${{ matrix.repository }}'
46
+ branch: "update/${{ matrix.variable }}"
47
+ body: Bump of ${{ matrix.repository }} version
48
+ signoff: true
49
+
50
+
51
+
.github/workflows/image.yml ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: 'build container images'
3
+
4
+ on:
5
+ pull_request:
6
+ push:
7
+ branches:
8
+ - master
9
+ tags:
10
+ - '*'
11
+
12
+ concurrency:
13
+ group: ci-${{ github.head_ref || github.ref }}-${{ github.repository }}
14
+ cancel-in-progress: true
15
+
16
+ jobs:
17
+ docker:
18
+ strategy:
19
+ matrix:
20
+ include:
21
+ - build-type: ''
22
+ platforms: 'linux/amd64,linux/arm64'
23
+ tag-latest: 'auto'
24
+ tag-suffix: ''
25
+ ffmpeg: ''
26
+ - build-type: 'cublas'
27
+ cuda-major-version: 11
28
+ cuda-minor-version: 7
29
+ platforms: 'linux/amd64'
30
+ tag-latest: 'false'
31
+ tag-suffix: '-cublas-cuda11'
32
+ ffmpeg: ''
33
+ - build-type: 'cublas'
34
+ cuda-major-version: 12
35
+ cuda-minor-version: 1
36
+ platforms: 'linux/amd64'
37
+ tag-latest: 'false'
38
+ tag-suffix: '-cublas-cuda12'
39
+ ffmpeg: ''
40
+ - build-type: ''
41
+ platforms: 'linux/amd64,linux/arm64'
42
+ tag-latest: 'false'
43
+ tag-suffix: '-ffmpeg'
44
+ ffmpeg: 'true'
45
+ - build-type: 'cublas'
46
+ cuda-major-version: 11
47
+ cuda-minor-version: 7
48
+ platforms: 'linux/amd64'
49
+ tag-latest: 'false'
50
+ tag-suffix: '-cublas-cuda11-ffmpeg'
51
+ ffmpeg: 'true'
52
+ - build-type: 'cublas'
53
+ cuda-major-version: 12
54
+ cuda-minor-version: 1
55
+ platforms: 'linux/amd64'
56
+ tag-latest: 'false'
57
+ tag-suffix: '-cublas-cuda12-ffmpeg'
58
+ ffmpeg: 'true'
59
+
60
+ runs-on: ubuntu-latest
61
+ steps:
62
+ - name: Checkout
63
+ uses: actions/checkout@v3
64
+
65
+ - name: Docker meta
66
+ id: meta
67
+ uses: docker/metadata-action@v4
68
+ with:
69
+ images: quay.io/go-skynet/local-ai
70
+ tags: |
71
+ type=ref,event=branch
72
+ type=semver,pattern={{raw}}
73
+ type=sha
74
+ flavor: |
75
+ latest=${{ matrix.tag-latest }}
76
+ suffix=${{ matrix.tag-suffix }}
77
+
78
+ - name: Set up QEMU
79
+ uses: docker/setup-qemu-action@master
80
+ with:
81
+ platforms: all
82
+
83
+ - name: Set up Docker Buildx
84
+ id: buildx
85
+ uses: docker/setup-buildx-action@master
86
+
87
+ - name: Login to DockerHub
88
+ if: github.event_name != 'pull_request'
89
+ uses: docker/login-action@v2
90
+ with:
91
+ registry: quay.io
92
+ username: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
93
+ password: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
94
+
95
+ - name: Build and push
96
+ uses: docker/build-push-action@v4
97
+ with:
98
+ builder: ${{ steps.buildx.outputs.name }}
99
+ build-args: |
100
+ BUILD_TYPE=${{ matrix.build-type }}
101
+ CUDA_MAJOR_VERSION=${{ matrix.cuda-major-version }}
102
+ CUDA_MINOR_VERSION=${{ matrix.cuda-minor-version }}
103
+ FFMPEG=${{ matrix.ffmpeg }}
104
+ context: .
105
+ file: ./Dockerfile
106
+ platforms: ${{ matrix.platforms }}
107
+ push: ${{ github.event_name != 'pull_request' }}
108
+ tags: ${{ steps.meta.outputs.tags }}
109
+ labels: ${{ steps.meta.outputs.labels }}
.github/workflows/release.yaml ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Build and Release
2
+
3
+ on: push
4
+
5
+ permissions:
6
+ contents: write
7
+
8
+ jobs:
9
+ build-linux:
10
+ strategy:
11
+ matrix:
12
+ include:
13
+ - build: 'avx2'
14
+ defines: ''
15
+ - build: 'avx'
16
+ defines: '-DLLAMA_AVX2=OFF'
17
+ - build: 'avx512'
18
+ defines: '-DLLAMA_AVX512=ON'
19
+ runs-on: ubuntu-latest
20
+ steps:
21
+ - name: Clone
22
+ uses: actions/checkout@v3
23
+ with:
24
+ submodules: true
25
+ - name: Dependencies
26
+ run: |
27
+ sudo apt-get update
28
+ sudo apt-get install build-essential ffmpeg
29
+ - name: Build
30
+ id: build
31
+ env:
32
+ CMAKE_ARGS: "${{ matrix.defines }}"
33
+ BUILD_ID: "${{ matrix.build }}"
34
+ run: |
35
+ STATIC=true make dist
36
+ - uses: actions/upload-artifact@v3
37
+ with:
38
+ name: ${{ matrix.build }}
39
+ path: release/
40
+ - name: Release
41
+ uses: softprops/action-gh-release@v1
42
+ if: startsWith(github.ref, 'refs/tags/')
43
+ with:
44
+ files: |
45
+ release/*
46
+
47
+ build-macOS:
48
+ strategy:
49
+ matrix:
50
+ include:
51
+ - build: 'avx2'
52
+ defines: ''
53
+ - build: 'avx'
54
+ defines: '-DLLAMA_AVX2=OFF'
55
+ - build: 'avx512'
56
+ defines: '-DLLAMA_AVX512=ON'
57
+ runs-on: macOS-latest
58
+ steps:
59
+ - name: Clone
60
+ uses: actions/checkout@v3
61
+ with:
62
+ submodules: true
63
+ - name: Build
64
+ id: build
65
+ env:
66
+ CMAKE_ARGS: "${{ matrix.defines }}"
67
+ BUILD_ID: "${{ matrix.build }}"
68
+ run: |
69
+ make dist
70
+ - uses: actions/upload-artifact@v3
71
+ with:
72
+ name: ${{ matrix.build }}
73
+ path: release/
74
+ - name: Release
75
+ uses: softprops/action-gh-release@v1
76
+ if: startsWith(github.ref, 'refs/tags/')
77
+ with:
78
+ files: |
79
+ release/*
.github/workflows/test.yml ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: 'tests'
3
+
4
+ on:
5
+ pull_request:
6
+ push:
7
+ branches:
8
+ - master
9
+ tags:
10
+ - '*'
11
+
12
+ concurrency:
13
+ group: ci-tests-${{ github.head_ref || github.ref }}-${{ github.repository }}
14
+ cancel-in-progress: true
15
+
16
+ jobs:
17
+ ubuntu-latest:
18
+ runs-on: ubuntu-latest
19
+
20
+ steps:
21
+ - name: Clone
22
+ uses: actions/checkout@v3
23
+ with:
24
+ submodules: true
25
+ - name: Dependencies
26
+ run: |
27
+ sudo apt-get update
28
+ sudo apt-get install build-essential ffmpeg
29
+ - name: Test
30
+ run: |
31
+ make test
32
+
33
+ macOS-latest:
34
+ runs-on: macOS-latest
35
+
36
+ steps:
37
+ - name: Clone
38
+ uses: actions/checkout@v3
39
+ with:
40
+ submodules: true
41
+
42
+ - name: Test
43
+ run: |
44
+ CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF" make test
.gitignore ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # go-llama build artifacts
2
+ go-llama
3
+ gpt4all
4
+ go-stable-diffusion
5
+ go-ggml-transformers
6
+ go-gpt2
7
+ go-rwkv
8
+ whisper.cpp
9
+ bloomz
10
+ go-bert
11
+
12
+ # LocalAI build binary
13
+ LocalAI
14
+ local-ai
15
+ # prevent above rules from omitting the helm chart
16
+ !charts/*
17
+
18
+ # Ignore models
19
+ models/*
20
+ test-models/
21
+ test-dir/
22
+
23
+ release/
24
+
25
+ # just in case
26
+ .DS_Store
27
+ .idea
28
+
29
+ # Generated during build
30
+ backend-assets/
31
+
32
+ /ggml-metal.metal
.vscode/launch.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "0.2.0",
3
+ "configurations": [
4
+ {
5
+ "name": "Python: Current File",
6
+ "type": "python",
7
+ "request": "launch",
8
+ "program": "${file}",
9
+ "console": "integratedTerminal",
10
+ "justMyCode": false,
11
+ "cwd": "${workspaceFolder}/examples/langchain-chroma",
12
+ "env": {
13
+ "OPENAI_API_BASE": "http://localhost:8080/v1",
14
+ "OPENAI_API_KEY": "abc"
15
+ }
16
+ },
17
+ {
18
+ "name": "Launch LocalAI API",
19
+ "type": "go",
20
+ "request": "launch",
21
+ "mode": "debug",
22
+ "program": "${workspaceFolder}/main.go",
23
+ "args": [
24
+ "api"
25
+ ],
26
+ "env": {
27
+ "C_INCLUDE_PATH": "${workspaceFolder}/go-llama:${workspaceFolder}/go-stable-diffusion/:${workspaceFolder}/gpt4all/gpt4all-bindings/golang/:${workspaceFolder}/go-gpt2:${workspaceFolder}/go-rwkv:${workspaceFolder}/whisper.cpp:${workspaceFolder}/go-bert:${workspaceFolder}/bloomz",
28
+ "LIBRARY_PATH": "${workspaceFolder}/go-llama:${workspaceFolder}/go-stable-diffusion/:${workspaceFolder}/gpt4all/gpt4all-bindings/golang/:${workspaceFolder}/go-gpt2:${workspaceFolder}/go-rwkv:${workspaceFolder}/whisper.cpp:${workspaceFolder}/go-bert:${workspaceFolder}/bloomz",
29
+ "DEBUG": "true"
30
+ }
31
+ }
32
+ ]
33
+ }
Dockerfile ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ARG GO_VERSION=1.20-bullseye
2
+
3
+ FROM golang:$GO_VERSION as requirements
4
+
5
+ ARG BUILD_TYPE
6
+ ARG CUDA_MAJOR_VERSION=11
7
+ ARG CUDA_MINOR_VERSION=7
8
+ ARG SPDLOG_VERSION="1.11.0"
9
+ ARG PIPER_PHONEMIZE_VERSION='1.0.0'
10
+ ARG TARGETARCH
11
+ ARG TARGETVARIANT
12
+
13
+ ENV BUILD_TYPE=${BUILD_TYPE}
14
+ ARG GO_TAGS="stablediffusion tts"
15
+
16
+ RUN apt-get update && \
17
+ apt-get install -y ca-certificates cmake curl patch
18
+
19
+ # CuBLAS requirements
20
+ RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \
21
+ apt-get install -y software-properties-common && \
22
+ apt-add-repository contrib && \
23
+ curl -O https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.0-1_all.deb && \
24
+ dpkg -i cuda-keyring_1.0-1_all.deb && \
25
+ rm -f cuda-keyring_1.0-1_all.deb && \
26
+ apt-get update && \
27
+ apt-get install -y cuda-nvcc-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
28
+ ; fi
29
+ ENV PATH /usr/local/cuda/bin:${PATH}
30
+
31
+ WORKDIR /build
32
+
33
+ # OpenBLAS requirements
34
+ RUN apt-get install -y libopenblas-dev
35
+
36
+ # Stable Diffusion requirements
37
+ RUN apt-get install -y libopencv-dev && \
38
+ ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
39
+
40
+ # piper requirements
41
+ # Use pre-compiled Piper phonemization library (includes onnxruntime)
42
+ #RUN if echo "${GO_TAGS}" | grep -q "tts"; then \
43
+ RUN test -n "$TARGETARCH" \
44
+ || (echo 'missing $TARGETARCH, either set this `ARG` manually, or run using `docker buildkit`' && false)
45
+ RUN curl -L "https://github.com/gabime/spdlog/archive/refs/tags/v${SPDLOG_VERSION}.tar.gz" | \
46
+ tar -xzvf - && \
47
+ mkdir -p "spdlog-${SPDLOG_VERSION}/build" && \
48
+ cd "spdlog-${SPDLOG_VERSION}/build" && \
49
+ cmake .. && \
50
+ make -j8 && \
51
+ cmake --install . --prefix /usr && mkdir -p "lib/Linux-$(uname -m)" && \
52
+ cd /build && \
53
+ mkdir -p "lib/Linux-$(uname -m)/piper_phonemize" && \
54
+ curl -L "https://github.com/rhasspy/piper-phonemize/releases/download/v${PIPER_PHONEMIZE_VERSION}/libpiper_phonemize-${TARGETARCH}${TARGETVARIANT}.tar.gz" | \
55
+ tar -C "lib/Linux-$(uname -m)/piper_phonemize" -xzvf - && ls -liah /build/lib/Linux-$(uname -m)/piper_phonemize/ && \
56
+ cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /lib64/ && \
57
+ cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /usr/lib/ && \
58
+ cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/include/. /usr/include/
59
+ # \
60
+ # ; fi
61
+
62
+ ###################################
63
+ ###################################
64
+
65
+ FROM requirements as builder
66
+
67
+ ARG GO_TAGS="stablediffusion tts"
68
+
69
+ ENV GO_TAGS=${GO_TAGS}
70
+ ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
71
+ ENV NVIDIA_REQUIRE_CUDA="cuda>=${CUDA_MAJOR_VERSION}.0"
72
+ ENV NVIDIA_VISIBLE_DEVICES=all
73
+
74
+ WORKDIR /build
75
+
76
+ COPY Makefile .
77
+ RUN make get-sources
78
+ COPY go.mod .
79
+ RUN make prepare
80
+ COPY . .
81
+ RUN ESPEAK_DATA=/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data make build
82
+
83
+ ###################################
84
+ ###################################
85
+
86
+ FROM requirements
87
+
88
+ ARG FFMPEG
89
+
90
+ ENV REBUILD=true
91
+ ENV HEALTHCHECK_ENDPOINT=http://localhost:8080/readyz
92
+
93
+ # Add FFmpeg
94
+ RUN if [ "${FFMPEG}" = "true" ]; then \
95
+ apt-get install -y ffmpeg \
96
+ ; fi
97
+
98
+ WORKDIR /build
99
+
100
+ # we start fresh & re-copy all assets because `make build` does not clean up nicely after itself
101
+ # so when `entrypoint.sh` runs `make build` again (which it does by default), the build would fail
102
+ # see https://github.com/go-skynet/LocalAI/pull/658#discussion_r1241971626 and
103
+ # https://github.com/go-skynet/LocalAI/pull/434
104
+ COPY . .
105
+ RUN make prepare-sources
106
+ COPY --from=builder /build/local-ai ./
107
+
108
+ # Define the health check command
109
+ HEALTHCHECK --interval=1m --timeout=10m --retries=10 \
110
+ CMD curl -f $HEALTHCHECK_ENDPOINT || exit 1
111
+
112
+ EXPOSE 8080
113
+ ENTRYPOINT [ "/build/entrypoint.sh" ]
Earthfile ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ VERSION 0.7
2
+
3
+ build:
4
+ FROM DOCKERFILE -f Dockerfile .
5
+ SAVE ARTIFACT /usr/bin/local-ai AS LOCAL local-ai
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2023 Ettore Di Giacinto
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
Makefile ADDED
@@ -0,0 +1,303 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ GOCMD=go
2
+ GOTEST=$(GOCMD) test
3
+ GOVET=$(GOCMD) vet
4
+ BINARY_NAME=local-ai
5
+
6
+ GOLLAMA_VERSION?=f104111358e8098aea69ce408b85b707528179ef
7
+ GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all
8
+ GPT4ALL_VERSION?=c1794597a7559d5616567e280b722231c624a57b
9
+ GOGGMLTRANSFORMERS_VERSION?=a459d2726792132541152c981ed9fbfe28f4fd20
10
+ RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp
11
+ RWKV_VERSION?=f5a8c45396741470583f59b916a2a7641e63bcd0
12
+ WHISPER_CPP_VERSION?=72deb41eb26300f71c50febe29db8ffcce09256c
13
+ BERT_VERSION?=6069103f54b9969c02e789d0fb12a23bd614285f
14
+ PIPER_VERSION?=56b8a81b4760a6fbee1a82e62f007ae7e8f010a7
15
+ BLOOMZ_VERSION?=1834e77b83faafe912ad4092ccf7f77937349e2f
16
+ export BUILD_TYPE?=
17
+ CGO_LDFLAGS?=
18
+ CUDA_LIBPATH?=/usr/local/cuda/lib64/
19
+ STABLEDIFFUSION_VERSION?=d89260f598afb809279bc72aa0107b4292587632
20
+ GO_TAGS?=
21
+ BUILD_ID?=git
22
+
23
+ VERSION?=$(shell git describe --always --tags --dirty || echo "dev" )
24
+ # go tool nm ./local-ai | grep Commit
25
+ LD_FLAGS?=
26
+ override LD_FLAGS += -X "github.com/go-skynet/LocalAI/internal.Version=$(VERSION)"
27
+ override LD_FLAGS += -X "github.com/go-skynet/LocalAI/internal.Commit=$(shell git rev-parse HEAD)"
28
+
29
+ OPTIONAL_TARGETS?=
30
+ ESPEAK_DATA?=
31
+
32
+ OS := $(shell uname -s)
33
+ ARCH := $(shell uname -m)
34
+ GREEN := $(shell tput -Txterm setaf 2)
35
+ YELLOW := $(shell tput -Txterm setaf 3)
36
+ WHITE := $(shell tput -Txterm setaf 7)
37
+ CYAN := $(shell tput -Txterm setaf 6)
38
+ RESET := $(shell tput -Txterm sgr0)
39
+
40
+ C_INCLUDE_PATH=$(shell pwd)/go-llama:$(shell pwd)/go-stable-diffusion/:$(shell pwd)/gpt4all/gpt4all-bindings/golang/:$(shell pwd)/go-ggml-transformers:$(shell pwd)/go-rwkv:$(shell pwd)/whisper.cpp:$(shell pwd)/go-bert:$(shell pwd)/bloomz
41
+ LIBRARY_PATH=$(shell pwd)/go-piper:$(shell pwd)/go-llama:$(shell pwd)/go-stable-diffusion/:$(shell pwd)/gpt4all/gpt4all-bindings/golang/:$(shell pwd)/go-ggml-transformers:$(shell pwd)/go-rwkv:$(shell pwd)/whisper.cpp:$(shell pwd)/go-bert:$(shell pwd)/bloomz
42
+
43
+ ifeq ($(BUILD_TYPE),openblas)
44
+ CGO_LDFLAGS+=-lopenblas
45
+ endif
46
+
47
+ ifeq ($(BUILD_TYPE),cublas)
48
+ CGO_LDFLAGS+=-lcublas -lcudart -L$(CUDA_LIBPATH)
49
+ export LLAMA_CUBLAS=1
50
+ endif
51
+
52
+ ifeq ($(BUILD_TYPE),metal)
53
+ CGO_LDFLAGS+=-framework Foundation -framework Metal -framework MetalKit -framework MetalPerformanceShaders
54
+ export LLAMA_METAL=1
55
+ endif
56
+
57
+ ifeq ($(BUILD_TYPE),clblas)
58
+ CGO_LDFLAGS+=-lOpenCL -lclblast
59
+ endif
60
+
61
+ # glibc-static or glibc-devel-static required
62
+ ifeq ($(STATIC),true)
63
+ LD_FLAGS=-linkmode external -extldflags -static
64
+ endif
65
+
66
+ ifeq ($(findstring stablediffusion,$(GO_TAGS)),stablediffusion)
67
+ OPTIONAL_TARGETS+=go-stable-diffusion/libstablediffusion.a
68
+ endif
69
+
70
+ ifeq ($(findstring tts,$(GO_TAGS)),tts)
71
+ OPTIONAL_TARGETS+=go-piper/libpiper_binding.a
72
+ OPTIONAL_TARGETS+=backend-assets/espeak-ng-data
73
+ endif
74
+
75
+ .PHONY: all test build vendor
76
+
77
+ all: help
78
+
79
+ ## GPT4ALL
80
+ gpt4all:
81
+ git clone --recurse-submodules $(GPT4ALL_REPO) gpt4all
82
+ cd gpt4all && git checkout -b build $(GPT4ALL_VERSION) && git submodule update --init --recursive --depth 1
83
+ # This is hackish, but needed as both go-llama and go-gpt4allj have their own version of ggml..
84
+ @find ./gpt4all -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_gpt4all_/g' {} +
85
+ @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_gpt4all_/g' {} +
86
+ @find ./gpt4all -type f -name "*.m" -exec sed -i'' -e 's/ggml_/ggml_gpt4all_/g' {} +
87
+ @find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_gpt4all_/g' {} +
88
+ @find ./gpt4all -type f -name "*.c" -exec sed -i'' -e 's/llama_/llama_gpt4all_/g' {} +
89
+ @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/llama_/llama_gpt4all_/g' {} +
90
+ @find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/llama_/llama_gpt4all_/g' {} +
91
+ @find ./gpt4all/gpt4all-backend -type f -name "llama_util.h" -execdir mv {} "llama_gpt4all_util.h" \;
92
+ @find ./gpt4all -type f -name "*.cmake" -exec sed -i'' -e 's/llama_util/llama_gpt4all_util/g' {} +
93
+ @find ./gpt4all -type f -name "*.txt" -exec sed -i'' -e 's/llama_util/llama_gpt4all_util/g' {} +
94
+ @find ./gpt4all/gpt4all-bindings/golang -type f -name "*.cpp" -exec sed -i'' -e 's/load_model/load_gpt4all_model/g' {} +
95
+ @find ./gpt4all/gpt4all-bindings/golang -type f -name "*.go" -exec sed -i'' -e 's/load_model/load_gpt4all_model/g' {} +
96
+ @find ./gpt4all/gpt4all-bindings/golang -type f -name "*.h" -exec sed -i'' -e 's/load_model/load_gpt4all_model/g' {} +
97
+
98
+ ## go-piper
99
+ go-piper:
100
+ git clone --recurse-submodules https://github.com/mudler/go-piper go-piper
101
+ cd go-piper && git checkout -b build $(PIPER_VERSION) && git submodule update --init --recursive --depth 1
102
+
103
+ ## BERT embeddings
104
+ go-bert:
105
+ git clone --recurse-submodules https://github.com/go-skynet/go-bert.cpp go-bert
106
+ cd go-bert && git checkout -b build $(BERT_VERSION) && git submodule update --init --recursive --depth 1
107
+ @find ./go-bert -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_bert_/g' {} +
108
+ @find ./go-bert -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_bert_/g' {} +
109
+ @find ./go-bert -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_bert_/g' {} +
110
+
111
+ ## stable diffusion
112
+ go-stable-diffusion:
113
+ git clone --recurse-submodules https://github.com/mudler/go-stable-diffusion go-stable-diffusion
114
+ cd go-stable-diffusion && git checkout -b build $(STABLEDIFFUSION_VERSION) && git submodule update --init --recursive --depth 1
115
+
116
+ go-stable-diffusion/libstablediffusion.a:
117
+ $(MAKE) -C go-stable-diffusion libstablediffusion.a
118
+
119
+ ## RWKV
120
+ go-rwkv:
121
+ git clone --recurse-submodules $(RWKV_REPO) go-rwkv
122
+ cd go-rwkv && git checkout -b build $(RWKV_VERSION) && git submodule update --init --recursive --depth 1
123
+ @find ./go-rwkv -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_rwkv_/g' {} +
124
+ @find ./go-rwkv -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_rwkv_/g' {} +
125
+ @find ./go-rwkv -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_rwkv_/g' {} +
126
+
127
+ go-rwkv/librwkv.a: go-rwkv
128
+ cd go-rwkv && cd rwkv.cpp && cmake . -DRWKV_BUILD_SHARED_LIBRARY=OFF && cmake --build . && cp librwkv.a ..
129
+
130
+ ## bloomz
131
+ bloomz:
132
+ git clone --recurse-submodules https://github.com/go-skynet/bloomz.cpp bloomz
133
+ @find ./bloomz -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_bloomz_/g' {} +
134
+ @find ./bloomz -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_bloomz_/g' {} +
135
+ @find ./bloomz -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_bloomz_/g' {} +
136
+ @find ./bloomz -type f -name "*.cpp" -exec sed -i'' -e 's/gpt_/gpt_bloomz_/g' {} +
137
+ @find ./bloomz -type f -name "*.h" -exec sed -i'' -e 's/gpt_/gpt_bloomz_/g' {} +
138
+ @find ./bloomz -type f -name "*.cpp" -exec sed -i'' -e 's/void replace/void json_bloomz_replace/g' {} +
139
+ @find ./bloomz -type f -name "*.cpp" -exec sed -i'' -e 's/::replace/::json_bloomz_replace/g' {} +
140
+
141
+ bloomz/libbloomz.a: bloomz
142
+ cd bloomz && make libbloomz.a
143
+
144
+ go-bert/libgobert.a: go-bert
145
+ $(MAKE) -C go-bert libgobert.a
146
+
147
+ backend-assets/gpt4all: gpt4all/gpt4all-bindings/golang/libgpt4all.a
148
+ mkdir -p backend-assets/gpt4all
149
+ @cp gpt4all/gpt4all-bindings/golang/buildllm/*.so backend-assets/gpt4all/ || true
150
+ @cp gpt4all/gpt4all-bindings/golang/buildllm/*.dylib backend-assets/gpt4all/ || true
151
+ @cp gpt4all/gpt4all-bindings/golang/buildllm/*.dll backend-assets/gpt4all/ || true
152
+
153
+ backend-assets/espeak-ng-data:
154
+ mkdir -p backend-assets/espeak-ng-data
155
+ ifdef ESPEAK_DATA
156
+ @cp -rf $(ESPEAK_DATA)/. backend-assets/espeak-ng-data
157
+ else
158
+ @touch backend-assets/espeak-ng-data/keep
159
+ endif
160
+
161
+ gpt4all/gpt4all-bindings/golang/libgpt4all.a: gpt4all
162
+ $(MAKE) -C gpt4all/gpt4all-bindings/golang/ libgpt4all.a
163
+
164
+ ## CEREBRAS GPT
165
+ go-ggml-transformers:
166
+ git clone --recurse-submodules https://github.com/go-skynet/go-ggml-transformers.cpp go-ggml-transformers
167
+ cd go-ggml-transformers && git checkout -b build $(GOGPT2_VERSION) && git submodule update --init --recursive --depth 1
168
+ # This is hackish, but needed as both go-llama and go-gpt4allj have their own version of ggml..
169
+ @find ./go-ggml-transformers -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_gpt2_/g' {} +
170
+ @find ./go-ggml-transformers -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_gpt2_/g' {} +
171
+ @find ./go-ggml-transformers -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_gpt2_/g' {} +
172
+ @find ./go-ggml-transformers -type f -name "*.cpp" -exec sed -i'' -e 's/gpt_print_usage/gpt2_print_usage/g' {} +
173
+ @find ./go-ggml-transformers -type f -name "*.h" -exec sed -i'' -e 's/gpt_print_usage/gpt2_print_usage/g' {} +
174
+ @find ./go-ggml-transformers -type f -name "*.cpp" -exec sed -i'' -e 's/gpt_params_parse/gpt2_params_parse/g' {} +
175
+ @find ./go-ggml-transformers -type f -name "*.h" -exec sed -i'' -e 's/gpt_params_parse/gpt2_params_parse/g' {} +
176
+ @find ./go-ggml-transformers -type f -name "*.cpp" -exec sed -i'' -e 's/gpt_random_prompt/gpt2_random_prompt/g' {} +
177
+ @find ./go-ggml-transformers -type f -name "*.h" -exec sed -i'' -e 's/gpt_random_prompt/gpt2_random_prompt/g' {} +
178
+ @find ./go-ggml-transformers -type f -name "*.cpp" -exec sed -i'' -e 's/json_/json_gpt2_/g' {} +
179
+
180
+ go-ggml-transformers/libtransformers.a: go-ggml-transformers
181
+ $(MAKE) -C go-ggml-transformers libtransformers.a
182
+
183
+ whisper.cpp:
184
+ git clone https://github.com/ggerganov/whisper.cpp.git
185
+ cd whisper.cpp && git checkout -b build $(WHISPER_CPP_VERSION) && git submodule update --init --recursive --depth 1
186
+ @find ./whisper.cpp -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_whisper_/g' {} +
187
+ @find ./whisper.cpp -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_whisper_/g' {} +
188
+ @find ./whisper.cpp -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_whisper_/g' {} +
189
+
190
+ whisper.cpp/libwhisper.a: whisper.cpp
191
+ cd whisper.cpp && make libwhisper.a
192
+
193
+ go-llama:
194
+ git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp go-llama
195
+ cd go-llama && git checkout -b build $(GOLLAMA_VERSION) && git submodule update --init --recursive --depth 1
196
+
197
+ go-llama/libbinding.a: go-llama
198
+ $(MAKE) -C go-llama BUILD_TYPE=$(BUILD_TYPE) libbinding.a
199
+
200
+ go-piper/libpiper_binding.a:
201
+ $(MAKE) -C go-piper libpiper_binding.a example/main
202
+
203
+ get-sources: go-llama go-ggml-transformers gpt4all go-piper go-rwkv whisper.cpp go-bert bloomz go-stable-diffusion
204
+ touch $@
205
+
206
+ replace:
207
+ $(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/go-llama
208
+ $(GOCMD) mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=$(shell pwd)/gpt4all/gpt4all-bindings/golang
209
+ $(GOCMD) mod edit -replace github.com/go-skynet/go-ggml-transformers.cpp=$(shell pwd)/go-ggml-transformers
210
+ $(GOCMD) mod edit -replace github.com/donomii/go-rwkv.cpp=$(shell pwd)/go-rwkv
211
+ $(GOCMD) mod edit -replace github.com/ggerganov/whisper.cpp=$(shell pwd)/whisper.cpp
212
+ $(GOCMD) mod edit -replace github.com/go-skynet/go-bert.cpp=$(shell pwd)/go-bert
213
+ $(GOCMD) mod edit -replace github.com/go-skynet/bloomz.cpp=$(shell pwd)/bloomz
214
+ $(GOCMD) mod edit -replace github.com/mudler/go-stable-diffusion=$(shell pwd)/go-stable-diffusion
215
+ $(GOCMD) mod edit -replace github.com/mudler/go-piper=$(shell pwd)/go-piper
216
+
217
+ prepare-sources: get-sources replace
218
+ $(GOCMD) mod download
219
+
220
+ ## GENERIC
221
+ rebuild: ## Rebuilds the project
222
+ $(MAKE) -C go-llama clean
223
+ $(MAKE) -C gpt4all/gpt4all-bindings/golang/ clean
224
+ $(MAKE) -C go-ggml-transformers clean
225
+ $(MAKE) -C go-rwkv clean
226
+ $(MAKE) -C whisper.cpp clean
227
+ $(MAKE) -C go-stable-diffusion clean
228
+ $(MAKE) -C go-bert clean
229
+ $(MAKE) -C bloomz clean
230
+ $(MAKE) -C go-piper clean
231
+ $(MAKE) build
232
+
233
+ prepare: prepare-sources backend-assets/gpt4all $(OPTIONAL_TARGETS) go-llama/libbinding.a go-bert/libgobert.a go-ggml-transformers/libtransformers.a go-rwkv/librwkv.a whisper.cpp/libwhisper.a bloomz/libbloomz.a ## Prepares for building
234
+ touch $@
235
+
236
+ clean: ## Remove build related file
237
+ rm -fr ./go-llama
238
+ rm -rf ./gpt4all
239
+ rm -rf ./go-gpt2
240
+ rm -rf ./go-stable-diffusion
241
+ rm -rf ./go-ggml-transformers
242
+ rm -rf ./backend-assets
243
+ rm -rf ./go-rwkv
244
+ rm -rf ./go-bert
245
+ rm -rf ./bloomz
246
+ rm -rf ./whisper.cpp
247
+ rm -rf ./go-piper
248
+ rm -rf $(BINARY_NAME)
249
+ rm -rf release/
250
+
251
+ ## Build:
252
+
253
+ build: prepare ## Build the project
254
+ $(info ${GREEN}I local-ai build info:${RESET})
255
+ $(info ${GREEN}I BUILD_TYPE: ${YELLOW}$(BUILD_TYPE)${RESET})
256
+ $(info ${GREEN}I GO_TAGS: ${YELLOW}$(GO_TAGS)${RESET})
257
+ $(info ${GREEN}I LD_FLAGS: ${YELLOW}$(LD_FLAGS)${RESET})
258
+
259
+ CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=${C_INCLUDE_PATH} LIBRARY_PATH=${LIBRARY_PATH} $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o $(BINARY_NAME) ./
260
+ ifeq ($(BUILD_TYPE),metal)
261
+ cp go-llama/build/bin/ggml-metal.metal .
262
+ endif
263
+
264
+ dist: build
265
+ mkdir -p release
266
+ cp $(BINARY_NAME) release/$(BINARY_NAME)-$(BUILD_ID)-$(OS)-$(ARCH)
267
+
268
+ generic-build: ## Build the project using generic
269
+ BUILD_TYPE="generic" $(MAKE) build
270
+
271
+ ## Run
272
+ run: prepare ## run local-ai
273
+ CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=${C_INCLUDE_PATH} LIBRARY_PATH=${LIBRARY_PATH} $(GOCMD) run ./
274
+
275
+ test-models/testmodel:
276
+ mkdir test-models
277
+ mkdir test-dir
278
+ wget https://huggingface.co/nnakasato/ggml-model-test/resolve/main/ggml-model-q4.bin -O test-models/testmodel
279
+ wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -O test-models/whisper-en
280
+ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O test-models/bert
281
+ wget https://cdn.openai.com/whisper/draft-20220913a/micro-machines.wav -O test-dir/audio.wav
282
+ wget https://huggingface.co/mudler/rwkv-4-raven-1.5B-ggml/resolve/main/RWKV-4-Raven-1B5-v11-Eng99%2525-Other1%2525-20230425-ctx4096_Q4_0.bin -O test-models/rwkv
283
+ wget https://raw.githubusercontent.com/saharNooby/rwkv.cpp/5eb8f09c146ea8124633ab041d9ea0b1f1db4459/rwkv/20B_tokenizer.json -O test-models/rwkv.tokenizer.json
284
+ cp tests/models_fixtures/* test-models
285
+
286
+ test: prepare test-models/testmodel
287
+ cp -r backend-assets api
288
+ cp tests/models_fixtures/* test-models
289
+ C_INCLUDE_PATH=${C_INCLUDE_PATH} LIBRARY_PATH=${LIBRARY_PATH} TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama" --flake-attempts 5 -v -r ./api ./pkg
290
+ C_INCLUDE_PATH=${C_INCLUDE_PATH} LIBRARY_PATH=${LIBRARY_PATH} TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="gpt4all" --flake-attempts 5 -v -r ./api ./pkg
291
+ C_INCLUDE_PATH=${C_INCLUDE_PATH} LIBRARY_PATH=${LIBRARY_PATH} TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="llama" --flake-attempts 5 -v -r ./api ./pkg
292
+
293
+ ## Help:
294
+ help: ## Show this help.
295
+ @echo ''
296
+ @echo 'Usage:'
297
+ @echo ' ${YELLOW}make${RESET} ${GREEN}<target>${RESET}'
298
+ @echo ''
299
+ @echo 'Targets:'
300
+ @awk 'BEGIN {FS = ":.*?## "} { \
301
+ if (/^[a-zA-Z_-]+:.*?##.*$$/) {printf " ${YELLOW}%-20s${GREEN}%s${RESET}\n", $$1, $$2} \
302
+ else if (/^## .*$$/) {printf " ${CYAN}%s${RESET}\n", substr($$1,4)} \
303
+ }' $(MAKEFILE_LIST)
api/api.go ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "errors"
5
+
6
+ "github.com/go-skynet/LocalAI/internal"
7
+ "github.com/go-skynet/LocalAI/pkg/assets"
8
+ "github.com/gofiber/fiber/v2"
9
+ "github.com/gofiber/fiber/v2/middleware/cors"
10
+ "github.com/gofiber/fiber/v2/middleware/logger"
11
+ "github.com/gofiber/fiber/v2/middleware/recover"
12
+ "github.com/rs/zerolog"
13
+ "github.com/rs/zerolog/log"
14
+ )
15
+
16
+ func App(opts ...AppOption) (*fiber.App, error) {
17
+ options := newOptions(opts...)
18
+
19
+ zerolog.SetGlobalLevel(zerolog.InfoLevel)
20
+ if options.debug {
21
+ zerolog.SetGlobalLevel(zerolog.DebugLevel)
22
+ }
23
+
24
+ // Return errors as JSON responses
25
+ app := fiber.New(fiber.Config{
26
+ BodyLimit: options.uploadLimitMB * 1024 * 1024, // this is the default limit of 4MB
27
+ DisableStartupMessage: options.disableMessage,
28
+ // Override default error handler
29
+ ErrorHandler: func(ctx *fiber.Ctx, err error) error {
30
+ // Status code defaults to 500
31
+ code := fiber.StatusInternalServerError
32
+
33
+ // Retrieve the custom status code if it's a *fiber.Error
34
+ var e *fiber.Error
35
+ if errors.As(err, &e) {
36
+ code = e.Code
37
+ }
38
+
39
+ // Send custom error page
40
+ return ctx.Status(code).JSON(
41
+ ErrorResponse{
42
+ Error: &APIError{Message: err.Error(), Code: code},
43
+ },
44
+ )
45
+ },
46
+ })
47
+
48
+ if options.debug {
49
+ app.Use(logger.New(logger.Config{
50
+ Format: "[${ip}]:${port} ${status} - ${method} ${path}\n",
51
+ }))
52
+ }
53
+
54
+ cm := NewConfigMerger()
55
+ if err := cm.LoadConfigs(options.loader.ModelPath); err != nil {
56
+ log.Error().Msgf("error loading config files: %s", err.Error())
57
+ }
58
+
59
+ if options.configFile != "" {
60
+ if err := cm.LoadConfigFile(options.configFile); err != nil {
61
+ log.Error().Msgf("error loading config file: %s", err.Error())
62
+ }
63
+ }
64
+
65
+ if options.debug {
66
+ for _, v := range cm.ListConfigs() {
67
+ cfg, _ := cm.GetConfig(v)
68
+ log.Debug().Msgf("Model: %s (config: %+v)", v, cfg)
69
+ }
70
+ }
71
+
72
+ if options.assetsDestination != "" {
73
+ // Extract files from the embedded FS
74
+ err := assets.ExtractFiles(options.backendAssets, options.assetsDestination)
75
+ if err != nil {
76
+ log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly, like gpt4all)", err)
77
+ }
78
+ }
79
+
80
+ // Default middleware config
81
+ app.Use(recover.New())
82
+
83
+ if options.preloadJSONModels != "" {
84
+ if err := ApplyGalleryFromString(options.loader.ModelPath, options.preloadJSONModels, cm, options.galleries); err != nil {
85
+ return nil, err
86
+ }
87
+ }
88
+
89
+ if options.preloadModelsFromPath != "" {
90
+ if err := ApplyGalleryFromFile(options.loader.ModelPath, options.preloadModelsFromPath, cm, options.galleries); err != nil {
91
+ return nil, err
92
+ }
93
+ }
94
+
95
+ if options.cors {
96
+ if options.corsAllowOrigins == "" {
97
+ app.Use(cors.New())
98
+ } else {
99
+ app.Use(cors.New(cors.Config{
100
+ AllowOrigins: options.corsAllowOrigins,
101
+ }))
102
+ }
103
+ }
104
+
105
+ // LocalAI API endpoints
106
+ applier := newGalleryApplier(options.loader.ModelPath)
107
+ applier.start(options.context, cm)
108
+
109
+ app.Get("/version", func(c *fiber.Ctx) error {
110
+ return c.JSON(struct {
111
+ Version string `json:"version"`
112
+ }{Version: internal.PrintableVersion()})
113
+ })
114
+
115
+ app.Post("/models/apply", applyModelGallery(options.loader.ModelPath, cm, applier.C, options.galleries))
116
+ app.Get("/models/available", listModelFromGallery(options.galleries, options.loader.ModelPath))
117
+ app.Get("/models/jobs/:uuid", getOpStatus(applier))
118
+
119
+ // openAI compatible API endpoint
120
+
121
+ // chat
122
+ app.Post("/v1/chat/completions", chatEndpoint(cm, options))
123
+ app.Post("/chat/completions", chatEndpoint(cm, options))
124
+
125
+ // edit
126
+ app.Post("/v1/edits", editEndpoint(cm, options))
127
+ app.Post("/edits", editEndpoint(cm, options))
128
+
129
+ // completion
130
+ app.Post("/v1/completions", completionEndpoint(cm, options))
131
+ app.Post("/completions", completionEndpoint(cm, options))
132
+ app.Post("/v1/engines/:model/completions", completionEndpoint(cm, options))
133
+
134
+ // embeddings
135
+ app.Post("/v1/embeddings", embeddingsEndpoint(cm, options))
136
+ app.Post("/embeddings", embeddingsEndpoint(cm, options))
137
+ app.Post("/v1/engines/:model/embeddings", embeddingsEndpoint(cm, options))
138
+
139
+ // audio
140
+ app.Post("/v1/audio/transcriptions", transcriptEndpoint(cm, options))
141
+ app.Post("/tts", ttsEndpoint(cm, options))
142
+
143
+ // images
144
+ app.Post("/v1/images/generations", imageEndpoint(cm, options))
145
+
146
+ if options.imageDir != "" {
147
+ app.Static("/generated-images", options.imageDir)
148
+ }
149
+
150
+ if options.audioDir != "" {
151
+ app.Static("/generated-audio", options.audioDir)
152
+ }
153
+
154
+ ok := func(c *fiber.Ctx) error {
155
+ return c.SendStatus(200)
156
+ }
157
+
158
+ // Kubernetes health checks
159
+ app.Get("/healthz", ok)
160
+ app.Get("/readyz", ok)
161
+
162
+ // models
163
+ app.Get("/v1/models", listModels(options.loader, cm))
164
+ app.Get("/models", listModels(options.loader, cm))
165
+
166
+ return app, nil
167
+ }
api/api_test.go ADDED
@@ -0,0 +1,514 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api_test
2
+
3
+ import (
4
+ "bytes"
5
+ "context"
6
+ "embed"
7
+ "encoding/json"
8
+ "fmt"
9
+ "io/ioutil"
10
+ "net/http"
11
+ "os"
12
+ "path/filepath"
13
+ "runtime"
14
+
15
+ . "github.com/go-skynet/LocalAI/api"
16
+ "github.com/go-skynet/LocalAI/pkg/gallery"
17
+ "github.com/go-skynet/LocalAI/pkg/model"
18
+ "github.com/go-skynet/LocalAI/pkg/utils"
19
+ "github.com/gofiber/fiber/v2"
20
+ . "github.com/onsi/ginkgo/v2"
21
+ . "github.com/onsi/gomega"
22
+ "gopkg.in/yaml.v3"
23
+
24
+ openaigo "github.com/otiai10/openaigo"
25
+ "github.com/sashabaranov/go-openai"
26
+ )
27
+
28
+ type modelApplyRequest struct {
29
+ ID string `json:"id"`
30
+ URL string `json:"url"`
31
+ Name string `json:"name"`
32
+ Overrides map[string]string `json:"overrides"`
33
+ }
34
+
35
+ func getModelStatus(url string) (response map[string]interface{}) {
36
+ // Create the HTTP request
37
+ resp, err := http.Get(url)
38
+ if err != nil {
39
+ fmt.Println("Error creating request:", err)
40
+ return
41
+ }
42
+ defer resp.Body.Close()
43
+
44
+ body, err := ioutil.ReadAll(resp.Body)
45
+ if err != nil {
46
+ fmt.Println("Error reading response body:", err)
47
+ return
48
+ }
49
+
50
+ // Unmarshal the response into a map[string]interface{}
51
+ err = json.Unmarshal(body, &response)
52
+ if err != nil {
53
+ fmt.Println("Error unmarshaling JSON response:", err)
54
+ return
55
+ }
56
+ return
57
+ }
58
+
59
+ func getModels(url string) (response []gallery.GalleryModel) {
60
+ utils.GetURI(url, func(url string, i []byte) error {
61
+ // Unmarshal YAML data into a struct
62
+ return json.Unmarshal(i, &response)
63
+ })
64
+ return
65
+ }
66
+
67
+ func postModelApplyRequest(url string, request modelApplyRequest) (response map[string]interface{}) {
68
+
69
+ //url := "http://localhost:AI/models/apply"
70
+
71
+ // Create the request payload
72
+
73
+ payload, err := json.Marshal(request)
74
+ if err != nil {
75
+ fmt.Println("Error marshaling JSON:", err)
76
+ return
77
+ }
78
+
79
+ // Create the HTTP request
80
+ req, err := http.NewRequest("POST", url, bytes.NewBuffer(payload))
81
+ if err != nil {
82
+ fmt.Println("Error creating request:", err)
83
+ return
84
+ }
85
+ req.Header.Set("Content-Type", "application/json")
86
+
87
+ // Make the request
88
+ client := &http.Client{}
89
+ resp, err := client.Do(req)
90
+ if err != nil {
91
+ fmt.Println("Error making request:", err)
92
+ return
93
+ }
94
+ defer resp.Body.Close()
95
+
96
+ body, err := ioutil.ReadAll(resp.Body)
97
+ if err != nil {
98
+ fmt.Println("Error reading response body:", err)
99
+ return
100
+ }
101
+
102
+ // Unmarshal the response into a map[string]interface{}
103
+ err = json.Unmarshal(body, &response)
104
+ if err != nil {
105
+ fmt.Println("Error unmarshaling JSON response:", err)
106
+ return
107
+ }
108
+ return
109
+ }
110
+
111
+ //go:embed backend-assets/*
112
+ var backendAssets embed.FS
113
+
114
+ var _ = Describe("API test", func() {
115
+
116
+ var app *fiber.App
117
+ var modelLoader *model.ModelLoader
118
+ var client *openai.Client
119
+ var client2 *openaigo.Client
120
+ var c context.Context
121
+ var cancel context.CancelFunc
122
+ var tmpdir string
123
+
124
+ Context("API with ephemeral models", func() {
125
+ BeforeEach(func() {
126
+ var err error
127
+ tmpdir, err = os.MkdirTemp("", "")
128
+ Expect(err).ToNot(HaveOccurred())
129
+
130
+ modelLoader = model.NewModelLoader(tmpdir)
131
+ c, cancel = context.WithCancel(context.Background())
132
+
133
+ g := []gallery.GalleryModel{
134
+ {
135
+ Name: "bert",
136
+ URL: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml",
137
+ },
138
+ {
139
+ Name: "bert2",
140
+ URL: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml",
141
+ Overrides: map[string]interface{}{"foo": "bar"},
142
+ AdditionalFiles: []gallery.File{gallery.File{Filename: "foo.yaml", URI: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml"}},
143
+ },
144
+ }
145
+ out, err := yaml.Marshal(g)
146
+ Expect(err).ToNot(HaveOccurred())
147
+ err = ioutil.WriteFile(filepath.Join(tmpdir, "gallery_simple.yaml"), out, 0644)
148
+ Expect(err).ToNot(HaveOccurred())
149
+
150
+ galleries := []gallery.Gallery{
151
+ {
152
+ Name: "test",
153
+ URL: "file://" + filepath.Join(tmpdir, "gallery_simple.yaml"),
154
+ },
155
+ }
156
+
157
+ app, err = App(WithContext(c),
158
+ WithGalleries(galleries),
159
+ WithModelLoader(modelLoader), WithBackendAssets(backendAssets), WithBackendAssetsOutput(tmpdir))
160
+ Expect(err).ToNot(HaveOccurred())
161
+ go app.Listen("127.0.0.1:9090")
162
+
163
+ defaultConfig := openai.DefaultConfig("")
164
+ defaultConfig.BaseURL = "http://127.0.0.1:9090/v1"
165
+
166
+ client2 = openaigo.NewClient("")
167
+ client2.BaseURL = defaultConfig.BaseURL
168
+
169
+ // Wait for API to be ready
170
+ client = openai.NewClientWithConfig(defaultConfig)
171
+ Eventually(func() error {
172
+ _, err := client.ListModels(context.TODO())
173
+ return err
174
+ }, "2m").ShouldNot(HaveOccurred())
175
+ })
176
+
177
+ AfterEach(func() {
178
+ cancel()
179
+ app.Shutdown()
180
+ os.RemoveAll(tmpdir)
181
+ })
182
+
183
+ Context("Applying models", func() {
184
+ It("applies models from a gallery", func() {
185
+
186
+ models := getModels("http://127.0.0.1:9090/models/available")
187
+ Expect(len(models)).To(Equal(2), fmt.Sprint(models))
188
+ Expect(models[0].Installed).To(BeFalse(), fmt.Sprint(models))
189
+ Expect(models[1].Installed).To(BeFalse(), fmt.Sprint(models))
190
+
191
+ response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
192
+ ID: "test@bert2",
193
+ })
194
+
195
+ Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
196
+
197
+ uuid := response["uuid"].(string)
198
+ resp := map[string]interface{}{}
199
+ Eventually(func() bool {
200
+ response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
201
+ fmt.Println(response)
202
+ resp = response
203
+ return response["processed"].(bool)
204
+ }, "360s").Should(Equal(true))
205
+ Expect(resp["message"]).ToNot(ContainSubstring("error"))
206
+
207
+ dat, err := os.ReadFile(filepath.Join(tmpdir, "bert2.yaml"))
208
+ Expect(err).ToNot(HaveOccurred())
209
+
210
+ _, err = os.ReadFile(filepath.Join(tmpdir, "foo.yaml"))
211
+ Expect(err).ToNot(HaveOccurred())
212
+
213
+ content := map[string]interface{}{}
214
+ err = yaml.Unmarshal(dat, &content)
215
+ Expect(err).ToNot(HaveOccurred())
216
+ Expect(content["backend"]).To(Equal("bert-embeddings"))
217
+ Expect(content["foo"]).To(Equal("bar"))
218
+
219
+ models = getModels("http://127.0.0.1:9090/models/available")
220
+ Expect(len(models)).To(Equal(2), fmt.Sprint(models))
221
+ Expect(models[0].Name).To(Or(Equal("bert"), Equal("bert2")))
222
+ Expect(models[1].Name).To(Or(Equal("bert"), Equal("bert2")))
223
+ for _, m := range models {
224
+ if m.Name == "bert2" {
225
+ Expect(m.Installed).To(BeTrue())
226
+ } else {
227
+ Expect(m.Installed).To(BeFalse())
228
+ }
229
+ }
230
+ })
231
+ It("overrides models", func() {
232
+ response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
233
+ URL: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml",
234
+ Name: "bert",
235
+ Overrides: map[string]string{
236
+ "backend": "llama",
237
+ },
238
+ })
239
+
240
+ Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
241
+
242
+ uuid := response["uuid"].(string)
243
+
244
+ Eventually(func() bool {
245
+ response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
246
+ fmt.Println(response)
247
+ return response["processed"].(bool)
248
+ }, "360s").Should(Equal(true))
249
+
250
+ dat, err := os.ReadFile(filepath.Join(tmpdir, "bert.yaml"))
251
+ Expect(err).ToNot(HaveOccurred())
252
+
253
+ content := map[string]interface{}{}
254
+ err = yaml.Unmarshal(dat, &content)
255
+ Expect(err).ToNot(HaveOccurred())
256
+ Expect(content["backend"]).To(Equal("llama"))
257
+ })
258
+ It("apply models without overrides", func() {
259
+ response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
260
+ URL: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml",
261
+ Name: "bert",
262
+ Overrides: map[string]string{},
263
+ })
264
+
265
+ Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
266
+
267
+ uuid := response["uuid"].(string)
268
+
269
+ Eventually(func() bool {
270
+ response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
271
+ fmt.Println(response)
272
+ return response["processed"].(bool)
273
+ }, "360s").Should(Equal(true))
274
+
275
+ dat, err := os.ReadFile(filepath.Join(tmpdir, "bert.yaml"))
276
+ Expect(err).ToNot(HaveOccurred())
277
+
278
+ content := map[string]interface{}{}
279
+ err = yaml.Unmarshal(dat, &content)
280
+ Expect(err).ToNot(HaveOccurred())
281
+ Expect(content["backend"]).To(Equal("bert-embeddings"))
282
+ })
283
+
284
+ It("runs openllama", Label("llama"), func() {
285
+ if runtime.GOOS != "linux" {
286
+ Skip("test supported only on linux")
287
+ }
288
+ response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
289
+ URL: "github:go-skynet/model-gallery/openllama_3b.yaml",
290
+ Name: "openllama_3b",
291
+ Overrides: map[string]string{},
292
+ })
293
+
294
+ Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
295
+
296
+ uuid := response["uuid"].(string)
297
+
298
+ Eventually(func() bool {
299
+ response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
300
+ fmt.Println(response)
301
+ return response["processed"].(bool)
302
+ }, "360s").Should(Equal(true))
303
+
304
+ resp, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "openllama_3b", Prompt: "Count up to five: one, two, three, four, "})
305
+ Expect(err).ToNot(HaveOccurred())
306
+ Expect(len(resp.Choices)).To(Equal(1))
307
+ Expect(resp.Choices[0].Text).To(ContainSubstring("five"))
308
+ })
309
+
310
+ It("runs gpt4all", Label("gpt4all"), func() {
311
+ if runtime.GOOS != "linux" {
312
+ Skip("test supported only on linux")
313
+ }
314
+
315
+ response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
316
+ URL: "github:go-skynet/model-gallery/gpt4all-j.yaml",
317
+ Name: "gpt4all-j",
318
+ Overrides: map[string]string{},
319
+ })
320
+
321
+ Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
322
+
323
+ uuid := response["uuid"].(string)
324
+
325
+ Eventually(func() bool {
326
+ response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
327
+ fmt.Println(response)
328
+ return response["processed"].(bool)
329
+ }, "360s").Should(Equal(true))
330
+
331
+ resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "gpt4all-j", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "How are you?"}}})
332
+ Expect(err).ToNot(HaveOccurred())
333
+ Expect(len(resp.Choices)).To(Equal(1))
334
+ Expect(resp.Choices[0].Message.Content).To(ContainSubstring("well"))
335
+ })
336
+ })
337
+ })
338
+
339
+ Context("API query", func() {
340
+ BeforeEach(func() {
341
+ modelLoader = model.NewModelLoader(os.Getenv("MODELS_PATH"))
342
+ c, cancel = context.WithCancel(context.Background())
343
+
344
+ var err error
345
+ app, err = App(WithContext(c), WithModelLoader(modelLoader))
346
+ Expect(err).ToNot(HaveOccurred())
347
+ go app.Listen("127.0.0.1:9090")
348
+
349
+ defaultConfig := openai.DefaultConfig("")
350
+ defaultConfig.BaseURL = "http://127.0.0.1:9090/v1"
351
+
352
+ client2 = openaigo.NewClient("")
353
+ client2.BaseURL = defaultConfig.BaseURL
354
+
355
+ // Wait for API to be ready
356
+ client = openai.NewClientWithConfig(defaultConfig)
357
+ Eventually(func() error {
358
+ _, err := client.ListModels(context.TODO())
359
+ return err
360
+ }, "2m").ShouldNot(HaveOccurred())
361
+ })
362
+ AfterEach(func() {
363
+ cancel()
364
+ app.Shutdown()
365
+ })
366
+ It("returns the models list", func() {
367
+ models, err := client.ListModels(context.TODO())
368
+ Expect(err).ToNot(HaveOccurred())
369
+ Expect(len(models.Models)).To(Equal(10))
370
+ })
371
+ It("can generate completions", func() {
372
+ resp, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "testmodel", Prompt: "abcdedfghikl"})
373
+ Expect(err).ToNot(HaveOccurred())
374
+ Expect(len(resp.Choices)).To(Equal(1))
375
+ Expect(resp.Choices[0].Text).ToNot(BeEmpty())
376
+ })
377
+
378
+ It("can generate chat completions ", func() {
379
+ resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "testmodel", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "abcdedfghikl"}}})
380
+ Expect(err).ToNot(HaveOccurred())
381
+ Expect(len(resp.Choices)).To(Equal(1))
382
+ Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty())
383
+ })
384
+
385
+ It("can generate completions from model configs", func() {
386
+ resp, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "gpt4all", Prompt: "abcdedfghikl"})
387
+ Expect(err).ToNot(HaveOccurred())
388
+ Expect(len(resp.Choices)).To(Equal(1))
389
+ Expect(resp.Choices[0].Text).ToNot(BeEmpty())
390
+ })
391
+
392
+ It("can generate chat completions from model configs", func() {
393
+ resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "gpt4all-2", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "abcdedfghikl"}}})
394
+ Expect(err).ToNot(HaveOccurred())
395
+ Expect(len(resp.Choices)).To(Equal(1))
396
+ Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty())
397
+ })
398
+
399
+ It("returns errors", func() {
400
+ _, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "foomodel", Prompt: "abcdedfghikl"})
401
+ Expect(err).To(HaveOccurred())
402
+ Expect(err.Error()).To(ContainSubstring("error, status code: 500, message: could not load model - all backends returned error: 11 errors occurred:"))
403
+ })
404
+ It("transcribes audio", func() {
405
+ if runtime.GOOS != "linux" {
406
+ Skip("test supported only on linux")
407
+ }
408
+ resp, err := client.CreateTranscription(
409
+ context.Background(),
410
+ openai.AudioRequest{
411
+ Model: openai.Whisper1,
412
+ FilePath: filepath.Join(os.Getenv("TEST_DIR"), "audio.wav"),
413
+ },
414
+ )
415
+ Expect(err).ToNot(HaveOccurred())
416
+ Expect(resp.Text).To(ContainSubstring("This is the Micro Machine Man presenting"))
417
+ })
418
+
419
+ It("calculate embeddings", func() {
420
+ if runtime.GOOS != "linux" {
421
+ Skip("test supported only on linux")
422
+ }
423
+ resp, err := client.CreateEmbeddings(
424
+ context.Background(),
425
+ openai.EmbeddingRequest{
426
+ Model: openai.AdaEmbeddingV2,
427
+ Input: []string{"sun", "cat"},
428
+ },
429
+ )
430
+ Expect(err).ToNot(HaveOccurred())
431
+ Expect(len(resp.Data[0].Embedding)).To(BeNumerically("==", 384))
432
+ Expect(len(resp.Data[1].Embedding)).To(BeNumerically("==", 384))
433
+
434
+ sunEmbedding := resp.Data[0].Embedding
435
+ resp2, err := client.CreateEmbeddings(
436
+ context.Background(),
437
+ openai.EmbeddingRequest{
438
+ Model: openai.AdaEmbeddingV2,
439
+ Input: []string{"sun"},
440
+ },
441
+ )
442
+ Expect(err).ToNot(HaveOccurred())
443
+ Expect(resp2.Data[0].Embedding).To(Equal(sunEmbedding))
444
+ })
445
+
446
+ Context("backends", func() {
447
+ It("runs rwkv", func() {
448
+ if runtime.GOOS != "linux" {
449
+ Skip("test supported only on linux")
450
+ }
451
+ resp, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "rwkv_test", Prompt: "Count up to five: one, two, three, four,"})
452
+ Expect(err).ToNot(HaveOccurred())
453
+ Expect(len(resp.Choices) > 0).To(BeTrue())
454
+ Expect(resp.Choices[0].Text).To(Equal(" five."))
455
+ })
456
+ })
457
+ })
458
+
459
+ Context("Config file", func() {
460
+ BeforeEach(func() {
461
+ modelLoader = model.NewModelLoader(os.Getenv("MODELS_PATH"))
462
+ c, cancel = context.WithCancel(context.Background())
463
+
464
+ var err error
465
+ app, err = App(WithContext(c), WithModelLoader(modelLoader), WithConfigFile(os.Getenv("CONFIG_FILE")))
466
+ Expect(err).ToNot(HaveOccurred())
467
+ go app.Listen("127.0.0.1:9090")
468
+
469
+ defaultConfig := openai.DefaultConfig("")
470
+ defaultConfig.BaseURL = "http://127.0.0.1:9090/v1"
471
+ client2 = openaigo.NewClient("")
472
+ client2.BaseURL = defaultConfig.BaseURL
473
+ // Wait for API to be ready
474
+ client = openai.NewClientWithConfig(defaultConfig)
475
+ Eventually(func() error {
476
+ _, err := client.ListModels(context.TODO())
477
+ return err
478
+ }, "2m").ShouldNot(HaveOccurred())
479
+ })
480
+ AfterEach(func() {
481
+ cancel()
482
+ app.Shutdown()
483
+ })
484
+ It("can generate chat completions from config file", func() {
485
+ models, err := client.ListModels(context.TODO())
486
+ Expect(err).ToNot(HaveOccurred())
487
+ Expect(len(models.Models)).To(Equal(12))
488
+ })
489
+ It("can generate chat completions from config file", func() {
490
+ resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "list1", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "abcdedfghikl"}}})
491
+ Expect(err).ToNot(HaveOccurred())
492
+ Expect(len(resp.Choices)).To(Equal(1))
493
+ Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty())
494
+ })
495
+ It("can generate chat completions from config file", func() {
496
+ resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "list2", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "abcdedfghikl"}}})
497
+ Expect(err).ToNot(HaveOccurred())
498
+ Expect(len(resp.Choices)).To(Equal(1))
499
+ Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty())
500
+ })
501
+ It("can generate edit completions from config file", func() {
502
+ request := openaigo.EditCreateRequestBody{
503
+ Model: "list2",
504
+ Instruction: "foo",
505
+ Input: "bar",
506
+ }
507
+ resp, err := client2.CreateEdit(context.Background(), request)
508
+ Expect(err).ToNot(HaveOccurred())
509
+ Expect(len(resp.Choices)).To(Equal(1))
510
+ Expect(resp.Choices[0].Text).ToNot(BeEmpty())
511
+ })
512
+
513
+ })
514
+ })
api/apt_suite_test.go ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api_test
2
+
3
+ import (
4
+ "testing"
5
+
6
+ . "github.com/onsi/ginkgo/v2"
7
+ . "github.com/onsi/gomega"
8
+ )
9
+
10
+ func TestLocalAI(t *testing.T) {
11
+ RegisterFailHandler(Fail)
12
+ RunSpecs(t, "LocalAI test suite")
13
+ }
api/config.go ADDED
@@ -0,0 +1,368 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "encoding/json"
5
+ "fmt"
6
+ "io/fs"
7
+ "os"
8
+ "path/filepath"
9
+ "strings"
10
+ "sync"
11
+
12
+ model "github.com/go-skynet/LocalAI/pkg/model"
13
+ "github.com/gofiber/fiber/v2"
14
+ "github.com/rs/zerolog/log"
15
+ "gopkg.in/yaml.v3"
16
+ )
17
+
18
+ type Config struct {
19
+ OpenAIRequest `yaml:"parameters"`
20
+ Name string `yaml:"name"`
21
+ StopWords []string `yaml:"stopwords"`
22
+ Cutstrings []string `yaml:"cutstrings"`
23
+ TrimSpace []string `yaml:"trimspace"`
24
+ ContextSize int `yaml:"context_size"`
25
+ F16 bool `yaml:"f16"`
26
+ NUMA bool `yaml:"numa"`
27
+ Threads int `yaml:"threads"`
28
+ Debug bool `yaml:"debug"`
29
+ Roles map[string]string `yaml:"roles"`
30
+ Embeddings bool `yaml:"embeddings"`
31
+ Backend string `yaml:"backend"`
32
+ TemplateConfig TemplateConfig `yaml:"template"`
33
+ MirostatETA float64 `yaml:"mirostat_eta"`
34
+ MirostatTAU float64 `yaml:"mirostat_tau"`
35
+ Mirostat int `yaml:"mirostat"`
36
+ NGPULayers int `yaml:"gpu_layers"`
37
+ MMap bool `yaml:"mmap"`
38
+ MMlock bool `yaml:"mmlock"`
39
+ LowVRAM bool `yaml:"low_vram"`
40
+
41
+ TensorSplit string `yaml:"tensor_split"`
42
+ MainGPU string `yaml:"main_gpu"`
43
+ ImageGenerationAssets string `yaml:"asset_dir"`
44
+
45
+ PromptCachePath string `yaml:"prompt_cache_path"`
46
+ PromptCacheAll bool `yaml:"prompt_cache_all"`
47
+ PromptCacheRO bool `yaml:"prompt_cache_ro"`
48
+
49
+ PromptStrings, InputStrings []string
50
+ InputToken [][]int
51
+ }
52
+
53
+ type TemplateConfig struct {
54
+ Completion string `yaml:"completion"`
55
+ Chat string `yaml:"chat"`
56
+ Edit string `yaml:"edit"`
57
+ }
58
+
59
+ type ConfigMerger struct {
60
+ configs map[string]Config
61
+ sync.Mutex
62
+ }
63
+
64
+ func defaultConfig(modelFile string) *Config {
65
+ return &Config{
66
+ OpenAIRequest: defaultRequest(modelFile),
67
+ }
68
+ }
69
+
70
+ func NewConfigMerger() *ConfigMerger {
71
+ return &ConfigMerger{
72
+ configs: make(map[string]Config),
73
+ }
74
+ }
75
+ func ReadConfigFile(file string) ([]*Config, error) {
76
+ c := &[]*Config{}
77
+ f, err := os.ReadFile(file)
78
+ if err != nil {
79
+ return nil, fmt.Errorf("cannot read config file: %w", err)
80
+ }
81
+ if err := yaml.Unmarshal(f, c); err != nil {
82
+ return nil, fmt.Errorf("cannot unmarshal config file: %w", err)
83
+ }
84
+
85
+ return *c, nil
86
+ }
87
+
88
+ func ReadConfig(file string) (*Config, error) {
89
+ c := &Config{}
90
+ f, err := os.ReadFile(file)
91
+ if err != nil {
92
+ return nil, fmt.Errorf("cannot read config file: %w", err)
93
+ }
94
+ if err := yaml.Unmarshal(f, c); err != nil {
95
+ return nil, fmt.Errorf("cannot unmarshal config file: %w", err)
96
+ }
97
+
98
+ return c, nil
99
+ }
100
+
101
+ func (cm *ConfigMerger) LoadConfigFile(file string) error {
102
+ cm.Lock()
103
+ defer cm.Unlock()
104
+ c, err := ReadConfigFile(file)
105
+ if err != nil {
106
+ return fmt.Errorf("cannot load config file: %w", err)
107
+ }
108
+
109
+ for _, cc := range c {
110
+ cm.configs[cc.Name] = *cc
111
+ }
112
+ return nil
113
+ }
114
+
115
+ func (cm *ConfigMerger) LoadConfig(file string) error {
116
+ cm.Lock()
117
+ defer cm.Unlock()
118
+ c, err := ReadConfig(file)
119
+ if err != nil {
120
+ return fmt.Errorf("cannot read config file: %w", err)
121
+ }
122
+
123
+ cm.configs[c.Name] = *c
124
+ return nil
125
+ }
126
+
127
+ func (cm *ConfigMerger) GetConfig(m string) (Config, bool) {
128
+ cm.Lock()
129
+ defer cm.Unlock()
130
+ v, exists := cm.configs[m]
131
+ return v, exists
132
+ }
133
+
134
+ func (cm *ConfigMerger) ListConfigs() []string {
135
+ cm.Lock()
136
+ defer cm.Unlock()
137
+ var res []string
138
+ for k := range cm.configs {
139
+ res = append(res, k)
140
+ }
141
+ return res
142
+ }
143
+
144
+ func (cm *ConfigMerger) LoadConfigs(path string) error {
145
+ cm.Lock()
146
+ defer cm.Unlock()
147
+ entries, err := os.ReadDir(path)
148
+ if err != nil {
149
+ return err
150
+ }
151
+ files := make([]fs.FileInfo, 0, len(entries))
152
+ for _, entry := range entries {
153
+ info, err := entry.Info()
154
+ if err != nil {
155
+ return err
156
+ }
157
+ files = append(files, info)
158
+ }
159
+ for _, file := range files {
160
+ // Skip templates, YAML and .keep files
161
+ if !strings.Contains(file.Name(), ".yaml") {
162
+ continue
163
+ }
164
+ c, err := ReadConfig(filepath.Join(path, file.Name()))
165
+ if err == nil {
166
+ cm.configs[c.Name] = *c
167
+ }
168
+ }
169
+
170
+ return nil
171
+ }
172
+
173
+ func updateConfig(config *Config, input *OpenAIRequest) {
174
+ if input.Echo {
175
+ config.Echo = input.Echo
176
+ }
177
+ if input.TopK != 0 {
178
+ config.TopK = input.TopK
179
+ }
180
+ if input.TopP != 0 {
181
+ config.TopP = input.TopP
182
+ }
183
+
184
+ if input.Temperature != 0 {
185
+ config.Temperature = input.Temperature
186
+ }
187
+
188
+ if input.Maxtokens != 0 {
189
+ config.Maxtokens = input.Maxtokens
190
+ }
191
+
192
+ switch stop := input.Stop.(type) {
193
+ case string:
194
+ if stop != "" {
195
+ config.StopWords = append(config.StopWords, stop)
196
+ }
197
+ case []interface{}:
198
+ for _, pp := range stop {
199
+ if s, ok := pp.(string); ok {
200
+ config.StopWords = append(config.StopWords, s)
201
+ }
202
+ }
203
+ }
204
+
205
+ if input.RepeatPenalty != 0 {
206
+ config.RepeatPenalty = input.RepeatPenalty
207
+ }
208
+
209
+ if input.Keep != 0 {
210
+ config.Keep = input.Keep
211
+ }
212
+
213
+ if input.Batch != 0 {
214
+ config.Batch = input.Batch
215
+ }
216
+
217
+ if input.F16 {
218
+ config.F16 = input.F16
219
+ }
220
+
221
+ if input.IgnoreEOS {
222
+ config.IgnoreEOS = input.IgnoreEOS
223
+ }
224
+
225
+ if input.Seed != 0 {
226
+ config.Seed = input.Seed
227
+ }
228
+
229
+ if input.Mirostat != 0 {
230
+ config.Mirostat = input.Mirostat
231
+ }
232
+
233
+ if input.MirostatETA != 0 {
234
+ config.MirostatETA = input.MirostatETA
235
+ }
236
+
237
+ if input.MirostatTAU != 0 {
238
+ config.MirostatTAU = input.MirostatTAU
239
+ }
240
+
241
+ if input.TypicalP != 0 {
242
+ config.TypicalP = input.TypicalP
243
+ }
244
+
245
+ switch inputs := input.Input.(type) {
246
+ case string:
247
+ if inputs != "" {
248
+ config.InputStrings = append(config.InputStrings, inputs)
249
+ }
250
+ case []interface{}:
251
+ for _, pp := range inputs {
252
+ switch i := pp.(type) {
253
+ case string:
254
+ config.InputStrings = append(config.InputStrings, i)
255
+ case []interface{}:
256
+ tokens := []int{}
257
+ for _, ii := range i {
258
+ tokens = append(tokens, int(ii.(float64)))
259
+ }
260
+ config.InputToken = append(config.InputToken, tokens)
261
+ }
262
+ }
263
+ }
264
+
265
+ switch p := input.Prompt.(type) {
266
+ case string:
267
+ config.PromptStrings = append(config.PromptStrings, p)
268
+ case []interface{}:
269
+ for _, pp := range p {
270
+ if s, ok := pp.(string); ok {
271
+ config.PromptStrings = append(config.PromptStrings, s)
272
+ }
273
+ }
274
+ }
275
+ }
276
+ func readInput(c *fiber.Ctx, loader *model.ModelLoader, randomModel bool) (string, *OpenAIRequest, error) {
277
+ input := new(OpenAIRequest)
278
+ // Get input data from the request body
279
+ if err := c.BodyParser(input); err != nil {
280
+ return "", nil, err
281
+ }
282
+
283
+ modelFile := input.Model
284
+
285
+ if c.Params("model") != "" {
286
+ modelFile = c.Params("model")
287
+ }
288
+
289
+ received, _ := json.Marshal(input)
290
+
291
+ log.Debug().Msgf("Request received: %s", string(received))
292
+
293
+ // Set model from bearer token, if available
294
+ bearer := strings.TrimLeft(c.Get("authorization"), "Bearer ")
295
+ bearerExists := bearer != "" && loader.ExistsInModelPath(bearer)
296
+
297
+ // If no model was specified, take the first available
298
+ if modelFile == "" && !bearerExists && randomModel {
299
+ models, _ := loader.ListModels()
300
+ if len(models) > 0 {
301
+ modelFile = models[0]
302
+ log.Debug().Msgf("No model specified, using: %s", modelFile)
303
+ } else {
304
+ log.Debug().Msgf("No model specified, returning error")
305
+ return "", nil, fmt.Errorf("no model specified")
306
+ }
307
+ }
308
+
309
+ // If a model is found in bearer token takes precedence
310
+ if bearerExists {
311
+ log.Debug().Msgf("Using model from bearer token: %s", bearer)
312
+ modelFile = bearer
313
+ }
314
+ return modelFile, input, nil
315
+ }
316
+
317
+ func readConfig(modelFile string, input *OpenAIRequest, cm *ConfigMerger, loader *model.ModelLoader, debug bool, threads, ctx int, f16 bool) (*Config, *OpenAIRequest, error) {
318
+ // Load a config file if present after the model name
319
+ modelConfig := filepath.Join(loader.ModelPath, modelFile+".yaml")
320
+
321
+ var config *Config
322
+
323
+ defaults := func() {
324
+ config = defaultConfig(modelFile)
325
+ config.ContextSize = ctx
326
+ config.Threads = threads
327
+ config.F16 = f16
328
+ config.Debug = debug
329
+ }
330
+
331
+ cfg, exists := cm.GetConfig(modelFile)
332
+ if !exists {
333
+ if _, err := os.Stat(modelConfig); err == nil {
334
+ if err := cm.LoadConfig(modelConfig); err != nil {
335
+ return nil, nil, fmt.Errorf("failed loading model config (%s) %s", modelConfig, err.Error())
336
+ }
337
+ cfg, exists = cm.GetConfig(modelFile)
338
+ if exists {
339
+ config = &cfg
340
+ } else {
341
+ defaults()
342
+ }
343
+ } else {
344
+ defaults()
345
+ }
346
+ } else {
347
+ config = &cfg
348
+ }
349
+
350
+ // Set the parameters for the language model prediction
351
+ updateConfig(config, input)
352
+
353
+ // Don't allow 0 as setting
354
+ if config.Threads == 0 {
355
+ if threads != 0 {
356
+ config.Threads = threads
357
+ } else {
358
+ config.Threads = 4
359
+ }
360
+ }
361
+
362
+ // Enforce debug flag if passed from CLI
363
+ if debug {
364
+ config.Debug = true
365
+ }
366
+
367
+ return config, input, nil
368
+ }
api/config_test.go ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "os"
5
+
6
+ "github.com/go-skynet/LocalAI/pkg/model"
7
+ . "github.com/onsi/ginkgo/v2"
8
+ . "github.com/onsi/gomega"
9
+ )
10
+
11
+ var _ = Describe("Test cases for config related functions", func() {
12
+
13
+ var (
14
+ configFile string
15
+ )
16
+
17
+ Context("Test Read configuration functions", func() {
18
+ configFile = os.Getenv("CONFIG_FILE")
19
+ It("Test ReadConfigFile", func() {
20
+ config, err := ReadConfigFile(configFile)
21
+ Expect(err).To(BeNil())
22
+ Expect(config).ToNot(BeNil())
23
+ // two configs in config.yaml
24
+ Expect(config[0].Name).To(Equal("list1"))
25
+ Expect(config[1].Name).To(Equal("list2"))
26
+ })
27
+
28
+ It("Test LoadConfigs", func() {
29
+ cm := NewConfigMerger()
30
+ options := newOptions()
31
+ modelLoader := model.NewModelLoader(os.Getenv("MODELS_PATH"))
32
+ WithModelLoader(modelLoader)(options)
33
+
34
+ err := cm.LoadConfigs(options.loader.ModelPath)
35
+ Expect(err).To(BeNil())
36
+ Expect(cm.configs).ToNot(BeNil())
37
+
38
+ // config should includes gpt4all models's api.config
39
+ Expect(cm.configs).To(HaveKey("gpt4all"))
40
+
41
+ // config should includes gpt2 models's api.config
42
+ Expect(cm.configs).To(HaveKey("gpt4all-2"))
43
+
44
+ // config should includes text-embedding-ada-002 models's api.config
45
+ Expect(cm.configs).To(HaveKey("text-embedding-ada-002"))
46
+
47
+ // config should includes rwkv_test models's api.config
48
+ Expect(cm.configs).To(HaveKey("rwkv_test"))
49
+
50
+ // config should includes whisper-1 models's api.config
51
+ Expect(cm.configs).To(HaveKey("whisper-1"))
52
+ })
53
+ })
54
+ })
api/gallery.go ADDED
@@ -0,0 +1,237 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "context"
5
+ "fmt"
6
+ "os"
7
+ "sync"
8
+ "time"
9
+
10
+ json "github.com/json-iterator/go"
11
+
12
+ "github.com/go-skynet/LocalAI/pkg/gallery"
13
+ "github.com/gofiber/fiber/v2"
14
+ "github.com/google/uuid"
15
+ "github.com/rs/zerolog/log"
16
+ )
17
+
18
+ type galleryOp struct {
19
+ req gallery.GalleryModel
20
+ id string
21
+ galleries []gallery.Gallery
22
+ galleryName string
23
+ }
24
+
25
+ type galleryOpStatus struct {
26
+ Error error `json:"error"`
27
+ Processed bool `json:"processed"`
28
+ Message string `json:"message"`
29
+ Progress float64 `json:"progress"`
30
+ TotalFileSize string `json:"file_size"`
31
+ DownloadedFileSize string `json:"downloaded_size"`
32
+ }
33
+
34
+ type galleryApplier struct {
35
+ modelPath string
36
+ sync.Mutex
37
+ C chan galleryOp
38
+ statuses map[string]*galleryOpStatus
39
+ }
40
+
41
+ func newGalleryApplier(modelPath string) *galleryApplier {
42
+ return &galleryApplier{
43
+ modelPath: modelPath,
44
+ C: make(chan galleryOp),
45
+ statuses: make(map[string]*galleryOpStatus),
46
+ }
47
+ }
48
+
49
+ // prepareModel applies a
50
+ func prepareModel(modelPath string, req gallery.GalleryModel, cm *ConfigMerger, downloadStatus func(string, string, string, float64)) error {
51
+
52
+ config, err := gallery.GetGalleryConfigFromURL(req.URL)
53
+ if err != nil {
54
+ return err
55
+ }
56
+
57
+ config.Files = append(config.Files, req.AdditionalFiles...)
58
+
59
+ return gallery.InstallModel(modelPath, req.Name, &config, req.Overrides, downloadStatus)
60
+ }
61
+
62
+ func (g *galleryApplier) updateStatus(s string, op *galleryOpStatus) {
63
+ g.Lock()
64
+ defer g.Unlock()
65
+ g.statuses[s] = op
66
+ }
67
+
68
+ func (g *galleryApplier) getStatus(s string) *galleryOpStatus {
69
+ g.Lock()
70
+ defer g.Unlock()
71
+
72
+ return g.statuses[s]
73
+ }
74
+
75
+ func (g *galleryApplier) start(c context.Context, cm *ConfigMerger) {
76
+ go func() {
77
+ for {
78
+ select {
79
+ case <-c.Done():
80
+ return
81
+ case op := <-g.C:
82
+ g.updateStatus(op.id, &galleryOpStatus{Message: "processing", Progress: 0})
83
+
84
+ // updates the status with an error
85
+ updateError := func(e error) {
86
+ g.updateStatus(op.id, &galleryOpStatus{Error: e, Processed: true, Message: "error: " + e.Error()})
87
+ }
88
+
89
+ // displayDownload displays the download progress
90
+ progressCallback := func(fileName string, current string, total string, percentage float64) {
91
+ g.updateStatus(op.id, &galleryOpStatus{Message: "processing", Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
92
+ displayDownload(fileName, current, total, percentage)
93
+ }
94
+
95
+ var err error
96
+ // if the request contains a gallery name, we apply the gallery from the gallery list
97
+ if op.galleryName != "" {
98
+ err = gallery.InstallModelFromGallery(op.galleries, op.galleryName, g.modelPath, op.req, progressCallback)
99
+ } else {
100
+ err = prepareModel(g.modelPath, op.req, cm, progressCallback)
101
+ }
102
+
103
+ if err != nil {
104
+ updateError(err)
105
+ continue
106
+ }
107
+
108
+ // Reload models
109
+ err = cm.LoadConfigs(g.modelPath)
110
+ if err != nil {
111
+ updateError(err)
112
+ continue
113
+ }
114
+
115
+ g.updateStatus(op.id, &galleryOpStatus{Processed: true, Message: "completed", Progress: 100})
116
+ }
117
+ }
118
+ }()
119
+ }
120
+
121
+ var lastProgress time.Time = time.Now()
122
+ var startTime time.Time = time.Now()
123
+
124
+ func displayDownload(fileName string, current string, total string, percentage float64) {
125
+ currentTime := time.Now()
126
+
127
+ if currentTime.Sub(lastProgress) >= 5*time.Second {
128
+
129
+ lastProgress = currentTime
130
+
131
+ // calculate ETA based on percentage and elapsed time
132
+ var eta time.Duration
133
+ if percentage > 0 {
134
+ elapsed := currentTime.Sub(startTime)
135
+ eta = time.Duration(float64(elapsed)*(100/percentage) - float64(elapsed))
136
+ }
137
+
138
+ if total != "" {
139
+ log.Debug().Msgf("Downloading %s: %s/%s (%.2f%%) ETA: %s", fileName, current, total, percentage, eta)
140
+ } else {
141
+ log.Debug().Msgf("Downloading: %s", current)
142
+ }
143
+ }
144
+ }
145
+
146
+ type galleryModel struct {
147
+ gallery.GalleryModel
148
+ ID string `json:"id"`
149
+ }
150
+
151
+ func ApplyGalleryFromFile(modelPath, s string, cm *ConfigMerger, galleries []gallery.Gallery) error {
152
+ dat, err := os.ReadFile(s)
153
+ if err != nil {
154
+ return err
155
+ }
156
+ return ApplyGalleryFromString(modelPath, string(dat), cm, galleries)
157
+ }
158
+
159
+ func ApplyGalleryFromString(modelPath, s string, cm *ConfigMerger, galleries []gallery.Gallery) error {
160
+ var requests []galleryModel
161
+ err := json.Unmarshal([]byte(s), &requests)
162
+ if err != nil {
163
+ return err
164
+ }
165
+
166
+ for _, r := range requests {
167
+ if r.ID == "" {
168
+ err = prepareModel(modelPath, r.GalleryModel, cm, displayDownload)
169
+ } else {
170
+ err = gallery.InstallModelFromGallery(galleries, r.ID, modelPath, r.GalleryModel, displayDownload)
171
+ }
172
+ }
173
+
174
+ return err
175
+ }
176
+
177
+ func getOpStatus(g *galleryApplier) func(c *fiber.Ctx) error {
178
+ return func(c *fiber.Ctx) error {
179
+
180
+ status := g.getStatus(c.Params("uuid"))
181
+ if status == nil {
182
+ return fmt.Errorf("could not find any status for ID")
183
+ }
184
+
185
+ return c.JSON(status)
186
+ }
187
+ }
188
+
189
+ type GalleryModel struct {
190
+ ID string `json:"id"`
191
+ gallery.GalleryModel
192
+ }
193
+
194
+ func applyModelGallery(modelPath string, cm *ConfigMerger, g chan galleryOp, galleries []gallery.Gallery) func(c *fiber.Ctx) error {
195
+ return func(c *fiber.Ctx) error {
196
+ input := new(GalleryModel)
197
+ // Get input data from the request body
198
+ if err := c.BodyParser(input); err != nil {
199
+ return err
200
+ }
201
+
202
+ uuid, err := uuid.NewUUID()
203
+ if err != nil {
204
+ return err
205
+ }
206
+ g <- galleryOp{
207
+ req: input.GalleryModel,
208
+ id: uuid.String(),
209
+ galleryName: input.ID,
210
+ galleries: galleries,
211
+ }
212
+ return c.JSON(struct {
213
+ ID string `json:"uuid"`
214
+ StatusURL string `json:"status"`
215
+ }{ID: uuid.String(), StatusURL: c.BaseURL() + "/models/jobs/" + uuid.String()})
216
+ }
217
+ }
218
+
219
+ func listModelFromGallery(galleries []gallery.Gallery, basePath string) func(c *fiber.Ctx) error {
220
+ return func(c *fiber.Ctx) error {
221
+ log.Debug().Msgf("Listing models from galleries: %+v", galleries)
222
+
223
+ models, err := gallery.AvailableGalleryModels(galleries, basePath)
224
+ if err != nil {
225
+ return err
226
+ }
227
+ log.Debug().Msgf("Models found from galleries: %+v", models)
228
+ for _, m := range models {
229
+ log.Debug().Msgf("Model found from galleries: %+v", m)
230
+ }
231
+ dat, err := json.Marshal(models)
232
+ if err != nil {
233
+ return err
234
+ }
235
+ return c.Send(dat)
236
+ }
237
+ }
api/localai.go ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "fmt"
5
+ "os"
6
+ "path/filepath"
7
+
8
+ model "github.com/go-skynet/LocalAI/pkg/model"
9
+ "github.com/go-skynet/LocalAI/pkg/tts"
10
+ "github.com/go-skynet/LocalAI/pkg/utils"
11
+ llama "github.com/go-skynet/go-llama.cpp"
12
+ "github.com/gofiber/fiber/v2"
13
+ )
14
+
15
+ type TTSRequest struct {
16
+ Model string `json:"model" yaml:"model"`
17
+ Input string `json:"input" yaml:"input"`
18
+ }
19
+
20
+ func generateUniqueFileName(dir, baseName, ext string) string {
21
+ counter := 1
22
+ fileName := baseName + ext
23
+
24
+ for {
25
+ filePath := filepath.Join(dir, fileName)
26
+ _, err := os.Stat(filePath)
27
+ if os.IsNotExist(err) {
28
+ return fileName
29
+ }
30
+
31
+ counter++
32
+ fileName = fmt.Sprintf("%s_%d%s", baseName, counter, ext)
33
+ }
34
+ }
35
+
36
+ func ttsEndpoint(cm *ConfigMerger, o *Option) func(c *fiber.Ctx) error {
37
+ return func(c *fiber.Ctx) error {
38
+
39
+ input := new(TTSRequest)
40
+ // Get input data from the request body
41
+ if err := c.BodyParser(input); err != nil {
42
+ return err
43
+ }
44
+
45
+ piperModel, err := o.loader.BackendLoader(model.PiperBackend, input.Model, []llama.ModelOption{}, uint32(0), o.assetsDestination)
46
+ if err != nil {
47
+ return err
48
+ }
49
+
50
+ if piperModel == nil {
51
+ return fmt.Errorf("could not load piper model")
52
+ }
53
+
54
+ w, ok := piperModel.(*tts.Piper)
55
+ if !ok {
56
+ return fmt.Errorf("loader returned non-piper object %+v", w)
57
+ }
58
+
59
+ if err := os.MkdirAll(o.audioDir, 0755); err != nil {
60
+ return err
61
+ }
62
+
63
+ fileName := generateUniqueFileName(o.audioDir, "piper", ".wav")
64
+ filePath := filepath.Join(o.audioDir, fileName)
65
+
66
+ modelPath := filepath.Join(o.loader.ModelPath, input.Model)
67
+
68
+ if err := utils.VerifyPath(modelPath, o.loader.ModelPath); err != nil {
69
+ return err
70
+ }
71
+
72
+ if err := w.TTS(input.Input, modelPath, filePath); err != nil {
73
+ return err
74
+ }
75
+
76
+ return c.Download(filePath)
77
+ }
78
+ }
api/openai.go ADDED
@@ -0,0 +1,772 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "bufio"
5
+ "bytes"
6
+ "encoding/base64"
7
+ "encoding/json"
8
+ "errors"
9
+ "fmt"
10
+ "io"
11
+ "io/ioutil"
12
+ "net/http"
13
+ "os"
14
+ "path"
15
+ "path/filepath"
16
+ "strconv"
17
+ "strings"
18
+
19
+ "github.com/ggerganov/whisper.cpp/bindings/go/pkg/whisper"
20
+ model "github.com/go-skynet/LocalAI/pkg/model"
21
+ whisperutil "github.com/go-skynet/LocalAI/pkg/whisper"
22
+ llama "github.com/go-skynet/go-llama.cpp"
23
+ "github.com/gofiber/fiber/v2"
24
+ "github.com/rs/zerolog/log"
25
+ "github.com/valyala/fasthttp"
26
+ )
27
+
28
+ // APIError provides error information returned by the OpenAI API.
29
+ type APIError struct {
30
+ Code any `json:"code,omitempty"`
31
+ Message string `json:"message"`
32
+ Param *string `json:"param,omitempty"`
33
+ Type string `json:"type"`
34
+ }
35
+
36
+ type ErrorResponse struct {
37
+ Error *APIError `json:"error,omitempty"`
38
+ }
39
+
40
+ type OpenAIUsage struct {
41
+ PromptTokens int `json:"prompt_tokens"`
42
+ CompletionTokens int `json:"completion_tokens"`
43
+ TotalTokens int `json:"total_tokens"`
44
+ }
45
+
46
+ type Item struct {
47
+ Embedding []float32 `json:"embedding"`
48
+ Index int `json:"index"`
49
+ Object string `json:"object,omitempty"`
50
+
51
+ // Images
52
+ URL string `json:"url,omitempty"`
53
+ B64JSON string `json:"b64_json,omitempty"`
54
+ }
55
+
56
+ type OpenAIResponse struct {
57
+ Created int `json:"created,omitempty"`
58
+ Object string `json:"object,omitempty"`
59
+ ID string `json:"id,omitempty"`
60
+ Model string `json:"model,omitempty"`
61
+ Choices []Choice `json:"choices,omitempty"`
62
+ Data []Item `json:"data,omitempty"`
63
+
64
+ Usage OpenAIUsage `json:"usage"`
65
+ }
66
+
67
+ type Choice struct {
68
+ Index int `json:"index,omitempty"`
69
+ FinishReason string `json:"finish_reason,omitempty"`
70
+ Message *Message `json:"message,omitempty"`
71
+ Delta *Message `json:"delta,omitempty"`
72
+ Text string `json:"text,omitempty"`
73
+ }
74
+
75
+ type Message struct {
76
+ Role string `json:"role,omitempty" yaml:"role"`
77
+ Content string `json:"content,omitempty" yaml:"content"`
78
+ }
79
+
80
+ type OpenAIModel struct {
81
+ ID string `json:"id"`
82
+ Object string `json:"object"`
83
+ }
84
+
85
+ type OpenAIRequest struct {
86
+ Model string `json:"model" yaml:"model"`
87
+
88
+ // whisper
89
+ File string `json:"file" validate:"required"`
90
+ Language string `json:"language"`
91
+ //whisper/image
92
+ ResponseFormat string `json:"response_format"`
93
+ // image
94
+ Size string `json:"size"`
95
+ // Prompt is read only by completion/image API calls
96
+ Prompt interface{} `json:"prompt" yaml:"prompt"`
97
+
98
+ // Edit endpoint
99
+ Instruction string `json:"instruction" yaml:"instruction"`
100
+ Input interface{} `json:"input" yaml:"input"`
101
+
102
+ Stop interface{} `json:"stop" yaml:"stop"`
103
+
104
+ // Messages is read only by chat/completion API calls
105
+ Messages []Message `json:"messages" yaml:"messages"`
106
+
107
+ Stream bool `json:"stream"`
108
+ Echo bool `json:"echo"`
109
+ // Common options between all the API calls
110
+ TopP float64 `json:"top_p" yaml:"top_p"`
111
+ TopK int `json:"top_k" yaml:"top_k"`
112
+ Temperature float64 `json:"temperature" yaml:"temperature"`
113
+ Maxtokens int `json:"max_tokens" yaml:"max_tokens"`
114
+
115
+ N int `json:"n"`
116
+
117
+ // Custom parameters - not present in the OpenAI API
118
+ Batch int `json:"batch" yaml:"batch"`
119
+ F16 bool `json:"f16" yaml:"f16"`
120
+ IgnoreEOS bool `json:"ignore_eos" yaml:"ignore_eos"`
121
+ RepeatPenalty float64 `json:"repeat_penalty" yaml:"repeat_penalty"`
122
+ Keep int `json:"n_keep" yaml:"n_keep"`
123
+
124
+ MirostatETA float64 `json:"mirostat_eta" yaml:"mirostat_eta"`
125
+ MirostatTAU float64 `json:"mirostat_tau" yaml:"mirostat_tau"`
126
+ Mirostat int `json:"mirostat" yaml:"mirostat"`
127
+
128
+ FrequencyPenalty float64 `json:"frequency_penalty" yaml:"frequency_penalty"`
129
+ TFZ float64 `json:"tfz" yaml:"tfz"`
130
+
131
+ Seed int `json:"seed" yaml:"seed"`
132
+
133
+ // Image (not supported by OpenAI)
134
+ Mode int `json:"mode"`
135
+ Step int `json:"step"`
136
+
137
+ TypicalP float64 `json:"typical_p" yaml:"typical_p"`
138
+ }
139
+
140
+ func defaultRequest(modelFile string) OpenAIRequest {
141
+ return OpenAIRequest{
142
+ TopP: 0.7,
143
+ TopK: 80,
144
+ Maxtokens: 512,
145
+ Temperature: 0.9,
146
+ Model: modelFile,
147
+ }
148
+ }
149
+
150
+ // https://platform.openai.com/docs/api-reference/completions
151
+ func completionEndpoint(cm *ConfigMerger, o *Option) func(c *fiber.Ctx) error {
152
+ process := func(s string, req *OpenAIRequest, config *Config, loader *model.ModelLoader, responses chan OpenAIResponse) {
153
+ ComputeChoices(s, req, config, o, loader, func(s string, c *[]Choice) {}, func(s string) bool {
154
+ resp := OpenAIResponse{
155
+ Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
156
+ Choices: []Choice{
157
+ {
158
+ Index: 0,
159
+ Text: s,
160
+ },
161
+ },
162
+ Object: "text_completion",
163
+ }
164
+ log.Debug().Msgf("Sending goroutine: %s", s)
165
+
166
+ responses <- resp
167
+ return true
168
+ })
169
+ close(responses)
170
+ }
171
+
172
+ return func(c *fiber.Ctx) error {
173
+ model, input, err := readInput(c, o.loader, true)
174
+ if err != nil {
175
+ return fmt.Errorf("failed reading parameters from request:%w", err)
176
+ }
177
+
178
+ log.Debug().Msgf("`input`: %+v", input)
179
+
180
+ config, input, err := readConfig(model, input, cm, o.loader, o.debug, o.threads, o.ctxSize, o.f16)
181
+ if err != nil {
182
+ return fmt.Errorf("failed reading parameters from request:%w", err)
183
+ }
184
+
185
+ log.Debug().Msgf("Parameter Config: %+v", config)
186
+
187
+ if input.Stream {
188
+ log.Debug().Msgf("Stream request received")
189
+ c.Context().SetContentType("text/event-stream")
190
+ //c.Response().Header.SetContentType(fiber.MIMETextHTMLCharsetUTF8)
191
+ //c.Set("Content-Type", "text/event-stream")
192
+ c.Set("Cache-Control", "no-cache")
193
+ c.Set("Connection", "keep-alive")
194
+ c.Set("Transfer-Encoding", "chunked")
195
+ }
196
+
197
+ templateFile := config.Model
198
+
199
+ if config.TemplateConfig.Completion != "" {
200
+ templateFile = config.TemplateConfig.Completion
201
+ }
202
+
203
+ if input.Stream {
204
+ if len(config.PromptStrings) > 1 {
205
+ return errors.New("cannot handle more than 1 `PromptStrings` when `Stream`ing")
206
+ }
207
+
208
+ predInput := config.PromptStrings[0]
209
+
210
+ // A model can have a "file.bin.tmpl" file associated with a prompt template prefix
211
+ templatedInput, err := o.loader.TemplatePrefix(templateFile, struct {
212
+ Input string
213
+ }{Input: predInput})
214
+ if err == nil {
215
+ predInput = templatedInput
216
+ log.Debug().Msgf("Template found, input modified to: %s", predInput)
217
+ }
218
+
219
+ responses := make(chan OpenAIResponse)
220
+
221
+ go process(predInput, input, config, o.loader, responses)
222
+
223
+ c.Context().SetBodyStreamWriter(fasthttp.StreamWriter(func(w *bufio.Writer) {
224
+
225
+ for ev := range responses {
226
+ var buf bytes.Buffer
227
+ enc := json.NewEncoder(&buf)
228
+ enc.Encode(ev)
229
+
230
+ log.Debug().Msgf("Sending chunk: %s", buf.String())
231
+ fmt.Fprintf(w, "data: %v\n", buf.String())
232
+ w.Flush()
233
+ }
234
+
235
+ resp := &OpenAIResponse{
236
+ Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
237
+ Choices: []Choice{
238
+ {
239
+ Index: 0,
240
+ FinishReason: "stop",
241
+ },
242
+ },
243
+ Object: "text_completion",
244
+ }
245
+ respData, _ := json.Marshal(resp)
246
+
247
+ w.WriteString(fmt.Sprintf("data: %s\n\n", respData))
248
+ w.WriteString("data: [DONE]\n\n")
249
+ w.Flush()
250
+ }))
251
+ return nil
252
+ }
253
+
254
+ var result []Choice
255
+ for _, i := range config.PromptStrings {
256
+ // A model can have a "file.bin.tmpl" file associated with a prompt template prefix
257
+ templatedInput, err := o.loader.TemplatePrefix(templateFile, struct {
258
+ Input string
259
+ }{Input: i})
260
+ if err == nil {
261
+ i = templatedInput
262
+ log.Debug().Msgf("Template found, input modified to: %s", i)
263
+ }
264
+
265
+ r, err := ComputeChoices(i, input, config, o, o.loader, func(s string, c *[]Choice) {
266
+ *c = append(*c, Choice{Text: s})
267
+ }, nil)
268
+ if err != nil {
269
+ return err
270
+ }
271
+
272
+ result = append(result, r...)
273
+ }
274
+
275
+ resp := &OpenAIResponse{
276
+ Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
277
+ Choices: result,
278
+ Object: "text_completion",
279
+ }
280
+
281
+ jsonResult, _ := json.Marshal(resp)
282
+ log.Debug().Msgf("Response: %s", jsonResult)
283
+
284
+ // Return the prediction in the response body
285
+ return c.JSON(resp)
286
+ }
287
+ }
288
+
289
+ // https://platform.openai.com/docs/api-reference/embeddings
290
+ func embeddingsEndpoint(cm *ConfigMerger, o *Option) func(c *fiber.Ctx) error {
291
+ return func(c *fiber.Ctx) error {
292
+ model, input, err := readInput(c, o.loader, true)
293
+ if err != nil {
294
+ return fmt.Errorf("failed reading parameters from request:%w", err)
295
+ }
296
+
297
+ config, input, err := readConfig(model, input, cm, o.loader, o.debug, o.threads, o.ctxSize, o.f16)
298
+ if err != nil {
299
+ return fmt.Errorf("failed reading parameters from request:%w", err)
300
+ }
301
+
302
+ log.Debug().Msgf("Parameter Config: %+v", config)
303
+ items := []Item{}
304
+
305
+ for i, s := range config.InputToken {
306
+ // get the model function to call for the result
307
+ embedFn, err := ModelEmbedding("", s, o.loader, *config, o)
308
+ if err != nil {
309
+ return err
310
+ }
311
+
312
+ embeddings, err := embedFn()
313
+ if err != nil {
314
+ return err
315
+ }
316
+ items = append(items, Item{Embedding: embeddings, Index: i, Object: "embedding"})
317
+ }
318
+
319
+ for i, s := range config.InputStrings {
320
+ // get the model function to call for the result
321
+ embedFn, err := ModelEmbedding(s, []int{}, o.loader, *config, o)
322
+ if err != nil {
323
+ return err
324
+ }
325
+
326
+ embeddings, err := embedFn()
327
+ if err != nil {
328
+ return err
329
+ }
330
+ items = append(items, Item{Embedding: embeddings, Index: i, Object: "embedding"})
331
+ }
332
+
333
+ resp := &OpenAIResponse{
334
+ Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
335
+ Data: items,
336
+ Object: "list",
337
+ }
338
+
339
+ jsonResult, _ := json.Marshal(resp)
340
+ log.Debug().Msgf("Response: %s", jsonResult)
341
+
342
+ // Return the prediction in the response body
343
+ return c.JSON(resp)
344
+ }
345
+ }
346
+
347
+ func chatEndpoint(cm *ConfigMerger, o *Option) func(c *fiber.Ctx) error {
348
+
349
+ process := func(s string, req *OpenAIRequest, config *Config, loader *model.ModelLoader, responses chan OpenAIResponse) {
350
+ initialMessage := OpenAIResponse{
351
+ Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
352
+ Choices: []Choice{{Delta: &Message{Role: "assistant"}}},
353
+ Object: "chat.completion.chunk",
354
+ }
355
+ responses <- initialMessage
356
+
357
+ ComputeChoices(s, req, config, o, loader, func(s string, c *[]Choice) {}, func(s string) bool {
358
+ resp := OpenAIResponse{
359
+ Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
360
+ Choices: []Choice{{Delta: &Message{Content: s}, Index: 0}},
361
+ Object: "chat.completion.chunk",
362
+ }
363
+ log.Debug().Msgf("Sending goroutine: %s", s)
364
+
365
+ responses <- resp
366
+ return true
367
+ })
368
+ close(responses)
369
+ }
370
+ return func(c *fiber.Ctx) error {
371
+ model, input, err := readInput(c, o.loader, true)
372
+ if err != nil {
373
+ return fmt.Errorf("failed reading parameters from request:%w", err)
374
+ }
375
+
376
+ config, input, err := readConfig(model, input, cm, o.loader, o.debug, o.threads, o.ctxSize, o.f16)
377
+ if err != nil {
378
+ return fmt.Errorf("failed reading parameters from request:%w", err)
379
+ }
380
+
381
+ log.Debug().Msgf("Parameter Config: %+v", config)
382
+
383
+ var predInput string
384
+
385
+ mess := []string{}
386
+ for _, i := range input.Messages {
387
+ var content string
388
+ r := config.Roles[i.Role]
389
+ if r != "" {
390
+ content = fmt.Sprint(r, " ", i.Content)
391
+ } else {
392
+ content = i.Content
393
+ }
394
+
395
+ mess = append(mess, content)
396
+ }
397
+
398
+ predInput = strings.Join(mess, "\n")
399
+
400
+ if input.Stream {
401
+ log.Debug().Msgf("Stream request received")
402
+ c.Context().SetContentType("text/event-stream")
403
+ //c.Response().Header.SetContentType(fiber.MIMETextHTMLCharsetUTF8)
404
+ // c.Set("Content-Type", "text/event-stream")
405
+ c.Set("Cache-Control", "no-cache")
406
+ c.Set("Connection", "keep-alive")
407
+ c.Set("Transfer-Encoding", "chunked")
408
+ }
409
+
410
+ templateFile := config.Model
411
+
412
+ if config.TemplateConfig.Chat != "" {
413
+ templateFile = config.TemplateConfig.Chat
414
+ }
415
+
416
+ // A model can have a "file.bin.tmpl" file associated with a prompt template prefix
417
+ templatedInput, err := o.loader.TemplatePrefix(templateFile, struct {
418
+ Input string
419
+ }{Input: predInput})
420
+ if err == nil {
421
+ predInput = templatedInput
422
+ log.Debug().Msgf("Template found, input modified to: %s", predInput)
423
+ }
424
+
425
+ if input.Stream {
426
+ responses := make(chan OpenAIResponse)
427
+
428
+ go process(predInput, input, config, o.loader, responses)
429
+
430
+ c.Context().SetBodyStreamWriter(fasthttp.StreamWriter(func(w *bufio.Writer) {
431
+
432
+ for ev := range responses {
433
+ var buf bytes.Buffer
434
+ enc := json.NewEncoder(&buf)
435
+ enc.Encode(ev)
436
+
437
+ log.Debug().Msgf("Sending chunk: %s", buf.String())
438
+ fmt.Fprintf(w, "data: %v\n", buf.String())
439
+ w.Flush()
440
+ }
441
+
442
+ resp := &OpenAIResponse{
443
+ Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
444
+ Choices: []Choice{
445
+ {
446
+ FinishReason: "stop",
447
+ Index: 0,
448
+ Delta: &Message{},
449
+ }},
450
+ Object: "chat.completion.chunk",
451
+ }
452
+ respData, _ := json.Marshal(resp)
453
+
454
+ w.WriteString(fmt.Sprintf("data: %s\n\n", respData))
455
+ w.WriteString("data: [DONE]\n\n")
456
+ w.Flush()
457
+ }))
458
+ return nil
459
+ }
460
+
461
+ result, err := ComputeChoices(predInput, input, config, o, o.loader, func(s string, c *[]Choice) {
462
+ *c = append(*c, Choice{Message: &Message{Role: "assistant", Content: s}})
463
+ }, nil)
464
+ if err != nil {
465
+ return err
466
+ }
467
+
468
+ resp := &OpenAIResponse{
469
+ Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
470
+ Choices: result,
471
+ Object: "chat.completion",
472
+ }
473
+ respData, _ := json.Marshal(resp)
474
+ log.Debug().Msgf("Response: %s", respData)
475
+
476
+ // Return the prediction in the response body
477
+ return c.JSON(resp)
478
+ }
479
+ }
480
+
481
+ func editEndpoint(cm *ConfigMerger, o *Option) func(c *fiber.Ctx) error {
482
+ return func(c *fiber.Ctx) error {
483
+ model, input, err := readInput(c, o.loader, true)
484
+ if err != nil {
485
+ return fmt.Errorf("failed reading parameters from request:%w", err)
486
+ }
487
+
488
+ config, input, err := readConfig(model, input, cm, o.loader, o.debug, o.threads, o.ctxSize, o.f16)
489
+ if err != nil {
490
+ return fmt.Errorf("failed reading parameters from request:%w", err)
491
+ }
492
+
493
+ log.Debug().Msgf("Parameter Config: %+v", config)
494
+
495
+ templateFile := config.Model
496
+
497
+ if config.TemplateConfig.Edit != "" {
498
+ templateFile = config.TemplateConfig.Edit
499
+ }
500
+
501
+ var result []Choice
502
+ for _, i := range config.InputStrings {
503
+ // A model can have a "file.bin.tmpl" file associated with a prompt template prefix
504
+ templatedInput, err := o.loader.TemplatePrefix(templateFile, struct {
505
+ Input string
506
+ Instruction string
507
+ }{Input: i})
508
+ if err == nil {
509
+ i = templatedInput
510
+ log.Debug().Msgf("Template found, input modified to: %s", i)
511
+ }
512
+
513
+ r, err := ComputeChoices(i, input, config, o, o.loader, func(s string, c *[]Choice) {
514
+ *c = append(*c, Choice{Text: s})
515
+ }, nil)
516
+ if err != nil {
517
+ return err
518
+ }
519
+
520
+ result = append(result, r...)
521
+ }
522
+
523
+ resp := &OpenAIResponse{
524
+ Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
525
+ Choices: result,
526
+ Object: "edit",
527
+ }
528
+
529
+ jsonResult, _ := json.Marshal(resp)
530
+ log.Debug().Msgf("Response: %s", jsonResult)
531
+
532
+ // Return the prediction in the response body
533
+ return c.JSON(resp)
534
+ }
535
+ }
536
+
537
+ // https://platform.openai.com/docs/api-reference/images/create
538
+
539
+ /*
540
+ *
541
+
542
+ curl http://localhost:8080/v1/images/generations \
543
+ -H "Content-Type: application/json" \
544
+ -d '{
545
+ "prompt": "A cute baby sea otter",
546
+ "n": 1,
547
+ "size": "512x512"
548
+ }'
549
+
550
+ *
551
+ */
552
+ func imageEndpoint(cm *ConfigMerger, o *Option) func(c *fiber.Ctx) error {
553
+ return func(c *fiber.Ctx) error {
554
+ m, input, err := readInput(c, o.loader, false)
555
+ if err != nil {
556
+ return fmt.Errorf("failed reading parameters from request:%w", err)
557
+ }
558
+
559
+ if m == "" {
560
+ m = model.StableDiffusionBackend
561
+ }
562
+ log.Debug().Msgf("Loading model: %+v", m)
563
+
564
+ config, input, err := readConfig(m, input, cm, o.loader, o.debug, 0, 0, false)
565
+ if err != nil {
566
+ return fmt.Errorf("failed reading parameters from request:%w", err)
567
+ }
568
+
569
+ log.Debug().Msgf("Parameter Config: %+v", config)
570
+
571
+ // XXX: Only stablediffusion is supported for now
572
+ if config.Backend == "" {
573
+ config.Backend = model.StableDiffusionBackend
574
+ }
575
+
576
+ sizeParts := strings.Split(input.Size, "x")
577
+ if len(sizeParts) != 2 {
578
+ return fmt.Errorf("Invalid value for 'size'")
579
+ }
580
+ width, err := strconv.Atoi(sizeParts[0])
581
+ if err != nil {
582
+ return fmt.Errorf("Invalid value for 'size'")
583
+ }
584
+ height, err := strconv.Atoi(sizeParts[1])
585
+ if err != nil {
586
+ return fmt.Errorf("Invalid value for 'size'")
587
+ }
588
+
589
+ b64JSON := false
590
+ if input.ResponseFormat == "b64_json" {
591
+ b64JSON = true
592
+ }
593
+
594
+ var result []Item
595
+ for _, i := range config.PromptStrings {
596
+ n := input.N
597
+ if input.N == 0 {
598
+ n = 1
599
+ }
600
+ for j := 0; j < n; j++ {
601
+ prompts := strings.Split(i, "|")
602
+ positive_prompt := prompts[0]
603
+ negative_prompt := ""
604
+ if len(prompts) > 1 {
605
+ negative_prompt = prompts[1]
606
+ }
607
+
608
+ mode := 0
609
+ step := 15
610
+
611
+ if input.Mode != 0 {
612
+ mode = input.Mode
613
+ }
614
+
615
+ if input.Step != 0 {
616
+ step = input.Step
617
+ }
618
+
619
+ tempDir := ""
620
+ if !b64JSON {
621
+ tempDir = o.imageDir
622
+ }
623
+ // Create a temporary file
624
+ outputFile, err := ioutil.TempFile(tempDir, "b64")
625
+ if err != nil {
626
+ return err
627
+ }
628
+ outputFile.Close()
629
+ output := outputFile.Name() + ".png"
630
+ // Rename the temporary file
631
+ err = os.Rename(outputFile.Name(), output)
632
+ if err != nil {
633
+ return err
634
+ }
635
+
636
+ baseURL := c.BaseURL()
637
+
638
+ fn, err := ImageGeneration(height, width, mode, step, input.Seed, positive_prompt, negative_prompt, output, o.loader, *config, o)
639
+ if err != nil {
640
+ return err
641
+ }
642
+ if err := fn(); err != nil {
643
+ return err
644
+ }
645
+
646
+ item := &Item{}
647
+
648
+ if b64JSON {
649
+ defer os.RemoveAll(output)
650
+ data, err := os.ReadFile(output)
651
+ if err != nil {
652
+ return err
653
+ }
654
+ item.B64JSON = base64.StdEncoding.EncodeToString(data)
655
+ } else {
656
+ base := filepath.Base(output)
657
+ item.URL = baseURL + "/generated-images/" + base
658
+ }
659
+
660
+ result = append(result, *item)
661
+ }
662
+ }
663
+
664
+ resp := &OpenAIResponse{
665
+ Data: result,
666
+ }
667
+
668
+ jsonResult, _ := json.Marshal(resp)
669
+ log.Debug().Msgf("Response: %s", jsonResult)
670
+
671
+ // Return the prediction in the response body
672
+ return c.JSON(resp)
673
+ }
674
+ }
675
+
676
+ // https://platform.openai.com/docs/api-reference/audio/create
677
+ func transcriptEndpoint(cm *ConfigMerger, o *Option) func(c *fiber.Ctx) error {
678
+ return func(c *fiber.Ctx) error {
679
+ m, input, err := readInput(c, o.loader, false)
680
+ if err != nil {
681
+ return fmt.Errorf("failed reading parameters from request:%w", err)
682
+ }
683
+
684
+ config, input, err := readConfig(m, input, cm, o.loader, o.debug, o.threads, o.ctxSize, o.f16)
685
+ if err != nil {
686
+ return fmt.Errorf("failed reading parameters from request:%w", err)
687
+ }
688
+ // retrieve the file data from the request
689
+ file, err := c.FormFile("file")
690
+ if err != nil {
691
+ return err
692
+ }
693
+ f, err := file.Open()
694
+ if err != nil {
695
+ return err
696
+ }
697
+ defer f.Close()
698
+
699
+ dir, err := os.MkdirTemp("", "whisper")
700
+
701
+ if err != nil {
702
+ return err
703
+ }
704
+ defer os.RemoveAll(dir)
705
+
706
+ dst := filepath.Join(dir, path.Base(file.Filename))
707
+ dstFile, err := os.Create(dst)
708
+ if err != nil {
709
+ return err
710
+ }
711
+
712
+ if _, err := io.Copy(dstFile, f); err != nil {
713
+ log.Debug().Msgf("Audio file copying error %+v - %+v - err %+v", file.Filename, dst, err)
714
+ return err
715
+ }
716
+
717
+ log.Debug().Msgf("Audio file copied to: %+v", dst)
718
+
719
+ whisperModel, err := o.loader.BackendLoader(model.WhisperBackend, config.Model, []llama.ModelOption{}, uint32(config.Threads), o.assetsDestination)
720
+ if err != nil {
721
+ return err
722
+ }
723
+
724
+ if whisperModel == nil {
725
+ return fmt.Errorf("could not load whisper model")
726
+ }
727
+
728
+ w, ok := whisperModel.(whisper.Model)
729
+ if !ok {
730
+ return fmt.Errorf("loader returned non-whisper object")
731
+ }
732
+
733
+ tr, err := whisperutil.Transcript(w, dst, input.Language, uint(config.Threads))
734
+ if err != nil {
735
+ return err
736
+ }
737
+
738
+ log.Debug().Msgf("Trascribed: %+v", tr)
739
+ // TODO: handle different outputs here
740
+ return c.Status(http.StatusOK).JSON(fiber.Map{"text": tr})
741
+ }
742
+ }
743
+
744
+ func listModels(loader *model.ModelLoader, cm *ConfigMerger) func(ctx *fiber.Ctx) error {
745
+ return func(c *fiber.Ctx) error {
746
+ models, err := loader.ListModels()
747
+ if err != nil {
748
+ return err
749
+ }
750
+ var mm map[string]interface{} = map[string]interface{}{}
751
+
752
+ dataModels := []OpenAIModel{}
753
+ for _, m := range models {
754
+ mm[m] = nil
755
+ dataModels = append(dataModels, OpenAIModel{ID: m, Object: "model"})
756
+ }
757
+
758
+ for _, k := range cm.ListConfigs() {
759
+ if _, exists := mm[k]; !exists {
760
+ dataModels = append(dataModels, OpenAIModel{ID: k, Object: "model"})
761
+ }
762
+ }
763
+
764
+ return c.JSON(struct {
765
+ Object string `json:"object"`
766
+ Data []OpenAIModel `json:"data"`
767
+ }{
768
+ Object: "list",
769
+ Data: dataModels,
770
+ })
771
+ }
772
+ }
api/options.go ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "context"
5
+ "embed"
6
+
7
+ "github.com/go-skynet/LocalAI/pkg/gallery"
8
+ model "github.com/go-skynet/LocalAI/pkg/model"
9
+ )
10
+
11
+ type Option struct {
12
+ context context.Context
13
+ configFile string
14
+ loader *model.ModelLoader
15
+ uploadLimitMB, threads, ctxSize int
16
+ f16 bool
17
+ debug, disableMessage bool
18
+ imageDir string
19
+ audioDir string
20
+ cors bool
21
+ preloadJSONModels string
22
+ preloadModelsFromPath string
23
+ corsAllowOrigins string
24
+
25
+ galleries []gallery.Gallery
26
+
27
+ backendAssets embed.FS
28
+ assetsDestination string
29
+ }
30
+
31
+ type AppOption func(*Option)
32
+
33
+ func newOptions(o ...AppOption) *Option {
34
+ opt := &Option{
35
+ context: context.Background(),
36
+ uploadLimitMB: 15,
37
+ threads: 1,
38
+ ctxSize: 512,
39
+ debug: true,
40
+ disableMessage: true,
41
+ }
42
+ for _, oo := range o {
43
+ oo(opt)
44
+ }
45
+ return opt
46
+ }
47
+
48
+ func WithCors(b bool) AppOption {
49
+ return func(o *Option) {
50
+ o.cors = b
51
+ }
52
+ }
53
+
54
+ func WithCorsAllowOrigins(b string) AppOption {
55
+ return func(o *Option) {
56
+ o.corsAllowOrigins = b
57
+ }
58
+ }
59
+
60
+ func WithBackendAssetsOutput(out string) AppOption {
61
+ return func(o *Option) {
62
+ o.assetsDestination = out
63
+ }
64
+ }
65
+
66
+ func WithBackendAssets(f embed.FS) AppOption {
67
+ return func(o *Option) {
68
+ o.backendAssets = f
69
+ }
70
+ }
71
+
72
+ func WithGalleries(galleries []gallery.Gallery) AppOption {
73
+ return func(o *Option) {
74
+ o.galleries = append(o.galleries, galleries...)
75
+ }
76
+ }
77
+
78
+ func WithContext(ctx context.Context) AppOption {
79
+ return func(o *Option) {
80
+ o.context = ctx
81
+ }
82
+ }
83
+
84
+ func WithYAMLConfigPreload(configFile string) AppOption {
85
+ return func(o *Option) {
86
+ o.preloadModelsFromPath = configFile
87
+ }
88
+ }
89
+
90
+ func WithJSONStringPreload(configFile string) AppOption {
91
+ return func(o *Option) {
92
+ o.preloadJSONModels = configFile
93
+ }
94
+ }
95
+ func WithConfigFile(configFile string) AppOption {
96
+ return func(o *Option) {
97
+ o.configFile = configFile
98
+ }
99
+ }
100
+
101
+ func WithModelLoader(loader *model.ModelLoader) AppOption {
102
+ return func(o *Option) {
103
+ o.loader = loader
104
+ }
105
+ }
106
+
107
+ func WithUploadLimitMB(limit int) AppOption {
108
+ return func(o *Option) {
109
+ o.uploadLimitMB = limit
110
+ }
111
+ }
112
+
113
+ func WithThreads(threads int) AppOption {
114
+ return func(o *Option) {
115
+ o.threads = threads
116
+ }
117
+ }
118
+
119
+ func WithContextSize(ctxSize int) AppOption {
120
+ return func(o *Option) {
121
+ o.ctxSize = ctxSize
122
+ }
123
+ }
124
+
125
+ func WithF16(f16 bool) AppOption {
126
+ return func(o *Option) {
127
+ o.f16 = f16
128
+ }
129
+ }
130
+
131
+ func WithDebug(debug bool) AppOption {
132
+ return func(o *Option) {
133
+ o.debug = debug
134
+ }
135
+ }
136
+
137
+ func WithDisableMessage(disableMessage bool) AppOption {
138
+ return func(o *Option) {
139
+ o.disableMessage = disableMessage
140
+ }
141
+ }
142
+
143
+ func WithAudioDir(audioDir string) AppOption {
144
+ return func(o *Option) {
145
+ o.audioDir = audioDir
146
+ }
147
+ }
148
+
149
+ func WithImageDir(imageDir string) AppOption {
150
+ return func(o *Option) {
151
+ o.imageDir = imageDir
152
+ }
153
+ }
api/prediction.go ADDED
@@ -0,0 +1,647 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package api
2
+
3
+ import (
4
+ "fmt"
5
+ "os"
6
+ "path/filepath"
7
+ "regexp"
8
+ "strings"
9
+ "sync"
10
+
11
+ "github.com/donomii/go-rwkv.cpp"
12
+ "github.com/go-skynet/LocalAI/pkg/langchain"
13
+ model "github.com/go-skynet/LocalAI/pkg/model"
14
+ "github.com/go-skynet/LocalAI/pkg/stablediffusion"
15
+ "github.com/go-skynet/bloomz.cpp"
16
+ bert "github.com/go-skynet/go-bert.cpp"
17
+ transformers "github.com/go-skynet/go-ggml-transformers.cpp"
18
+ llama "github.com/go-skynet/go-llama.cpp"
19
+ gpt4all "github.com/nomic-ai/gpt4all/gpt4all-bindings/golang"
20
+ )
21
+
22
+ // mutex still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
23
+ var mutexMap sync.Mutex
24
+ var mutexes map[string]*sync.Mutex = make(map[string]*sync.Mutex)
25
+
26
+ func defaultLLamaOpts(c Config) []llama.ModelOption {
27
+ llamaOpts := []llama.ModelOption{}
28
+ if c.ContextSize != 0 {
29
+ llamaOpts = append(llamaOpts, llama.SetContext(c.ContextSize))
30
+ }
31
+ if c.F16 {
32
+ llamaOpts = append(llamaOpts, llama.EnableF16Memory)
33
+ }
34
+ if c.Embeddings {
35
+ llamaOpts = append(llamaOpts, llama.EnableEmbeddings)
36
+ }
37
+
38
+ if c.NGPULayers != 0 {
39
+ llamaOpts = append(llamaOpts, llama.SetGPULayers(c.NGPULayers))
40
+ }
41
+
42
+ llamaOpts = append(llamaOpts, llama.SetMMap(c.MMap))
43
+ llamaOpts = append(llamaOpts, llama.SetMainGPU(c.MainGPU))
44
+ llamaOpts = append(llamaOpts, llama.SetTensorSplit(c.TensorSplit))
45
+ if c.Batch != 0 {
46
+ llamaOpts = append(llamaOpts, llama.SetNBatch(c.Batch))
47
+ } else {
48
+ llamaOpts = append(llamaOpts, llama.SetNBatch(512))
49
+ }
50
+
51
+ if c.NUMA {
52
+ llamaOpts = append(llamaOpts, llama.EnableNUMA)
53
+ }
54
+
55
+ if c.LowVRAM {
56
+ llamaOpts = append(llamaOpts, llama.EnabelLowVRAM)
57
+ }
58
+
59
+ return llamaOpts
60
+ }
61
+
62
+ func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negative_prompt, dst string, loader *model.ModelLoader, c Config, o *Option) (func() error, error) {
63
+ if c.Backend != model.StableDiffusionBackend {
64
+ return nil, fmt.Errorf("endpoint only working with stablediffusion models")
65
+ }
66
+ inferenceModel, err := loader.BackendLoader(c.Backend, c.ImageGenerationAssets, []llama.ModelOption{}, uint32(c.Threads), o.assetsDestination)
67
+ if err != nil {
68
+ return nil, err
69
+ }
70
+
71
+ var fn func() error
72
+ switch model := inferenceModel.(type) {
73
+ case *stablediffusion.StableDiffusion:
74
+ fn = func() error {
75
+ return model.GenerateImage(height, width, mode, step, seed, positive_prompt, negative_prompt, dst)
76
+ }
77
+
78
+ default:
79
+ fn = func() error {
80
+ return fmt.Errorf("creation of images not supported by the backend")
81
+ }
82
+ }
83
+
84
+ return func() error {
85
+ // This is still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
86
+ mutexMap.Lock()
87
+ l, ok := mutexes[c.Backend]
88
+ if !ok {
89
+ m := &sync.Mutex{}
90
+ mutexes[c.Backend] = m
91
+ l = m
92
+ }
93
+ mutexMap.Unlock()
94
+ l.Lock()
95
+ defer l.Unlock()
96
+
97
+ return fn()
98
+ }, nil
99
+ }
100
+
101
+ func ModelEmbedding(s string, tokens []int, loader *model.ModelLoader, c Config, o *Option) (func() ([]float32, error), error) {
102
+ if !c.Embeddings {
103
+ return nil, fmt.Errorf("endpoint disabled for this model by API configuration")
104
+ }
105
+
106
+ modelFile := c.Model
107
+
108
+ llamaOpts := defaultLLamaOpts(c)
109
+
110
+ var inferenceModel interface{}
111
+ var err error
112
+ if c.Backend == "" {
113
+ inferenceModel, err = loader.GreedyLoader(modelFile, llamaOpts, uint32(c.Threads), o.assetsDestination)
114
+ } else {
115
+ inferenceModel, err = loader.BackendLoader(c.Backend, modelFile, llamaOpts, uint32(c.Threads), o.assetsDestination)
116
+ }
117
+ if err != nil {
118
+ return nil, err
119
+ }
120
+
121
+ var fn func() ([]float32, error)
122
+ switch model := inferenceModel.(type) {
123
+ case *llama.LLama:
124
+ fn = func() ([]float32, error) {
125
+ predictOptions := buildLLamaPredictOptions(c, loader.ModelPath)
126
+ if len(tokens) > 0 {
127
+ return model.TokenEmbeddings(tokens, predictOptions...)
128
+ }
129
+ return model.Embeddings(s, predictOptions...)
130
+ }
131
+ // bert embeddings
132
+ case *bert.Bert:
133
+ fn = func() ([]float32, error) {
134
+ if len(tokens) > 0 {
135
+ return model.TokenEmbeddings(tokens, bert.SetThreads(c.Threads))
136
+ }
137
+ return model.Embeddings(s, bert.SetThreads(c.Threads))
138
+ }
139
+ default:
140
+ fn = func() ([]float32, error) {
141
+ return nil, fmt.Errorf("embeddings not supported by the backend")
142
+ }
143
+ }
144
+
145
+ return func() ([]float32, error) {
146
+ // This is still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
147
+ mutexMap.Lock()
148
+ l, ok := mutexes[modelFile]
149
+ if !ok {
150
+ m := &sync.Mutex{}
151
+ mutexes[modelFile] = m
152
+ l = m
153
+ }
154
+ mutexMap.Unlock()
155
+ l.Lock()
156
+ defer l.Unlock()
157
+
158
+ embeds, err := fn()
159
+ if err != nil {
160
+ return embeds, err
161
+ }
162
+ // Remove trailing 0s
163
+ for i := len(embeds) - 1; i >= 0; i-- {
164
+ if embeds[i] == 0.0 {
165
+ embeds = embeds[:i]
166
+ } else {
167
+ break
168
+ }
169
+ }
170
+ return embeds, nil
171
+ }, nil
172
+ }
173
+
174
+ func buildLLamaPredictOptions(c Config, modelPath string) []llama.PredictOption {
175
+ // Generate the prediction using the language model
176
+ predictOptions := []llama.PredictOption{
177
+ llama.SetTemperature(c.Temperature),
178
+ llama.SetTopP(c.TopP),
179
+ llama.SetTopK(c.TopK),
180
+ llama.SetTokens(c.Maxtokens),
181
+ llama.SetThreads(c.Threads),
182
+ }
183
+
184
+ if c.PromptCacheAll {
185
+ predictOptions = append(predictOptions, llama.EnablePromptCacheAll)
186
+ }
187
+
188
+ if c.PromptCacheRO {
189
+ predictOptions = append(predictOptions, llama.EnablePromptCacheRO)
190
+ }
191
+
192
+ if c.PromptCachePath != "" {
193
+ // Create parent directory
194
+ p := filepath.Join(modelPath, c.PromptCachePath)
195
+ os.MkdirAll(filepath.Dir(p), 0755)
196
+ predictOptions = append(predictOptions, llama.SetPathPromptCache(p))
197
+ }
198
+
199
+ if c.Mirostat != 0 {
200
+ predictOptions = append(predictOptions, llama.SetMirostat(c.Mirostat))
201
+ }
202
+
203
+ if c.MirostatETA != 0 {
204
+ predictOptions = append(predictOptions, llama.SetMirostatETA(c.MirostatETA))
205
+ }
206
+
207
+ if c.MirostatTAU != 0 {
208
+ predictOptions = append(predictOptions, llama.SetMirostatTAU(c.MirostatTAU))
209
+ }
210
+
211
+ if c.Debug {
212
+ predictOptions = append(predictOptions, llama.Debug)
213
+ }
214
+
215
+ predictOptions = append(predictOptions, llama.SetStopWords(c.StopWords...))
216
+
217
+ if c.RepeatPenalty != 0 {
218
+ predictOptions = append(predictOptions, llama.SetPenalty(c.RepeatPenalty))
219
+ }
220
+
221
+ if c.Keep != 0 {
222
+ predictOptions = append(predictOptions, llama.SetNKeep(c.Keep))
223
+ }
224
+
225
+ if c.Batch != 0 {
226
+ predictOptions = append(predictOptions, llama.SetBatch(c.Batch))
227
+ }
228
+
229
+ if c.F16 {
230
+ predictOptions = append(predictOptions, llama.EnableF16KV)
231
+ }
232
+
233
+ if c.IgnoreEOS {
234
+ predictOptions = append(predictOptions, llama.IgnoreEOS)
235
+ }
236
+
237
+ if c.Seed != 0 {
238
+ predictOptions = append(predictOptions, llama.SetSeed(c.Seed))
239
+ }
240
+
241
+ //predictOptions = append(predictOptions, llama.SetLogitBias(c.Seed))
242
+
243
+ predictOptions = append(predictOptions, llama.SetFrequencyPenalty(c.FrequencyPenalty))
244
+ predictOptions = append(predictOptions, llama.SetMlock(c.MMlock))
245
+ predictOptions = append(predictOptions, llama.SetMemoryMap(c.MMap))
246
+ predictOptions = append(predictOptions, llama.SetPredictionMainGPU(c.MainGPU))
247
+ predictOptions = append(predictOptions, llama.SetPredictionTensorSplit(c.TensorSplit))
248
+ predictOptions = append(predictOptions, llama.SetTailFreeSamplingZ(c.TFZ))
249
+ predictOptions = append(predictOptions, llama.SetTypicalP(c.TypicalP))
250
+
251
+ return predictOptions
252
+ }
253
+
254
+ func ModelInference(s string, loader *model.ModelLoader, c Config, o *Option, tokenCallback func(string) bool) (func() (string, error), error) {
255
+ supportStreams := false
256
+ modelFile := c.Model
257
+
258
+ llamaOpts := defaultLLamaOpts(c)
259
+
260
+ var inferenceModel interface{}
261
+ var err error
262
+ if c.Backend == "" {
263
+ inferenceModel, err = loader.GreedyLoader(modelFile, llamaOpts, uint32(c.Threads), o.assetsDestination)
264
+ } else {
265
+ inferenceModel, err = loader.BackendLoader(c.Backend, modelFile, llamaOpts, uint32(c.Threads), o.assetsDestination)
266
+ }
267
+ if err != nil {
268
+ return nil, err
269
+ }
270
+
271
+ var fn func() (string, error)
272
+
273
+ switch model := inferenceModel.(type) {
274
+ case *rwkv.RwkvState:
275
+ supportStreams = true
276
+
277
+ fn = func() (string, error) {
278
+ stopWord := "\n"
279
+ if len(c.StopWords) > 0 {
280
+ stopWord = c.StopWords[0]
281
+ }
282
+
283
+ if err := model.ProcessInput(s); err != nil {
284
+ return "", err
285
+ }
286
+
287
+ response := model.GenerateResponse(c.Maxtokens, stopWord, float32(c.Temperature), float32(c.TopP), tokenCallback)
288
+
289
+ return response, nil
290
+ }
291
+ case *transformers.GPTNeoX:
292
+ fn = func() (string, error) {
293
+ // Generate the prediction using the language model
294
+ predictOptions := []transformers.PredictOption{
295
+ transformers.SetTemperature(c.Temperature),
296
+ transformers.SetTopP(c.TopP),
297
+ transformers.SetTopK(c.TopK),
298
+ transformers.SetTokens(c.Maxtokens),
299
+ transformers.SetThreads(c.Threads),
300
+ }
301
+
302
+ if c.Batch != 0 {
303
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
304
+ }
305
+
306
+ if c.Seed != 0 {
307
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
308
+ }
309
+
310
+ return model.Predict(
311
+ s,
312
+ predictOptions...,
313
+ )
314
+ }
315
+ case *transformers.Replit:
316
+ fn = func() (string, error) {
317
+ // Generate the prediction using the language model
318
+ predictOptions := []transformers.PredictOption{
319
+ transformers.SetTemperature(c.Temperature),
320
+ transformers.SetTopP(c.TopP),
321
+ transformers.SetTopK(c.TopK),
322
+ transformers.SetTokens(c.Maxtokens),
323
+ transformers.SetThreads(c.Threads),
324
+ }
325
+
326
+ if c.Batch != 0 {
327
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
328
+ }
329
+
330
+ if c.Seed != 0 {
331
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
332
+ }
333
+
334
+ return model.Predict(
335
+ s,
336
+ predictOptions...,
337
+ )
338
+ }
339
+ case *transformers.Starcoder:
340
+ fn = func() (string, error) {
341
+ // Generate the prediction using the language model
342
+ predictOptions := []transformers.PredictOption{
343
+ transformers.SetTemperature(c.Temperature),
344
+ transformers.SetTopP(c.TopP),
345
+ transformers.SetTopK(c.TopK),
346
+ transformers.SetTokens(c.Maxtokens),
347
+ transformers.SetThreads(c.Threads),
348
+ }
349
+
350
+ if c.Batch != 0 {
351
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
352
+ }
353
+
354
+ if c.Seed != 0 {
355
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
356
+ }
357
+
358
+ return model.Predict(
359
+ s,
360
+ predictOptions...,
361
+ )
362
+ }
363
+ case *transformers.MPT:
364
+ fn = func() (string, error) {
365
+ // Generate the prediction using the language model
366
+ predictOptions := []transformers.PredictOption{
367
+ transformers.SetTemperature(c.Temperature),
368
+ transformers.SetTopP(c.TopP),
369
+ transformers.SetTopK(c.TopK),
370
+ transformers.SetTokens(c.Maxtokens),
371
+ transformers.SetThreads(c.Threads),
372
+ }
373
+
374
+ if c.Batch != 0 {
375
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
376
+ }
377
+
378
+ if c.Seed != 0 {
379
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
380
+ }
381
+
382
+ return model.Predict(
383
+ s,
384
+ predictOptions...,
385
+ )
386
+ }
387
+ case *bloomz.Bloomz:
388
+ fn = func() (string, error) {
389
+ // Generate the prediction using the language model
390
+ predictOptions := []bloomz.PredictOption{
391
+ bloomz.SetTemperature(c.Temperature),
392
+ bloomz.SetTopP(c.TopP),
393
+ bloomz.SetTopK(c.TopK),
394
+ bloomz.SetTokens(c.Maxtokens),
395
+ bloomz.SetThreads(c.Threads),
396
+ }
397
+
398
+ if c.Seed != 0 {
399
+ predictOptions = append(predictOptions, bloomz.SetSeed(c.Seed))
400
+ }
401
+
402
+ return model.Predict(
403
+ s,
404
+ predictOptions...,
405
+ )
406
+ }
407
+ case *transformers.Falcon:
408
+ fn = func() (string, error) {
409
+ // Generate the prediction using the language model
410
+ predictOptions := []transformers.PredictOption{
411
+ transformers.SetTemperature(c.Temperature),
412
+ transformers.SetTopP(c.TopP),
413
+ transformers.SetTopK(c.TopK),
414
+ transformers.SetTokens(c.Maxtokens),
415
+ transformers.SetThreads(c.Threads),
416
+ }
417
+
418
+ if c.Batch != 0 {
419
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
420
+ }
421
+
422
+ if c.Seed != 0 {
423
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
424
+ }
425
+
426
+ return model.Predict(
427
+ s,
428
+ predictOptions...,
429
+ )
430
+ }
431
+ case *transformers.GPTJ:
432
+ fn = func() (string, error) {
433
+ // Generate the prediction using the language model
434
+ predictOptions := []transformers.PredictOption{
435
+ transformers.SetTemperature(c.Temperature),
436
+ transformers.SetTopP(c.TopP),
437
+ transformers.SetTopK(c.TopK),
438
+ transformers.SetTokens(c.Maxtokens),
439
+ transformers.SetThreads(c.Threads),
440
+ }
441
+
442
+ if c.Batch != 0 {
443
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
444
+ }
445
+
446
+ if c.Seed != 0 {
447
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
448
+ }
449
+
450
+ return model.Predict(
451
+ s,
452
+ predictOptions...,
453
+ )
454
+ }
455
+ case *transformers.Dolly:
456
+ fn = func() (string, error) {
457
+ // Generate the prediction using the language model
458
+ predictOptions := []transformers.PredictOption{
459
+ transformers.SetTemperature(c.Temperature),
460
+ transformers.SetTopP(c.TopP),
461
+ transformers.SetTopK(c.TopK),
462
+ transformers.SetTokens(c.Maxtokens),
463
+ transformers.SetThreads(c.Threads),
464
+ }
465
+
466
+ if c.Batch != 0 {
467
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
468
+ }
469
+
470
+ if c.Seed != 0 {
471
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
472
+ }
473
+
474
+ return model.Predict(
475
+ s,
476
+ predictOptions...,
477
+ )
478
+ }
479
+ case *transformers.GPT2:
480
+ fn = func() (string, error) {
481
+ // Generate the prediction using the language model
482
+ predictOptions := []transformers.PredictOption{
483
+ transformers.SetTemperature(c.Temperature),
484
+ transformers.SetTopP(c.TopP),
485
+ transformers.SetTopK(c.TopK),
486
+ transformers.SetTokens(c.Maxtokens),
487
+ transformers.SetThreads(c.Threads),
488
+ }
489
+
490
+ if c.Batch != 0 {
491
+ predictOptions = append(predictOptions, transformers.SetBatch(c.Batch))
492
+ }
493
+
494
+ if c.Seed != 0 {
495
+ predictOptions = append(predictOptions, transformers.SetSeed(c.Seed))
496
+ }
497
+
498
+ return model.Predict(
499
+ s,
500
+ predictOptions...,
501
+ )
502
+ }
503
+ case *gpt4all.Model:
504
+ supportStreams = true
505
+
506
+ fn = func() (string, error) {
507
+ if tokenCallback != nil {
508
+ model.SetTokenCallback(tokenCallback)
509
+ }
510
+
511
+ // Generate the prediction using the language model
512
+ predictOptions := []gpt4all.PredictOption{
513
+ gpt4all.SetTemperature(c.Temperature),
514
+ gpt4all.SetTopP(c.TopP),
515
+ gpt4all.SetTopK(c.TopK),
516
+ gpt4all.SetTokens(c.Maxtokens),
517
+ }
518
+
519
+ if c.Batch != 0 {
520
+ predictOptions = append(predictOptions, gpt4all.SetBatch(c.Batch))
521
+ }
522
+
523
+ str, er := model.Predict(
524
+ s,
525
+ predictOptions...,
526
+ )
527
+ // Seems that if we don't free the callback explicitly we leave functions registered (that might try to send on closed channels)
528
+ // For instance otherwise the API returns: {"error":{"code":500,"message":"send on closed channel","type":""}}
529
+ // after a stream event has occurred
530
+ model.SetTokenCallback(nil)
531
+ return str, er
532
+ }
533
+ case *llama.LLama:
534
+ supportStreams = true
535
+ fn = func() (string, error) {
536
+
537
+ if tokenCallback != nil {
538
+ model.SetTokenCallback(tokenCallback)
539
+ }
540
+
541
+ predictOptions := buildLLamaPredictOptions(c, loader.ModelPath)
542
+
543
+ str, er := model.Predict(
544
+ s,
545
+ predictOptions...,
546
+ )
547
+ // Seems that if we don't free the callback explicitly we leave functions registered (that might try to send on closed channels)
548
+ // For instance otherwise the API returns: {"error":{"code":500,"message":"send on closed channel","type":""}}
549
+ // after a stream event has occurred
550
+ model.SetTokenCallback(nil)
551
+ return str, er
552
+ }
553
+ case *langchain.HuggingFace:
554
+ fn = func() (string, error) {
555
+
556
+ // Generate the prediction using the language model
557
+ predictOptions := []langchain.PredictOption{
558
+ langchain.SetModel(c.Model),
559
+ langchain.SetMaxTokens(c.Maxtokens),
560
+ langchain.SetTemperature(c.Temperature),
561
+ langchain.SetStopWords(c.StopWords),
562
+ }
563
+
564
+ pred, er := model.PredictHuggingFace(s, predictOptions...)
565
+ if er != nil {
566
+ return "", er
567
+ }
568
+ return pred.Completion, nil
569
+ }
570
+ }
571
+
572
+ return func() (string, error) {
573
+ // This is still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
574
+ mutexMap.Lock()
575
+ l, ok := mutexes[modelFile]
576
+ if !ok {
577
+ m := &sync.Mutex{}
578
+ mutexes[modelFile] = m
579
+ l = m
580
+ }
581
+ mutexMap.Unlock()
582
+ l.Lock()
583
+ defer l.Unlock()
584
+
585
+ res, err := fn()
586
+ if tokenCallback != nil && !supportStreams {
587
+ tokenCallback(res)
588
+ }
589
+ return res, err
590
+ }, nil
591
+ }
592
+
593
+ func ComputeChoices(predInput string, input *OpenAIRequest, config *Config, o *Option, loader *model.ModelLoader, cb func(string, *[]Choice), tokenCallback func(string) bool) ([]Choice, error) {
594
+ result := []Choice{}
595
+
596
+ n := input.N
597
+
598
+ if input.N == 0 {
599
+ n = 1
600
+ }
601
+
602
+ // get the model function to call for the result
603
+ predFunc, err := ModelInference(predInput, loader, *config, o, tokenCallback)
604
+ if err != nil {
605
+ return result, err
606
+ }
607
+
608
+ for i := 0; i < n; i++ {
609
+ prediction, err := predFunc()
610
+ if err != nil {
611
+ return result, err
612
+ }
613
+
614
+ prediction = Finetune(*config, predInput, prediction)
615
+ cb(prediction, &result)
616
+
617
+ //result = append(result, Choice{Text: prediction})
618
+
619
+ }
620
+ return result, err
621
+ }
622
+
623
+ var cutstrings map[string]*regexp.Regexp = make(map[string]*regexp.Regexp)
624
+ var mu sync.Mutex = sync.Mutex{}
625
+
626
+ func Finetune(config Config, input, prediction string) string {
627
+ if config.Echo {
628
+ prediction = input + prediction
629
+ }
630
+
631
+ for _, c := range config.Cutstrings {
632
+ mu.Lock()
633
+ reg, ok := cutstrings[c]
634
+ if !ok {
635
+ cutstrings[c] = regexp.MustCompile(c)
636
+ reg = cutstrings[c]
637
+ }
638
+ mu.Unlock()
639
+ prediction = reg.ReplaceAllString(prediction, "")
640
+ }
641
+
642
+ for _, c := range config.TrimSpace {
643
+ prediction = strings.TrimSpace(strings.TrimPrefix(prediction, c))
644
+ }
645
+ return prediction
646
+
647
+ }
assets.go ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ package main
2
+
3
+ import "embed"
4
+
5
+ //go:embed backend-assets/*
6
+ var backendAssets embed.FS
docker-compose.yaml ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.6'
2
+
3
+ services:
4
+ api:
5
+ image: quay.io/go-skynet/local-ai:latest
6
+ build:
7
+ context: .
8
+ dockerfile: Dockerfile
9
+ ports:
10
+ - 8080:8080
11
+ env_file:
12
+ - .env
13
+ volumes:
14
+ - ./models:/models:cached
15
+ command: ["/usr/bin/local-ai" ]
entrypoint.sh ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ set -e
3
+
4
+ cd /build
5
+
6
+ if [ "$REBUILD" != "false" ]; then
7
+ rm -rf ./local-ai
8
+ ESPEAK_DATA=/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data make build -j${THREADS:-1}
9
+ fi
10
+
11
+ ./local-ai "$@"
examples/README.md ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Examples
2
+
3
+ Here is a list of projects that can easily be integrated with the LocalAI backend.
4
+
5
+ ### Projects
6
+
7
+ ### AutoGPT
8
+
9
+ _by [@mudler](https://github.com/mudler)_
10
+
11
+ This example shows how to use AutoGPT with LocalAI.
12
+
13
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/autoGPT/)
14
+
15
+ ### Chatbot-UI
16
+
17
+ _by [@mkellerman](https://github.com/mkellerman)_
18
+
19
+ ![Screenshot from 2023-04-26 23-59-55](https://user-images.githubusercontent.com/2420543/234715439-98d12e03-d3ce-4f94-ab54-2b256808e05e.png)
20
+
21
+ This integration shows how to use LocalAI with [mckaywrigley/chatbot-ui](https://github.com/mckaywrigley/chatbot-ui).
22
+
23
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui/)
24
+
25
+ There is also a separate example to show how to manually setup a model: [example](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui-manual/)
26
+
27
+ ### K8sGPT
28
+
29
+ _by [@mudler](https://github.com/mudler)_
30
+
31
+ This example show how to use LocalAI inside Kubernetes with [k8sgpt](https://k8sgpt.ai).
32
+
33
+ ![Screenshot from 2023-06-19 23-58-47](https://github.com/go-skynet/go-ggml-transformers.cpp/assets/2420543/cab87409-ee68-44ae-8d53-41627fb49509)
34
+
35
+ ### Flowise
36
+
37
+ _by [@mudler](https://github.com/mudler)_
38
+
39
+ This example shows how to use [FlowiseAI/Flowise](https://github.com/FlowiseAI/Flowise) with LocalAI.
40
+
41
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/flowise/)
42
+
43
+ ### Discord bot
44
+
45
+ _by [@mudler](https://github.com/mudler)_
46
+
47
+ Run a discord bot which lets you talk directly with a model
48
+
49
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/discord-bot/), or for a live demo you can talk with our bot in #random-bot in our discord server.
50
+
51
+ ### Langchain
52
+
53
+ _by [@dave-gray101](https://github.com/dave-gray101)_
54
+
55
+ A ready to use example to show e2e how to integrate LocalAI with langchain
56
+
57
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/langchain/)
58
+
59
+ ### Langchain Python
60
+
61
+ _by [@mudler](https://github.com/mudler)_
62
+
63
+ A ready to use example to show e2e how to integrate LocalAI with langchain
64
+
65
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/langchain-python/)
66
+
67
+ ### LocalAI WebUI
68
+
69
+ _by [@dhruvgera](https://github.com/dhruvgera)_
70
+
71
+ ![image](https://user-images.githubusercontent.com/42107491/235344183-44b5967d-ba22-4331-804c-8da7004a5d35.png)
72
+
73
+ A light, community-maintained web interface for LocalAI
74
+
75
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/localai-webui/)
76
+
77
+ ### How to run rwkv models
78
+
79
+ _by [@mudler](https://github.com/mudler)_
80
+
81
+ A full example on how to run RWKV models with LocalAI
82
+
83
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/rwkv/)
84
+
85
+ ### PrivateGPT
86
+
87
+ _by [@mudler](https://github.com/mudler)_
88
+
89
+ A full example on how to run PrivateGPT with LocalAI
90
+
91
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/privateGPT/)
92
+
93
+ ### Slack bot
94
+
95
+ _by [@mudler](https://github.com/mudler)_
96
+
97
+ Run a slack bot which lets you talk directly with a model
98
+
99
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/slack-bot/)
100
+
101
+ ### Slack bot (Question answering)
102
+
103
+ _by [@mudler](https://github.com/mudler)_
104
+
105
+ Run a slack bot, ideally for teams, which lets you ask questions on a documentation website, or a github repository.
106
+
107
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/slack-qa-bot/)
108
+
109
+ ### Question answering on documents with llama-index
110
+
111
+ _by [@mudler](https://github.com/mudler)_
112
+
113
+ Shows how to integrate with [Llama-Index](https://gpt-index.readthedocs.io/en/stable/getting_started/installation.html) to enable question answering on a set of documents.
114
+
115
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/query_data/)
116
+
117
+ ### Question answering on documents with langchain and chroma
118
+
119
+ _by [@mudler](https://github.com/mudler)_
120
+
121
+ Shows how to integrate with `Langchain` and `Chroma` to enable question answering on a set of documents.
122
+
123
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/langchain-chroma/)
124
+
125
+ ### Telegram bot
126
+
127
+ _by [@mudler](https://github.com/mudler)
128
+
129
+ ![Screenshot from 2023-06-09 00-36-26](https://github.com/go-skynet/LocalAI/assets/2420543/e98b4305-fa2d-41cf-9d2f-1bb2d75ca902)
130
+
131
+ Use LocalAI to power a Telegram bot assistant, with Image generation and audio support!
132
+
133
+ [Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/telegram-bot/)
134
+
135
+ ### Template for Runpod.io
136
+
137
+ _by [@fHachenberg](https://github.com/fHachenberg)_
138
+
139
+ Allows to run any LocalAI-compatible model as a backend on the servers of https://runpod.io
140
+
141
+ [Check it out here](https://runpod.io/gsc?template=uv9mtqnrd0&ref=984wlcra)
142
+
143
+ ## Want to contribute?
144
+
145
+ Create an issue, and put `Example: <description>` in the title! We will post your examples here.
examples/autoGPT/.env ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ OPENAI_API_KEY=sk---anystringhere
2
+ OPENAI_API_BASE=http://api:8080/v1
3
+ # Models to preload at start
4
+ # Here we configure gpt4all as gpt-3.5-turbo and bert as embeddings
5
+ PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}, { "url": "github:go-skynet/model-gallery/bert-embeddings.yaml", "name": "text-embedding-ada-002"}]
examples/autoGPT/README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AutoGPT
2
+
3
+ Example of integration with [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT).
4
+
5
+ ## Run
6
+
7
+ ```bash
8
+ # Clone LocalAI
9
+ git clone https://github.com/go-skynet/LocalAI
10
+
11
+ cd LocalAI/examples/autoGPT
12
+
13
+ docker-compose run --rm auto-gpt
14
+ ```
15
+
16
+ Note: The example automatically downloads the `gpt4all` model as it is under a permissive license. The GPT4All model does not seem to be enough to run AutoGPT. WizardLM-7b-uncensored seems to perform better (with `f16: true`).
17
+
18
+ See the `.env` configuration file to set a different model with the [model-gallery](https://github.com/go-skynet/model-gallery) by editing `PRELOAD_MODELS`.
19
+
20
+ ## Without docker
21
+
22
+ Run AutoGPT with `OPENAI_API_BASE` pointing to the LocalAI endpoint. If you run it locally for instance:
23
+
24
+ ```
25
+ OPENAI_API_BASE=http://localhost:8080 python ...
26
+ ```
27
+
28
+ Note: you need a model named `gpt-3.5-turbo` and `text-embedding-ada-002`. You can preload those in LocalAI at start by setting in the env:
29
+
30
+ ```
31
+ PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}, { "url": "github:go-skynet/model-gallery/bert-embeddings.yaml", "name": "text-embedding-ada-002"}]
32
+ ```
examples/autoGPT/docker-compose.yaml ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: "3.9"
2
+ services:
3
+ api:
4
+ image: quay.io/go-skynet/local-ai:latest
5
+ ports:
6
+ - 8080:8080
7
+ env_file:
8
+ - .env
9
+ environment:
10
+ - DEBUG=true
11
+ - MODELS_PATH=/models
12
+ volumes:
13
+ - ./models:/models:cached
14
+ command: ["/usr/bin/local-ai" ]
15
+ auto-gpt:
16
+ image: significantgravitas/auto-gpt
17
+ depends_on:
18
+ api:
19
+ condition: service_healthy
20
+ redis:
21
+ condition: service_started
22
+ env_file:
23
+ - .env
24
+ environment:
25
+ MEMORY_BACKEND: ${MEMORY_BACKEND:-redis}
26
+ REDIS_HOST: ${REDIS_HOST:-redis}
27
+ profiles: ["exclude-from-up"]
28
+ volumes:
29
+ - ./auto_gpt_workspace:/app/autogpt/auto_gpt_workspace
30
+ - ./data:/app/data
31
+ ## allow auto-gpt to write logs to disk
32
+ - ./logs:/app/logs
33
+ ## uncomment following lines if you want to make use of these files
34
+ ## you must have them existing in the same folder as this docker-compose.yml
35
+ #- type: bind
36
+ # source: ./azure.yaml
37
+ # target: /app/azure.yaml
38
+ #- type: bind
39
+ # source: ./ai_settings.yaml
40
+ # target: /app/ai_settings.yaml
41
+ redis:
42
+ image: "redis/redis-stack-server:latest"
examples/chatbot-ui-manual/README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # chatbot-ui
2
+
3
+ Example of integration with [mckaywrigley/chatbot-ui](https://github.com/mckaywrigley/chatbot-ui).
4
+
5
+ ![Screenshot from 2023-04-26 23-59-55](https://user-images.githubusercontent.com/2420543/234715439-98d12e03-d3ce-4f94-ab54-2b256808e05e.png)
6
+
7
+ ## Setup
8
+
9
+ ```bash
10
+ # Clone LocalAI
11
+ git clone https://github.com/go-skynet/LocalAI
12
+
13
+ cd LocalAI/examples/chatbot-ui
14
+
15
+ # (optional) Checkout a specific LocalAI tag
16
+ # git checkout -b build <TAG>
17
+
18
+ # Download gpt4all-j to models/
19
+ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
20
+
21
+ # start with docker-compose
22
+ docker-compose up -d --pull always
23
+ # or you can build the images with:
24
+ # docker-compose up -d --build
25
+ ```
26
+
27
+ ## Pointing chatbot-ui to a separately managed LocalAI service
28
+
29
+ If you want to use the [chatbot-ui example](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) with an externally managed LocalAI service, you can alter the `docker-compose` file so that it looks like the below. You will notice the file is smaller, because we have removed the section that would normally start the LocalAI service. Take care to update the IP address (or FQDN) that the chatbot-ui service tries to access (marked `<<LOCALAI_IP>>` below):
30
+ ```
31
+ version: '3.6'
32
+
33
+ services:
34
+ chatgpt:
35
+ image: ghcr.io/mckaywrigley/chatbot-ui:main
36
+ ports:
37
+ - 3000:3000
38
+ environment:
39
+ - 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX'
40
+ - 'OPENAI_API_HOST=http://<<LOCALAI_IP>>:8080'
41
+ ```
42
+
43
+ Once you've edited the Dockerfile, you can start it with `docker compose up`, then browse to `http://localhost:3000`.
44
+
45
+ ## Accessing chatbot-ui
46
+
47
+ Open http://localhost:3000 for the Web UI.
48
+
examples/chatbot-ui-manual/docker-compose.yaml ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.6'
2
+
3
+ services:
4
+ api:
5
+ image: quay.io/go-skynet/local-ai:latest
6
+ build:
7
+ context: ../../
8
+ dockerfile: Dockerfile
9
+ ports:
10
+ - 8080:8080
11
+ environment:
12
+ - DEBUG=true
13
+ - MODELS_PATH=/models
14
+ volumes:
15
+ - ./models:/models:cached
16
+ command: ["/usr/bin/local-ai" ]
17
+
18
+ chatgpt:
19
+ image: ghcr.io/mckaywrigley/chatbot-ui:main
20
+ ports:
21
+ - 3000:3000
22
+ environment:
23
+ - 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX'
24
+ - 'OPENAI_API_HOST=http://api:8080'
examples/chatbot-ui-manual/models/completion.tmpl ADDED
@@ -0,0 +1 @@
 
 
1
+ {{.Input}}
examples/chatbot-ui-manual/models/gpt-3.5-turbo.yaml ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: gpt-3.5-turbo
2
+ parameters:
3
+ model: ggml-gpt4all-j
4
+ top_k: 80
5
+ temperature: 0.2
6
+ top_p: 0.7
7
+ context_size: 1024
8
+ stopwords:
9
+ - "HUMAN:"
10
+ - "GPT:"
11
+ roles:
12
+ user: " "
13
+ system: " "
14
+ template:
15
+ completion: completion
16
+ chat: gpt4all
examples/chatbot-ui-manual/models/gpt4all.tmpl ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response.
2
+ ### Prompt:
3
+ {{.Input}}
4
+ ### Response:
examples/chatbot-ui/README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # chatbot-ui
2
+
3
+ Example of integration with [mckaywrigley/chatbot-ui](https://github.com/mckaywrigley/chatbot-ui).
4
+
5
+ ![Screenshot from 2023-04-26 23-59-55](https://user-images.githubusercontent.com/2420543/234715439-98d12e03-d3ce-4f94-ab54-2b256808e05e.png)
6
+
7
+ ## Run
8
+
9
+ In this example LocalAI will download the gpt4all model and set it up as "gpt-3.5-turbo". See the `docker-compose.yaml`
10
+ ```bash
11
+ # Clone LocalAI
12
+ git clone https://github.com/go-skynet/LocalAI
13
+
14
+ cd LocalAI/examples/chatbot-ui
15
+
16
+ # start with docker-compose
17
+ docker-compose up --pull always
18
+
19
+ # or you can build the images with:
20
+ # docker-compose up -d --build
21
+ ```
22
+
23
+ ## Pointing chatbot-ui to a separately managed LocalAI service
24
+
25
+ If you want to use the [chatbot-ui example](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) with an externally managed LocalAI service, you can alter the `docker-compose` file so that it looks like the below. You will notice the file is smaller, because we have removed the section that would normally start the LocalAI service. Take care to update the IP address (or FQDN) that the chatbot-ui service tries to access (marked `<<LOCALAI_IP>>` below):
26
+ ```
27
+ version: '3.6'
28
+
29
+ services:
30
+ chatgpt:
31
+ image: ghcr.io/mckaywrigley/chatbot-ui:main
32
+ ports:
33
+ - 3000:3000
34
+ environment:
35
+ - 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX'
36
+ - 'OPENAI_API_HOST=http://<<LOCALAI_IP>>:8080'
37
+ ```
38
+
39
+ Once you've edited the Dockerfile, you can start it with `docker compose up`, then browse to `http://localhost:3000`.
40
+
41
+ ## Accessing chatbot-ui
42
+
43
+ Open http://localhost:3000 for the Web UI.
44
+
examples/chatbot-ui/docker-compose.yaml ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.6'
2
+
3
+ services:
4
+ api:
5
+ image: quay.io/go-skynet/local-ai:latest
6
+ # As initially LocalAI will download the models defined in PRELOAD_MODELS
7
+ # you might need to tweak the healthcheck values here according to your network connection.
8
+ # Here we give a timespan of 20m to download all the required files.
9
+ healthcheck:
10
+ test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
11
+ interval: 1m
12
+ timeout: 20m
13
+ retries: 20
14
+ build:
15
+ context: ../../
16
+ dockerfile: Dockerfile
17
+ ports:
18
+ - 8080:8080
19
+ environment:
20
+ - DEBUG=true
21
+ - MODELS_PATH=/models
22
+ # You can preload different models here as well.
23
+ # See: https://github.com/go-skynet/model-gallery
24
+ - 'PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}]'
25
+ volumes:
26
+ - ./models:/models:cached
27
+ command: ["/usr/bin/local-ai" ]
28
+ chatgpt:
29
+ depends_on:
30
+ api:
31
+ condition: service_healthy
32
+ image: ghcr.io/mckaywrigley/chatbot-ui:main
33
+ ports:
34
+ - 3000:3000
35
+ environment:
36
+ - 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX'
37
+ - 'OPENAI_API_HOST=http://api:8080'
examples/discord-bot/.env.example ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ OPENAI_API_KEY=x
2
+ DISCORD_BOT_TOKEN=x
3
+ DISCORD_CLIENT_ID=x
4
+ OPENAI_API_BASE=http://api:8080
5
+ ALLOWED_SERVER_IDS=x
6
+ SERVER_TO_MODERATION_CHANNEL=1:1
examples/discord-bot/README.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # discord-bot
2
+
3
+ ![Screenshot from 2023-05-01 07-58-19](https://user-images.githubusercontent.com/2420543/235413924-0cb2e75b-f2d6-4119-8610-44386e44afb8.png)
4
+
5
+ ## Setup
6
+
7
+ ```bash
8
+ # Clone LocalAI
9
+ git clone https://github.com/go-skynet/LocalAI
10
+
11
+ cd LocalAI/examples/discord-bot
12
+
13
+ # (optional) Checkout a specific LocalAI tag
14
+ # git checkout -b build <TAG>
15
+
16
+ # Download gpt4all-j to models/
17
+ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
18
+
19
+ # Set the discord bot options (see: https://github.com/go-skynet/gpt-discord-bot#setup)
20
+ cp -rfv .env.example .env
21
+ vim .env
22
+
23
+ # start with docker-compose
24
+ docker-compose up -d --build
25
+ ```
26
+
27
+ Note: see setup options here: https://github.com/go-skynet/gpt-discord-bot#setup
28
+
29
+ Open up the URL in the console and give permission to the bot in your server. Start a thread with `/chat ..`
30
+
31
+ ## Kubernetes
32
+
33
+ - install the local-ai chart first
34
+ - change OPENAI_API_BASE to point to the API address and apply the discord-bot manifest:
35
+
36
+ ```yaml
37
+ apiVersion: v1
38
+ kind: Namespace
39
+ metadata:
40
+ name: discord-bot
41
+ ---
42
+ apiVersion: apps/v1
43
+ kind: Deployment
44
+ metadata:
45
+ name: localai
46
+ namespace: discord-bot
47
+ labels:
48
+ app: localai
49
+ spec:
50
+ selector:
51
+ matchLabels:
52
+ app: localai
53
+ replicas: 1
54
+ template:
55
+ metadata:
56
+ labels:
57
+ app: localai
58
+ name: localai
59
+ spec:
60
+ containers:
61
+ - name: localai-discord
62
+ env:
63
+ - name: OPENAI_API_KEY
64
+ value: "x"
65
+ - name: DISCORD_BOT_TOKEN
66
+ value: ""
67
+ - name: DISCORD_CLIENT_ID
68
+ value: ""
69
+ - name: OPENAI_API_BASE
70
+ value: "http://local-ai.default.svc.cluster.local:8080"
71
+ - name: ALLOWED_SERVER_IDS
72
+ value: "xx"
73
+ - name: SERVER_TO_MODERATION_CHANNEL
74
+ value: "1:1"
75
+ image: quay.io/go-skynet/gpt-discord-bot:main
76
+ ```
examples/discord-bot/docker-compose.yaml ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.6'
2
+
3
+ services:
4
+ api:
5
+ image: quay.io/go-skynet/local-ai:latest
6
+ build:
7
+ context: ../../
8
+ dockerfile: Dockerfile
9
+ ports:
10
+ - 8080:8080
11
+ environment:
12
+ - DEBUG=true
13
+ - MODELS_PATH=/models
14
+ volumes:
15
+ - ./models:/models:cached
16
+ command: ["/usr/bin/local-ai" ]
17
+
18
+ bot:
19
+ image: quay.io/go-skynet/gpt-discord-bot:main
20
+ env_file:
21
+ - .env
examples/discord-bot/models ADDED
@@ -0,0 +1 @@
 
 
1
+ ../chatbot-ui/models/
examples/flowise/README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # flowise
2
+
3
+ Example of integration with [FlowiseAI/Flowise](https://github.com/FlowiseAI/Flowise).
4
+
5
+ ![Screenshot from 2023-05-30 18-01-03](https://github.com/go-skynet/LocalAI/assets/2420543/02458782-0549-4131-971c-95ee56ec1af8)
6
+
7
+ You can check a demo video in the Flowise PR: https://github.com/FlowiseAI/Flowise/pull/123
8
+
9
+ ## Run
10
+
11
+ In this example LocalAI will download the gpt4all model and set it up as "gpt-3.5-turbo". See the `docker-compose.yaml`
12
+ ```bash
13
+ # Clone LocalAI
14
+ git clone https://github.com/go-skynet/LocalAI
15
+
16
+ cd LocalAI/examples/flowise
17
+
18
+ # start with docker-compose
19
+ docker-compose up --pull always
20
+
21
+ ```
22
+
23
+ ## Accessing flowise
24
+
25
+ Open http://localhost:3000.
26
+
27
+ ## Using LocalAI
28
+
29
+ Search for LocalAI in the integration, and use the `http://api:8080/` as URL.
30
+
examples/flowise/docker-compose.yaml ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.6'
2
+
3
+ services:
4
+ api:
5
+ image: quay.io/go-skynet/local-ai:latest
6
+ # As initially LocalAI will download the models defined in PRELOAD_MODELS
7
+ # you might need to tweak the healthcheck values here according to your network connection.
8
+ # Here we give a timespan of 20m to download all the required files.
9
+ healthcheck:
10
+ test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
11
+ interval: 1m
12
+ timeout: 20m
13
+ retries: 20
14
+ build:
15
+ context: ../../
16
+ dockerfile: Dockerfile
17
+ ports:
18
+ - 8080:8080
19
+ environment:
20
+ - DEBUG=true
21
+ - MODELS_PATH=/models
22
+ # You can preload different models here as well.
23
+ # See: https://github.com/go-skynet/model-gallery
24
+ - 'PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}]'
25
+ volumes:
26
+ - ./models:/models:cached
27
+ command: ["/usr/bin/local-ai" ]
28
+ flowise:
29
+ depends_on:
30
+ api:
31
+ condition: service_healthy
32
+ image: flowiseai/flowise
33
+ ports:
34
+ - 3000:3000
35
+ volumes:
36
+ - ~/.flowise:/root/.flowise
37
+ command: /bin/sh -c "sleep 3; flowise start"
examples/k8sgpt/README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # k8sgpt example
2
+
3
+ This example show how to use LocalAI with k8sgpt
4
+
5
+ ![Screenshot from 2023-06-19 23-58-47](https://github.com/go-skynet/go-ggml-transformers.cpp/assets/2420543/cab87409-ee68-44ae-8d53-41627fb49509)
6
+
7
+ ## Create the cluster locally with Kind (optional)
8
+
9
+ If you want to test this locally without a remote Kubernetes cluster, you can use kind.
10
+
11
+ Install [kind](https://kind.sigs.k8s.io/) and create a cluster:
12
+
13
+ ```
14
+ kind create cluster
15
+ ```
16
+
17
+ ## Setup LocalAI
18
+
19
+ We will use [helm](https://helm.sh/docs/intro/install/):
20
+
21
+ ```
22
+ helm repo add go-skynet https://go-skynet.github.io/helm-charts/
23
+ helm repo update
24
+
25
+ # Clone LocalAI
26
+ git clone https://github.com/go-skynet/LocalAI
27
+
28
+ cd LocalAI/examples/k8sgpt
29
+
30
+ # modify values.yaml preload_models with the models you want to install.
31
+ # CHANGE the URL to a model in huggingface.
32
+ helm install local-ai go-skynet/local-ai --create-namespace --namespace local-ai --values values.yaml
33
+ ```
34
+
35
+ ## Setup K8sGPT
36
+
37
+ ```
38
+ # Install k8sgpt
39
+ helm repo add k8sgpt https://charts.k8sgpt.ai/
40
+ helm repo update
41
+ helm install release k8sgpt/k8sgpt-operator -n k8sgpt-operator-system --create-namespace
42
+ ```
43
+
44
+ Apply the k8sgpt-operator configuration:
45
+
46
+ ```
47
+ kubectl apply -f - << EOF
48
+ apiVersion: core.k8sgpt.ai/v1alpha1
49
+ kind: K8sGPT
50
+ metadata:
51
+ name: k8sgpt-local-ai
52
+ namespace: default
53
+ spec:
54
+ backend: localai
55
+ baseUrl: http://local-ai.local-ai.svc.cluster.local:8080/v1
56
+ noCache: false
57
+ model: gpt-3.5-turbo
58
+ noCache: false
59
+ version: v0.3.0
60
+ enableAI: true
61
+ EOF
62
+ ```
63
+
64
+ ## Test
65
+
66
+ Apply a broken pod:
67
+
68
+ ```
69
+ kubectl apply -f broken-pod.yaml
70
+ ```