jaothan commited on
Commit
891fbea
·
verified ·
1 Parent(s): 7f48a8c

Upload 36 files

Browse files
.gitignore ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .env
2
+ test_endpoint.py
3
+ app/llm_workflows.py
4
+ build
5
+ __pycache__
6
+ databricks_llm.egg-info
7
+ dist
8
+ hf_endpoint.py
9
+ azureOAI_endpoint.py
10
+ **/prompt.py
11
+ prompts
LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
README - raw.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Introduction
2
+ This repo contains a CLI which can perform interactions with the Databricks API by using natural language.
3
+ This functionality is enabled by Langchain and Large Language Models (LLMs).
4
+ The code is tested with text-davinci-003 from OpenAI with the configuration in llm.yaml. Every OpenAI model can be used in the configuration llm.yaml.
5
+
6
+ # General
7
+ You can either install the CLI directly or use the repository directly with python.py databricks_llm.py \[OPTIONS\] COMMAND \[ARGS\].
8
+
9
+ # Getting Started
10
+ 1. Dependencies
11
+ * All necessary dependencies are included in the requirements.txt and should be installed in a virtual environment. (e.g. pip install -r requirements.txt)
12
+ 2. Installation of the CLI:
13
+ * python setup.py bdist_wheel
14
+ * pip install .\dist\databricks_llm-0.0.1-py3-none-any.whl
15
+ 3. The following environment variables always need to be set:
16
+ * OPENAI_API_KEY = ... (Your access key from OpenAI or Azure OpenAI)
17
+ * DATABRICKS_HOST = ... (The host name in the form "adb-\<workspace-id\>.azuredatabricks.net")
18
+ * DBR_BEARER_TOKEN = ... (A personal access token for your workspace)
19
+ 4. If you want to use the **AzureOpenAI** Service, please add the following environment variables:
20
+ * OPENAI_API_TYPE = "azure" (example)
21
+ * OPENAI_API_VERSION= "2023-05-15" (example)
22
+ * OPENAI_API_BASE = "https://\<resource-name\>.openai.azure.com/"
23
+ 5. The **llm.yaml** config:
24
+ * This contains the parameters for the Large Langugage Model.
25
+ * **For AzureOpenAI the entry "deployment_name" needs to be present, otherwise please remove it**
26
+
27
+ # Functionalities
28
+ This repo contains several functionalities around Databricks Jobs/Workflows and their permissions as well as functionalities around MLflow models that are stored within a Databricks workspace. The commands and their descriptions can be accessed with the --help option.
29
+
30
+ # Example workflows
31
+ * Jobs:
32
+ * Get particular job infos: databricks_llm jobs "Please return a list of all jobs; Figure out individual job configuration and only return the job names with tags 'NLP' and their jobID summarized:"
33
+ * Permissions:
34
+ * Add a permission entry: databricks_llm permissions "Please add an entry in the list for the group: '\<group_name\>' with permission 'CAN_VIEW'" "\<workflow_id\>"
35
+ * Modify a permission entry: databricks_llm permissions "Please modify the entry in the list for the group: '\<group_name\>' to permission 'CAN_MANAGE'" "\<workflow_id\>"
36
+ * Machine Learning:
37
+ * Get model infos: databricks_llm ml get-model-info "Please give me all infos for the model '\<model_name\>'"
38
+ * Get infos about a run: databricks_llm ml get-run-info "\<run-id\>" "Please return the f1 score"
39
+ * Transition a model: databricks_llm ml transition-model "Please transition model '\<model_name\>' version 6 into stage 'Production' and archive the current model in production"
40
+
41
+ # Overview of the workflows
42
+ ![api_chatter](./assets/API_Chatter_Overview.PNG)
app.py ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import click
2
+ import logging
3
+ import sys
4
+ import yaml
5
+ import os
6
+ from langchain.llms import AzureOpenAI, OpenAI
7
+ from app.api_funcs import get_job_infos, get_run, get_model, \
8
+ trans_model, batch_mod_permission, prepare_api_docs
9
+ from pathlib import Path
10
+ logger = logging.getLogger(__name__)
11
+ logger.setLevel(logging.DEBUG)
12
+ handler = logging.StreamHandler(sys.stdout)
13
+ handler.setLevel(logging.DEBUG)
14
+ logger.addHandler(handler)
15
+
16
+ PATH = Path(os.path.abspath(os.path.dirname(__file__)))
17
+
18
+ # Open the YAML file
19
+ conf_path = PATH / "app" / 'llm.yaml'
20
+ with open(conf_path) as f:
21
+ config = yaml.safe_load(f)
22
+
23
+ # Use AzureOpenAI, if config contains deployment name, otherwise OpenAI
24
+ if config['model'].get('deployment_name', False):
25
+ llm = AzureOpenAI(**config['model'])
26
+ else:
27
+ llm = OpenAI(**config['model'])
28
+
29
+
30
+ headers = {"Authorization": f"Bearer {os.getenv('DBR_BEARER_TOKEN')}"}
31
+ updated_api_docs = prepare_api_docs()
32
+
33
+
34
+ def comma_list(comma_str: str):
35
+ return comma_str.split(',')
36
+
37
+ def determine_api_text(updated_api_docs: dict, query: str):
38
+ pick_api_prompt = """Please return the file name from the list {api_docs}
39
+ that best corresponds to the following query: {query}. \
40
+ DO NOT EXPLAIN your answer!
41
+ """
42
+ api_docs = os.listdir(PATH / "app" / "dbr_api_docs")
43
+ selected_api_doc = llm(pick_api_prompt.format(api_docs=api_docs, query=query)).lstrip().rstrip()
44
+ logger.info(f"\nSelecting the following api document: {selected_api_doc}")
45
+ api_text = updated_api_docs[selected_api_doc]
46
+ return api_text, selected_api_doc
47
+
48
+ # Add subcommands for commands
49
+ @click.group()
50
+ def cli():
51
+ pass
52
+
53
+ @cli.group(help='Run machine learning model.')
54
+ def ml():
55
+ pass
56
+
57
+ # Add commands for specific subcommands of 'ml'
58
+ @ml.command(help='Get information about a model.')
59
+ @click.argument('query', type=str)
60
+ def get_model_info(query):
61
+ # Instruction to get model infos
62
+ api_text, _ = determine_api_text(updated_api_docs, query)
63
+ logger.info(get_model(llm, query, api_text, headers))
64
+
65
+ @ml.command(help='Get information about a model run.')
66
+ @click.argument('run_id', type=str)
67
+ @click.argument('query', type=str)
68
+ def get_run_info(query, run_id):
69
+ # ID of the model run for which you'd like information.
70
+ # Which information should be pulled from the run?
71
+ api_text, _ = determine_api_text(updated_api_docs, query)
72
+ logger.info(get_run(llm, run_id, query, api_text, headers))
73
+
74
+ @ml.command(help='Transition a model from one state to another.')
75
+ @click.argument('query', type=str)
76
+ def transition_model(query):
77
+ # Instruction to transition a model.
78
+ api_text, _ = determine_api_text(updated_api_docs, query)
79
+ trans_model(llm, query, api_text, headers)
80
+
81
+ @cli.command(help='View job history.')
82
+ @click.argument('query', type=str)
83
+ def jobs(query):
84
+ if ";" not in query:
85
+ query = query + ";"
86
+ query, response_query = query.split(";")
87
+ api_text, _ = determine_api_text(updated_api_docs, query)
88
+ # The query for the LLM + an optional query for the API response
89
+ logger.info(get_job_infos(llm, query, response_query, api_text, headers))
90
+
91
+ @cli.command(help='Manage user permissions.')
92
+ @click.argument('query', type=str)
93
+ @click.argument('jobs', type=comma_list)
94
+ def permissions(jobs, query):
95
+ api_text, api_name = determine_api_text(updated_api_docs, query)
96
+ # Add/Get user permissions.
97
+ batch_mod_permission(
98
+ logger, llm, updated_api_docs, api_text, api_name, headers,
99
+ query, jobs=jobs
100
+ )
101
+
102
+ if __name__ == '__main__':
103
+ cli()
app/__init__.py ADDED
File without changes
app/__main__.py ADDED
File without changes
app/api_funcs.py ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ import json
3
+ import os
4
+ import ast
5
+ from pathlib import Path
6
+ from typing import List
7
+ import langchain.llms as LLM
8
+ from app.custom_langchain import ModAPIChain, APIResponse, FlexAPIChain, FlexAPIChainPayload
9
+ from app.utils import find_value, custom_api_prompt
10
+ from app.custom_api_prompts import API_REQUEST_PROMPT, \
11
+ API_REQUEST_PROMPT2, API_RESPONSE_PROMPT, API_RESPONSE_PROMPT2, PERMISSION_PROMPT
12
+
13
+ PATH = Path(os.path.abspath(os.path.dirname(__file__)))
14
+
15
+ # Open the YAML file
16
+ api_doc_path = PATH / 'dbr_api_docs'
17
+
18
+ def prepare_api_docs():
19
+ """
20
+ Add environment variables to the individual
21
+ API docs.
22
+ """
23
+ updated_api_docs = {}
24
+ for filename in os.listdir(api_doc_path):
25
+ with open(os.path.join(api_doc_path, filename)) as f:
26
+ updated_api_docs[filename] = f.read()
27
+ updated_api_docs[filename] = updated_api_docs[filename].replace(
28
+ "{DATABRICKS_HOST}", os.getenv('DATABRICKS_HOST'))
29
+ return updated_api_docs
30
+
31
+
32
+ def get_run(llm: LLM, runID: str, query: str, api_text: str, headers: dict) -> str:
33
+ """
34
+ Get infos from an MLflow run.
35
+
36
+ Args:
37
+ llm (LLM): Large language model
38
+ runID (str): Run ID
39
+ query (str): Query
40
+ api_text (str): API documentation for a particular endpoint
41
+ headers (dict): Authorization for the Databricks API
42
+
43
+ Return:
44
+ (str): the response of the language model
45
+ """
46
+ mlflow_run_get_chain = APIResponse.from_llm_and_api_docs(llm, api_text, headers=headers, verbose=True)
47
+ run_infos = json.loads(mlflow_run_get_chain.run(
48
+ f"""Please give me the infos about the run '{runID}'"""))
49
+ # Please return the f1 score
50
+ return llm(f"""{query} of the metrics : {find_value(run_infos, 'metrics')}""")
51
+
52
+
53
+ def trans_model(llm: LLM, query: str, api_text: str, headers: dict):
54
+ """
55
+ Transition a MLflow model to another stage.
56
+
57
+ Args:
58
+ llm (LLM): Large language model
59
+ query (str): Query for the model
60
+ api_text (str): API documentation for a particular endpoint
61
+ headers (dict): Authorization for the Databricks API
62
+ """
63
+ mlflow_model_transition_chain = FlexAPIChainPayload.from_llm_and_api_docs(
64
+ llm, api_text, headers=headers, api_url_prompt=API_REQUEST_PROMPT2,
65
+ api_response_prompt=API_RESPONSE_PROMPT2, verbose=True
66
+ )
67
+ mlflow_model_transition_chain.run(query)
68
+
69
+
70
+ def get_model(llm: LLM, model_query: str, api_text: str, headers: dict) -> str:
71
+ """
72
+ Get all infos about a model and its particular stages
73
+
74
+ Args:
75
+ llm (LLM): Large language model
76
+ model_query (str): Query
77
+ api_text (str): API documentation for a particular endpoint
78
+ headers (dict): Authorization for the Databricks API
79
+
80
+ Return:
81
+ (str): response of the language model
82
+ """
83
+ mlflow_model_get_chain = APIResponse.from_llm_and_api_docs(llm, api_text, headers=headers, verbose=True)
84
+ api_result = mlflow_model_get_chain.run(model_query)
85
+
86
+ return llm(
87
+ f"""What is the runID and their version of the latest version with stage 'None' \
88
+ and stage 'Production' respectively, given the following info: {api_result}"""
89
+ )
90
+
91
+
92
+ def batch_mod_permission(
93
+ logger, llm: LLM, updated_api_docs: dict, api_text: str, api_name: str,
94
+ headers: dict, permission_mod: str, jobs: List[str]
95
+ ):
96
+ """
97
+ Get or modify permissions for a batch of jobs.
98
+
99
+ Args:
100
+ logger (Logger): Logger object
101
+ llm (LLM): Large language model
102
+ updated_api_docs (dict): Updated API docs
103
+ api_text (str): API documentation for a particular endpoint
104
+ api_name (str): API name
105
+ headers (dict): Authorization for the Databricks API
106
+ permission_mod (str): Permission modification
107
+ jobs (list): List of jobs
108
+ """
109
+ get_permission_txt = updated_api_docs['get_permissions.txt']
110
+ for job in jobs:
111
+ logger.info(f"Permission for job: {job}")
112
+ mod_permissions(
113
+ logger, llm, permission_mod, job, headers,
114
+ get_permission_txt, api_text, api_name
115
+ )
116
+
117
+
118
+ def get_permissions(logger, llm: LLM, api_text: str, headers: dict, jobID: str) -> dict:
119
+ """
120
+ Get the permissions of a particular job.
121
+
122
+ Args:
123
+ logger (Logger): Logger object
124
+ llm (LLM): Large language model
125
+ api_text (str): API documentation for a particular endpoint
126
+ headers (dict): Authorization for the Databricks API
127
+ jobID (str): ID of the Databricks job
128
+
129
+ Return:
130
+ (dict): Access control list
131
+ """
132
+ init_query = f"""Get the jobs permissions for jobID {jobID}"""
133
+ permission_get_chain = APIResponse.from_llm_and_api_docs(llm, api_text, headers=headers, verbose=True)
134
+
135
+ acc_control_list = {'access_control_list': json.loads(permission_get_chain.run(init_query))['access_control_list']}
136
+ logger.info(acc_control_list)
137
+ return acc_control_list
138
+
139
+ def mod_permissions(
140
+ logger,
141
+ llm: LLM,
142
+ permission_mod: str,
143
+ jobID: str, headers: dict,
144
+ get_permission_txt: str,
145
+ api_text: str, api_name: str
146
+ ):
147
+ """
148
+ Modify permissions for a job.
149
+
150
+ Args:
151
+ logger (Logger): Logger object
152
+ llm (LLM): Large language model
153
+ permission_mod (str): Permission modification
154
+ jobID (str): ID of the Databricks job
155
+ headers (dict): Authorization for the Databricks API
156
+ get_permission_txt (str): API text for getting permissions
157
+ api_text (str): API documentation for a particular endpoint
158
+ api_name (str): file name of the API documentation
159
+ """
160
+ acc_control_list = get_permissions(logger, llm, get_permission_txt, headers, jobID)
161
+
162
+ # Update the permission, if the query demands it
163
+ if api_name == 'update_permissions.txt':
164
+ permission_prompt= PERMISSION_PROMPT.format(
165
+ acc_control_list=acc_control_list, permission_mod=permission_mod)
166
+
167
+ output_acc_list = llm(permission_prompt)
168
+ patch_payload = ast.literal_eval(output_acc_list.rstrip().lstrip()) #
169
+ # Reformat the general permission response into the form for the patch payload
170
+ for idx, _ in enumerate(patch_payload['access_control_list']):
171
+ patch_payload['access_control_list'][idx]['permission_level'] = patch_payload['access_control_list'][idx]['all_permissions'][0]['permission_level']
172
+ patch_payload['access_control_list'][idx].pop('all_permissions')
173
+
174
+ permission_update_chain = FlexAPIChain.from_llm_and_api_docs(
175
+ llm, api_text, headers=headers, verbose=True,
176
+ api_url_prompt=API_REQUEST_PROMPT, api_response_prompt=API_RESPONSE_PROMPT
177
+ ) # use updated API Chain which can do post, put and patch also....
178
+
179
+ permission_update_chain.body = patch_payload
180
+ logger.info(f"Payload for permission update: {patch_payload}")
181
+ permission_update_chain(f"""Update the jobs permissions of jobID {jobID}""")
182
+
183
+
184
+ def get_job_infos(
185
+ llm: LLM, query: str, response_query: str, api_text: str, headers: dict
186
+ ) -> str:
187
+ """
188
+ Get particular infos about the list of existing jobs.
189
+
190
+ Args:
191
+ llm (LLM): Large language model
192
+ query (str): Query to get the list of jobs
193
+ response_query (str): Query to get the response
194
+ api_text (str): API documentation for a particular endpoint
195
+ headers (dict): Authorization for the Databricks API
196
+
197
+ Return:
198
+ (str): response from the large language model
199
+ """
200
+ job_chain = ModAPIChain.from_llm_and_api_docs(
201
+ llm, api_text, headers=headers, verbose=True
202
+ )
203
+
204
+ custom_resp = f"This response contains a list of databricks jobs. {response_query}"
205
+ jobs_result = custom_api_prompt(llm, job_chain, query, custom_resp)
206
+ return jobs_result
app/convert_api_txt.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Script for converting raw api texts (crawled, cop&paste from webpage etc.)
app/custom_api_prompts.py ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from langchain.prompts.prompt import PromptTemplate
2
+
3
+ API_URL_PROMPT_TEMPLATE = """You are given the below API Documentation:
4
+ {api_docs}
5
+ Using this documentation, generate the full API url to call for answering the user question.
6
+ You should build the API url in order to get a response that is as short as possible, while still getting the necessary information to answer the question. Pay attention to deliberately exclude any unnecessary pieces of data in the API call.
7
+ AFTER the API url, you should extract the request METHOD (can be GET, POST, PATCH or PUT) from doc.
8
+
9
+ Question:{question}
10
+ """
11
+
12
+ API_REQUEST_PROMPT_TEMPLATE = API_URL_PROMPT_TEMPLATE + """Output the API url and METHOD, join them with `|`. DO NOT GIVE ANY EXPLANATION."""
13
+
14
+ API_REQUEST_PROMPT = PromptTemplate(
15
+ input_variables=[
16
+ "api_docs",
17
+ "question",
18
+ ],
19
+ template=API_REQUEST_PROMPT_TEMPLATE,
20
+ )
21
+
22
+ API_RESPONSE_PROMPT_TEMPLATE = (
23
+ API_URL_PROMPT_TEMPLATE
24
+ + """API url: {api_url}
25
+
26
+ Here is the response from the API:
27
+
28
+ {api_response}
29
+
30
+ Summarize this response to answer the original question.
31
+
32
+ Summary:"""
33
+ )
34
+
35
+ API_RESPONSE_PROMPT = PromptTemplate(
36
+ input_variables=["api_docs", "question", "api_url", "api_response"],
37
+ template=API_RESPONSE_PROMPT_TEMPLATE,
38
+ )
39
+
40
+ API_URL_PROMPT_TEMPLATE2 = """You are given the below API Documentation:
41
+ {api_docs}
42
+ Using this documentation, generate the full API url to call for answering the user question.
43
+ You should build the API url in order to get a response that is as short as possible, while still getting the necessary information to answer the question. Pay attention to deliberately exclude any unnecessary pieces of data in the API call.
44
+ AFTER the API url, you should extract the request METHOD from doc, and generate the BODY data in JSON format according to the user question if necessary. The BODY data could be empty dict.
45
+
46
+ Question:{question}
47
+ """
48
+
49
+ API_REQUEST_PROMPT_TEMPLATE2 = API_URL_PROMPT_TEMPLATE2 + """Output the API url, METHOD and BODY, join them with `|`. DO NOT GIVE ANY EXPLANATION."""
50
+
51
+ API_REQUEST_PROMPT2 = PromptTemplate(
52
+ input_variables=[
53
+ "api_docs",
54
+ "question",
55
+ ],
56
+ template=API_REQUEST_PROMPT_TEMPLATE2,
57
+ )
58
+
59
+ API_RESPONSE_PROMPT_TEMPLATE2 = (
60
+ API_URL_PROMPT_TEMPLATE2
61
+ + """API url: {api_url}
62
+
63
+ Here is the response from the API:
64
+
65
+ {api_response}
66
+
67
+ Summarize this response to answer the original question.
68
+
69
+ Summary:"""
70
+ )
71
+
72
+ API_RESPONSE_PROMPT2 = PromptTemplate(
73
+ input_variables=["api_docs", "question", "api_url", "api_response"],
74
+ template=API_RESPONSE_PROMPT_TEMPLATE2,
75
+ )
76
+
77
+ PERMISSION_PROMPT = """
78
+ The following list is the content for an 'access control list' of Databricks Jobs:
79
+ {acc_control_list}
80
+ Each entry of the 'access_control_list' denotes either a group or user and their permission level.
81
+ {permission_mod} (This permission is not inherited!):
82
+ """
83
+
app/custom_langchain.py ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Dict, Any
2
+ import json
3
+ from langchain.chains import APIChain
4
+ from typing import Any, Dict, Optional
5
+ from langchain.callbacks.manager import (
6
+ CallbackManagerForChainRun,
7
+ )
8
+
9
+
10
+ class ModAPIChain(APIChain):
11
+ """Chain that makes API calls and summarizes the responses to answer a question,
12
+ but doesn't display any API outputs for confidentiality! (only difference to APIChain)"""
13
+ def _call(
14
+ self,
15
+ inputs: Dict[str, Any],
16
+ run_manager: Optional[CallbackManagerForChainRun] = None,
17
+ ) -> Dict[str, str]:
18
+ _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
19
+ question = inputs[self.question_key]
20
+ api_url = self.api_request_chain.predict(
21
+ question=question,
22
+ api_docs=self.api_docs,
23
+ callbacks=_run_manager.get_child(),
24
+ )
25
+ _run_manager.on_text(api_url, color="green", end="\n", verbose=self.verbose)
26
+ api_response = self.requests_wrapper.get(api_url)
27
+ answer = self.api_answer_chain.predict(
28
+ question=question,
29
+ api_docs=self.api_docs,
30
+ api_url=api_url,
31
+ api_response=api_response,
32
+ callbacks=_run_manager.get_child(),
33
+ )
34
+ return {self.output_key: answer}
35
+
36
+
37
+ class APIResponse(APIChain):
38
+ """Chain that makes API calls and summarizes the responses to answer a question."""
39
+ def _call(
40
+ self,
41
+ inputs: Dict[str, Any],
42
+ run_manager = None,
43
+ ) -> Dict[str, str]:
44
+ _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
45
+ question = inputs[self.question_key]
46
+ api_url = self.api_request_chain.predict(
47
+ question=question,
48
+ api_docs=self.api_docs,
49
+ callbacks=_run_manager.get_child(),
50
+ )
51
+ _run_manager.on_text(api_url, color="green", end="\n", verbose=self.verbose)
52
+ api_response = self.requests_wrapper.get(api_url)
53
+ answer = api_response
54
+ return {self.output_key: answer}
55
+
56
+
57
+ class FlexAPIChain(APIChain):
58
+ """
59
+ Flexible API Chain which can create all request types
60
+ whereby the body is passed as an attribute. Relies on
61
+ specific Prompt Templates.
62
+ """
63
+ body: dict = {}
64
+ def _call(self, inputs: Dict[str, str], run_manager= None) -> Dict[str, str]:
65
+ question = inputs[self.question_key]
66
+ request_info = self.api_request_chain.predict(
67
+ question=question, api_docs=self.api_docs
68
+ )
69
+ _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
70
+
71
+ api_url, request_method = request_info.split('|')
72
+ _run_manager.on_text(request_method, color="green", end="\n", verbose=self.verbose)
73
+ _run_manager.on_text(api_url, color="green", end="\n", verbose=self.verbose)
74
+
75
+ # get the method with same name
76
+ request_func = getattr(self.requests_wrapper, request_method.lower().rstrip().lstrip())
77
+
78
+ api_response = request_func(api_url, self.body)
79
+ return {self.output_key: api_response}
80
+
81
+
82
+ class FlexAPIChainPayload(APIChain):
83
+ """
84
+ Flexible API Chain which can create all request types
85
+ including the necessary payload. Relies on
86
+ specific Prompt Templates.
87
+ """
88
+ def _call(self, inputs: Dict[str, str], run_manager = None) -> Dict[str, str]:
89
+ question = inputs[self.question_key]
90
+ request_info = self.api_request_chain.predict(
91
+ question=question, api_docs=self.api_docs
92
+ )
93
+ _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
94
+
95
+ api_url, request_method, body = request_info.split('|')
96
+ _run_manager.on_text(body, color="green", end="\n", verbose=self.verbose)
97
+ _run_manager.on_text(request_method, color="green", end="\n", verbose=self.verbose)
98
+ _run_manager.on_text(api_url, color="green", end="\n", verbose=self.verbose)
99
+
100
+ # get the method with same name
101
+ request_func = getattr(self.requests_wrapper, request_method.lower().rstrip().lstrip())
102
+
103
+ api_response = request_func(api_url, json.loads(body))
104
+ return {self.output_key: api_response}
app/dbr_api_docs/get_permissions.txt ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ API Endpoint: Get Object Permissions
5
+
6
+ Path:
7
+ GET /api/2.0/permissions/{request_object_type}/{request_object_id}
8
+
9
+ Path Parameters:
10
+ - request_object_type (required, string)
11
+ - request_object_id (required, string)
12
+
13
+ Responses:
14
+ - 200
15
+ - object_id (string)
16
+ - object_type (string)
17
+ - access_control_list (Array of objects):
18
+ - user_name (string): Name of the user
19
+ - group_name (string): Name of the group
20
+ - service_principal_name (string): Name of the service principal
21
+ - all_permissions (Array of objects): All permissions
app/dbr_api_docs/jobs_cancel_now.txt ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ ## Cancel a Job Run
5
+
6
+ **POST** `/api/2.1/jobs/runs/cancel`
7
+
8
+ Cancels a job run. The run is canceled asynchronously, so it may still be running when this request completes.
9
+
10
+ **Parameters**
11
+
12
+ - `run_id` (required, integer, int64): This field is required.
13
+
14
+ **Responses**
15
+
16
+ - `200` Run was cancelled successfully.
app/dbr_api_docs/jobs_create.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ **POST /api/2.1/jobs/create**
5
+ Create a new job.
6
+
7
+ **Payload**
8
+ A valid job dictionary.
9
+
10
+ **Responses**
11
+ - 200 Job was created successfully
12
+ - job_id: integer, int64
13
+ The canonical identifier for the newly created job.
app/dbr_api_docs/jobs_get.txt ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ ## Get a Single Job
5
+
6
+ **GET** `/api/2.1/jobs/get`
7
+
8
+ Retrieves the details for a single job.
9
+
10
+ **Query Parameters**
11
+
12
+ - `job_id` (required, integer, int64): The canonical identifier of the job to retrieve information about. This field is required.
13
+
14
+ **Responses**
15
+
16
+ - `200` Job was retrieved successfully.
17
+ - `created_time` (integer, int64): The time at which this job was created in epoch milliseconds (milliseconds since 1/1/1970 UTC).
18
+ - `job_id` (integer, int64): The canonical identifier for this job.
19
+ - `trigger_history` (object): History of the file arrival trigger associated with the job.
20
+ - `creator_user_name` (string): The creator user name. This field won’t be included in the response if the user has already been deleted.
21
+ - `run_as_user_name` (string): The user name that the job runs as. run_as_user_name is based on the current job settings, and is set to the creator of the
app/dbr_api_docs/jobs_list.txt ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ **List all jobs**
5
+
6
+ **GET** /api/2.1/jobs/list
7
+
8
+ Retrieves a list of jobs.
9
+
10
+ **Query Parameters**
11
+
12
+ - `limit`: integer [1..25], default: 20. The number of jobs to return. This value must be greater than 0 and less or equal to 25.
13
+ - `offset`: integer, default: 0. The offset of the first job to return, relative to the most recently created job.
14
+ - `name`: string. A filter on the list based on the exact (case insensitive) job name.
15
+ - `expand_tasks`: boolean, default: false. Whether to include task and cluster details in the response.
16
+
17
+ **Responses**
18
+
19
+ - 200 List of jobs was retrieved successfully.
20
+ - `jobs`: Array of objects. The list of jobs.
21
+ - `job_id`: integer, int64. The canonical identifier for this job.
22
+ - `creator_user_name`: string. The creator user name. This field won’t be included in the response if the user has already been deleted.
23
+ - `settings`: object
app/dbr_api_docs/jobs_run_now.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ POST /api/2.1/jobs/run-now
5
+ Run a job and return the run_id of the triggered run.
6
+
7
+ Parameters:
8
+ - job_id (required, integer, int64): The ID of the job to be executed
9
+
10
+ Responses:
11
+ - 200 Run was started successfully.
12
+ - run_id (integer, int64): The globally unique ID of the newly triggered run.
13
+ - number_in_job (integer, int64): Deprecated. A unique identifier for this job run. This is set to the same value as run_id.
app/dbr_api_docs/mlflow_registered_models_get.txt ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ ## API Documentation
5
+ **Endpoint**
6
+
7
+ GET /api/2.0/mlflow/databricks/registered-models/get
8
+
9
+ **Query Parameters**
10
+
11
+ - `name` (required): string, Registered model unique name identifier.
12
+
13
+ **Responses**
14
+
15
+ - `200` Model details were returned successfully.
16
+
17
+ **Response Body**
18
+
19
+ - `registered_model` (object):
20
+ - `name` (string): [1..100] characters, Name of the model.
21
+ - `description` (string): <= 65535 characters, User-specified description for the object.
22
+ - `permission_level` (string): Enum: "CAN_MANAGE" "CAN_EDIT" "CAN_READ" "CAN_MANAGE_STAGING_VERSIONS" "CAN_MANAGE_PRODUCTION_VERSIONS", Permission level of the requesting user on the object. For what is allowed at each level, see MLflow Model permissions.
23
+ - `tags` (Array of objects): Array of tags associated with the model.
24
+ - `latest_versions` (Array of objects): Array of model versions, each the
app/dbr_api_docs/mlflow_runs_get.txt ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ ## API Documentation
5
+
6
+ **Get a run**
7
+
8
+ **GET** `/api/2.0/mlflow/runs/get`
9
+
10
+ Gets the metadata, metrics, params, and tags for a run. In the case where multiple metrics with the same key are logged for a run, return only the value with the latest timestamp. If there are multiple values with the latest timestamp, return the maximum of these values.
11
+
12
+ **Query Parameters**
13
+
14
+ - `run_id` (required, string): ID of the run to fetch. Must be provided.
15
+ - `run_uuid` (string): [Deprecated, use run_id instead] ID of the run to fetch. This field will be removed in a future MLflow version.
16
+
17
+ **Responses**
18
+
19
+ - `200`:
20
+ - `run` (object): Run metadata (name, start time, etc) and data (metrics, params, and tags).
21
+ - `info` (object): Run metadata.
22
+ - `data` (object): Run data.
23
+ - `metrics` (Array of objects): Run metrics.
24
+ - `params` (Array of objects):
app/dbr_api_docs/mlflow_transition_stage.txt ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ API Endpoint: POST /api/2.0/mlflow/databricks/model-versions/transition-stage
5
+
6
+ Description: Transition a model version's stage. This is a Databricks workspace version of the MLflow endpoint that also accepts a comment associated with the transition to be recorded.
7
+
8
+ Parameters:
9
+ - stage (required, string, Enum: "None" "Staging" "Production" "Archived"): Target stage of the transition. Valid values are: None (The initial stage of a model version), Staging (Staging or pre-production stage), Production (Production stage), Archived (Archived stage).
10
+ - name (required, string, [1..100] characters): Name of the model.
11
+ - archive_existing_versions (required, boolean): Specifies whether to archive all current model versions in the target stage.
12
+ - version (required, string): Version of the model.
13
+ - comment (string, [1..65535] characters): User-provided comment on the action.
14
+
15
+ Responses:
16
+ - 200: Model version's stage was updated successfully.
app/dbr_api_docs/update_permissions.txt ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Endpoint PREFIX : https://{DATABRICKS_HOST}/
2
+
3
+
4
+ ### Update Permission
5
+
6
+ **PATCH** `/api/2.0/permissions/{request_object_type}/{request_object_id}`
7
+
8
+ Updates the permissions on an object. Payload needs to be a valid access control list.
9
+
10
+ **Responses**
11
+
12
+ - `200` - Permissions updated successfully.
app/dbr_api_docs_raw/get_permissions.txt ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Get object permissions
2
+ GET
3
+ /api/2.0/permissions/{request_object_type}/{request_object_id}
4
+ Gets the permission of an object. Objects can inherit permissions from their parent objects or root objects.
5
+
6
+ Path Parameters
7
+ request_object_type
8
+ required
9
+ string
10
+ request_object_id
11
+ required
12
+ string
13
+ Responses
14
+ 200
15
+ object_id
16
+ string
17
+
18
+ object_type
19
+ string
20
+
21
+ access_control_list
22
+ Array of objects
23
+ Array[
24
+ user_name
25
+ string
26
+ name of the user
27
+
28
+ group_name
29
+ string
30
+ name of the group
31
+
32
+ service_principal_name
33
+ string
34
+ name of the service principal
35
+
36
+ all_permissions
37
+ Array of objects
38
+ All permissions.
39
+ ]
40
+
app/dbr_api_docs_raw/jobs_cancel_now.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Cancel a job run
2
+ POST
3
+ /api/2.1/jobs/runs/cancel
4
+ Cancels a job run. The run is canceled asynchronously, so it may still be running when this request completes.
5
+
6
+ run_id
7
+ required
8
+ integer
9
+ int64
10
+ This field is required.
11
+
12
+ Responses
13
+ 200 Run was cancelled successfully.
app/dbr_api_docs_raw/jobs_create.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Create a new job
2
+ POST
3
+ /api/2.1/jobs/create
4
+ Create a new job.
5
+
6
+ The payload needs to be a valid job dictionary.
7
+
8
+ Responses
9
+ 200 Job was created successfully
10
+ job_id
11
+ integer
12
+ int64
13
+ The canonical identifier for the newly created job.
app/dbr_api_docs_raw/jobs_get.txt ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Get a single job
2
+ GET
3
+ /api/2.1/jobs/get
4
+ Retrieves the details for a single job.
5
+
6
+ Query Parameters
7
+ job_id
8
+ required
9
+ integer
10
+ int64
11
+ The canonical identifier of the job to retrieve information about. This field is required.
12
+
13
+ Responses
14
+ 200 Job was retrieved successfully.
15
+ created_time
16
+ integer
17
+ int64
18
+ The time at which this job was created in epoch milliseconds (milliseconds since 1/1/1970 UTC).
19
+
20
+ job_id
21
+ integer
22
+ int64
23
+ The canonical identifier for this job.
24
+
25
+ trigger_history
26
+ object
27
+ History of the file arrival trigger associated with the job.
28
+
29
+ creator_user_name
30
+ string
31
+ The creator user name. This field won’t be included in the response if the user has already been deleted.
32
+
33
+ run_as_user_name
34
+ string
35
+ The user name that the job runs as. run_as_user_name is based on the current job settings, and is set to the creator of the job if job access control is disabled, or the is_owner permission if job access control is enabled.
36
+
37
+ settings
38
+ object
39
+ Settings for this job and all of its runs. These settings can be updated using the resetJob method.
app/dbr_api_docs_raw/jobs_list.txt ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ List all jobs
2
+ GET
3
+ /api/2.1/jobs/list
4
+ Retrieves a list of jobs.
5
+
6
+ Query Parameters
7
+ limit
8
+ integer
9
+ [ 1 .. 25 ]
10
+ Default: 20
11
+ The number of jobs to return. This value must be greater than 0 and less or equal to 25. The default value is 20.
12
+
13
+ offset
14
+ integer
15
+ Default: 0
16
+ The offset of the first job to return, relative to the most recently created job.
17
+
18
+ name
19
+ string
20
+ A filter on the list based on the exact (case insensitive) job name.
21
+
22
+ expand_tasks
23
+ boolean
24
+ Default: false
25
+ Whether to include task and cluster details in the response.
26
+
27
+ Responses
28
+ 200 List of jobs was retrieved successfully.
29
+ jobs
30
+ Array of objects
31
+ The list of jobs.
32
+
33
+ Array
34
+ job_id
35
+ integer
36
+ int64
37
+ The canonical identifier for this job.
38
+
39
+ creator_user_name
40
+ string
41
+ The creator user name. This field won’t be included in the response if the user has already been deleted.
42
+
43
+ settings
44
+ object
45
+ Settings for this job and all of its runs. These settings can be updated using the resetJob method.
46
+
47
+ created_time
48
+ integer
49
+ int64
50
+ The time at which this job was created in epoch milliseconds (milliseconds since 1/1/1970 UTC).
51
+
52
+ has_more
53
+ boolean
app/dbr_api_docs_raw/jobs_run_now.txt ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Trigger a new job run
2
+ POST
3
+ /api/2.1/jobs/run-now
4
+ Run a job and return the run_id of the triggered run.
5
+
6
+ job_id
7
+ required
8
+ integer
9
+ int64
10
+ The ID of the job to be executed
11
+
12
+ Responses
13
+ 200 Run was started successfully.
14
+ run_id
15
+ integer
16
+ int64
17
+ The globally unique ID of the newly triggered run.
18
+
19
+ number_in_job
20
+ integer
21
+ int64
22
+ Deprecated
23
+ A unique identifier for this job run. This is set to the same value as run_id.
app/dbr_api_docs_raw/mlflow_registered_models_get.txt ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Get model
2
+ GET
3
+ /api/2.0/mlflow/databricks/registered-models/get
4
+ Get the details of a model. This is a Databricks workspace version of the MLflow endpoint that also returns the model's Databricks workspace ID and the permission level of the requesting user on the model.
5
+
6
+ Query Parameters
7
+ name
8
+ required
9
+ string
10
+ Registered model unique name identifier.
11
+
12
+ Responses
13
+ 200 Model details were returned successfully.
14
+ registered_model
15
+ object
16
+ name
17
+ string
18
+ [ 1 .. 100 ] characters
19
+ Name of the model.
20
+
21
+ description
22
+ string
23
+ <= 65535 characters
24
+ User-specified description for the object.
25
+
26
+ permission_level
27
+ string
28
+ Enum: "CAN_MANAGE" "CAN_EDIT" "CAN_READ" "CAN_MANAGE_STAGING_VERSIONS" "CAN_MANAGE_PRODUCTION_VERSIONS"
29
+ Permission level of the requesting user on the object. For what is allowed at each level, see MLflow Model permissions.
30
+
31
+ tags
32
+ Array of objects
33
+ Array of tags associated with the model.
34
+
35
+ latest_versions
36
+ Array of objects
37
+ Array of model versions, each the latest version for its stage.
38
+
39
+ user_id
40
+ string
41
+ email
42
+ The username of the user that created the object.
43
+
44
+ creation_timestamp
45
+ integer
46
+ int64
47
+ Creation time of the object, as a Unix timestamp in milliseconds.
48
+
49
+ id
50
+ string
51
+ uuid
52
+ Unique identifier for the object.
53
+
54
+ last_updated_timestamp
55
+ integer
56
+ int64
57
+ Time of the object at last update, as a Unix timestamp in milliseconds.
app/dbr_api_docs_raw/mlflow_runs_get.txt ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Get a run
2
+ GET
3
+ /api/2.0/mlflow/runs/get
4
+ Gets the metadata, metrics, params, and tags for a run. In the case where multiple metrics with the same key are logged for a run, return only the value with the latest timestamp.
5
+
6
+ If there are multiple values with the latest timestamp, return the maximum of these values.
7
+
8
+ Query Parameters
9
+ run_id
10
+ required
11
+ string
12
+ ID of the run to fetch. Must be provided.
13
+
14
+ run_uuid
15
+ string
16
+ Deprecated
17
+ [Deprecated, use run_id instead] ID of the run to fetch. This field will be removed in a future MLflow version.
18
+
19
+ Responses
20
+ 200
21
+ run
22
+ object
23
+ Run metadata (name, start time, etc) and data (metrics, params, and tags).
24
+
25
+ info
26
+ object
27
+ Run metadata.
28
+
29
+ data
30
+ object
31
+ Run data.
32
+
33
+ metrics
34
+ Array of objects
35
+ Run metrics.
36
+
37
+ params
38
+ Array of objects
39
+ Run parameters.
40
+
41
+ tags
42
+ Array of objects
43
+ Additional metadata key-value pairs.
44
+
45
+ inputs
46
+ object
47
+ Run inputs.
app/dbr_api_docs_raw/mlflow_transition_stage.txt ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Transition a stage
2
+ POST
3
+ /api/2.0/mlflow/databricks/model-versions/transition-stage
4
+ Transition a model version's stage. This is a Databricks workspace version of the MLflow endpoint that also accepts a comment associated with the transition to be recorded.",
5
+
6
+ Details required to transition a model version's stage.
7
+
8
+ stage
9
+ required
10
+ string
11
+ Enum: "None" "Staging" "Production" "Archived"
12
+ Target stage of the transition. Valid values are:
13
+
14
+ None: The initial stage of a model version.
15
+
16
+ Staging: Staging or pre-production stage.
17
+
18
+ Production: Production stage.
19
+
20
+ Archived: Archived stage.
21
+
22
+ name
23
+ required
24
+ string
25
+ [ 1 .. 100 ] characters
26
+ Name of the model.
27
+
28
+ archive_existing_versions
29
+ required
30
+ boolean
31
+ Specifies whether to archive all current model versions in the target stage.
32
+
33
+ version
34
+ required
35
+ string
36
+ Version of the model.
37
+
38
+ comment
39
+ string
40
+ [ 1 .. 65535 ] characters
41
+ User-provided comment on the action.
42
+
43
+ Responses
44
+ 200 Model version's stage was updated successfully.
app/dbr_api_docs_raw/update_permissions.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ Update permission
2
+ PATCH
3
+ /api/2.0/permissions/{request_object_type}/{request_object_id}
4
+ Updates the permissions on an object. Payload needs to be a valid access control list
5
+
6
+ Responses
7
+ 200
app/llm.yaml ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ # Model configuration
2
+ model:
3
+ model_name: text-davinci-003
4
+ deployment_name: ptc-davinci-003
5
+ temperature: 0
app/utils.py ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from langchain import LLMChain
2
+ from langchain.prompts.prompt import PromptTemplate
3
+ from langchain.chains.api.prompt import API_URL_PROMPT_TEMPLATE
4
+
5
+
6
+ def find_value(dictionary: dict, key):
7
+ """
8
+ Recursive function to return the value of key in a dictionary
9
+ (Code was also created by an LLM)
10
+ """
11
+ if key in dictionary:
12
+ return dictionary[key]
13
+ for k, v in dictionary.items():
14
+ if isinstance(v, dict):
15
+ item = find_value(v, key)
16
+ if item is not None:
17
+ return item
18
+
19
+
20
+ def create_custom_response_template(custom_resp: str):
21
+ # Customized Prompt
22
+ API_RESPONSE_PROMPT_TEMPLATE = (
23
+ API_URL_PROMPT_TEMPLATE
24
+ + """ {api_url}
25
+
26
+ Here is the response from the API:
27
+
28
+ {api_response}""" + f"\n\n{custom_resp}"
29
+ )
30
+
31
+ return PromptTemplate(
32
+ input_variables=["api_docs", "question", "api_url", "api_response"],
33
+ template=API_RESPONSE_PROMPT_TEMPLATE,
34
+ )
35
+
36
+
37
+ def custom_api_prompt(llm, chain, question, custom_resp):
38
+ """
39
+ Helper function for separating the api-link creation and the post-processing on the response
40
+ """
41
+ resp_prompt = create_custom_response_template(custom_resp)
42
+ get_answer_chain = LLMChain(llm=llm, prompt=resp_prompt)
43
+ chain.api_answer_chain = get_answer_chain
44
+ return chain.run(question)
45
+
46
+
assets/API_Chatter_Overview.PNG ADDED
databricks_llm - raw.py ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import click
2
+ import logging
3
+ import sys
4
+ import yaml
5
+ import os
6
+ from langchain.llms import AzureOpenAI, OpenAI
7
+ from app.api_funcs import get_job_infos, get_run, get_model, \
8
+ trans_model, batch_mod_permission, prepare_api_docs
9
+ from pathlib import Path
10
+ logger = logging.getLogger(__name__)
11
+ logger.setLevel(logging.DEBUG)
12
+ handler = logging.StreamHandler(sys.stdout)
13
+ handler.setLevel(logging.DEBUG)
14
+ logger.addHandler(handler)
15
+
16
+ PATH = Path(os.path.abspath(os.path.dirname(__file__)))
17
+
18
+ # Open the YAML file
19
+ conf_path = PATH / "app" / 'llm.yaml'
20
+ with open(conf_path) as f:
21
+ config = yaml.safe_load(f)
22
+
23
+ # Use AzureOpenAI, if config contains deployment name, otherwise OpenAI
24
+ if config['model'].get('deployment_name', False):
25
+ llm = AzureOpenAI(**config['model'])
26
+ else:
27
+ llm = OpenAI(**config['model'])
28
+
29
+
30
+ headers = {"Authorization": f"Bearer {os.getenv('DBR_BEARER_TOKEN')}"}
31
+ updated_api_docs = prepare_api_docs()
32
+
33
+
34
+ def comma_list(comma_str: str):
35
+ return comma_str.split(',')
36
+
37
+ def determine_api_text(updated_api_docs: dict, query: str):
38
+ pick_api_prompt = """Please return the file name from the list {api_docs}
39
+ that best corresponds to the following query: {query}. \
40
+ DO NOT EXPLAIN your answer!
41
+ """
42
+ api_docs = os.listdir(PATH / "app" / "dbr_api_docs")
43
+ selected_api_doc = llm(pick_api_prompt.format(api_docs=api_docs, query=query)).lstrip().rstrip()
44
+ logger.info(f"\nSelecting the following api document: {selected_api_doc}")
45
+ api_text = updated_api_docs[selected_api_doc]
46
+ return api_text, selected_api_doc
47
+
48
+ # Add subcommands for commands
49
+ @click.group()
50
+ def cli():
51
+ pass
52
+
53
+ @cli.group(help='Run machine learning model.')
54
+ def ml():
55
+ pass
56
+
57
+ # Add commands for specific subcommands of 'ml'
58
+ @ml.command(help='Get information about a model.')
59
+ @click.argument('query', type=str)
60
+ def get_model_info(query):
61
+ # Instruction to get model infos
62
+ api_text, _ = determine_api_text(updated_api_docs, query)
63
+ logger.info(get_model(llm, query, api_text, headers))
64
+
65
+ @ml.command(help='Get information about a model run.')
66
+ @click.argument('run_id', type=str)
67
+ @click.argument('query', type=str)
68
+ def get_run_info(query, run_id):
69
+ # ID of the model run for which you'd like information.
70
+ # Which information should be pulled from the run?
71
+ api_text, _ = determine_api_text(updated_api_docs, query)
72
+ logger.info(get_run(llm, run_id, query, api_text, headers))
73
+
74
+ @ml.command(help='Transition a model from one state to another.')
75
+ @click.argument('query', type=str)
76
+ def transition_model(query):
77
+ # Instruction to transition a model.
78
+ api_text, _ = determine_api_text(updated_api_docs, query)
79
+ trans_model(llm, query, api_text, headers)
80
+
81
+ @cli.command(help='View job history.')
82
+ @click.argument('query', type=str)
83
+ def jobs(query):
84
+ if ";" not in query:
85
+ query = query + ";"
86
+ query, response_query = query.split(";")
87
+ api_text, _ = determine_api_text(updated_api_docs, query)
88
+ # The query for the LLM + an optional query for the API response
89
+ logger.info(get_job_infos(llm, query, response_query, api_text, headers))
90
+
91
+ @cli.command(help='Manage user permissions.')
92
+ @click.argument('query', type=str)
93
+ @click.argument('jobs', type=comma_list)
94
+ def permissions(jobs, query):
95
+ api_text, api_name = determine_api_text(updated_api_docs, query)
96
+ # Add/Get user permissions.
97
+ batch_mod_permission(
98
+ logger, llm, updated_api_docs, api_text, api_name, headers,
99
+ query, jobs=jobs
100
+ )
101
+
102
+ if __name__ == '__main__':
103
+ cli()
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ click==8.1.2
2
+ langchain==0.0.171
3
+ openai==0.27.6
setup.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from setuptools import setup, find_packages
2
+
3
+ setup(
4
+ name='databricks_llm', # name of your package
5
+ version='0.0.1', # version number
6
+ py_modules=['databricks_llm', 'app'], # list all .py files used to build your package
7
+ install_requires=[
8
+ 'Click',
9
+ ],
10
+ packages = find_packages(),
11
+ include_package_data=True,
12
+ package_data={"app": ["dbr_api_docs/*.txt", "llm.yaml"]},
13
+ entry_points='''
14
+ [console_scripts]
15
+ databricks_llm=databricks_llm:cli
16
+ ''',
17
+ )