Spaces:

TRaw
/

tskwvr

Runtime error

App Files Files Community

tskwvr / website /docs /customization /plugin /plugin_intro.md

TRaw

Upload 297 files

3d3d712 about 2 years ago

preview code

raw

history blame contribute delete

8.15 kB

	---
	id: plugin_intro
	description: Plugin introduction
	slug: /plugin/plugin_intro
	---

	# Plugin Introduction

	Plugins are the units that could be orchestrated by TaskWeaver. One could view the plugins as tools that the LLM can
	utilize to accomplish certain tasks.

	In TaskWeaver, each plugin is represented as a Python function that can be called within a code snippet. The
	orchestration is essentially the process of generating Python code snippets consisting of a certain number of plugins.
	One concrete example would be pulling data from database and apply anomaly detection. The generated code (simplified) looks like
	follows:

	```python
	df, data_description = sql_pull_data(query="pull data from time_series table")
	anomaly_df, anomaly_description = anomaly_detection(df, time_col_name="ts", value_col_name="val")
	```

	## Plugin Structure

	A plugin has two files:

	* Plugin Implementation: a Python file that defines the plugin
	* Plugin Schema: a file in yaml that defines the schema of the plugin

	## Plugin Implementation

	The plugin function needs to be implemented in Python.
	To be coordinated with the orchestration by TaskWeaver, a plugin python file consists of two parts:

	- Plugin function implementation code
	- TaskWeaver plugin decorator

	Here we exhibit an example of the anomaly detection plugin as the following code:

	```python
	import pandas as pd
	from pandas.api.types import is_numeric_dtype

	from taskWeaver.plugin import Plugin, register_plugin


	@register_plugin
	class AnomalyDetectionPlugin(Plugin):
	def __call__(self, df: pd.DataFrame, time_col_name: str, value_col_name: str):

	"""
	anomaly_detection function identifies anomalies from an input dataframe of time series.
	It will add a new column "Is_Anomaly", where each entry will be marked with "True" if the value is an anomaly
	or "False" otherwise.

	:param df: the input data, must be a dataframe
	:param time_col_name: name of the column that contains the datetime
	:param value_col_name: name of the column that contains the numeric values.
	:return df: a new df that adds an additional "Is_Anomaly" column based on the input df.
	:return description: the description about the anomaly detection results.
	"""
	try:
	df[time_col_name] = pd.to_datetime(df[time_col_name])
	except Exception:
	print("Time column is not datetime")
	return

	if not is_numeric_dtype(df[value_col_name]):
	try:
	df[value_col_name] = df[value_col_name].astype(float)
	except ValueError:
	print("Value column is not numeric")
	return

	mean, std = df[value_col_name].mean(), df[value_col_name].std()
	cutoff = std * 3
	lower, upper = mean - cutoff, mean + cutoff
	df["Is_Anomaly"] = df[value_col_name].apply(lambda x: x < lower or x > upper)
	anomaly_count = df["Is_Anomaly"].sum()
	description = "There are {} anomalies in the time series data".format(anomaly_count)

	self.ctx.add_artifact(
	name="anomaly_detection_results", # a brief description of the artifact
	file_name="anomaly_detection_results.csv", # artifact file name
	type="df", # artifact data type, support chart/df/file/txt/svg
	val=df, # variable to be dumped
	)

	return df, description

	```

	You need to go through the following steps to implement your own plugin.

	1. import the TaskWeaver plugin decorator `from taskWeaver.plugin import Plugin, register_plugin`
	2. create your plugin class inherited from `Plugin` parent class (e.g., `AnomalyDetectionPlugin(Plugin)`), which is
	decorated by `@register_plugin`
	3. implement your plugin function in `__call__` method of the plugin class. **Most importantly, it is mandatory to
	include `descriptions` of your execution results in the return values of your plugin function**. These descriptions
	can be utilized by the LLM to effectively summarize your execution results.

	> 💡A key difference in a plugin implementation and a normal python function is that it always return a description of
	> the result in natural language. As LLMs only understand natural language, it is important to let the model understand
	> what the execution result is. In the example implementation above, the description says how many anomalies are detected.
	> Behind the scene, only the description will be passed to the LLM model. In contrast, the execution result (e.g., df in
	> the above example) is not handled by the LLM.

	### Important Notes

	1. If the functionality of your plugin depends on additional libraries or packages, it is essential to ensure that they
	are installed before proceeding.

	2. If you wish to persist intermediate results, such as data, figures, or prompts, in your plugin implementation,
	TaskWeaver provides an `add_artifact` API that allows you to store these results in the workspace. In the example we
	provide, if you have performed anomaly detection and obtained results in the form of a CSV file, you can utilize
	the `add_artifact` API to save this file as an artifact. The artifacts are stored in the `project/workspace/session_id/cwd` folder in the project directory.

	```python
	self.ctx.add_artifact(
	name="anomaly_detection_results", # a brief description of the artifact
	file_name="anomaly_detection_results.csv", # artifact file name
	type="df", # artifact data type, support chart/df/file/txt/svg
	val=df, # variable to be dumped
	)
	```

	## Plugin Schema

	The plugin schema is composed of several parts:

	1. name: The main function name of the Python code.
	2. enabled: determine whether the plugin is enabled for selection during conversations. The default value is true.
	3. descriptions: A brief description that introduces the plugin function.
	4. parameters: This section lists all the input parameter information. It includes the parameter's name, type,
	whether it is required or optional, and a description providing more details about the parameter.
	5. returns: This section lists all the return value information. It includes the return value's name, type, and
	description that provides information about the value that is returned by the function.

	Note: The addition of any extra fields would result in a validation failure within the plugin schema.

	The plugin schema is required to be written in YAML format. Here is the plugin schema example of the above anomaly
	detection plugin:

	```yaml
	name: anomaly_detection
	enabled: true
	required: false
	description: >-
	anomaly_detection function identifies anomalies from an input DataFrame of
	time series. It will add a new column "Is_Anomaly", where each entry will be marked with "True" if the value is an anomaly or "False" otherwise.

	parameters:
	- name: df
	type: DataFrame
	required: true
	description: >-
	the input data from which we can identify the anomalies with the 3-sigma
	algorithm.
	- name: time_col_name
	type: str
	required: true
	description: name of the column that contains the datetime
	- name: value_col_name
	type: str
	required: true
	description: name of the column that contains the numeric values.

	returns:
	- name: df
	type: DataFrame
	description: >-
	This DataFrame extends the input DataFrame with a newly-added column
	"Is_Anomaly" containing the anomaly detection result.
	- name: description
	type: str
	description: This is a string describing the anomaly detection results.

	```

	Besides, we also set two optional fields as below:

	1. code: In cases where multiple plugins map to the same Python code (i.e., the plugin name is different from the
	code name), it is essential to specify the code name (code file) in the plugin schema to ensure clarity and accuracy.
	2. configurations: When using common code that requires some configuration parameter modifications for different
	plugins, it is important to specify these configuration parameters in the plugin schema.