Kiran5 commited on
Commit
5e00217
·
verified ·
1 Parent(s): 4e5286e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +710 -0
README.md CHANGED
@@ -10,3 +10,713 @@ short_description: moderate inputs by passing them through various checks
10
  ---
11
 
12
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
13
+
14
+ # Responsible-AI-Moderation Layer
15
+
16
+ ## Table of Contents
17
+
18
+ - [Introduction](#introduction)
19
+ - [Features](#features)
20
+ - [Prerequisites](#prerequisites)
21
+ - [Installation](#installation)
22
+ - [Set Configuration Variables](#set-configuration-variables)
23
+ - [Running the Application](#running-the-application)
24
+ - [License](#license)
25
+ - [Contact](#contact)
26
+
27
+ ## Introduction
28
+ The **Moderation Layer** module provides model-based and template-based guardrails to moderate inputs by passing them through various checks like check for toxicity, profanity, prompt injection, jailbreak, privacy etc. The module also provides additional features like chain of thoughts, chain of verifications, token importance and LLM explanation.
29
+
30
+ This application is built using the Flask web framework.Leveraging the flexibility and scalability of Flask, this application provides some intuitve features as mentioned below.
31
+
32
+ ## Features
33
+ 1. **Model-based guardrail** : Here we use traditional AI models, libraries to test the input for various checks like prompt injection, jailbreak, toxicity etc.
34
+
35
+ 2. **Template-based guardrail** : Here we are making use of a Prompt Template for each evaluation check like prompt injection, jailbreak, etc. Prompt templates are a way to define reusable structures for prompting LLMs. It allows us to create a base prompt with placeholders that can be filled with different values to generate specific outputs.
36
+
37
+ 3. **Translate option** : Given that our model-based guardrails are optimized for English, we offer a translator option. In cases where prompts are in languages other than English, the translator converts them to English before passing them to the guardrails, thus providing extra protection against Multilingual Jailbreak attacks as well.
38
+
39
+ 4. **Multimodal** : We are using multimodal functionality of GPT4o combined with our model based guardrail and template based guardrail to perform moderation checks like prompt injection, jailbreak etc. for text and image uploaded, respectively.
40
+
41
+ 5. **Response comparision** : We are also providing response comparision between just LLM output and output from LLM with our guardrails.
42
+
43
+ 6. **Multiple LLM Support** : We have provided support to various LLM models to generate response and perform template based checks, these include : gpt-4o-mini, gpt-35-turbo, GPT-4-Turbo, Llama3-70b, anthropic.claude-3-sonnet, gemini-2.5.pro and gemini-2.5-flash models. You should have support of atleast one of these models.
44
+
45
+ ## Prerequisites
46
+
47
+ 1. Before installing the repo for Moderation Layer, first you need to install the repo for Moderation Models.
48
+ Please find the link for [Moderation Model](https://github.com/Infosys/Infosys-Responsible-AI-Toolkit/tree/main/responsible-ai-ModerationModel).
49
+ If you want to use template based guardrails then use [Admin Module](https://github.com/Infosys/Infosys-Responsible-AI-Toolkit/tree/master/responsible-ai-admin)
50
+
51
+ 2. **Installation of Python** : Install Python (version 3.11.x) from the [official website](https://www.python.org/downloads/) and ensure it is added to your system PATH.
52
+
53
+
54
+ 3. **Installation of pip** :
55
+
56
+ **Linux:**
57
+ 1. Check if pip is already installed: Open a terminal and run the following command:
58
+ ```sh
59
+ pip --version
60
+ ```
61
+ 2. If pip is not installed, you'll see an error message.
62
+ 3. Install pip using get-pip.py: Download the `get-pip.py` script as follows
63
+ ```sh
64
+ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
65
+ ```
66
+ 4. Make the script executable:
67
+ ```sh
68
+ chmod +x get-pip.py
69
+ ```
70
+ 5. Run the script:
71
+ ```sh
72
+ ./get-pip.py
73
+ ```
74
+ 6. This will install pip and its dependencies.
75
+
76
+ **Windows:**
77
+ 1. Download the Python installer: Visit the official [Python website](https://www.python.org/downloads/) and download the latest Python installer for Windows.
78
+
79
+ 2. Install Python: Run the installer and follow the on-screen instructions. Make sure to check the "Add Python to PATH" option during the installation.
80
+
81
+ 3. Verify pip installation: Open a command prompt and run the following command:
82
+ ```sh
83
+ pip --version
84
+ ```
85
+ If pip is installed, you'll see its version.
86
+
87
+
88
+ **macOS:**
89
+
90
+ **Approach 1**
91
+ 1. Check if pip is already installed: Open a Terminal window.Type
92
+ ```sh
93
+ pip --version
94
+ ```
95
+ and press Enter. If pip is installed, you'll see its version.
96
+
97
+ 2. Install pip using Homebrew (recommended): If pip isn't installed or you're unsure, install `Homebrew`, a popular package manager for macOS:
98
+ ```sh
99
+ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"  
100
+ ```
101
+ 3. Once Homebrew is installed, use it to install pip:
102
+ ```sh
103
+ brew install python
104
+ ```
105
+ 4. Verify the installation: Type
106
+ ```sh
107
+ pip --version
108
+ ```
109
+ again.You should see the installed version of pip.
110
+
111
+ **Approach 2**
112
+
113
+ If you encounter issues or prefer a manual installation:
114
+
115
+ 1. Download the get-pip.py script : Use curl to download the script:
116
+ ```sh
117
+ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
118
+ ```
119
+ 2. Run the script : Make sure you're using the correct Python interpreter (usually python3):
120
+ ```sh
121
+ python3 get-pip.py
122
+ ```
123
+ 3. Enter your password (if prompted) : When you run the script, you might be prompted for your administrator password. This is because the installation process requires elevated privileges. Type your password and press Enter (the characters won't be displayed for security reasons).
124
+
125
+ 4. Verify the installation : Once the script finishes running, you can verify that pip is installed by typing:
126
+ ```sh
127
+ pip --version
128
+ ```
129
+ If pip is installed correctly, you should see the installed version number displayed.
130
+
131
+
132
+
133
+ ## Installation
134
+
135
+ ### Steps for Installation :
136
+ **Step 1** : Clone the repository `responsible-ai-moderationLayer`:
137
+ ```sh
138
+ git clone <repository-url>
139
+ ```
140
+
141
+ **Step 2** : Navigate to the `responsible-ai-moderationLayer` directory:
142
+ ```sh
143
+ cd responsible-ai-moderationLayer
144
+ ```
145
+
146
+ **Step 3** : Use the below link to download `en_core_web_lg` whl file -
147
+
148
+ [Download Link](https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.7.1/en_core_web_lg-3.7.1-py3-none-any.whl)
149
+ This will take 30-40 mins.
150
+ Once done, put this inside `lib` folder of the repo `responsible-ai-moderationLayer`.
151
+
152
+
153
+ **Step 4** : Activate the virtual environment for different OS.
154
+
155
+ **Windows:**
156
+ 1. Open Command Prompt or PowerShell: Find and open the appropriate command-line interface
157
+ 2. Create a virtual environment using the following python command :
158
+ ```sh
159
+ > python -m venv <Name of your Virtual Environment>
160
+ ```
161
+ Let's say your virtual env. name is `myenv` and is located in `C:\Users\your_username`
162
+
163
+ 3. Navigate to the virtual environment directory and activate it using below command :
164
+ ```sh
165
+ > cd C:\Users\your_username\myenv\Scripts
166
+ > .\activate
167
+ ```
168
+ 4. You should see a prompt that indicates the activated environment, such as (myenv) before your usual prompt like this :
169
+ ```sh
170
+ (myenv) C:\Users\your_username\myenv\Scripts>
171
+ ```
172
+
173
+ **Linux:**
174
+ 1. Open a terminal: Find and open a terminal window.
175
+ 2. To create a virtual environment, install the relevant version of the `venv` module.Since we are using Python 3.11, install the 3.11 variant of the package, which is named python3.11-venv
176
+ ```sh
177
+ abc@demo:~/$ sudo apt install python3.11-venv
178
+ ```
179
+ 3. Create a Virtual env like this.
180
+ ```sh
181
+ abc@demo:~/Projects/MyCoolApp$ python3.11 -m venv <Name of your Virtual Environment>
182
+ ```
183
+ Let's say your virtual env. name is `myenv`
184
+
185
+ 4. Activate the environment: Run the following command
186
+ ```sh
187
+ abc@demo:~/Projects/MyCoolApp$ source myenv/bin/activate
188
+ ```
189
+ 5. You should see a prompt that indicates the activated environment, such as (myenv) before your usual prompt like this :
190
+ ```sh
191
+ (myenv) abc@demo:~/Projects/MyCoolApp$
192
+ ```
193
+
194
+ **MacOS:**
195
+ 1. Open Terminal: Find and open the Terminal.
196
+ 2. Activate the environment: Run the following commands
197
+ ```sh
198
+ python3 -m pip install virtualenv
199
+ python3 -m virtualenv <Name of your Virtual environment>
200
+ ```
201
+ Let's say, your virtual env name is `myenv`.
202
+ ```sh
203
+ source ./myenv/bin/activate
204
+ ```
205
+
206
+
207
+ **Step 5** : Go to the `requirements` directory where the `requirement.txt` file is present :
208
+
209
+ Now, install the requirements as shown below :
210
+ ```sh
211
+ cd requirements
212
+ pip install -r requirement.txt
213
+ ```
214
+
215
+ ## Set Configuration Variables
216
+ After installing all the required packages, configure the environment variables necessary to run the APIs.
217
+
218
+ We are maintaining an environment file where we are keeping the Moderation Model urls,openai credentials,and many other attributes required for proper functioning of Moderation Layer.
219
+
220
+ 1. Navigate to the `src` directory:
221
+ ```sh
222
+ cd ..
223
+ ```
224
+ 2. Locate the file named as `.env`. This file contains key value pairs.There are some mandatory parameters and some are optional.
225
+
226
+ **Mandatory Parameters**
227
+ -------------------------------------------------------------------------------------------------------
228
+
229
+ 1. **Passing Moderation Model URLs** : We have following env variables for passing Moderation Models:
230
+ ```sh
231
+ SIMILARITYMODEL="${similaritymodel}" #[MANDATORY]
232
+ RESTRICTEDMODEL="${restrictedmodel}" #[MANDATORY]
233
+ DETOXIFYMODEL="${detoxifymodel}" #[MANDATORY]
234
+ PROMPTINJECTIONMODEL="${promptinjectionmodel}" #[MANDATORY]
235
+ JAILBREAKMODEL="${jailbreakmodel}" #[MANDATORY]
236
+ PRIVACY="${privacy}" #[MANDATORY]
237
+ SENTIMENT="${sentiment}" #[MANDATORY]
238
+ INVISIBLETEXT="${invisibletext}" #[MANDATORY]
239
+ GIBBERISH="${gibberish}" #[MANDATORY]
240
+ BANCODE="${bancode}" #[MANDATORY]
241
+
242
+ ```
243
+ We need to pass the Model Urls in the same env file, which are nothing but apis for each Model that have been exposed in Moderation Model repo.
244
+
245
+ ```sh
246
+ jailbreakmodel=<Put the model url needed for jailbreak>
247
+ restrictedmodel=<Put the model url needed for restricted topic>
248
+ detoxifymodel=<Put the model url needed for toxicity>
249
+ promptinjectionmodel=<Put the model url needed for prompt injection>
250
+ jailbreakmodel=<Put the model url needed for jailbreak>
251
+ privacy=<Put the model url needed for privacy>
252
+ sentiment=<Put the model url needed for sentiment>
253
+ invisibletext=<Put the model url needed for invisibletext>
254
+ gibberish=<Put the model url needed for gibberish>
255
+ bancode=<Put the model url needed for bancode>
256
+ ```
257
+ For example :
258
+ ```sh
259
+ jailbreakmodel="https://loremipsum.io/generator"
260
+ JAILBREAKMODEL="${jailbreakmodel}"
261
+ ```
262
+
263
+ 2. **Passing LLM Credentials** : Each LLM model is optional
264
+
265
+ **GPT Credentials** : (Optional)
266
+
267
+ For this, you need to have Azure OpenaAI service. For more details, you can refer [Azure OpenAI Pricing](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/) for Pricing Details.
268
+
269
+ **AWS Credentials** : (Optional)
270
+
271
+ To use Claude foundation model via AWS Bedrock, you'll need:
272
+ 1. An AWS account with Bedrock access
273
+ 2. IAM permissions to invoke Bedrock models
274
+
275
+ For more information you can refer to
276
+ [Amazon Bedrock Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
277
+ [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
278
+
279
+ **LLAMA** : (Optional)
280
+
281
+ We have hosted the LLaMA 3-70B model on our own infrastructure.
282
+ You can interact with it by hosting your model on your own infrastructure.
283
+ [LLaMA 3 Model Card (HuggingFace)](https://huggingface.co/meta-llama/Meta-Llama-3-70B)
284
+
285
+ **Gemini** : (Optional)
286
+
287
+ For the Gemini 2.5 Flash and Gemini 2.5 Pro models, we're using Google Generative AI SDK (google.generativeai). This only requires:
288
+ 1. The model name (e.g., "gemini-2.5-pro")
289
+ 2. Your Google API Key
290
+ [Gemini API Quickstart (Python)](https://ai.google.dev/gemini-api/docs/quickstart?lang=python)
291
+ [Google Generative AI Models List](https://ai.google.dev/gemini-api/docs/models)
292
+
293
+ Here we are passing LLM creds in the env file as follows. You should have atleast one model deployment endpoint or key. Like if you have support for gemini-2.5-pro model then other model fields are optional and only gemini-2.5-pro model will be mandatory in that case.
294
+
295
+ For gpt4o-mini : (Optional)
296
+ ```sh
297
+ apibase_gpt4=<Enter the Azure OpenAI endpoint for gpt4o-mini, your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com> #[OPTIONAL]
298
+ apiversion_gpt4=<Enter the Azure OpenAI version for gpt4o-mini> #[OPTIONAL]
299
+ openaimodel_gpt4=<Enter the Model name for gpt4o-mini> #[OPTIONAL]
300
+ apikey_gpt4=<Enter the Azure OpenAI API key for gpt4o-mini> #[OPTIONAL]
301
+ ```
302
+ For gpt3.5 Turbo : (Optional)
303
+ ```sh
304
+ apibase_gpt3=<Enter the Azure OpenAI endpoint for gpt3.5 Turbo> #[OPTIONAL]
305
+ apiversion_gpt3=<Enter the Azure OpenAI version for gpt3.5 Turbo> #[OPTIONAL]
306
+ openaimodel_gpt3=<Enter the Model name for gpt3.5 Turbo> #[OPTIONAL]
307
+ apikey_gpt3=<Enter the Azure OpenAI API key for gpt3.5 Turbo> #[OPTIONAL]
308
+ ```
309
+ For gpt4O for multimodal functionality: (Optional)
310
+ ```sh
311
+ api_base=<Enter the Azure OpenAI endpoint for gpt4o-mini> #[OPTIONAL]
312
+ api_key=<Enter the Azure OpenAI API key for gpt4o-mini> #[OPTIONAL]
313
+ api_version=<Enter the Azure OpenAI version for gpt4o-min> #[OPTIONAL]
314
+ model=<Enter the Model name for gpt4o-mini> #[OPTIONAL]
315
+ ```
316
+
317
+ For llama3-70b : (Optional)
318
+ ```sh
319
+ llamaendpoint3_70b=<Enter the endpoint where you have hosted the llama model> #[OPTIONAL]
320
+ aicloud_model_auth=<Give the endpoint to generate the authorization token> #[OPTIONAL]
321
+ ```
322
+
323
+ For aws-anthropic bedrock model : (Optional)
324
+ ```sh
325
+ awsservicename=<Enter the AWS service name> #[OPTIONAL]
326
+ awsmodelid=<Enter the AWS model id you are using> #[OPTIONAL]
327
+ accept=<You can give application/json> #[OPTIONAL]
328
+ contentType=<You can give application/json> #[OPTIONAL]
329
+ region_name=<Enter the region name for your service> #[OPTIONAL]
330
+ anthropicversion=<Enter the AWS model version > #[OPTIONAL]
331
+ AWS_KEY_ADMIN_PATH= <Enter the endpoint to generate the auth token to access the model> #[OPTIONAL]
332
+ ```
333
+
334
+ For gemini-2.5-pro model : (Optional)
335
+ ```sh
336
+ gemini_pro_model_name=<Enter the complete gemini model name here > #[OPTIONAL]
337
+ gemini_pro_api_key=<Enter the API key> #[OPTIONAL]
338
+ ```
339
+
340
+ For gemini-2.5-flash model : (Optional)
341
+ ```sh
342
+ gemini_flash_model_name=<Enter the complete gemini model name here > #[OPTIONAL]
343
+ gemini_flash_api_key=<Enter the API key> #[OPTIONAL]
344
+ ```
345
+
346
+ Using the above values here :
347
+ ```sh
348
+ OPENAI_API_BASE_GPT3 = "${apibase_gpt3}"
349
+ OPENAI_API_KEY_GPT3 = "${apikey_gpt3}"
350
+ OPENAI_API_VERSION_GPT3 = "${apiversion_gpt3}"
351
+ OPENAI_MODEL_GPT3 = "${openaimodel_gpt3}"
352
+ OPENAI_API_BASE_GPT4 = "${apibase_gpt4}"
353
+ OPENAI_API_KEY_GPT4 = "${apikey_gpt4}"
354
+ OPENAI_API_VERSION_GPT4 = "${apiversion_gpt4}"
355
+ OPENAI_MODEL_GPT4 = "${openaimodel_gpt4}"
356
+ OPENAI_API_BASE_GPT4_O = "${api_base}"
357
+ OPENAI_API_KEY_GPT4_O = "${api_key}"
358
+ OPENAI_API_VERSION_GPT4_O = "${api_version}"
359
+ OPENAI_MODEL_GPT4_O = "${model}"
360
+ LLAMA_ENDPOINT3_70b = "${llamaendpoint3_70b}"
361
+ AICLOUD_MODEL_AUTH = "${aicloud_model_auth}"
362
+ AWS_SERVICE_NAME = "${awsservicename}"
363
+ AWS_MODEL_ID = "${awsmodelid}"
364
+ ACCEPT = "${accept}"
365
+ CONTENTTYPE = "${contentType}"
366
+ REGION_NAME = "${region_name}"
367
+ ANTHROPIC_VERSION = "${anthropicversion}"
368
+ GEMINI_PRO_MODEL_NAME = "${gemini_pro_model_name}"
369
+ GEMINI_PRO_API_KEY = "${gemini_pro_api_key}"
370
+ GEMINI_FLASH_MODEL_NAME = "${gemini_flash_model_name}"
371
+ GEMINI_FLASH_API_KEY = "${gemini_flash_api_key}"
372
+ ```
373
+
374
+ 3. **Passing DB Related details** : Here we need to pass config details related to DB.
375
+ ```sh
376
+ dbtype=<Mention as 'mongo' for Mongodb or 'psql' for Postgresql> #[OPTIONAL]
377
+ APP_MONGO_HOST=<Hostname for MongoDB or PostgreSQL> #[OPTIONAL]
378
+ username=<Enter the username for Database> #[OPTIONAL]
379
+ password=<Enter the password for Database> #[OPTIONAL]
380
+ APP_MONGO_DBNAME=<Enter Database name for Mongo/Postgresql> #[OPTIONAL]
381
+
382
+ APP_MONGO_DBNAME="${APP_MONGO_DBNAME}" #If you are using DB then this will be mandatory and mention the dbname
383
+ DBTYPE="${dbtype}" #[MANDATORY]It will be mandatory if you will be using DB(values supported are mongo, psql , cosmos)
384
+
385
+ #If you are giving the DB type as psql or mongo then you need to define the username. password and mongohost
386
+ APP_MONGO_HOST="${APP_MONGO_HOST}"
387
+ DB_USERNAME="${username}"
388
+ DB_PWD="${password}"
389
+
390
+ # If you are using Mongo DB then you need to define the mongo path
391
+ MONGO_PATH="mongodb://${DB_USERNAME}:${DB_PWD}@${APP_MONGO_HOST}/" #[OPTIONAL]
392
+ ```
393
+
394
+ 4. **Passing Cache details** : Pass the details for caching mechanism.
395
+
396
+ **Note :** No. of cache entries depends upon no. of time the cache is invoked.
397
+ As per our design, when a single prompt is given , cache is invoked 15 times.
398
+ So, if you want to store entries :
399
+
400
+ for 10 user prompts ----> set cache_size as 150
401
+
402
+ for 20 user prompts ----> set cache_size as 300
403
+ ```sh
404
+ cache_ttl=<Time for which entries will be stored in cache, mention in seconds>
405
+ cache_size=<Total entries in cache>
406
+ cache_flag=<cache enablement flag , set it to True if caching to be applied,otherwise False>
407
+
408
+ CACHE_TTL="${cache_ttl}" #[MANDATORY]
409
+ CACHE_SIZE="${cache_size}" #[MANDATORY]
410
+ CACHE_FLAG="${cache_flag}" #[MANDATORY]
411
+ ```
412
+
413
+ 5. **Setting Port No** : This will help to dynamically configure port no. for the Moderation layer Application.
414
+
415
+ As we know that this app is based on Flask, so you may give port no. as `5000` (Flask's default Port No), or any port no of your choice to make the app run.
416
+
417
+ Below are the entries for the same :
418
+ ```sh
419
+ ports=< Set your port no. here>
420
+ PORT="${ports}" #[MANDATORY]
421
+ ```
422
+
423
+ 6. **Telemetry Related Details** : If you are setting up this application in your local, then mention as follows :
424
+ ```sh
425
+ TELEMETRY_ENVIRONMENT="${telemetryenviron}" #[MANDATORY]
426
+ telemetryenviron=<set it as "AZURE">
427
+ ```
428
+ ```sh
429
+ TEL_FLAG="${tel_flag}" #[MANDATORY]
430
+ tel_flag=<set it as False>
431
+ ```
432
+ otherwise : If you are going to setup elasticsearch, kibana telemetry in your system, use the below configurations.
433
+ telemetrypath -> moderation telemetry path URL
434
+ coupledtelemetrypath --> coupled moderation telemetry path URL
435
+ adminTemplatepath --> admin telemetry path URL
436
+ evalllmtelemetrypath --> eval telemetry path URL
437
+ ```sh
438
+ tel_flag=<set it as True>
439
+ TELEMETRY_PATH="http://<host:PORT>/path/v1/telemtry/<moderation telemetry api url>"
440
+ COUPLEDTELEMETRYPATH="http://<host:PORT>/path/v1/telemtry/<coupled moderation telemetry api url>"
441
+ ADMINTEMPLATEPATH="http://<host:PORT>/path/v1/telemtry/<admin telemetry api url>"
442
+ EVALLLMTELEMETRYPATH="http://<host:PORT>/path/v1/telemtry/<eval moderation telemetry api url>"
443
+ ```
444
+
445
+ 7. **Target Environment** : Set the TARGETENVIRONMENT as azure
446
+ ```sh
447
+ TARGETENVIRONMENT="${environmentname}" #[MANDATORY]
448
+ environmentname=<set it as azure>
449
+ ```
450
+
451
+ **Optional Parameters**
452
+ -------------------------------------------------------------------------------------------------------
453
+
454
+ 1. **Setting Azure AI Tranlator API details** : Below is needed if you want to use Azure translate API for translating the user prompt.For this, you need to have Azure AI Translator service. For more details, you can refer [Azure AI Translator Pricing](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/translator/#resources) for Pricing Details.
455
+
456
+ Set azure translate details as follows :
457
+ ```sh
458
+ AZURE_TRANSLATE_KEY = "${azuretranslatekey}" #[OPTIONAL]
459
+ AZURE_TRANSLATE_ENDPOINT = "${azuretranslateendpoint}" #[OPTIONAL]
460
+ AZURE_TRANSLATE_REGION = "${azuretranslateregion}" #[OPTIONAL]
461
+ azuretranslatekey=<Enter Azure Translate Key>
462
+ azuretranslateendpoint=<Enter Azure Translate Endpoint>
463
+ azuretranslateregion=<Enter the Region for Azure Translate>
464
+ ```
465
+
466
+ 2. **BLOOM and LLama Credentials** : Mention the endpoints for Bloom or Llama models.
467
+ ```sh
468
+ BLOOM_ENDPOINT="${bloomendpoint}" #[OPTIONAL]
469
+ LLAMA_ENDPOINT="${llamaendpoint}" #[OPTIONAL]
470
+ bloomendpoint=<Mention Bloom endpoint>
471
+ llamaendpoint=<Mention Llama endpoint>
472
+ ```
473
+
474
+ 3. **Setting Details for OAuth2 authentication**: This is required to generate Bearer Token(for OAuth2), which is optional.
475
+ ```sh
476
+ TENANT_ID = "${tenant_id}" #[OPTIONAL]
477
+ CLIENT_ID = "${client_id}" #[OPTIONAL]
478
+ CLIENT_SECRET = "${client_secret}" #[OPTIONAL]
479
+ AUTH_URL = "${auth_url}" #[OPTIONAL]
480
+ auth_url=<Mention Authenticaion url for token generation>
481
+ client_secret=<Client Secret key for token generation>
482
+ client_id=<Client Id for token generation>
483
+ tenant_id=<Tenant Id for token generation>
484
+ ```
485
+
486
+ `SCOPE` is optional, this will be required only if we are using Microsoft or Google's support for token generation.
487
+ ```sh
488
+ SCOPE = "${scope}" #[OPTIONAL]
489
+ scope=<Set the scope for Service Providers>
490
+ ```
491
+
492
+ 4. **EXE creation** : This is required for exe creation of application. Set the following variables as follows :
493
+ ```sh
494
+ exe_creation = <Set it to True or False based on user choice> #[OPTIONAL]
495
+ EXE_CREATION = "${exe_creation}"
496
+ ```
497
+
498
+ 5. **Setting Vault** : This is required for setting up vault.
499
+
500
+ For local setup :
501
+ ```sh
502
+ ISVAULT="${isvault}" #[OPTIONAL]
503
+ isvault=<set it as False>
504
+ ```
505
+ Otherise :
506
+ ```sh
507
+ isvault=<set it as True>
508
+ ```
509
+
510
+ 6. **Health Check for Application** : This config detail will be required for health check of our application. It's typically used by external monitoring tools or systems to verify if your application is up and running correctly. When a monitoring tool sends a request to `/health` api in `router.py` file, the application responds with a status as `Health check Success` indicating that the application is healthy.
511
+
512
+ To perform health check , we need to set the following flag :
513
+
514
+ For local setup :
515
+ ```sh
516
+ log=<set it to false>
517
+ ```
518
+ Otherwise :
519
+ ```sh
520
+ log=<set it to true>
521
+ LOGCHECK="${log}" #[OPTIONAL]
522
+ ```
523
+
524
+ 7. **SSL Verify** : If you want to by pass verify SSL check then set the variable value to False otherwise True:
525
+
526
+ ```sh
527
+ verify_ssl=<set it to True or False as required>
528
+ VERIFY_SSL="${verify_ssl}" #[OPTIONAL]
529
+ ```
530
+
531
+ 8. Rest all environment variables not mentioned above but are mentioned in `.env` file are completely optional and need not to be set.
532
+
533
+
534
+ ## Running the Application
535
+
536
+ **Note** : Please don't run the api for feedback i.e. `/rai/vi/moderations/feedback` as this endpoint will be deprecated from the next release onwards.
537
+
538
+ Once we have completed all the above mentioned steps, we can start the service.
539
+
540
+ 1. Navigate to the `src` directory:
541
+
542
+ 2. Run `main.py` file:
543
+ ```sh
544
+ python main.py
545
+ ```
546
+
547
+ 3. PORT_NO : Use the Port No that is configured in `.env` file.
548
+
549
+ Open the following URL in your browser:
550
+
551
+ `http://localhost:<PORT_NO>/rai/v1/moderations/docs`
552
+
553
+ 4. **For PII Entity Detection and Blocking :**
554
+ For `rai/vi/moderations` and `rai/vi/moderations/coupledmoderations` APIs , we have configured parameters to be blocked under `PiientitiesConfiguredToBlock` coming in `ModerationCheckThresholds` in the Request Payload.
555
+
556
+ These parameters are configurable. For instance, I have provided here some list of entities to block
557
+ ```sh
558
+ "AADHAR_NUMBER" : to block Aadhar Number (Aadhar number should not have spaces in between)
559
+ "PAN_Number" : to block PAN Card Number
560
+ "IN_AADHAAR" : to block Indian Aadhar number ( this is added due to updated presidio analyzer, which has wide set of entities to detect, make sure Indian Aadhar number should not have spaces in between)
561
+ ```
562
+ ```sh
563
+ "IN_PAN" : to block Indian PAN Card number ( this is added due to updated presidio analyzer, which has wide set of entities to detect)
564
+ ```
565
+ ```sh
566
+ "US_PASSPORT" : to block US Passport Number
567
+ "US_SSN" : to block US SSN Number
568
+ ```
569
+ which can be added as below :
570
+ ```sh
571
+ "ModerationCheckThresholds": {
572
+ "PiientitiesConfiguredToBlock": [
573
+ "AADHAR_NUMBER",
574
+ "PAN_Number",
575
+ "IN_PAN",
576
+ "IN_AADHAAR",
577
+ "US_PASSPORT",
578
+ "US_SSN"
579
+ ]
580
+ }
581
+ ```
582
+
583
+ 5. **Using Bearer Token for OAuth2 for coupledModeration**
584
+ - You need to use the OAuth2 Authentication Token provided Azure or GCP Platform ( Please refer step 3 under **Optional Parameters** in section `Set Configuration Variables` on how to generate OAuth2 token)
585
+ - Use that token in 2 places :
586
+
587
+ a) `Authorize` at the top of the Swagger UI : On clicking on it, you will get an option as **BearerAuth (http, Bearer)** and then **Value**, in the empty text box , just pass the token and click on `Authorize` and then `Close`.
588
+
589
+ b) For Coupled Moderation API : On expanding it, you will get an option as `Parameters`. There, you will find a text box beside `autorization`. Mention the token in below format :
590
+ ```sh
591
+ Bearer <token>
592
+ ```
593
+
594
+ 6. **For the api /rai/v1/moderations/getTemplates/{userid}**
595
+ - For this , first you need to clone the admin repository. Link mentioned : [Admin Repo](https://github.com/Infosys/Infosys-Responsible-AI-Toolkit/tree/main/responsible-ai-admin)
596
+ - Once done, please do the necessary steps to run the admin repo codebase in your local (as mentioned in admin readme file)
597
+ - Go to the API ```api/v1/rai/admin/createCustomeTemplate``` and provide the necessary details to create custom template. The payload is like this :
598
+
599
+ For Text Based Templates :
600
+ ```sh
601
+ {
602
+ "userId": "123",
603
+ "mode": "Master_Template/Private_Template", -> Master templates accessible to all, Private templates only to particular user with userid
604
+ "category": "SingleModel",
605
+ "templateName": "Template1", <------> Name of the Prompt Template
606
+ "description": "Template1", <-----> Short description on what the template is about
607
+ "subTemplates": [
608
+ {
609
+ "subtemplate": "evaluation_criteria",
610
+ "templateData": ""
611
+ },
612
+ {
613
+ "subtemplate": "prompting_instructions",
614
+ "templateData": ""
615
+ },
616
+ {
617
+ "subtemplate": "few_shot_examples",
618
+ "templateData": ""
619
+ }
620
+ ]
621
+ }
622
+ ```
623
+
624
+ For Image based Templates :
625
+ ```sh
626
+ {
627
+ "userId": "123",
628
+ "mode": "Master_Template/Private_Template", -> Master templates accessible to all, Private templates only to particular user with userid
629
+ "category": "MultiModel",
630
+ "templateName": "Template1", <------> Name of the Prompt Template
631
+ "description": "Template1", <-----> Short description on what the template is about
632
+ "subTemplates": [
633
+ {
634
+ "subtemplate": "evaluation_criteria",
635
+ "templateData": ""
636
+ },
637
+ {
638
+ "subtemplate": "prompting_instructions",
639
+ "templateData": ""
640
+ },
641
+ {
642
+ "subtemplate": "few_shot_examples",
643
+ "templateData": ""
644
+ }
645
+ ]
646
+ }
647
+ ```
648
+
649
+ - Once the above thing is done, in the `.env` file, mention the admin api running in local for the field as shown below :
650
+ ```sh
651
+ adminTemplatepath = "<admin_api_url>" `[ Ex : http://localhost:8019//api/v1/rai/admin/getCustomeTemplate/"]`
652
+ ADMINTEMPLATEPATH="${adminTemplatepath}"
653
+ ```
654
+ - Post this, you can run the api ```api/v1/rai/admin/createCustomeTemplate``` and as success response, you will get response like this :
655
+ ```sh
656
+ Templates Retrieved
657
+ ```
658
+
659
+ 7. **For Template-based Checks**
660
+ - There are 2 APIs exposed that make use of prompt templates to evaluate adversarials from text( text moderation) or image(image moderation).
661
+ - Some Master Templates we are using for the api `/rai/v1/moderations/evalllm` :
662
+ ```sh
663
+ 1. Prompt Injection Check : to check for prompt injection
664
+ 2. Jailbreak Check : to check for Jailbreak checks
665
+ 3. Fairness and Bias Check: to check for biasness
666
+ 4. Privacy Check : to check for Privacy
667
+ 5. Language Critique Coherence Check : through this check, LLM will evaluate the quality of the provided text, focusing on the coherence aspect.
668
+ 6. Language Critique Fluency Check : through this check, LLM will evaluate the quality of the provided text, focusing on the fluency aspect.
669
+ 7. Language Critique Grammar Check : through this check, LLM will evaluate the quality of the provided text, focusing on the grammar aspect.
670
+ 8. Language Critique Politeness Check : through this check, LLM will evaluate the quality of the provided text, focusing on the politeness aspect.
671
+ 9. Response Completness Check : to check if the LLM response is complete w.r.t the user prompt
672
+ 10. Response Conciseness Check : to check if the LLM response is concise and brief w.r.t. the user prompt
673
+ 11. Response Language Critique Coherence Check : through this check, LLM will evaluate the quality of the LLM Response, focusing on the coherence aspect.
674
+ 12. Response Language Critique Fluency Check : through this check, LLM will evaluate the quality of the LLM Response, focusing on the fluency aspect.
675
+ 13. Response Language Critique Grammar Check : through this check, LLM will evaluate the quality of the LLM Response, focusing on the grammar aspect.
676
+ 14. Response Language Critique Politeness Check : through this check, LLM will evaluate the quality of the LLM Response, focusing on the politeness aspect.
677
+ ```
678
+
679
+ - Some Master Templates we are using for the api `/rai/v1/moderations/multimodal` :
680
+ ```sh
681
+ 1. Image Restricted Topic Check : to restrict certain topics like "terrorrism" , "explosives"
682
+ 2. Image Prompt Injection Check : to check for Prompt Injection
683
+ 3. Image Jailbreak Check : to check for Jailbreak attacks
684
+ 4. Image Toxicity Check : to check for Toxicity
685
+ 5. Image Profanity Check : to check for Profanity
686
+ ```
687
+
688
+ We need to use these template names in our request payload as shown below :
689
+ ```sh
690
+ {
691
+ "AccountName": "None",
692
+ "PortfolioName": "None",
693
+ "userid": "None",
694
+ "lotNumber": 1,
695
+ "Prompt": "Which is the biggest country in the world?",
696
+ "model_name": "gpt4",
697
+ "temperature": "0",
698
+ "PromptTemplate": "GoalPriority",
699
+ "template_name": "PROMPT_INJECTION" <----------> template name as mentioned above
700
+ }
701
+ ```
702
+
703
+ **Notes :**
704
+
705
+ 1. Change model_name in payload according to the model which you want to use:
706
+ gpt4 for GPT4o-mini or GPT4-Turbo model
707
+ gpt3 for GPT35-Turbo model
708
+ Llama3-70b for Llama3-70b model
709
+ AWS_CLAUDE_V3_5 for AWS Bedrock Claude model
710
+ Gemini-Pro for Gemini 2.5 Pro model
711
+ Gemini-Flash for Gemini 2.5 Flash model
712
+
713
+ 2. The Bancode check in Moderation is designed to specifically identify and block prompts that consist purely of code. It intelligently distinguishes between natural language (NL) and code, and classifies mixed inputs (text + code) as natural language. This ensures that only code-only prompts are restricted, while allowing flexibility for mixed or textual inputs.
714
+
715
+ ## License
716
+ The source code for the project is licensed under the MIT license, which you can find in the [LICENSE.txt](LICENSE.txt) file.
717
+
718
+ ## Contact
719
+ If you have more questions or need further insights please feel free to connect with us at
720
+ DL : Infosys Responsible AI
721
+ Mailid: Infosysraitoolkit@infosys.com
722
+