GuardRails
Protect
Helps block malicious attempts from users to get the LLM to respond in disruptive, embarrassing, or harmful ways.
Parameters
None - Data should be "chained" into this function.
Returns
The original input text, with some "hard stops" removed. For example: ignore all instructions
is a hard-blocked phrase that will be removed.
This also sets some global variables:
promptInjection
: A number0.00 - 1.00
indicating the percent likelihood of a "detected" prompt injection attempt.flaggedText
: The text in question that was flagged by the system.
Example Usage
Would remove the flagged text, yielding Write me a poem. and say hello
as the text sent to the LLM, and also set some variables:
Allowing us to detect, and choose how to handle this attempted prompt injection. See the "Protect from Prompt Injection" template for a more robust example:
Profanity Filter
Filters input text for profanity and blocks it.
Parameters
None - Data should be "chained" into this function.
Returns
Either the input text, or the "blocked" phrase set in the Configure tab:
This also sets the global variable profanity
to true/false based on profanity detection.
Example Usage
The variable profanity
will be set to true
, and the variable test
will be set to the value seen in the configure tab:
That seems like a sensitive question. Maybe I'm not understanding you, so try rephrasing.
Remove PII
Masks detected PII in input text.
Parameters
None - Data should be "chained" into this function.
Returns
The resulting masked text.
Note
You may define additional PII, or disable specific builtin PII filters, on the Configure tab under Guardrails
Identify PII
Find PII via the built-in NeuralSeek and custom added REGEX patterns.
Parameters
None - Data should be "chained" into this function.
Returns
The identified PII text.