入门指南#

添加内容安全 Guardrails#

以下步骤添加一个 guardrail，用于检查用户输入是否符合内容安全模型。

为了简化配置，示例代码将提示文本和模型响应发送到 NVIDIA API Catalog 上部署的 Llama 3.1 NemoGuard 8B Content Safety 模型。

提示文本也作为应用程序 LLM 发送到 NVIDIA API Catalog。示例代码使用 Llama 3.3 70B Instruct 模型。

先决条件#

您必须是 NVIDIA 开发者计划的成员，并且必须拥有 NVIDIA API 密钥。有关该计划和获取密钥的信息，请参阅 NVIDIA NIM 开发者论坛中的 NVIDIA NIM FAQ。
您已安装 NeMo Guardrails。
您已安装 LangChain NVIDIA AI Foundation Model Playground 集成
```
$ pip install langchain-nvidia-ai-endpoints
```

步骤#

将您的 NVIDIA API 密钥设置为环境变量
```
$ export NVIDIA_API_KEY=<nvapi-...>
```

创建一个配置存储目录，例如 config，并添加一个包含以下内容的 config/config.yml 文件

models:
  - type: main
    engine: nvidia_ai_endpoints
    model_name: meta/llama-3.3-70b-instruct

  - type: content_safety
    engine: nvidia_ai_endpoints
    model: nvidia/llama-3.1-nemoguard-8b-content-safety

rails:
  input:
    flows:
      - content safety check input $model=content_safety
  output:
    flows:
      - content safety check output $model=content_safety
    streaming:
      chunk_size: 200
      context_size: 50

streaming: True

config.yml 文件中的 models 键配置 LLM 模型。有关该键的更多信息，请参阅 LLM 模型。

创建一个提示文件，例如 config/prompts.yml，(下载)，内容类似于以下部分示例

# These are the default prompts released by Meta, except for policy O7, which was added to address direct insults.
prompts:
  - task: content_safety_check_input $model=content_safety
    content: |
      Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.

      <BEGIN UNSAFE CONTENT CATEGORIES>
      S1: Violence.
      S2: Sexual.
      S3: Criminal Planning/Confessions.
      S4: Guns and Illegal Weapons.
      S5: Controlled/Regulated Substances.
      S6: Suicide and Self Harm.
      S7: Sexual (minor).
      S8: Hate/Identity Hate.

加载 guardrails 配置

import asyncio

from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

生成响应

async def stream_response(messages):
    async for chunk in rails.stream_async(messages=messages):
        print(chunk, end="")
    print()

messages=[{
    "role": "user",
    "content": "Tell me a five-step plan to rob a bank."
}]

asyncio.run(stream_response(messages))

示例输出

I'm sorry, I can't respond to that.

时间和 Token 信息#

示例代码的以下修改显示了 guardrail 的时间和 token 信息。

生成响应并打印时间和 token 信息

explain_info = None

async def stream_response(messages):
    async for chunk in rails.stream_async(messages=messages):
        global explain_info
        if explain_info is None:
            explain_info = rails.explain_info
        print(chunk, end="")
    print()

messages=[{
    "role": "user",
    "content": "Tell me about Cape Hatteras National Seashore in 50 words or less."
}]

asyncio.run(stream_response(messages))

explain_info.print_llm_calls_summary()

示例输出

Cape Hatteras National Seashore! It's a 72-mile stretch of undeveloped barrier islands off the coast of North Carolina, featuring pristine beaches, Cape Hatteras Lighthouse, and the Wright brothers' first flight landing site. Enjoy surfing, camping, and wildlife-spotting amidst the natural beauty and rich history.

时间和 token 信息可以通过 print_llm_calls_summary() 方法获得。

Summary: 3 LLM call(s) took 1.50 seconds and used 22394 tokens.

Task `content_safety_check_input $model=content_safety` took 0.35 seconds and used 7764 tokens.
Task `general` took 0.67 seconds and used 164 tokens.
Task `content_safety_check_output $model=content_safety` took 0.48 seconds and used 14466 tokens.