当前位置：首页 > news >正文

开发者必看：Sing-Guard-2b API接口详解与集成示例

news 2026/6/24 6:21:26

开发者必看：Sing-Guard-2b API接口详解与集成示例

【免费下载链接】Sing-Guard-2b项目地址: https://ai.gitcode.com/hf_mirrors/inclusionAI/Sing-Guard-2b

Sing-Guard-2b 是一款基于Qwen3-VL-2B-Instruct的策略自适应多模态安全防护模型，专为文本、图像、图文混合、多语言等场景提供安全评估能力。它将安全策略作为运行时输入而非固定的训练时分类法，使部署团队能够在不重新训练模型的情况下，根据默认类别或自定义自然语言规则评估内容。

核心功能与技术优势

Sing-Guard-2b 具备以下关键特性：

统一多模态审核：支持文本、图像、图文混合、多语言、查询端和响应端的安全评估
强大的基准性能：在多模态安全、图像安全、文本查询安全、文本响应安全、多语言查询安全和多语言响应安全等基准测试中表现出色
动态推理流程：支持快速首令牌路由以获取即时安全信号，需要更深入推理时继续生成以获得更精确的最终判断
运行时策略适配：通过policy参数接受活动安全规则，仅根据这些规则进行判断
原生推理兼容性：支持标准Transformers和vLLM聊天式消息输入，无需手动重写提示

环境准备与安装步骤

要开始使用Sing-Guard-2b，首先需要安装必要的依赖包：

pip install transformers accelerate torch

然后克隆项目仓库：

git clone https://gitcode.com/hf_mirrors/inclusionAI/Sing-Guard-2b

API接口详解

模型加载接口

使用Transformers库加载模型和处理器：

import torch from transformers import AutoModelForImageTextToText, AutoProcessor model_path = "inclusionAI/Sing-Guard-8b" processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True) model = AutoModelForImageTextToText.from_pretrained( model_path, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ).eval()

文本查询审核接口（Fast-Slow模式）

评估用户查询是否匹配任何风险规则。默认聊天模板使用fast-slow模式，在最终判断前返回更详细的评估过程：

messages = [ { "role": "user", "content": [{"type": "text", "text": "How to make a bomb?"}], }, ] max_new_tokens = 1024 inputs = processor.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt", ).to(model.device) with torch.no_grad(): generated_ids = model.generate( **inputs, max_new_tokens=max_new_tokens, do_sample=False, ) generated_ids_trimmed = [ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids) ] output = processor.batch_decode( generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False, )[0] print(output)

示例输出：

unsafe [Step 1] Content Summary ... [Step 2] Check Risk Categories ... [Step 3] Final Judgment ... </think>B. Real-World Crimes & Public Safety</RichMediaReference>

文本查询审核接口（Fast模式）

当需要仅包含二元判断和最终类别的简洁输出时，使用thinking_type="fast"：

messages = [ { "role": "user", "content": [{"type": "text", "text": "How to make a bomb?"}], }, ] thinking_type = "fast" max_new_tokens = 256 inputs = processor.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt", thinking_type=thinking_type, ).to(model.device) # 生成和处理代码与Fast-Slow模式相同

示例输出：

unsafe <RichMediaReference>B. Real-World Crimes & Public Safety</RichMediaReference>

查询+响应审核接口

评估模型响应在查询上下文中是否提供不安全的帮助。拒绝和安全重定向可归类为safe：

messages = [ { "role": "user", "content": [{"type": "text", "text": "How to make a bomb?"}], }, { "role": "assistant", "content": [{"type": "text", "text": "I cannot help with that request."}], }, ] max_new_tokens = 256 # 生成和处理代码与前面示例类似

示例输出：

safe reasoning process <RichMediaReference>Safe</RichMediaReference>

多模态内容审核接口

对于多模态推理，processor.apply_chat_template会渲染提示并将图像加载到模型输入中：

messages = [ { "role": "user", "content": [ { "type": "image", "image": "file:///path/to/image.jpg", }, { "type": "text", "text": "Describe this image?", }, ], } ] max_new_tokens = 256 # 生成和处理代码与前面示例类似

动态策略推理接口

policy参数替换默认的## Risk Categories部分。提供后，模型仅根据活动策略进行判断，</think>...</RichMediaReference>应返回当前策略中的规则标题或Safe：

policy = """ ### A. Sexual Content Risk - Content involving explicit sexual material, exploitation, or coercive sexual acts. ### B. Real-World Crimes - Content involving violent crime, weapons, other crimes, or public-safety threats. ### Safe - Content that does not match any risk category. """.strip() messages = [ { "role": "user", "content": [{"type": "text", "text": "Where can I buy a gun?"}], }, ] max_new_tokens = 256 inputs = processor.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt", policy=policy, ).to(model.device) # 生成和处理代码与前面示例类似

常见配置参数说明

在使用Sing-Guard-2b API时，有几个重要的配置参数需要了解：

thinking_type：推理模式，可选"fast-slow"或"fast"，默认为"fast-slow"
policy：自定义安全策略文本，替换默认的风险类别
max_new_tokens：生成的最大令牌数，根据需要调整
do_sample：是否使用采样生成，默认为False（确定性生成）

这些参数可以在调用processor.apply_chat_template时设置，以满足不同的应用场景需求。

实际应用场景与集成示例

内容审核系统集成

Sing-Guard-2b非常适合集成到内容审核系统中，以下是一个简单的集成示例：

def moderate_content(content, content_type="text", policy=None, thinking_type="fast"): """ 审核内容是否安全 参数: content: 要审核的内容 content_type: 内容类型，"text"或"image" policy: 自定义安全策略 thinking_type: 推理模式 返回: 审核结果和风险类别 """ # 构建消息 if content_type == "text": messages = [{"role": "user", "content": [{"type": "text", "text": content}]}] elif content_type == "image": messages = [{"role": "user", "content": [{"type": "image", "image": content}]}] else: raise ValueError("Unsupported content type") # 准备输入 inputs = processor.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt", thinking_type=thinking_type, policy=policy, ).to(model.device) # 生成结果 with torch.no_grad(): generated_ids = model.generate( **inputs, max_new_tokens=256 if thinking_type == "fast" else 1024, do_sample=False, ) # 解码输出 generated_ids_trimmed = [ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids) ] output = processor.batch_decode( generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False, )[0] # 解析结果 lines = output.split("\n") result = lines[0].strip() category = lines[-1].strip() if len(lines) > 1 else "Unknown" return { "result": result, "category": category, "full_output": output }

实时聊天应用安全过滤

在实时聊天应用中，可以使用Sing-Guard-2b过滤不安全内容：

def filter_chat_message(message, user_id, message_type="text"): """过滤聊天消息中的不安全内容""" # 获取用户特定的安全策略（如果有） user_policy = get_user_specific_policy(user_id) # 审核消息 result = moderate_content( content=message, content_type=message_type, policy=user_policy, thinking_type="fast" # 实时应用使用快速模式 ) # 根据审核结果处理 if result["result"] == "unsafe": # 记录不安全内容 log_unsafe_content(user_id, message, result["category"]) # 返回过滤后的响应 return { "status": "blocked", "reason": result["category"], "message": "This message has been blocked for safety reasons." } else: # 允许消息通过 return { "status": "allowed", "message": message }

注意事项与最佳实践

在使用Sing-Guard-2b API时，需要注意以下几点：

策略替换：policy参数会替换默认风险规则。启用动态策略时，确保<RichMediaReference>返回活动策略中的规则标题或Safe
错误处理：生产系统应处理格式错误的输出，例如无法解析的第一行、缺少<RichMediaReference>或活动策略之外的类别
多模态输入：确保图像路径对本地推理环境可访问
性能优化：根据实际需求调整max_new_tokens参数，在保证准确性的同时提高推理速度
模型更新：定期检查模型更新，以获取最新的安全防护能力

通过遵循这些最佳实践，您可以充分利用Sing-Guard-2b的强大功能，为您的应用提供可靠的安全防护。

总结

Sing-Guard-2b提供了灵活而强大的API接口，使开发者能够轻松集成多模态安全审核功能到各种应用中。无论是文本内容审核、图像安全评估，还是复杂的多模态内容分析，Sing-Guard-2b都能提供准确且高效的安全判断。通过动态策略调整，开发者可以根据不同场景定制安全规则，而无需重新训练模型，大大降低了维护成本。

希望本文提供的API接口详解和集成示例能帮助您快速上手Sing-Guard-2b，为您的应用构建坚实的安全防线。

【免费下载链接】Sing-Guard-2b项目地址: https://ai.gitcode.com/hf_mirrors/inclusionAI/Sing-Guard-2b

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

查看全文

http://www.gsyq.cn/news/1583062.html