tutorial dgx-spark getting-started deployment

在 DGX Spark 上开始使用 NemoClaw

NVIDIA AI

NVIDIA AI

@nvidiaai

2026年3月20日

10 分钟

在 DGX Spark 上开始使用 NemoClaw

在 DGX Spark 上开始使用 NemoClaw

NVIDIA DGX Spark 是 NemoClaw 的理想开发平台。凭借其 Grace Blackwell 架构提供的 128GB 统一内存和高达 1 petaflop 的 AI 算力,单台 DGX Spark 即可在桌面上本地运行整个 NemoClaw 栈——包括 Nemotron 120B MoE。

本教程将带你完成从开箱到运行首个安全代理蓝图的完整设置过程。

前提条件

  • NVIDIA DGX Spark(或任何配备 24GB 以上显存 NVIDIA GPU 的系统,用于量化模型)
  • Ubuntu 22.04 LTS 或更高版本(DGX OS 在 Spark 上预装)
  • Docker 24.0+ 及 NVIDIA Container Toolkit
  • 50GB 可用磁盘空间(用于模型和容器)

步骤 1:安装 NemoClaw CLI

NemoClaw CLI 是管理整个栈的主要接口。通过官方安装程序安装:

bash
# Download and run the NemoClaw installer
curl -fsSL https://github.com/NVIDIA/NemoClaw | bash

# Verify installation
nemoclaw version
# Output: nemoclaw v1.0.0-preview (built for linux/arm64)

# Initialize NemoClaw in your project directory
mkdir my-first-agent && cd my-first-agent
nemoclaw init

nemoclaw init 命令创建项目脚手架:

my-first-agent/
├── nemoclaw.yaml          # Main configuration
├── policies/
│   ├── sandbox.yaml       # OpenShell sandbox policies
│   ├── network.yaml       # Network access policies
│   └── privacy.yaml       # Privacy Router configuration
├── blueprints/
│   └── starter.yaml       # Default agent blueprint
└── scripts/
    ├── setup.sh           # Environment setup script
    └── test-agent.sh      # Agent smoke test

步骤 2:配置栈

编辑 nemoclaw.yaml 来配置你的部署:

yaml
# nemoclaw.yaml
apiVersion: nemoclaw.nvidia.com/v1
kind: NemoClawConfig
metadata:
  name: my-first-deployment
spec:
  # Model configuration
  model:
    provider: local
    name: nemotron-120b-moe
    quantization: int4  # Use INT4 for DGX Spark
    gpuLayers: all

  # OpenShell configuration
  openshell:
    enabled: true
    isolationLevel: standard  # standard | strict | paranoid
    auditLog: true

  # Privacy Router configuration
  privacyRouter:
    enabled: true
    defaultRoute: local
    cloudEndpoints: []  # No cloud endpoints for local-only setup

  # Network Policy Engine
  networkPolicy:
    enabled: true
    defaultAction: deny
    allowlist:
      - "*.internal.company.com"

  # Agent configuration
  agent:
    framework: openclaw
    version: "3.13"
    maxConcurrentTasks: 8

步骤 3:拉取 Nemotron 模型

NemoClaw 使用 Nemotron 120B MoE 作为策略评估引擎。在 DGX Spark 上,我们使用 INT4 量化版本,可以轻松放入 128GB 统一内存中:

bash
# Pull the Nemotron model (approximately 35GB)
nemoclaw model pull nemotron-120b-moe-int4

# Verify the model is ready
nemoclaw model list
# Output:
# NAME                        SIZE     STATUS
# nemotron-120b-moe-int4      34.7GB   ready

对于内存较小的系统,NemoClaw 还支持更小的模型:

bash
# Alternative: Nemotron 8B for systems with 24GB VRAM
nemoclaw model pull nemotron-nano-4b

步骤 4:启动 NemoClaw 运行时

一条命令启动完整栈:

bash
# Start all NemoClaw services
nemoclaw up

# Output:
# ✓ OpenShell runtime started (kernel modules loaded)
# ✓ Nemotron 120B MoE loaded (34.7GB, 4-bit quantized)
# ✓ Privacy Router initialized (local-only mode)
# ✓ Network Policy Engine active (deny-by-default)
# ✓ OpenClaw agent framework ready
#
# NemoClaw is running at http://localhost:7860
# Dashboard: http://localhost:7860/dashboard
# API: http://localhost:7860/api/v1

仪表盘提供对代理执行、策略评估和安全事件的实时可视化。

步骤 5:部署你的第一个蓝图

蓝图是包含内置安全策略的预配置代理模板。让我们部署客户支持蓝图:

bash
# List available blueprints
nemoclaw blueprint list
# Output:
# NAME                  DESCRIPTION                          SECURITY LEVEL
# customer-support      Tier-1 support ticket handling       standard
# sales-ops            CRM and sales automation              standard
# security-ops         Alert triage and remediation           strict
# infra-management     Cloud resource management              strict
# code-review          PR analysis and vulnerability scan     standard
# data-pipeline        ETL orchestration                      standard

# Deploy the customer support blueprint
nemoclaw blueprint deploy customer-support
  • OpenShell 沙箱策略(限制文件系统和网络访问)
  • Nemotron 策略规则(PII 检测、意图分类)
  • 网络白名单(仅允许已批准的 API 端点)
  • 操作员审批工作流(对退款、账户变更进行升级)

步骤 6:测试你的代理

向你的安全代理发送测试请求:

bash
# Send a test message to the agent
nemoclaw agent test --blueprint customer-support \
  --message "Customer John Smith (ID: 12345) is asking about their recent order #ORD-9876. They want to know the delivery status."

# Output:
# ┌──────────────────────────────────────────────┐
# │ NemoClaw Security Report                      │
# ├──────────────────────────────────────────────┤
# │ Policy Evaluation:     PASS (45ms)            │
# │ Intent Classification: customer-inquiry       │
# │ Data Sensitivity:      internal               │
# │ Model Route:           local (nemotron-120b)  │
# │ Sandbox:               cs-agent-sandbox-001   │
# │ Network Access:        crm.api, orders.api    │
# │ PII Detected:          name, customer-id      │
# │ PII Action:            redacted-from-logs     │
# │ Approval Required:     no                     │
# ├──────────────────────────────────────────────┤
# │ Agent Response:                                │
# │ "I've checked order #ORD-9876 for the         │
# │  customer. The order shipped on March 18       │
# │  via FedEx (tracking: FX123456789). Expected  │
# │  delivery is March 21."                        │
# └──────────────────────────────────────────────┘
  • 将意图分类为常规客户咨询
  • 检测到 PII(客户姓名和 ID)并从日志中脱敏
  • 将请求路由到本地 Nemotron 模型
  • 仅授予对 CRM 和订单 API 的网络访问权限
  • 判断无需人工审批

步骤 7:通过仪表盘监控

在浏览器中打开 http://localhost:7860/dashboard 访问 NemoClaw 监控仪表盘。主要功能包括:

  • 实时事件流 —— 每个代理操作、策略评估和安全决策
  • 策略违规告警 —— 当代理尝试未授权操作时即时通知
  • 审计日志 —— 所有代理活动的完整、不可变记录
  • 性能指标 —— 延迟、吞吐量和资源利用率
  • 审批队列 —— 待处理的高风险操作人工审批请求

常见配置模式

连接外部 API

要允许代理访问外部服务,需更新网络策略:

yaml
# policies/network.yaml
networkPolicy:
  egress:
    allow:
      - domain: "api.zendesk.com"
        methods: [GET, POST, PUT]
        headers:
          required: ["Authorization"]
      - domain: "api.stripe.com"
        methods: [GET]  # Read-only access to payment data

配置操作员审批

为敏感操作设置审批工作流:

yaml
# policies/sandbox.yaml
approvalWorkflow:
  enabled: true
  rules:
    - action: "refund.process"
      condition: "amount > 100"
      approvers: ["support-leads"]
      channel: "slack"
      timeout: "10m"
    - action: "account.modify"
      condition: "always"
      approvers: ["account-managers"]
      channel: "teams"
      timeout: "15m"

启用云端模型路由

对于非敏感任务,可以启用云端模型路由以获取更好的性能:

yaml
# policies/privacy.yaml
privacyRouter:
  defaultRoute: local
  cloudEndpoints:
    - name: "nvidia-nim"
      url: "https://build.nvidia.com"
      apiKey: "${NVIDIA_API_KEY}"
      allowedSensitivity: ["public", "internal"]

故障排除

OpenShell 内核模块加载失败

bash
# Check kernel module status
nemoclaw diagnose openshell

# If using a custom kernel, ensure eBPF is enabled
# and the kernel version is 5.15+

模型加载内存不足

bash
# Check available GPU memory
nemoclaw diagnose gpu

# Switch to a smaller quantization or model
nemoclaw model pull nemotron-120b-moe-int2  # Smaller but less accurate
nemoclaw model pull nemotron-nano-4b  # Much smaller

后续步骤

你现在已经在 DGX Spark 上拥有了一个完全运行的 NemoClaw 部署。接下来你可以:

  1. 1.为你的具体使用场景自定义安全策略
  2. 2.为你组织的代理工作流构建自定义蓝图
  3. 3.与现有的 SIEM 和可观测性工具集成
  4. 4.使用 NemoClaw 集群模式扩展到多节点部署

请查看本系列的下一篇文章,深入了解 OpenShell 的安全运行时。

保持关注

获取 NemoClaw 新版本、安全公告和生态动态。不发垃圾邮件,随时退订。