technical architecture openshell nemotron

NemoClaw Architecture Deep Dive

Eric Ericsson

Eric Ericsson

@eericsson

March 19, 2026

12 min read

NemoClaw Architecture Deep Dive

NemoClaw Architecture Deep Dive

NemoClaw is not a single product — it is a layered security architecture designed to make autonomous AI agents safe for production deployment. This post walks through each layer, explains how they interact, and provides the technical context you need to evaluate NemoClaw for your own infrastructure.

Architecture Overview

The NemoClaw stack consists of four primary layers, each addressing a different dimension of agent security:

┌─────────────────────────────────────────────┐
│              Agent Application               │
│         (OpenClaw Agent Framework)           │
├─────────────────────────────────────────────┤
│             Privacy Router                    │
│    (Local vs. Cloud Model Routing)           │
├─────────────────────────────────────────────┤
│           Nemotron Policy Engine              │
│    (120B MoE Intent Classification)          │
├─────────────────────────────────────────────┤
│             OpenShell Runtime                 │
│    (Kernel-Level Sandbox + Isolation)        │
├─────────────────────────────────────────────┤
│          Host OS / Hardware (DGX)             │
└─────────────────────────────────────────────┘

Each layer operates independently and can be deployed standalone, but the full stack provides defense-in-depth security that no single layer can achieve alone.

Layer 1: OpenShell Security Runtime

OpenShell is the foundation of NemoClaw's security model. It provides kernel-level sandboxing for agent execution, ensuring that even a compromised agent cannot escape its security boundary.

How OpenShell Works

OpenShell uses a combination of Linux kernel namespaces, seccomp-BPF filters, and NVIDIA's custom eBPF programs to create isolated execution environments for each agent task:

yaml
# openshell-policy.yaml
apiVersion: openshell.nvidia.com/v1
kind: SandboxPolicy
metadata:
  name: customer-support-agent
spec:
  isolation:
    network: restricted
    filesystem: read-only
    syscalls: minimal
  resources:
    maxMemory: 4Gi
    maxCPU: 2
    gpuAccess: inference-only
  permissions:
    allowedAPIs:
      - crm.read
      - crm.update
      - ticket.create
      - ticket.resolve
    deniedAPIs:
      - admin.*
      - billing.*
      - user.delete
  auditLog:
    enabled: true
    destination: siem://security-events

Every system call made by the agent is intercepted by OpenShell's eBPF layer, classified against the policy, and either allowed, denied, or escalated for human approval. The entire decision pipeline runs in kernel space, adding less than 50 microseconds of latency per system call.

Operator Approval Workflows

For high-risk operations — deleting data, modifying infrastructure, sending external communications — OpenShell can pause agent execution and route the action to a human operator for approval:

typescript
// Approval workflow configuration
const approvalPolicy = {
  triggers: [
    { action: 'data.delete', threshold: 'always' },
    { action: 'infra.modify', threshold: 'always' },
    { action: 'email.send', threshold: 'external-only' },
    { action: 'payment.process', threshold: 'above-100-usd' },
  ],
  channels: ['slack', 'teams', 'pagerduty'],
  timeout: '15m',
  defaultAction: 'deny',
};

This ensures that agents can operate autonomously for routine tasks while maintaining human oversight for consequential actions.

Layer 2: Nemotron 120B MoE Policy Engine

The Nemotron 120B Mixture-of-Experts model serves as NemoClaw's intelligent policy evaluation engine. Unlike traditional rule-based security systems, Nemotron can understand the intent behind agent actions and evaluate them against natural-language security policies.

Intent Classification

When an agent requests an action, Nemotron classifies the intent across multiple dimensions:

  • Sensitivity: How sensitive is the data involved? (public, internal, confidential, restricted)
  • Reversibility: Can this action be undone? (fully reversible, partially reversible, irreversible)
  • Scope: How many systems or users are affected? (single, team, organization, external)
  • Compliance: Does this action fall under any regulatory frameworks? (GDPR, HIPAA, SOC 2, PCI-DSS)

The classification runs in under 200ms on a single A100 GPU, and under 50ms on a DGX Spark with the quantized model variant.

Natural Language Policies

Security teams can define policies in plain English, which Nemotron interprets and enforces:

Policy: "Customer support agents may access customer records for active tickets only.
         They may not access financial data, modify account settings, or communicate
         with customers outside of the ticketing system. All PII must be redacted
         from internal logs."

Nemotron converts these natural-language policies into executable security rules, bridging the gap between security team intent and technical enforcement.

Layer 3: Privacy Router

The Privacy Router is NemoClaw's intelligent model routing layer. It determines whether each agent task should be processed by a local model (Nemotron running on-premises) or routed to a cloud model endpoint, based on the data sensitivity classification.

Routing Logic

python
# Simplified Privacy Router logic
def route_request(request: AgentRequest) -> ModelEndpoint:
    sensitivity = classify_sensitivity(request.context)

    if sensitivity in ['restricted', 'confidential']:
        # Highly sensitive data stays local
        return local_nemotron_endpoint

    if sensitivity == 'internal':
        # Internal data can use cloud with encryption
        return cloud_endpoint_with_e2e_encryption

    if sensitivity == 'public':
        # Public data can use any endpoint for best performance
        return optimal_cloud_endpoint

    # Default: local processing
    return local_nemotron_endpoint

The Privacy Router maintains a real-time classification cache, so repeated requests with similar context are routed without re-evaluation. In benchmarks, the router adds less than 5ms of latency to the request pipeline.

Data Residency Compliance

For organizations operating under data residency requirements (EU GDPR, China's PIPL, etc.), the Privacy Router can enforce geographic routing constraints:

yaml
privacyRouter:
  residencyRules:
    - region: EU
      dataTypes: [personalData, financialData]
      allowedEndpoints: [eu-west-1-local, eu-central-1-local]
    - region: CN
      dataTypes: [all]
      allowedEndpoints: [cn-north-1-local]
  fallback: local-only

Layer 4: Network Policy Engine

The Network Policy Engine controls what external resources an agent can access. It operates as a transparent proxy, inspecting and filtering all outbound network requests from agent sandboxes.

Policy Definition

yaml
networkPolicy:
  name: sales-ops-agent
  egress:
    allow:
      - domain: "*.salesforce.com"
        methods: [GET, POST, PATCH]
      - domain: "api.hubspot.com"
        methods: [GET]
      - domain: "smtp.company.com"
        ports: [587]
    deny:
      - domain: "*"  # deny all other outbound traffic
  ingress:
    allow:
      - source: "webhook.salesforce.com"
        path: "/api/v1/events"
  inspection:
    tlsDecrypt: true
    logPayloads: false
    scanForPII: true

The Network Policy Engine supports TLS interception for outbound requests (with proper certificate management), allowing it to scan request payloads for accidental PII leakage.

Putting It All Together

When an OpenClaw agent receives a task, the request flows through the NemoClaw stack as follows:

  1. 1.OpenClaw receives the task and constructs an execution plan
  2. 2.Privacy Router classifies the data sensitivity and selects the appropriate model endpoint
  3. 3.Nemotron evaluates the execution plan against security policies and classifies intent
  4. 4.OpenShell creates an isolated sandbox for the task execution
  5. 5.Network Policy Engine configures the sandbox's network access based on the agent's role
  6. 6.The agent executes within the sandbox, with every action audited
  7. 7.High-risk actions are escalated to human operators for approval
  8. 8.Results are returned through the Privacy Router, with PII redacted from logs

This entire pipeline adds approximately 300ms of latency to the first request in a session, and under 100ms for subsequent requests (due to caching). For most enterprise workloads, this overhead is negligible compared to the model inference time.

Performance Benchmarks

On a single DGX Spark:

MetricValue
Policy evaluation latency (p50)45ms
Policy evaluation latency (p99)180ms
Sandbox creation time120ms
Network policy application15ms
Throughput (concurrent agents)64
Memory overhead per sandbox256MB

On a DGX H100 cluster (8 GPUs):

MetricValue
Policy evaluation latency (p50)12ms
Policy evaluation latency (p99)45ms
Throughput (concurrent agents)512

Getting Started

The NemoClaw architecture documentation is available on GitHub at nvidia/nemoclaw. Each layer can be deployed independently, so you can adopt NemoClaw incrementally — starting with OpenShell sandboxing and adding the other layers as your security requirements evolve.

In the next post, we'll walk through a hands-on tutorial for deploying the full NemoClaw stack on a DGX Spark.

Stay in the Loop

Get updates on NemoClaw releases, security advisories, and ecosystem news. No spam, unsubscribe anytime.