Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 64 additions & 50 deletions compliance/virtualmachines/roxagent/README.md
Original file line number Diff line number Diff line change
@@ -1,91 +1,105 @@
# VM Agent
# roxagent

Runs inside VMs to scan for vulnerabilities and report back to the host via vsock.
While not directly related to the `compliance` feature, the agent utilizes compliance
node scanning code for package scanning in the virtual machine.
Runs inside KubeVirt VMs to scan for vulnerabilities and serve reports to Sensor
via VSOCK (pull mode). While co-located under `compliance/`, the agent reuses
compliance node-scanning code for package indexing — it is not part of the
Compliance Operator feature.

## What it does

Scans the VM for installed packages (`rpm`/`dnf` databases), creates vulnerability reports, and sends them to the host over vsock. Can run once or continuously in daemon mode. Requires `sudo` privileges to scan packages.
1. Scans the VM for installed packages (`rpm`/`dnf` databases) using the same
Scanner V4 indexer as node scanning.
2. Caches the scan report in memory with a generation counter.
3. Listens on a VSOCK port for incoming connections from Sensor.
4. When Sensor connects, serves the cached report via the `VMServiceRequest` /
`VMServiceResponse` framed protobuf protocol.
5. Periodically rescans to pick up package changes.

Sensor pulls reports from all running VMs on a timer and forwards them to
Central for vulnerability matching.

## Usage

```bash
# Single scan
sudo ./roxagent

# Daemon mode (scans every 4 hours by default)
sudo ./roxagent --daemon
# Start pull-mode server (production mode)
sudo ./roxagent serve

# Custom settings
sudo ./roxagent --daemon --index-interval 10m --host-path /custom/path --port 2048
sudo ./roxagent serve --port 818 --host-path / --rescan-interval 4h
```

## Flags
## Flags (`serve` subcommand)

- `--daemon` - Run continuously (default: false).
- `--index-interval` - Time between scans in daemon mode (default: 4h, minimum: 10m).
- `--host-path` - Where to look for package databases (default: /).
- `--max-initial-report-delay` - Max delay before starting to send in daemon mode (default: 20m).
- `--port` - VSock port (default: 818).
- `--repo-cpe-url` - URL for the repository to CPE mapping.
- `--timeout` - VSock client timeout when sending index reports.
- `--verbose` - Prints the index reports to stdout.
- `--port` — VSOCK port to listen on (default: 818).
- `--host-path` — Root filesystem path for package indexing (default: /).
- `--repo-cpe-url` — URL for the repository-to-CPE mapping file.
- `--rescan-interval` — Interval between periodic rescans (default: 4h).

## How it works

1. Scans filesystem for `rpm`/`dnf` package databases.
2. Pulls repo-to-CPE mappings from Red Hat. Network connection to the public Internet or to Sensor is required.
3. Creates protobuf index report.
4. Sends report to host via vsock.
1. Performs an initial scan of the VM filesystem for package databases.
2. Fetches repo-to-CPE mappings from Red Hat (requires network access).
3. Starts a VSOCK listener with optional mTLS (KubeVirt CA).
4. On each Sensor connection: reads a `VMServiceRequest`, dispatches by method,
returns the cached `VMServiceResponse` with the index report.
5. On rescan timer: re-indexes the filesystem and atomically swaps the cached
report, incrementing the generation counter.

The host receives these reports and forwards them to StackRox Central for vulnerability analysis.
### TLS

## Single-instance protection
When running inside a KubeVirt VM with TLS enabled:
- roxagent fetches the KubeVirt CA from the host (VSOCK CID 2, port 1).
- Connections from Sensor (via virt-handler) present a client cert signed by
the KubeVirt CA, which roxagent validates.
- roxagent uses a self-signed server cert (virt-handler does not validate it).
- The CA is refreshed hourly to support rotation.

- roxagent uses a file lock at `/run/lock/roxagent/roxagent.lock` to prevent overlapping scans on the same VM.
- In single-scan mode, if another agent is already running, the new invocation exits with an error.
- In daemon mode, if another agent is scanning, the current cycle is skipped and retried at the next interval.
- If the lock cannot be acquired (permissions or missing directory), a warning is logged and the scan proceeds without protection.
- The lock is shared between host-binary and Quadlet/Podman runs via a bind mount.
If the KubeVirt CA is unavailable, roxagent falls back to plaintext VSOCK
(RBAC on the KubeVirt subresource still gates access).

## Deployment

### Using Quadlet (Recommended for RHEL VMs)
### Native systemd service (CI / dev)

For RHEL 9 VMs, use Podman Quadlet to run roxagent as a periodic systemd service.
See [quadlet/README.md](quadlet/README.md) for detailed instructions.
The CI script `scripts/ci/add-vms/install-agent-native.sh` builds roxagent,
copies it into the VM via `virtctl scp`, and enables a systemd service:

```bash
cd quadlet
./install.sh # Install locally
./install.sh user@host # Install on remote VM
# roxagent-serve.service runs: /usr/local/bin/roxagent serve
```

### Building from Source
### Quadlet (RHEL VMs)

See [quadlet/README.md](quadlet/README.md) for Podman Quadlet deployment.
Note: Quadlet units may still reference the old push-mode entrypoint and need
updating for pull mode.

### Building from source

```bash
# For the current platform
go build -o roxagent .

# For Linux VMs
GOOS=linux GOARCH=amd64 go build -o roxagent-linux .
# Cross-compile for Linux VMs
GOOS=linux GOARCH=amd64 go build -o roxagent .
```

## Troubleshooting

### Can't connect to host
### Can't connect / dial failures from Sensor

- Check if vsock is enabled in the VM.
- Verify the port isn't in use.
- Make sure vsock kernel modules are loaded.
- Verify VSOCK is enabled on the VMI spec (`spec.domain.devices.autoattachVSOCK`).
- Check that the VSOCK port isn't in use by another process inside the VM.
- Ensure Sensor has RBAC for `virtualmachineinstances/vsock` on `subresources.kubevirt.io`.

### No packages found
### No packages found (zero-package reports)

- Check `--host-path` points to the right place.
- Check `--host-path` points to the correct root filesystem.
- Verify `rpm`/`dnf` databases exist and are readable.
- Use `--verbose` to examine the index report and compare with the content from `rpm`/`dnf` databases.
- Check Sensor logs for `reportcheck` warnings.

### Scan failures
### TLS handshake failures

- Check internet access for repo-to-CPE downloads.
- Look at logs for specific errors.
- Verify KubeVirt has TLS enabled (check virt-handler logs).
- Check that roxagent can reach CID 2 port 1 (KubeVirt CA service).
- Look for "Rejected plaintext connection" in roxagent logs (Sensor not using TLS).
90 changes: 2 additions & 88 deletions compliance/virtualmachines/roxagent/cmd/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,111 +2,25 @@ package cmd

import (
"context"
"fmt"
"time"

"github.com/spf13/cobra"
"github.com/stackrox/rox/compliance/virtualmachines/roxagent/common"
"github.com/stackrox/rox/compliance/virtualmachines/roxagent/index"
"github.com/stackrox/rox/compliance/virtualmachines/roxagent/lock"
"github.com/stackrox/rox/compliance/virtualmachines/roxagent/vsock"
"github.com/stackrox/rox/pkg/logging"
)

var log = logging.LoggerForModule()

const (
minDaemonIndexInterval = 10 * time.Minute
repoToCPEMappingURL = "https://security.access.redhat.com/data/metrics/repository-to-cpe.json"
repoToCPEMappingURL = "https://security.access.redhat.com/data/metrics/repository-to-cpe.json"
)

// RootCmd returns the root cobra command that dispatches to subcommands.
func RootCmd(ctx context.Context) *cobra.Command {
cmd := cobra.Command{
Use: "agent",
Short: "Collects index reports for vulnerability scanning of virtual machines.",
SilenceUsage: true,
}
cmd.SetContext(ctx)
cfg := &common.Config{}
cmd.Flags().BoolVar(&cfg.DaemonMode, "daemon", false,
"Run in daemon mode. Sends index reports continuously.",
)

// Shortening this interval results in more frequent scans and therefore more load,
// which, assuming the throughput continues to be limited by scanning capacity,
// reduces the number of VMs that Stackrox can handle.
//
// As of February 2026, the measured capacity of a default Stackrox deployment is:
// - 4500 VMs if the report interval is 4 hours
// - 1100 VMs if the report interval is 1 hour
//
// See the documentation for more details.
cmd.Flags().DurationVar(&cfg.IndexInterval, "index-interval", 240*time.Minute,
fmt.Sprintf(
"Interval at which index reports are sent in daemon mode (minimum: %v). "+
"Shorter intervals increase scanning load and reduce the overall number of VMs that can be scanned.",
minDaemonIndexInterval,
),
)
cmd.Flags().StringVar(&cfg.IndexHostPath, "host-path", "/",
"Path where the indexer starts searching for the RPM and DNF databases.",
)
cmd.Flags().DurationVar(&cfg.MaxInitialReportDelay, "max-initial-report-delay", 20*time.Minute,
"Max delay before starting to send in daemon mode.",
)
cmd.Flags().StringVar(&cfg.RepoToCPEMappingURL, "repo-cpe-url", repoToCPEMappingURL,
"URL for the repository to CPE mapping.",
)
cmd.Flags().DurationVar(&cfg.Timeout, "timeout", 30*time.Second,
"VSock client timeout when sending index reports.",
)
cmd.Flags().BoolVar(&cfg.Verbose, "verbose", false,
"Prints the index reports to stdout.",
)
cmd.Flags().Uint32Var(&cfg.VsockPort, "port", 818,
"VSock port to connect with the virtual machine host.",
)
cmd.RunE = func(cmd *cobra.Command, _ []string) error {
if err := validateDaemonConfig(cfg); err != nil {
return err
}

client := &vsock.Client{
Port: cfg.VsockPort,
HostPath: cfg.IndexHostPath,
Timeout: cfg.Timeout,
Verbose: cfg.Verbose,
}
if cfg.DaemonMode {
if err := index.RunDaemon(ctx, cfg, client, lock.DefaultLockPath); err != nil {
return fmt.Errorf("running indexer daemon: %w", err)
}
return nil
}

scanFn := func() error { return index.RunSingle(ctx, cfg, client) }
onHeld := func() error {
return fmt.Errorf("roxagent is already running (lock file is held at %s); exiting", lock.DefaultLockPath)
}
onUnavailable := func(lockErr error) error {
log.Warnf("could not acquire lock at %s: %v; continuing without single-instance protection; concurrent runs may cause high host load", lock.DefaultLockPath, lockErr)
return scanFn()
}
if err := lock.RunWithLock(lock.DefaultLockPath, scanFn, onHeld, onUnavailable); err != nil {
return fmt.Errorf("running indexer: %w", err)
}
return nil
}
cmd.AddCommand(ServeCmd(ctx))
return &cmd
}

func validateDaemonConfig(cfg *common.Config) error {
if !cfg.DaemonMode {
return nil
}
if cfg.IndexInterval < minDaemonIndexInterval {
return fmt.Errorf("index interval must be at least %s in daemon mode (got %s)", minDaemonIndexInterval, cfg.IndexInterval)
}
return nil
}
14 changes: 0 additions & 14 deletions compliance/virtualmachines/roxagent/common/config.go

This file was deleted.

112 changes: 0 additions & 112 deletions compliance/virtualmachines/roxagent/index/index.go

This file was deleted.

Loading
Loading