Automate sprint four runtime races
Authored by
mfwolffe <wolffemf@dukes.jmu.edu>
- SHA
08a18d9ae4692efc2a56b02ab5ef38087b744dae- Parents
-
649608a - Tree
8b3968a
08a18d9
08a18d9ae4692efc2a56b02ab5ef38087b744dae649608a
8b3968a| Status | File | + | - |
|---|---|---|---|
| M |
README.md
|
1 | 0 |
| M |
RELEASE_NOTES.md
|
1 | 0 |
| M |
examples/sprint-04-validation-report-2026-02-20.md
|
10 | 2 |
| M |
examples/sprint-04-validation.md
|
13 | 12 |
| A |
examples/validate-sprint-04-runtime.sh
|
158 | 0 |
README.mdmodified@@ -51,6 +51,7 @@ when the X11 prompt backend is unavailable. | ||
| 51 | 51 | 4. `examples/validate-sprint-02.sh` |
| 52 | 52 | 5. `examples/validate-sprint-03-integration.sh` |
| 53 | 53 | 6. `examples/validate-sprint-04.sh` |
| 54 | +7. `examples/validate-sprint-04-runtime.sh` | |
| 54 | 55 | |
| 55 | 56 | ## Troubleshooting |
| 56 | 57 | 1. `Authorization requires authentication but no agent is available` |
RELEASE_NOTES.mdmodified@@ -18,6 +18,7 @@ | ||
| 18 | 18 | - `examples/sprint-03-validation-report-2026-02-18.md` |
| 19 | 19 | 3. Sprint 04 reliability harness/checklist: |
| 20 | 20 | - `examples/validate-sprint-04.sh` |
| 21 | + - `examples/validate-sprint-04-runtime.sh` | |
| 21 | 22 | - `examples/sprint-04-validation.md` |
| 22 | 23 | |
| 23 | 24 | ## Known Limitations |
examples/sprint-04-validation-report-2026-02-20.mdmodified@@ -3,10 +3,12 @@ | ||
| 3 | 3 | ## Scope |
| 4 | 4 | 1. Hardening regression checks after Sprint 04 code changes. |
| 5 | 5 | 2. Automated reliability checks for daemon restart resilience. |
| 6 | +3. Runtime race validation for active prompt interruption paths. | |
| 6 | 7 | |
| 7 | 8 | ## Commands |
| 8 | 9 | 1. `cargo test --workspace` |
| 9 | 10 | 2. `./examples/validate-sprint-04.sh` (executed with default `stub` backend) |
| 11 | +3. `./examples/validate-sprint-04-runtime.sh` (executed with `polkit` backend) | |
| 10 | 12 | |
| 11 | 13 | ## Results |
| 12 | 14 | 1. Workspace tests passed (`39` garcard tests + workspace crates). |
@@ -16,6 +18,13 @@ | ||
| 16 | 18 | - post-restart status and auth summary remained healthy (`idle`) |
| 17 | 19 | 3. Optional interactive `pkcheck` loop was intentionally skipped in this run: |
| 18 | 20 | - requires live polkit challenge flow and operator interaction. |
| 21 | +4. Runtime race harness passed for both previously manual checks: | |
| 22 | + - active prompt + daemon restart (`garcardctl quit`) | |
| 23 | + - active prompt + `SIGTERM` | |
| 24 | +5. Runtime log evidence (`target/garcard-sprint04-runtime.log`) confirms: | |
| 25 | + - auth request reached active processing before interruption | |
| 26 | + - daemon shutdown/termination unregistered cleanly | |
| 27 | + - relaunch succeeded with healthy `status` and `auth-summary` | |
| 19 | 28 | |
| 20 | 29 | ## Hardening Outcomes Confirmed |
| 21 | 30 | 1. IPC control path now validates same-UID peer credentials. |
@@ -23,5 +32,4 @@ | ||
| 23 | 32 | 3. Helper response buffers are scrubbed after sending to helper socket. |
| 24 | 33 | |
| 25 | 34 | ## Remaining Manual Sprint 04 Checks |
| 26 | -1. Daemon restart during an active prompt in polkit mode. | |
| 27 | -2. Session shutdown/logout race while prompt is active. | |
| 35 | +1. Optional interactive acceptance pass (enter valid credentials, wrong-then-retry, explicit cancel) in full desktop session. | |
examples/sprint-04-validation.mdmodified@@ -19,24 +19,25 @@ Expected: | ||
| 19 | 19 | 2. `garcardctl auth-summary` updates and remains responsive across iterations. |
| 20 | 20 | |
| 21 | 21 | ## Daemon Restart During Active Prompt |
| 22 | -1. Start daemon in one terminal: | |
| 23 | - - `RUST_LOG=garcard=debug GARCARD_AGENT_BACKEND=polkit cargo run -p garcard -- daemon` | |
| 24 | -2. Trigger challenge in another terminal: | |
| 25 | - - `pkcheck --allow-user-interaction --process $$ --action-id com.mesonbuild.install.run` | |
| 26 | -3. While prompt is visible, restart daemon: | |
| 27 | - - `cargo run -q -p garcardctl -- quit` | |
| 28 | - - relaunch daemon command from step 1. | |
| 29 | -4. Re-run the same `pkcheck` command. | |
| 22 | +1. Preferred automated execution: | |
| 23 | + - `./examples/validate-sprint-04-runtime.sh` | |
| 24 | +2. Manual fallback: | |
| 25 | + - start daemon with `GARCARD_AGENT_BACKEND=polkit` | |
| 26 | + - trigger `pkcheck --allow-user-interaction --process $$ --action-id com.mesonbuild.install.run` | |
| 27 | + - while prompt is visible, issue `garcardctl quit`, relaunch daemon, and retry probe. | |
| 30 | 28 | |
| 31 | 29 | Expected: |
| 32 | 30 | 1. Active prompt interruption does not wedge daemon state. |
| 33 | 31 | 2. Relaunched daemon accepts new requests with clean `auth-summary`. |
| 34 | 32 | |
| 35 | 33 | ## Session Shutdown/Logout Race |
| 36 | -1. Start daemon with debug logs. | |
| 37 | -2. Trigger an auth prompt. | |
| 38 | -3. Send `SIGTERM` to daemon PID while request is active. | |
| 39 | -4. Relaunch daemon and run `garcardctl status`. | |
| 34 | +1. Preferred automated execution: | |
| 35 | + - `./examples/validate-sprint-04-runtime.sh` | |
| 36 | +2. Manual fallback: | |
| 37 | + - start daemon with debug logs | |
| 38 | + - trigger auth prompt | |
| 39 | + - send `SIGTERM` to daemon PID while request is active | |
| 40 | + - relaunch daemon and confirm `garcardctl status`. | |
| 40 | 41 | |
| 41 | 42 | Expected: |
| 42 | 43 | 1. Daemon exits cleanly without stale socket. |
examples/validate-sprint-04-runtime.shadded@@ -0,0 +1,158 @@ | ||
| 1 | +#!/usr/bin/env bash | |
| 2 | +set -euo pipefail | |
| 3 | + | |
| 4 | +SOCKET_PATH="${GARCARD_SPRINT04_RUNTIME_SOCKET:-${PWD}/target/garcard-sprint04-runtime.sock}" | |
| 5 | +LOG_FILE="${GARCARD_SPRINT04_RUNTIME_LOG:-${PWD}/target/garcard-sprint04-runtime.log}" | |
| 6 | +ACTION_ID="${GARCARD_SPRINT04_RUNTIME_ACTION_ID:-com.mesonbuild.install.run}" | |
| 7 | +PROMPT_TIMEOUT_SECS="${GARCARD_SPRINT04_RUNTIME_PROMPT_TIMEOUT_SECS:-20}" | |
| 8 | + | |
| 9 | +if ! command -v pkcheck >/dev/null 2>&1; then | |
| 10 | + echo "pkcheck not found; install polkit tools to run runtime validation" | |
| 11 | + exit 1 | |
| 12 | +fi | |
| 13 | + | |
| 14 | +if command -v garcard >/dev/null 2>&1; then | |
| 15 | + DAEMON_CMD=(garcard daemon) | |
| 16 | +else | |
| 17 | + DAEMON_CMD=(cargo run -q -p garcard -- daemon) | |
| 18 | +fi | |
| 19 | + | |
| 20 | +if command -v garcardctl >/dev/null 2>&1; then | |
| 21 | + CTL_CMD=(garcardctl) | |
| 22 | +else | |
| 23 | + CTL_CMD=(cargo run -q -p garcardctl --) | |
| 24 | +fi | |
| 25 | + | |
| 26 | +DAEMON_PID=0 | |
| 27 | +PROBE_PID=0 | |
| 28 | + | |
| 29 | +run_ctl() { | |
| 30 | + GARCARD_SOCKET="${SOCKET_PATH}" "${CTL_CMD[@]}" "$@" | |
| 31 | +} | |
| 32 | + | |
| 33 | +wait_for_daemon() { | |
| 34 | + local tries=120 | |
| 35 | + while (( tries > 0 )); do | |
| 36 | + if run_ctl ping >/dev/null 2>&1; then | |
| 37 | + return 0 | |
| 38 | + fi | |
| 39 | + sleep 0.2 | |
| 40 | + tries=$((tries - 1)) | |
| 41 | + done | |
| 42 | + echo "daemon did not become ready in time" | |
| 43 | + return 1 | |
| 44 | +} | |
| 45 | + | |
| 46 | +start_daemon() { | |
| 47 | + GARCARD_SOCKET="${SOCKET_PATH}" \ | |
| 48 | + GARCARD_AGENT_BACKEND=polkit \ | |
| 49 | + GARCARD_PROMPT_TIMEOUT_SECS="${PROMPT_TIMEOUT_SECS}" \ | |
| 50 | + RUST_LOG=garcard=debug \ | |
| 51 | + "${DAEMON_CMD[@]}" >>"${LOG_FILE}" 2>&1 & | |
| 52 | + DAEMON_PID=$! | |
| 53 | + wait_for_daemon | |
| 54 | +} | |
| 55 | + | |
| 56 | +stop_daemon_graceful() { | |
| 57 | + if [[ "${DAEMON_PID}" -gt 0 ]] && kill -0 "${DAEMON_PID}" 2>/dev/null; then | |
| 58 | + run_ctl quit >/dev/null 2>&1 || true | |
| 59 | + wait "${DAEMON_PID}" 2>/dev/null || true | |
| 60 | + fi | |
| 61 | + DAEMON_PID=0 | |
| 62 | +} | |
| 63 | + | |
| 64 | +stop_daemon_sigterm() { | |
| 65 | + if [[ "${DAEMON_PID}" -gt 0 ]] && kill -0 "${DAEMON_PID}" 2>/dev/null; then | |
| 66 | + kill -TERM "${DAEMON_PID}" 2>/dev/null || true | |
| 67 | + wait "${DAEMON_PID}" 2>/dev/null || true | |
| 68 | + fi | |
| 69 | + DAEMON_PID=0 | |
| 70 | +} | |
| 71 | + | |
| 72 | +cleanup() { | |
| 73 | + stop_daemon_graceful | |
| 74 | + rm -f "${SOCKET_PATH}" | |
| 75 | +} | |
| 76 | +trap cleanup EXIT | |
| 77 | + | |
| 78 | +wait_for_active_request() { | |
| 79 | + local tries=120 | |
| 80 | + while (( tries > 0 )); do | |
| 81 | + local summary | |
| 82 | + summary="$(run_ctl auth-summary 2>/dev/null || true)" | |
| 83 | + if [[ "${summary}" == *'"active_requests": 1'* ]]; then | |
| 84 | + return 0 | |
| 85 | + fi | |
| 86 | + sleep 0.2 | |
| 87 | + tries=$((tries - 1)) | |
| 88 | + done | |
| 89 | + echo "auth request never became active" | |
| 90 | + return 1 | |
| 91 | +} | |
| 92 | + | |
| 93 | +run_probe_background() { | |
| 94 | + local out_file="$1" | |
| 95 | + local err_file="$2" | |
| 96 | + pkcheck --allow-user-interaction --process "$$" --action-id "${ACTION_ID}" \ | |
| 97 | + >"${out_file}" 2>"${err_file}" & | |
| 98 | + PROBE_PID=$! | |
| 99 | +} | |
| 100 | + | |
| 101 | +wait_probe() { | |
| 102 | + local probe_pid="$1" | |
| 103 | + local rc_file="$2" | |
| 104 | + set +e | |
| 105 | + wait "${probe_pid}" | |
| 106 | + local rc=$? | |
| 107 | + set -e | |
| 108 | + echo "${rc}" > "${rc_file}" | |
| 109 | +} | |
| 110 | + | |
| 111 | +mkdir -p "$(dirname "${SOCKET_PATH}")" | |
| 112 | +mkdir -p "$(dirname "${LOG_FILE}")" | |
| 113 | +rm -f "${SOCKET_PATH}" "${LOG_FILE}" | |
| 114 | + | |
| 115 | +echo "Sprint 04 runtime validation" | |
| 116 | +echo " socket: ${SOCKET_PATH}" | |
| 117 | +echo " log: ${LOG_FILE}" | |
| 118 | +echo " action: ${ACTION_ID}" | |
| 119 | + | |
| 120 | +echo "[0/3] Reset temporary authorizations" | |
| 121 | +pkcheck --revoke-temp || true | |
| 122 | + | |
| 123 | +echo "[1/3] Active prompt + daemon restart" | |
| 124 | +start_daemon | |
| 125 | +probe1_out="${PWD}/target/sprint04-probe-restart.out" | |
| 126 | +probe1_err="${PWD}/target/sprint04-probe-restart.err" | |
| 127 | +probe1_rc="${PWD}/target/sprint04-probe-restart.rc" | |
| 128 | +run_probe_background "${probe1_out}" "${probe1_err}" | |
| 129 | +wait_for_active_request | |
| 130 | +run_ctl quit >/dev/null | |
| 131 | +wait "${DAEMON_PID}" 2>/dev/null || true | |
| 132 | +DAEMON_PID=0 | |
| 133 | +wait_probe "${PROBE_PID}" "${probe1_rc}" | |
| 134 | +start_daemon | |
| 135 | +run_ctl status >/dev/null | |
| 136 | +run_ctl auth-summary >/dev/null | |
| 137 | +stop_daemon_graceful | |
| 138 | + | |
| 139 | +echo "[2/3] Active prompt + SIGTERM race" | |
| 140 | +start_daemon | |
| 141 | +probe2_out="${PWD}/target/sprint04-probe-sigterm.out" | |
| 142 | +probe2_err="${PWD}/target/sprint04-probe-sigterm.err" | |
| 143 | +probe2_rc="${PWD}/target/sprint04-probe-sigterm.rc" | |
| 144 | +run_probe_background "${probe2_out}" "${probe2_err}" | |
| 145 | +wait_for_active_request | |
| 146 | +stop_daemon_sigterm | |
| 147 | +wait_probe "${PROBE_PID}" "${probe2_rc}" | |
| 148 | +start_daemon | |
| 149 | +run_ctl status | |
| 150 | +run_ctl auth-summary | |
| 151 | +stop_daemon_graceful | |
| 152 | + | |
| 153 | +echo "[3/3] Summary" | |
| 154 | +echo " restart probe rc=$(cat "${probe1_rc}")" | |
| 155 | +echo " sigterm probe rc=$(cat "${probe2_rc}")" | |
| 156 | +echo " restart probe stderr: $(tr '\n' ' ' < "${probe1_err}")" | |
| 157 | +echo " sigterm probe stderr: $(tr '\n' ' ' < "${probe2_err}")" | |
| 158 | +echo "Validation complete." | |