Skip to main content
Preview Your Audit
← All insights

What Remediation Should Look Like

Detection without response is not a control — it's a report. A walk through the closed-loop remediation system: rule-level tickets, two-check auto-closure, and the audit trail that comes free with the architecture.

This is part of a series on rethinking ISO 27001 compliance from first principles. Previous articles addressed evidence structure, the questions auditors ask, and the AI system that answers them. This one tackles the gap between knowing something is wrong and actually fixing it — the remediation loop that most compliance systems leave open.

A compliance report tells you that 3 out of 100 devices aren’t encrypted. Now what?

In most organisations: nothing. The report goes into a folder. The devices stay unencrypted. Six months later, the auditor asks the same question and gets the same report with slightly different device names.

This is the gap nobody talks about. Not the evidence gap — I’ve covered that. The remediation gap. The space between knowing something is wrong and actually fixing it. Evidence collection without remediation is surveillance, not compliance. You’re watching the problem. You’re not solving it.

The standard is clear on this. Clause 10.2 requires that when a nonconformity occurs, the organisation must react, evaluate the need for corrective action, implement it, and review its effectiveness. It is essential here to distinguish between a correction — the immediate act of fixing a specific nonconformity — and corrective action — the process of eliminating the root cause to ensure the failure does not recur. That’s a closed loop: detect, fix, verify. Most compliance implementations break the loop at the first step: they detect and report, then rely on someone reading the report and deciding to act. For endpoint devices specifically, ISO 27002 guidance suggests that as threat landscapes evolve, incident response timelines should move from manual “hours or days” to automated “minutes and real-time” capabilities to remain audit-ready.


The granularity problem

When a compliance check fails, what gets flagged? In most systems, the control. “A.8.1 is non-compliant.” That’s the notification. That’s the ticket. That’s what the IT team receives.

But A.8.1 has seven rules. Device compliance coverage. Encryption coverage. Windows EDR onboarding. Conditional Access enforcement. Linux endpoint sensor. macOS sensor health. macOS Defender configuration. Each measures something different, has a different threshold, and requires a different remediation action by a different person.

“A.8.1 is non-compliant” provides the technician with no actionable information. Which rule failed? What’s the gap? Which specific devices are affected? What admin portal do they need to open?

The right granularity is one remediation item per non-compliant rule, not per control. When A.8.1-R3 (Windows EDR onboarding) drops below 95%, the remediation item should say:

  • What failed: Windows EDR onboarding is at 91% (threshold: 95%)
  • The gap: 7 devices not onboarded (minus 2 approved exceptions = 5 actionable)
  • Which devices: List of specific device names and their users
  • Where to fix it: Microsoft Defender admin centre > Endpoints > Device inventory
  • What to do: Verify the Defender sensor is installed and reporting on each listed device

That’s actionable. A technician can read it, open the correct portal, and begin work. No interpretation required. No “what does this compliance finding actually mean?” No email thread requesting clarification from the security team.


The ticket integration question

Here’s something that puzzles me about the compliance industry: MSPs already have ticketing systems. ConnectWise, Autotask, Halo — professional services automation tools that exist precisely to track work from identification through assignment to completion.

Compliance systems exist in a parallel universe. They produce reports. The reports are uploaded to SharePoint or a GRC platform. Someone reviews them periodically. If action is needed, someone manually creates a ticket. Maybe.

Why are these separate?

If a compliance rule fails, the expected output is a ticket in the system technicians already use. Not a finding in a compliance report that someone has to translate into a ticket. The compliance system knows which rule failed, the gap, which items are affected, and what needs to happen. The ticketing system can assign work, track progress, and verify completion. The integration point is obvious.

One ticket per non-compliant rule. Automatically created when evidence collection detects the failure. Assigned to the right team. With enough detail that the technician doesn’t need to consult the original evidence report to understand what to do.

When I first described this, I imagined the ticket detail being manually authored per rule — someone writing the “what failed, what to do, where to fix it” for each of the hundreds of rules across 93 controls. That didn’t scale. The solution was rule metadata — a structured knowledge base where each rule is annotated with its remediation context: what the rule measures in plain language, what portal to open, what steps to take, what the expected outcome looks like, and which ISO 27001 clause justifies the requirement. When a ticket is created, the system assembles the description automatically from this metadata. The technician receives a ticket that reads like it was written by someone who understood both the compliance requirement and the operational fix — because it was, once, and that understanding is now encoded and reusable across every tenant and every collection cycle.

And — critically — with duplicate prevention. If the same rule fails on the next evidence collection cycle, you don’t create a second ticket. Update the existing one to reflect the current gap. The technician sees one ticket per problem, not a cascade of identical findings from successive collection runs.


The auto-closure problem

Creating tickets for non-compliance is the easy half. The hard half is closing them.

In most workflows, closure is manual. A technician resolves the issue, marks the ticket as resolved, and a verifier verifies the fix during the next review cycle. The gap between “fixed” and “verified” can range from weeks to months. During that time, the ticket remains resolved, yet no one is confident about it.

What if the evidence system verified the fix automatically?

The logic is straightforward. Evidence collection runs. A previously failing rule is now passing. The evidence has changed from non-compliant to compliant. The ticket should close.

But not immediately. A single passing result might be a false positive. The device that was missing EDR might have been offline during the second scan, causing it to be excluded from the denominator rather than marked as compliant. The Conditional Access policy might have been temporarily reconfigured for testing and might have passed during collection.

So you require confirmation. Not one passing result — two consecutive passing results. On two separate collection runs, the rule must pass before the system considers the remediation complete. If the rule fails again between the two checks, the counter resets. You start over.

This “two-check” logic is more than just a safeguard; it aligns with Clause 9.1’s requirement that monitoring methods must produce “comparable and reproducible results” to be considered valid under the standard. Two consecutive compliant checks are a surprisingly effective filter. It eliminates transient fixes, testing artefacts, and scan timing coincidences. It’s not perfect — nothing is — but it’s dramatically better than “the technician said it’s fixed.”

The gold standard for this feedback loop is “drift control,” a feature on platforms such as Azure Local that automatically refreshes security settings every 90 minutes to ensure any deviation from the desired state is remediated immediately.

When the second consecutive check passes, the ticket closes automatically. A verification note is added documenting the changes: the failing rule, the current compliance state, the evidence timestamp, and the location of the full evidence report. The auditor can trace the original finding through remediation to verified closure without asking anyone to explain what happened.


The state manifests

This workflow requires memory. The evidence system needs to remember, per rule, the compliance state from the last check, whether there’s an open ticket, and how many consecutive passing checks have occurred.

I think of this as an evidence state manifest — a per-tenant record that tracks every rule’s compliance history. Each entry records:

  • The current status (compliant, non-compliant, collection error)
  • The associated ticket ID (if one exists)
  • The number of consecutive compliant checks
  • The last ten state transitions (for audit trail)
  • The timestamp of the last check

This isn’t complex data. It’s a JSON file that grows as rules are tracked and shrinks as tickets are closed. But it enables the entire closed-loop workflow. Without it, every evidence collection run is stateless — it knows what’s true now but has no memory of what was true before. The manifest provides the continuity that turns point-in-time snapshots into a control system.

The distinction matters. A point-in-time snapshot says: “A.8.8-R4 is non-compliant.” A stateful system says: “A.8.8-R4 was non-compliant on January 30th. A ticket was created (#3056295). The rule passed on February 1st (first check). It passed again on February 1st (second check). The ticket was auto-closed at 20:41:36.”

That’s an audit trail. Not a report — a trail. The auditor can see the full lifecycle of a finding from detection to resolution without asking a single question.


Resolution documentation

Auto-closure handles the simple case: the rule failed, someone fixed it, and the rule now passes. But some remediations are more complex, and the fix itself deserves documentation.

When a technician resolves a non-trivial compliance finding, the resolution should be documented in a standardised format:

  • Root cause: Why was the control failing? Not “the setting was wrong” but specifically what was wrong and why it happened.
  • Technical fix: What was changed? Numbered steps, specific enough that someone else could replicate the fix.
  • Validation: How was the fix verified? “Evidence report regenerated and uploaded” — not “checked, and it looks fine.”
  • Files modified: If the fix involved code or configuration changes, which files changed?

This sounds like overhead. It’s not. It’s Clause 10.2 made concrete. The standard requires root cause analysis, corrective action, and effectiveness review. A standardised resolution note captures all three in a format that both the technician and the auditor can understand.

The resolution note lives on the ticket. When the auditor asks, “I see A.5.13-R2 was non-compliant in January — what happened?”, the answer isn’t a conversation. It’s a ticket with a resolution note that says: “Root cause: the collection script was checking the wrong API endpoint. Fix: switched to the correct endpoint. Validation: rule now passes, evidence regenerated.” The auditor can verify the evidence independently. The loop is closed.


The feedback loop

Let me draw the full workflow:

  1. Collect — Evidence collection runs against the tenant. API calls, threshold evaluation, and weighted scoring.
  2. Evaluate — Rules pass or fail. The overall control compliance is calculated.
  3. Remediate — For each failing rule, a ticket is created (or an existing ticket is updated). The ticket contains specific actionable details.
  4. Fix — The technician remediates the issue. Optionally documents the resolution.
  5. Re-collect — Evidence collection runs again. The previously failing rule is re-evaluated.
  6. Verify — If the rule passes, the consecutive compliant counter increments. If it fails, the counter resets.
  7. Close — After two consecutive passing checks, the ticket auto-closes with a verification note.

This is a control system. Not a compliance exercise — a feedback loop where the output (compliance state) feeds back into the input (remediation action) until the system reaches the desired state.

The auditor’s question — “When a nonconformity is identified, what happens?” — has a concrete answer: a ticket is created in the next evidence-collection cycle, the technician remediates, the system re-evaluates, and the ticket closes once compliance is verified. The entire chain is logged, timestamped, and auditable.

Most organisations answer that question with: “We review the findings and create an action plan.” That’s a process description. What I’ve described is a system.


What we learned when we built it

I’ve described a system. Let me tell you what happened when we actually implemented it — because theory doesn’t predict everything.

The “safe to close” problem. Auto-closure sounds clean in theory: two consecutive checks pass, ticket closes. In practice, a technician may have started working on a ticket, added diagnostic notes, and is halfway through a complex remediation when the evidence collection happens to catch a transient compliant state. Auto-closing that ticket would destroy the technician’s work context and create confusion. The solution: only auto-close tickets that are still in their initial state — “New” or “Awaiting Schedule.” If a technician has moved the ticket to “In Progress” or any active work status, the system respects their ownership and leaves the ticket open, even if the evidence shows compliant.

The priority derivation problem. When a rule fails, what priority should the ticket receive? Our initial instinct was to derive priority from the rule’s weight — high-weight rules get high-priority tickets. This was wrong. Weight reflects audit importance, not operational urgency. A better signal turned out to be the compliance score itself: a rule at 0% (total failure) is critical; a rule at 75% (below threshold but partially operational) is medium. The score tells you how broken things are; the weight tells you how much the auditor cares.

The state manifest growth problem. Each tenant’s evidence state manifest tracks every rule’s compliance history. After several months, the history arrays grow. In a system with a 2MB document size limit, this matters. The solution: cap history at 20 entries per rule and trim on every update. This preserves the audit trail while keeping the document within storage constraints. The lesson: any system that accumulates history needs a retention policy from day one, not after the first storage failure.

The duplicate ticket cascade. Our duplicate prevention logic checked for existing tickets by rule ID. But ConnectWise ticket searches are case-insensitive and match on substrings. A search for “A.8.1-R3” could match “A.8.1-R3” and also “A.8.13-R3” if the search wasn’t bounded precisely. The fix required exact-match validation after the initial search — a reminder that ticket system APIs are not databases, and their search semantics may not match your expectations.

These aren’t edge cases. They’re the lessons that separate a described system from an operating one. Every one of them was invisible during design and obvious in production.


The question I’ll leave you with

When your last compliance check found a gap — any gap, on any control — what happened next?

Was there a ticket? Was it specific enough for a technician to act on without further context? Was there a deadline? Was there a re-check? Was the fix verified with evidence, not just an assertion?

If the answer to any of those is “not exactly,” the problem isn’t that your compliance checks aren’t finding issues. They probably are. The problem is that findings without remediation, remediation without verification, and verification without evidence are just different flavours of the same gap.

The standard calls it “corrective action.” I’d call it closing the loop. Either way, detection without response is not a control. It’s a report.


JJ Milner is a Microsoft MVP and the founder of Global Micro Solutions, a managed services provider operating across 1,200+ Microsoft 365 tenants. He writes about rethinking compliance from first principles.

Related ISO 27001 controls

J
JJ Milner

Microsoft MVP and founder of Global Micro Solutions. 30+ years securing Microsoft environments across 1,200+ tenants. Writes about rethinking compliance from first principles.

See what the auditor would find. In 30 minutes.

Same questions a real ISO 27001 auditor asks. Immediate gap analysis.

Start Your Audit Preview