9001:2026
All articles
AI & QMSMarch 18, 20268 min read

Using ChatGPT to write ISO 9001 procedures: where it helps and where it fails

Generative AI can draft a passable procedure in seconds. Whether it survives an audit, gets used by your team, and reflects how work actually happens is a very different question.

Ask ChatGPT for a 'control of nonconforming output' procedure and you will get something usable in under a minute. Clean structure, references to clause 8.7, sensible roles and responsibilities. For a quality manager staring at a blank page, the productivity boost is real.

But there is a meaningful gap between a procedure that reads well and a procedure that runs your business. Understanding where that gap lives is the difference between AI as accelerator and AI as liability.

Where ChatGPT genuinely helps

  • First-draft scaffolding — purpose, scope, definitions, references, generic flow
  • Translating dense clause language into plain working English
  • Summarizing long procedures into one-page work instructions
  • Generating audit checklists from a procedure you already trust
  • Rewriting for tone, length, or a specific reading level
  • Cross-referencing requirements across ISO 9001, 14001, 45001, and 27001

Where it consistently fails

1. It does not know your process

The model produces a plausible average of every procedure it has ever seen. Your actual workflow — the handoffs, the exceptions, the one customer who insists on a different inspection regime — is invisible to it. Procedures that do not match reality get ignored, which is worse than no procedure at all.

2. It hallucinates clause references

Confident citations of 'clause 8.5.4' or 'Annex SL section 6.2' are a known failure mode. If you publish a procedure with a fabricated reference and an auditor catches it, the entire document loses credibility.

3. It cannot make risk decisions

Risk-based thinking is a judgment about your context, your customers, your tolerance. A model averaging the internet has no view on whether your supplier qualification threshold should be 95% or 98%. It will pick whatever sounds reasonable, and 'reasonable' is not a defensible basis for a control.

A workable pattern

  • Use AI to draft the structure, never the substance
  • Have the actual process owner — not the quality team — fill in the real flow
  • Verify every clause reference against the standard itself
  • Run the draft past someone who does the work daily before it goes live
  • Treat AI-generated text as a starting prompt, not finished documented information
ChatGPT writes procedures. Your team writes the procedure that actually runs the business. Those are not the same document.