When NOT to Use AI Code Generation

Knowing when not to use AI code generation is as important as knowing how to use it effectively. AI coding assistants are powerful tools, but they are not universally appropriate. There are domains, scenarios, and code categories where AI generation introduces unacceptable risk, produces unreliable results, or is simply less efficient than manual coding.

This guide identifies those scenarios and provides a structured decision framework for evaluating whether AI assistance is appropriate for a given task.

Core Principle

AI coding assistants are pattern-matching systems trained on existing code. They excel when the task closely matches patterns in their training data. They struggle or fail when the task requires genuine novelty, precise domain expertise, formal correctness guarantees, or reasoning about security properties that are context-dependent.

The fundamental question is not "can the AI generate this code?" but "can I confidently verify the AI's output for this specific task?" If you cannot verify correctness with high confidence, AI generation is inappropriate.

Scenarios Where AI Should NOT Be Used

1. Security-Critical Code Paths

AI-generated code has a 2.74x higher vulnerability rate. For code that directly implements security controls, the risk of subtle vulnerabilities is unacceptable.

Do NOT use AI for:

Authentication logic (login flows, session management, token validation)
Authorization and access control (permission checks, role-based access)
Cryptographic implementations (encryption, decryption, key derivation, signing)
Input sanitization for injection prevention (SQL injection, XSS, command injection)
Certificate validation and TLS configuration
Security-critical middleware and interceptors

Why: AI may generate code that appears secure but contains subtle flaws--using insecure defaults, omitting important checks, or implementing deprecated cryptographic patterns from training data. These flaws can be extremely difficult to detect during code review.

Alternative: Write security-critical code manually, leveraging well-tested security libraries. Have the code reviewed by a security specialist per PRD-STD-002.

2. Novel Algorithm Development

AI excels at reproducing known patterns but cannot innovate. For genuinely novel algorithms, AI outputs will be a poor fit.

Do NOT use AI for:

Research algorithms that do not exist in public codebases
Custom optimization algorithms specific to your problem domain
Novel data structures designed for specific performance characteristics
Original machine learning model architectures
Proprietary business logic algorithms that represent competitive advantage

Why: AI will generate code based on the closest match in its training data, which may be structurally or semantically different from what you need. The result often looks plausible but is fundamentally wrong for your use case.

Alternative: Develop novel algorithms through traditional engineering, using AI only for boilerplate and scaffolding around the core algorithm.

3. Regulatory and Compliance-Sensitive Code

Code that is subject to regulatory requirements demands deterministic, auditable, and precisely documented logic.

Do NOT use AI for:

Financial calculation logic subject to accounting standards (GAAP, IFRS)
Healthcare data processing subject to HIPAA
Privacy-related logic subject to GDPR, CCPA, or similar regulations
Payment processing logic subject to PCI DSS
Code that will be submitted for regulatory review or certification
Tax calculation engines subject to jurisdictional rules

Why: Regulatory code requires precise adherence to specific rules, often with detailed audit trails. AI-generated code may introduce subtle deviations that are difficult to detect but create compliance violations. Furthermore, regulatory auditors may not accept AI-generated code without additional verification procedures.

Alternative: Write regulatory code manually from the regulatory specification. Use AI only for non-regulated supporting code (e.g., UI, logging).

4. High-Assurance Systems

Systems where failure has severe consequences (safety, financial, environmental) require engineering rigor that AI cannot provide.

Do NOT use AI for:

Safety-critical systems (medical devices, automotive controls, aviation)
Infrastructure control systems (power grid, water treatment, industrial automation)
Financial trading systems where bugs cause direct financial loss
Systems requiring formal verification or mathematical proof of correctness
Code that must comply with standards like DO-178C, IEC 62304, or ISO 26262

Why: High-assurance systems require formal methods, exhaustive testing, and certification processes that are fundamentally incompatible with probabilistic AI code generation.

Alternative: Follow established safety-critical development processes (formal specification, formal verification, certified toolchains).

5. Deep Domain Expertise Code

Code that requires specialized domain knowledge beyond what is well-represented in AI training data.

Do NOT use AI for:

Specialized scientific computing (unless using well-known libraries)
Domain-specific protocol implementations
Code that requires understanding of specific hardware or embedded systems
Legacy system integration requiring knowledge of undocumented proprietary systems

Why: AI may hallucinate APIs, misunderstand domain constraints, or apply generic patterns where domain-specific expertise is needed.

Decision Matrix

Use this matrix to evaluate whether AI assistance is appropriate for a given task:

Factor	Low Risk (AI Appropriate)	Medium Risk (AI with Caution)	High Risk (Avoid AI)
Security Impact	No direct security implications	Processes user data; defense-in-depth present	Implements security controls directly
Regulatory Exposure	No regulatory requirements	Tangential to regulated systems	Directly subject to regulation
Novelty	Well-known patterns and libraries	Moderate customization of known patterns	Novel algorithms or data structures
Verification Difficulty	Easy to verify with unit tests	Requires integration testing and review	Requires formal verification or audit
Domain Complexity	General-purpose programming	Moderate domain knowledge needed	Deep specialized domain expertise
Failure Impact	Low (cosmetic, non-critical)	Medium (user-facing, recoverable)	High (safety, financial, compliance)
Auditability Requirement	None	Desirable	Mandatory with complete lineage

Scoring:

If any factor is "High Risk" -- Do NOT use AI for this task
If two or more factors are "Medium Risk" -- Use AI with extreme caution, enhanced review, and senior oversight
If all factors are "Low Risk" -- AI assistance is appropriate; follow standard AEEF practices

Partial AI Assistance

In many "do not use AI" scenarios, AI can still provide value in supporting roles:

Task Category	AI Can Help With	AI Must NOT Do
Security-critical code	Test generation, documentation, boilerplate scaffolding	Core security logic
Regulatory code	UI/UX around regulatory functions, data formatting	Calculation logic, compliance rules
Novel algorithms	Scaffolding, utility functions, tests	Core algorithm implementation
High-assurance systems	Documentation, tooling scripts, non-critical utilities	Safety-critical logic

This partial-assistance model captures some AI velocity benefits while keeping high-risk code under human authorship.

Team Decision Process

When it is unclear whether AI is appropriate for a task, follow this escalation process:

Individual engineer assessment -- Apply the decision matrix above. If the result is clear, proceed accordingly.
Peer discussion -- If the result is ambiguous, discuss with a team member. Two engineers applying the matrix independently should agree.
Tech lead escalation -- If there is disagreement, escalate to the tech lead for a decision.
Architecture review -- For system-level decisions about where AI may and may not be used, include the decision in an Architecture Decision Record per PRD-STD-005.

Organizational Boundaries

Organizations SHOULD maintain a formal list of code categories and modules where AI generation is prohibited. This list should be:

Documented in the project's contributing guidelines
Enforced through labeling and code review per PRD-STD-002
Reviewed quarterly as AI tool capabilities evolve
Aligned with the organization's risk framework per Pillar 2: Governance & Risk

Common Mistakes

Overconfidence in AI security code: "The AI generated authentication code that looks right to me" is not sufficient verification. Security code requires specialist review.
Using AI for unfamiliar domains: If you lack the expertise to verify AI's output in a specialized domain, the AI's output is unverifiable regardless of how plausible it appears.
Ignoring the 2.74x vulnerability rate: The statistical risk of AI-generated security vulnerabilities is well-documented. Acknowledge and plan for this elevated risk.
Assuming AI understands regulations: AI has no awareness of regulatory requirements. It generates code that patterns match, not code that complies.
Gradual scope creep: Starting with AI for safe tasks and gradually letting it generate increasingly risky code without reassessing. Regularly reapply the decision matrix.

For related guidance, see PRD-STD-004: Security Scanning and Pillar 2: Governance & Risk.

Core Principle​

Scenarios Where AI Should NOT Be Used​

1. Security-Critical Code Paths​

2. Novel Algorithm Development​

3. Regulatory and Compliance-Sensitive Code​

4. High-Assurance Systems​

5. Deep Domain Expertise Code​

Decision Matrix​

Partial AI Assistance​

Team Decision Process​

Organizational Boundaries​

Common Mistakes​

Core Principle

Scenarios Where AI Should NOT Be Used

1. Security-Critical Code Paths

2. Novel Algorithm Development

3. Regulatory and Compliance-Sensitive Code

4. High-Assurance Systems

5. Deep Domain Expertise Code

Decision Matrix

Partial AI Assistance

Team Decision Process

Organizational Boundaries

Common Mistakes