Skip to main content

Human Hardening

The Human Hardening stage is the third stage of the Operating Model Lifecycle and the most critical quality control point in AI-assisted development. All AI-generated outputs MUST undergo thorough human review, code refactoring, security analysis, performance optimization, and quality assurance before proceeding to the Governance Gate. This stage exists because AI-generated code — however impressive at first glance — carries 1.7x more issues and a 2.74x higher vulnerability rate than human-authored code. Human hardening is not a rubber stamp; it is a deliberate, skilled engineering activity that transforms a prototype into production-quality software.

Code Refactoring

Refactoring Objectives

AI-generated code frequently exhibits patterns that require human refinement:

Issue PatternDescriptionRefactoring Action
Over-engineeringAI generates unnecessarily complex solutions for simple problemsSimplify; remove unnecessary abstractions
Naming inconsistencyAI uses generic or inconsistent variable/function namesAlign with organizational naming conventions
Dead codeAI generates unused functions, imports, or variablesRemove all dead code
DuplicationAI generates code that duplicates existing codebase functionalityReplace with calls to existing implementations
Anti-patternsAI uses patterns that conflict with organizational standardsRefactor to approved patterns
Missing error handlingAI omits error handling or uses overly broad catch blocksAdd specific, appropriate error handling
Hardcoded valuesAI embeds configuration values directly in codeExtract to configuration files or environment variables

Refactoring Checklist

Before marking refactoring as complete, the developer MUST verify:

  • Code follows organizational style guide and naming conventions
  • No dead code, unused imports, or commented-out blocks remain
  • No duplication of existing codebase functionality
  • Error handling is specific and appropriate for each failure mode
  • Configuration values are externalized
  • Code is no more complex than the problem requires
  • All AI-generated comments are accurate (AI sometimes generates plausible but incorrect comments)
  • Function and method sizes are within organizational limits (typically < 30 lines)
  • Dependencies introduced by AI are validated (they exist, are maintained, and meet licensing requirements)

Security Analysis

Security analysis during human hardening is the primary defense against the elevated vulnerability rate in AI-generated code. This analysis MUST be performed by a developer with security awareness training, and SHOULD be reviewed by the Security team for Risk Tier 3 and above.

Security Review Checklist

The following security checks MUST be performed on all AI-generated code:

Input Validation and Injection

  • All user inputs are validated and sanitized before use
  • Database queries use parameterized statements (no string concatenation)
  • HTML output is properly escaped to prevent XSS
  • File paths are validated to prevent path traversal
  • Command execution uses parameterized APIs (no shell injection vectors)
  • XML parsing is configured to prevent XXE attacks
  • Deserialization is restricted to expected types

Authentication and Authorization

  • Authentication logic does not leak information (timing attacks, user enumeration)
  • Authorization checks are present on all protected endpoints
  • Session management follows organizational standards
  • Passwords are hashed with approved algorithms (bcrypt, argon2) with appropriate parameters
  • Tokens are generated with sufficient entropy and appropriate expiration

Data Protection

  • Sensitive data is not logged or included in error messages
  • Encryption uses current, approved algorithms and key lengths
  • TLS is enforced for all external communications
  • Secrets are not hardcoded (API keys, passwords, connection strings)
  • Data at rest encryption is applied where required by data classification

Common AI Vulnerability Patterns

AI tools are known to produce the following vulnerability patterns with elevated frequency:

VulnerabilityAI TendencyWhat to Look For
SQL injectionUses string interpolation instead of parameterized queriesAny SQL query constructed with +, f"", or template literals
Insecure defaultsUses permissive defaults (e.g., CORS *, debug mode on)Configuration settings that are too open
Missing authenticationGenerates endpoints without authentication middlewareNew endpoints without auth decorators/middleware
Deprecated APIsUses deprecated or insecure API versionsMethod calls flagged by IDE or linter as deprecated
Insufficient input validationTrusts input data without validationFunctions that process external input without validation
Predictable randomnessUses Math.random() or similar for security-sensitive operationsRandom value generation for tokens, IDs, or crypto

Performance Optimization

AI-generated code often prioritizes correctness and readability over performance. Human hardening MUST include performance evaluation for code on performance-sensitive paths.

Performance Review Criteria

CriterionCheckAction if Failed
Algorithmic complexityVerify time and space complexity are appropriate for expected data volumesOptimize algorithm or add pagination/batching
Database query efficiencyCheck for N+1 queries, missing indexes, unnecessary joinsOptimize queries; add eager loading or caching
Memory managementCheck for memory leaks, unnecessary data copies, unbounded collectionsAdd resource cleanup, pagination, streaming
ConcurrencyCheck for race conditions, deadlocks, thread safety issuesAdd synchronization or use concurrent data structures
I/O efficiencyCheck for unnecessary network calls, file operations, or serializationBatch operations; add caching; reduce I/O

When Full Performance Review Is Required

  • Code on hot paths (called > 100 times per second)
  • Code processing large data sets (> 10,000 records)
  • Code with latency-sensitive user-facing interactions
  • Background jobs or batch processing code
  • Code that introduces new external service calls

Quality Assurance Procedures

Test Development

AI-generated code MUST be accompanied by comprehensive tests. The testing strategy for AI-assisted code includes:

Test TypeRequirementWho Creates
Unit testsREQUIRED for all AI-generated codeAI-generated (initial) + human review and augmentation
Integration testsREQUIRED for code that interacts with external systemsHuman-written (AI MAY assist)
Edge case testsREQUIRED — AI-generated tests often miss edge casesHuman-written
Security testsREQUIRED for security-sensitive codeHuman-written with security expertise
Performance testsRECOMMENDED for performance-sensitive codeHuman-written

Test Quality Criteria

Tests for AI-generated code MUST meet higher quality standards than typical unit tests because they serve as the primary validation that the AI's output is correct:

  • Assertion quality — Tests MUST contain meaningful assertions that verify behavior, not just that code runs without exceptions
  • Edge case coverage — Tests MUST cover boundary conditions, null/empty inputs, error conditions, and concurrent access scenarios
  • Independence — Tests MUST be independent of each other and of external state
  • Readability — Test names MUST clearly describe what is being tested and what the expected behavior is
  • Coverage threshold — AI-generated code MUST achieve the same or higher code coverage as the organizational standard (typically >= 80% line coverage)

Documentation Requirements

The following documentation MUST be completed during human hardening:

  • Inline code comments are accurate and explain "why," not "what"
  • Public API documentation (JSDoc, Javadoc, docstrings) is complete and accurate
  • README or wiki documentation is updated for any new features or configuration
  • AI attribution metadata is present on all AI-assisted files and commits
  • Any deviations from the original Business Intent are documented and communicated to stakeholders

Hardening Completion Criteria

Human hardening is complete when ALL of the following criteria are met:

  • All refactoring checklist items are verified
  • All security review checklist items are verified
  • Performance review is complete (for applicable code paths)
  • All tests pass with required coverage
  • Documentation is complete
  • AI attribution metadata is in place
  • The code is ready for peer review as part of the standard PR process
  • The developer is confident the code meets production quality standards

Human hardening is where engineering discipline meets AI acceleration. The exploration stage generates raw material quickly; hardening transforms it into production-quality software. Skipping or rushing this stage negates the safety mechanisms of the entire AEEF framework. The time invested in hardening is what allows the organization to deploy AI-assisted code with the same confidence as human-authored code — and the data collected through Expanded Metrics proves it.