Human Hardening

The Human Hardening stage is the third stage of the Operating Model Lifecycle and the most critical quality control point in AI-assisted development. All AI-generated outputs MUST undergo thorough human review, code refactoring, security analysis, performance optimization, and quality assurance before proceeding to the Governance Gate. This stage exists because AI-generated code — however impressive at first glance — carries 1.7x more issues and a 2.74x higher vulnerability rate than human-authored code. Human hardening is not a rubber stamp; it is a deliberate, skilled engineering activity that transforms a prototype into production-quality software.

Code Refactoring

Refactoring Objectives

AI-generated code frequently exhibits patterns that require human refinement:

Issue Pattern	Description	Refactoring Action
Over-engineering	AI generates unnecessarily complex solutions for simple problems	Simplify; remove unnecessary abstractions
Naming inconsistency	AI uses generic or inconsistent variable/function names	Align with organizational naming conventions
Dead code	AI generates unused functions, imports, or variables	Remove all dead code
Duplication	AI generates code that duplicates existing codebase functionality	Replace with calls to existing implementations
Anti-patterns	AI uses patterns that conflict with organizational standards	Refactor to approved patterns
Missing error handling	AI omits error handling or uses overly broad catch blocks	Add specific, appropriate error handling
Hardcoded values	AI embeds configuration values directly in code	Extract to configuration files or environment variables

Refactoring Checklist

Before marking refactoring as complete, the developer MUST verify:

Code follows organizational style guide and naming conventions
No dead code, unused imports, or commented-out blocks remain
No duplication of existing codebase functionality
Error handling is specific and appropriate for each failure mode
Configuration values are externalized
Code is no more complex than the problem requires
All AI-generated comments are accurate (AI sometimes generates plausible but incorrect comments)
Function and method sizes are within organizational limits (typically < 30 lines)
Dependencies introduced by AI are validated (they exist, are maintained, and meet licensing requirements)

Security Analysis

Security analysis during human hardening is the primary defense against the elevated vulnerability rate in AI-generated code. This analysis MUST be performed by a developer with security awareness training, and SHOULD be reviewed by the Security team for Risk Tier 3 and above.

Security Review Checklist

The following security checks MUST be performed on all AI-generated code:

Input Validation and Injection

All user inputs are validated and sanitized before use
Database queries use parameterized statements (no string concatenation)
HTML output is properly escaped to prevent XSS
File paths are validated to prevent path traversal
Command execution uses parameterized APIs (no shell injection vectors)
XML parsing is configured to prevent XXE attacks
Deserialization is restricted to expected types

Authentication and Authorization

Authentication logic does not leak information (timing attacks, user enumeration)
Authorization checks are present on all protected endpoints
Session management follows organizational standards
Passwords are hashed with approved algorithms (bcrypt, argon2) with appropriate parameters
Tokens are generated with sufficient entropy and appropriate expiration

Data Protection

Sensitive data is not logged or included in error messages
Encryption uses current, approved algorithms and key lengths
TLS is enforced for all external communications
Secrets are not hardcoded (API keys, passwords, connection strings)
Data at rest encryption is applied where required by data classification

Common AI Vulnerability Patterns

AI tools are known to produce the following vulnerability patterns with elevated frequency:

Vulnerability	AI Tendency	What to Look For
SQL injection	Uses string interpolation instead of parameterized queries	Any SQL query constructed with `+`, `f""`, or template literals
Insecure defaults	Uses permissive defaults (e.g., CORS `*`, debug mode on)	Configuration settings that are too open
Missing authentication	Generates endpoints without authentication middleware	New endpoints without auth decorators/middleware
Deprecated APIs	Uses deprecated or insecure API versions	Method calls flagged by IDE or linter as deprecated
Insufficient input validation	Trusts input data without validation	Functions that process external input without validation
Predictable randomness	Uses `Math.random()` or similar for security-sensitive operations	Random value generation for tokens, IDs, or crypto

Performance Optimization

AI-generated code often prioritizes correctness and readability over performance. Human hardening MUST include performance evaluation for code on performance-sensitive paths.

Performance Review Criteria

Criterion	Check	Action if Failed
Algorithmic complexity	Verify time and space complexity are appropriate for expected data volumes	Optimize algorithm or add pagination/batching
Database query efficiency	Check for N+1 queries, missing indexes, unnecessary joins	Optimize queries; add eager loading or caching
Memory management	Check for memory leaks, unnecessary data copies, unbounded collections	Add resource cleanup, pagination, streaming
Concurrency	Check for race conditions, deadlocks, thread safety issues	Add synchronization or use concurrent data structures
I/O efficiency	Check for unnecessary network calls, file operations, or serialization	Batch operations; add caching; reduce I/O

When Full Performance Review Is Required

Code on hot paths (called > 100 times per second)
Code processing large data sets (> 10,000 records)
Code with latency-sensitive user-facing interactions
Background jobs or batch processing code
Code that introduces new external service calls

Quality Assurance Procedures

Test Development

AI-generated code MUST be accompanied by comprehensive tests. The testing strategy for AI-assisted code includes:

Test Type	Requirement	Who Creates
Unit tests	REQUIRED for all AI-generated code	AI-generated (initial) + human review and augmentation
Integration tests	REQUIRED for code that interacts with external systems	Human-written (AI MAY assist)
Edge case tests	REQUIRED — AI-generated tests often miss edge cases	Human-written
Security tests	REQUIRED for security-sensitive code	Human-written with security expertise
Performance tests	RECOMMENDED for performance-sensitive code	Human-written

Test Quality Criteria

Tests for AI-generated code MUST meet higher quality standards than typical unit tests because they serve as the primary validation that the AI's output is correct:

Assertion quality — Tests MUST contain meaningful assertions that verify behavior, not just that code runs without exceptions
Edge case coverage — Tests MUST cover boundary conditions, null/empty inputs, error conditions, and concurrent access scenarios
Independence — Tests MUST be independent of each other and of external state
Readability — Test names MUST clearly describe what is being tested and what the expected behavior is
Coverage threshold — AI-generated code MUST achieve the same or higher code coverage as the organizational standard (typically >= 80% line coverage)

Documentation Requirements

The following documentation MUST be completed during human hardening:

Inline code comments are accurate and explain "why," not "what"
Public API documentation (JSDoc, Javadoc, docstrings) is complete and accurate
README or wiki documentation is updated for any new features or configuration
AI attribution metadata is present on all AI-assisted files and commits
Any deviations from the original Business Intent are documented and communicated to stakeholders

Hardening Completion Criteria

Human hardening is complete when ALL of the following criteria are met:

All refactoring checklist items are verified
All security review checklist items are verified
Performance review is complete (for applicable code paths)
All tests pass with required coverage
Documentation is complete
AI attribution metadata is in place
The code is ready for peer review as part of the standard PR process
The developer is confident the code meets production quality standards

Human hardening is where engineering discipline meets AI acceleration. The exploration stage generates raw material quickly; hardening transforms it into production-quality software. Skipping or rushing this stage negates the safety mechanisms of the entire AEEF framework. The time invested in hardening is what allows the organization to deploy AI-assisted code with the same confidence as human-authored code — and the data collected through Expanded Metrics proves it.

Code Refactoring
- Refactoring Objectives
- Refactoring Checklist
Security Analysis
- Security Review Checklist
Performance Optimization
- Performance Review Criteria
- When Full Performance Review Is Required
Quality Assurance Procedures

Code Refactoring​

Refactoring Objectives​

Refactoring Checklist​

Security Analysis​

Security Review Checklist​

Input Validation and Injection​

Authentication and Authorization​

Data Protection​

Common AI Vulnerability Patterns​

Performance Optimization​

Performance Review Criteria​

When Full Performance Review Is Required​

Quality Assurance Procedures​

Test Development​

Test Quality Criteria​

Documentation Requirements​

Hardening Completion Criteria​