Pilot Project Selection Criteria

This section defines the criteria for choosing pilot projects that will serve as the initial testbed for AI-assisted development. Pilot selection is a strategic decision — choose too simple a project and the results are unconvincing; choose too complex and the risk of failure obscures the evaluation. The ideal pilot demonstrates meaningful value, contains risk within acceptable bounds, and provides clear, measurable data points to inform the go/no-go decision for Phase 2: Structured Expansion.

Ideal Project Characteristics

A well-selected pilot project exhibits the following characteristics. Organizations SHOULD aim to satisfy at least seven of these ten criteria.

Technical Characteristics

Moderate complexity — The project SHOULD involve 2-4 week development cycles with well-understood requirements. It MUST NOT be a greenfield architecture initiative or a critical system rewrite.
Established technology stack — The project MUST use languages and frameworks that are well-supported by the selected AI tools. Niche or legacy technology stacks SHOULD be avoided for initial pilots.
Strong test infrastructure — The project MUST have existing automated test suites (unit and integration) that can validate AI-generated code. Projects without test infrastructure are NOT suitable pilots.
Non-critical production path — The project SHOULD NOT be on the critical path for a major revenue-generating release. Internal tools, developer productivity features, and non-customer-facing services are RECOMMENDED for initial pilots.
Clear code review culture — The project team MUST already practice regular code review. Teams without an established review culture will not provide reliable quality signal.

Organizational Characteristics

Willing and experienced team — The team MUST include developers who volunteered for the pilot and have at least 2 years of experience in the project's technology stack.
Supportive management — The team's direct management MUST actively support the pilot, including allocating time for training, measurement overhead, and feedback sessions.
Stable scope — The project SHOULD have stable, well-defined scope for the pilot duration. Projects with rapidly shifting requirements are poor candidates.
Visible outcomes — The project SHOULD produce outcomes that are demonstrable to stakeholders, supporting the case for expansion.
Representative workload — The project SHOULD involve a mix of task types (new feature development, bug fixes, refactoring, test writing) to evaluate AI assistance across multiple scenarios.

Risk Scoring

Every candidate pilot project MUST be scored against the following risk factors. Projects with a total risk score above 35 (out of 50) MUST NOT be selected as pilots without Steering Committee exception approval.

Risk Factor	Low Risk (1-2)	Medium Risk (3)	High Risk (4-5)	Weight
Data sensitivity	Public/Internal data only	Some Confidential data	Restricted data (PII/PHI/PCI)	3x
Production impact	Internal tool, no customer impact	Limited customer impact	Direct customer-facing, revenue impact	3x
Technical complexity	CRUD operations, well-understood patterns	Moderate business logic, some integration	Distributed systems, complex algorithms	2x
Regulatory scope	No regulatory requirements	Standard compliance (SOC 2)	Heavy regulation (HIPAA, PCI-DSS)	2x
Team experience	Senior team, deep domain knowledge	Mixed experience levels	Junior team or new domain	1x

Risk Score Calculation: Sum of (Risk Score x Weight) for each factor. Maximum possible score: 55.

Risk Thresholds

Total Risk Score	Decision
11-20	Ideal pilot candidate — Proceed with standard approval
21-30	Acceptable pilot candidate — Proceed with documented risk mitigations
31-35	Marginal candidate — Requires additional mitigations and Director-level approval
36+	Not recommended — Select a different project or request Steering Committee exception

Team Readiness Assessment

Before a team is approved for a pilot, the following readiness criteria MUST be verified:

Required Readiness Criteria

All pilot developers have completed the Developer Training curriculum and passed the assessment
All pilot developers have signed the Acceptable Use Policy per Baseline Security Policies
The team has an established code review process with documented standards
The project has automated test infrastructure with at least 60% code coverage
The team's Tech Lead has been briefed on AI output review responsibilities
The team manager has confirmed willingness to allocate 15-20% overhead for measurement and feedback activities

Success Metrics

Each pilot project MUST define success metrics before the pilot begins. The following metrics are REQUIRED:

Quantitative Metrics

Metric	Measurement Method	Target
Velocity change	Story points completed per sprint vs. baseline	No degradation in first 2 sprints; 10%+ improvement by sprint 4
Defect density	Defects per 1,000 lines of code (AI-assisted vs. baseline)	No increase vs. baseline
Security findings	SAST/DAST findings per release (AI-assisted vs. baseline)	No increase vs. baseline
Code review cycle time	Time from PR creation to merge	Baseline or better
AI attribution rate	Percentage of commits with AI attribution metadata	100% of AI-assisted commits

Qualitative Metrics

Metric	Collection Method	Target
Developer satisfaction	Survey (1-5 scale) at weeks 2, 6, and 12	Average score >= 3.5
Perceived code quality	Developer and reviewer assessment	"Same or better" for >80% of respondents
Tool usability	Survey (1-5 scale)	Average score >= 3.5
Training effectiveness	Post-training survey	Average score >= 4.0
Confidence level	Developer self-assessment of AI tool proficiency	"Confident" or "Very Confident" for >70%

Go/No-Go Criteria

At the pilot conclusion (typically 6-8 weeks into Phase 1), the Phase Lead MUST conduct a go/no-go assessment for each pilot project:

Go Criteria (ALL must be met)

No Critical or High severity security incidents attributable to AI tool usage
Defect density has not increased more than 10% relative to baseline
At least 80% of pilot developers recommend continuing AI tool usage
AI attribution metadata is present on 95%+ of AI-assisted commits
All Acceptable Use Policy violations (if any) have been resolved

No-Go Indicators (ANY triggers no-go)

A security incident involving Restricted or Confidential data leakage
Defect density increase of more than 25% relative to baseline
Fewer than 50% of pilot developers recommend continuing
Systematic failure to follow code review or governance processes
Evidence of automation complacency (AI output accepted without review)

Conditional Continuation

If the pilot neither clearly passes go criteria nor triggers a no-go indicator, the Steering Committee MAY approve a conditional continuation of up to 4 additional weeks with:

Documented remediation actions for identified issues
Enhanced monitoring and weekly status reporting
A defined re-assessment date

Pilot project selection is one of the most consequential decisions in Phase 1. These projects will become the reference cases for expanding AI-assisted development across the organization in Phase 2 — invest the effort to choose them well.

Ideal Project Characteristics
- Technical Characteristics
- Organizational Characteristics
Risk Scoring
- Risk Thresholds
Team Readiness Assessment
- Required Readiness Criteria
- Recommended Readiness Criteria
Success Metrics
- Quantitative Metrics
- Qualitative Metrics
Go/No-Go Criteria

Ideal Project Characteristics​

Technical Characteristics​

Organizational Characteristics​

Risk Scoring​

Risk Thresholds​

Team Readiness Assessment​

Required Readiness Criteria​

Recommended Readiness Criteria​

Success Metrics​

Quantitative Metrics​

Qualitative Metrics​

Go/No-Go Criteria​

Go Criteria (ALL must be met)​

No-Go Indicators (ANY triggers no-go)​

Conditional Continuation​