Skip to content

sepiariver/GAN-coding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

GAN-Coding: Vibe-Coding But More Boring

Today I was a guest to a software engineering class run by an esteemed colleague. I was asked to describe my usage of AI for software development. Being the first time I'd organized my thoughts around my process, I was surprised to learn that I had a process at all. In retrospect, it seems I've been developing and refining this process, however informally, for over three years (I enjoyed beta access to ChatGPT 3.5).

I believe the process is repeatable and thus far has produced useful outputs, at least for my work. I'd like to share it with you, not only to hopefully verify its usefulness across a more broad spectrum of use cases but also in the "open-source" spirit that made me fall in love with software development in the early 00s.

"Human knowledge belongs to the world, like Shakespeare, or Aspirin." ~ Teddy Chin, AntiTrust (2001)

I'm calling the process "GAN-coding", in contrast with the widely used and overloaded term "vibe-coding".

What is GAN-coding? (and what it isn't)

Vibe-coding is a slangy term used to describe a style of software development in which a human uses natural language prompts to instruct an AI model to generate code rather than manually writing code by hand.

GAN-coding is a software development methodology where code is produced through repeated adversarial cycles between generators and discriminators, with humans retaining final authority, and correctness is enforced through explicit rejection loops rather than trust.

This isn't "GAN" in the machine-learning sense, technically. I'm borrowing the generator/discriminator pattern because it matches the workflow: produce an artifact, then adversarially challenge it.

"Vibe-coding" has a broad scope. In practice it can mean hyper-rigorous AI-assisted coding very much like what I'm labeling "GAN-coding", OR it can mean a one-shot "make me a website that takes payments for a digital product" or anything in between. When I refer to vibe-coding in this article, I'm scoping down to "average-ish" entrepreneurial or professional use of AI-assisted coding.

GAN-coding can be seen as an extension of vibe-coding with some key differences:

Characteristic Vibe-coding Manual human coding GAN-coding
AI-driven: LLM generates code Very high Low (incidental assistance) Medium-high
Natural language (not code) prompts Very high Low (inline code completion) Medium-high (structured prompts including code)
Human comprehension of generated code Low-high Very high (ideally) Medium-high (by definition)
Applied rigor: code reviews and testing Low-high Low-high High (by definition)

It's like the "TDD" (test-driven development) cousin of vibe-coding. If you already do strict code review + tests with AI assistance, GAN-coding is the largely the same but I've attempted to make it explicit, role-driven and repeatable.

Core Principles

  1. No Single Agent Is Trusted in Isolation Every meaningful artifact—design, code, tests, or review—is challenged by an independent agent. Humans and AI models alternate roles as generators and discriminators.
  2. Discrimination Is the Bottleneck Generation is fast and inexpensive. Evaluation, understanding, and rejection are not. The process is intentionally optimized to preserve human attention for high-leverage decisions.
  3. Correctness Is Proven, Not Assumed Code must survive adversarial scrutiny: tests that actually verify behavior, reviews that look for failure modes, and diffs that can be reasoned about line by line.
  4. Diversity Defends Against Correlated Failure Multiple AI models are used intentionally, not interchangeably. Different priors, biases, and blind spots reduce the risk of silent, shared error.

When we adopt these principles and the process I'll describe below, some interesting things happen.

Characteristic Vibe-coding Manual human coding GAN-coding
Primary failure mode Confidently shipping broken code Human error, blind spots, fatigue Over-constraint or slow convergence
Who is accountable Ambiguous (blame the AI) Human author(s) Human discriminators / co-authors

If the human doesn't fully understand the code, then the human cannot be accountable, or has an exit hatch for accountability. If rigorous validation is optional, as is the case in "average-ish" vibe-coding, then failure emerges from false confidence. "The AI scores high on SWE so this must be good to go."

While far from perfect, human manual coding was the progenitor of most of the Internet up until recently. It's the devil we know and we have process around it. I propose we apply similarly rigorous process to AI-assisted coding in order to reap the considerable benefits AI can confer while maintaining some semblance of human ownership, accountability and authority that helps align outcomes with intent and value.

The GAN-coding Process

A few critical features of the GAN-coding process:

  • Explicit human rejection loops and human accountability
  • AI-assisted adversarial discrimination cycles driven by human judgement
  • Constraint-driven prompting (tests, invariants, contracts)
  • Human-owned architectural decisions

Phase 1: Design

The process begins with traditional requirements gathering, specification writing, and architectural design. This phase is intentionally aligned with best practices from manual human coding. The human generates design artifacts, specifications, requirements, etc.

Roles

  • Human: Generator
  • AI (design-review model): Discriminator

The AI is used to challenge assumptions, surface edge cases, and critique architecture—not to author it. Design concerns are resolved before implementation planning begins.


Phase 2: Implementation Prompt

A canonical prompt is produced that describes the system with sufficient clarity that either a human engineer or an AI system could implement it.

This constraint is deliberate. If only a specific model can interpret the prompt correctly, intent has leaked into the model rather than being captured in the design. In GAN-coding, prompts function as contracts, not vibes.

Roles

  • AI (design-review model): Generator
  • Human: Discriminator

Phase 3: Implementation Planning

A coding-focused AI model decomposes the work into explicit phases or chunks, ideally with semantic and functional boundaries. The human reviews and approves the plan. Coding does not begin without an approved plan.

Roles

  • AI (coding model): Generator
  • Human: Discriminator

Pro tip: keep iteration phases small to prevent context drift.


Phase 4: Iterative Code Generation and Testing

For each phase:

  1. AI generates code.
  2. AI generates a test suite targeting that code.
    1. Could do #1 and #2 in reverse order for closer TDD alignment.
  3. The human scrutinizes the tests:
    1. Do they verify behavior or merely exercise code?
    2. Do they cover critical paths and error modes?
  4. The human reviews the code for comprehension, targeting medium-high understanding. Adversarial tests and independent reviews act as safety nets for the parts the human hasn't fully parsed, but tolerate gaps with extreme caution.
  5. All changes are committed. At every step, diffs are reviewed manually by the human.

Roles

  • AI (coding model): Generator
  • Human: Discriminator

Iterations continue until the phase converges.


Phase 5: Independent AI Review

A different AI model performs a full review of the code and tests. The human is responsible for selecting a discriminator model suitable to the task and providing sufficient context for an effective review: requirements, prioritization, convention, domain knowledge, etc. NOTE: this happens before pushing a PR for automated AI code review in CI/CD.

Roles

  • Original AI: Generator (as source of artifacts under review)
  • Independent AI: Discriminator

This phase defends against human fatigue and blind spots but also injects diversity of thought to reduce correlated failures.


Phase 6: Review Arbitration

The human reviews the AI reviewer's feedback. Valid concerns are verified by the original coding model.

Roles

  • Independent (reviewer) AI: Generator
  • Human + Original (coding) AI: Discriminator

While the introduction of the original coding AI as a discriminator may introduce defensive rejection ("my code is solid") the original coding AI possesses unique context that can greatly reduce false positive issues. There's an implicit additional step in which the roles are:

  • Original (coding) AI: Generator
  • Human: Discriminator

Only verified issues are addressed. All subsequent changes are subject to "Phase 4: Iterative Code Generation and Testing".


Completion and Convergence

The system is considered complete only when:

  • All planned phases are implemented
  • Tests meaningfully enforce invariants
  • The human can explain the system at an architectural and code level
  • No unresolved discriminator objections remain
Phase Generator Discriminator Artifact
1. Design Human AI (Design-Review Model) Design spec
2. Implementation Prompt AI (Design-Review Model) Human Implementation prompt
3. Implementation Planning AI (Coding Model) Human Phased implementation plan
4. Code Generation & Testing AI (Coding Model) Human Code + tests
5. Independent AI Review AI (Coding Model) AI (Review Model) Change requests
6. Review Arbitration AI (Review Model) Human + AI (Coding Model) Verified/rejected changes

Recursion: Verified changes loop back to Phase 4. Upon phase completion, the cycle repeats for subsequent phases until the project is complete.

flowchart TD
    subgraph Setup["Setup"]
        A["1. Design<br/>Human → AI"] --> B{OK?}
        B -->|No| A
        B -->|Yes| C["2. Prompt<br/>AI → Human"]
        C --> D{OK?}
        D -->|No| C
        D -->|Yes| E["3. Planning<br/>AI → Human"]
        E --> F{OK?}
        F -->|No| E
    end

    F -->|Yes| G

    subgraph Cycle["Recursive Implementation"]
        G["4. Code & Test<br/>AI → Human"] --> H{OK?}
        H -->|No| G
        H -->|Yes| I["5. AI Review<br/>AI → AI"]
        I --> J["6. Arbitration<br/>AI → Human + AI"]
        J --> K{Valid<br/>Changes?}
        K -->|Yes| G
    end

    K -->|No| L{Phase<br/>Done?}
    L -->|No| G
    L -->|Yes| M{More<br/>Phases?}
    M -->|Yes| G
    M -->|No| N["✓ Complete"]
Loading

In Practice

Is this actually useful? Isn't it boring and expensive?

Yes. Yes.

For many software engineers, individuals or teams, this process may net out to reduced productivity. At first. But there's a similar argument to be made for TDD in general, or for the stricter quality controls that are natural extensions for scaling, successful products. The argument is that the cost of rigor gets amortized, paid off, and starts returning value over time. (And by the way, some or all of your team members may be doing this already.)

That said, the GAN-coding process as it stands isn't cheap. To calibrate on when this is worth the cost, let's posit some dimensions on which we can gauge when GAN-coding could net more valuable returns.

Dimension Vibe-coding Manual human coding GAN-coding
Team size Very small Small–medium Medium–large
Codebase growth Fast, uneven, brittle Slow, deliberate Medium-fast, structured
Onboarding new contributors Easy initially Slow but deep Moderate, principled
Consistency across modules Low Medium to High if enforced Medium to High if enforced
Failure detection Late Medium-early Early
Long-term maintainability Low–medium Medium–high High

For small teams and prototype products, GAN-coding might be a time-sink with minimal return. Say "no" to GAN-coding during rapid prototyping, or in the cases where tests are disproportionately expensive relative to the risks they mitigate.

If speed is practically irrelevant, the codebase super-stable and mature, or there's little pressure for engineers to stretch beyond their capacity, then manual human coding may be the best approach that introduces the least risk. But even then, GAN-coding can provide valuable diversity of thought to detect otherwise hidden failures and tacit assumptions. When the iterative review cycle and AI assistance inherent in GAN-coding results in more thorough documentation or better test coverage, any team could benefit from improved maintainability.

The Human Element

GAN-coding isn't for everyone. It doesn't cater to the strengths and weaknesses of every coder equally. Let's look at a few skills that may be relevant (this list is far from exhaustive).

Skill / Trait Vibe-coding Manual human coding GAN-coding
Prompt articulation Extremely high Low or n/a Extremely high
Syntax & language mastery Low Extremely high Medium
Typing speed / mechanical Low or n/a High Low or n/a
Systems thinking & architecture Low–medium High High-extremely high
Critical evaluation & skepticism Low–medium Medium-high Extremely high
Debugging & failure analysis Low High Extremely high
Speed of ideation Extremely high Low–medium Medium-high

"Average-ish" vibe-coding rewards expressive intent over technical depth. You don't need to comprehend the outputs. You "load it in your browser" and if it works, it works.

Manual human coding rewards deep expertise, it sometimes rewards an architectural mindset and skepticism, it demands (but doesn't always get) a high level of debugging and failure analysis, and there's a manual dexterity component that makes it uniquely human (for now).

GAN-coding rewards judgement and curiosity.

No Single Agent Is Trusted in Isolation.

This core principle requires the GAN-coder to not only be skeptical of the outputs of others, but also their own outputs.

Discrimination Is the Bottleneck

The GAN-coder, by virtue of selecting the methodology, is under strain to extend their innate capacity. Their curiosity drives the resilience required to review yet another line of AI-generated code and also to learn from it. It's only boring if curiosity is exhausted.

Correctness Is Proven, Not Assumed

The GAN-coder is required to set aside their biases, practice objective judgement and be curious about the truth rather than their preferences.

Takeaways

While GAN-coding may not work for everyone, I posit that to those it does work for, it will be greatly beneficial. Here's a recap:

  • No agent trusted in isolation.
  • Write a spec a human could implement.
  • Don't code without an implementation plan.
  • Correctness is proven. Use tests and always review them manually.
  • Discrimination is the bottleneck. Always run an independent model review before PR.
  • Diversity defends against correlated failure. Use different models for different roles.
  • The human is always the tiebreaker.

Note there are some nuances in terms of prompt engineering, model selection and other bits I collectively consider "implementation details." A goal of mine for the GAN-coding process is that it can be successful agnostic to such details, as long as each step of the process is implemented faithfully. The one exception is the prompt in "Phase 2: Implementation Prompt". That's a detail I feel worth reiterating.

If you've read this far, I think it means something resonated with you. I'd look forward to your feedback and input. This is a work in progress, after all. I used something like the GAN-coding process to write this article. I produced outputs that I asked three different frontier AI models to review, validate, and enhance. I didn't write the mermaid diagram by hand, but I edited and iterated on it with an AI collaborator. I pasted the entire output for final review by multiple LLMs.

The final authority and discriminator though, is you :)

About

Vibe-Coding But More Boring

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published