Skip to content

Theurrien/qualitative-research-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Qualitative Research Analysis System

An AI-powered system for rigorous qualitative research analysis using Claude Code sub-agents. This system implements inductive coding methodology with multiple independent coders for reliability, following grounded theory principles.

Ready to use this for your research? Fork this repository and follow the setup instructions below to analyze your own interview data!

✨ Key Features

  • πŸ€– 3 Independent AI Coders - Ensures inter-rater reliability
  • πŸ‘€ Human-in-the-Loop - You review and approve all coding decisions at checkpoints
  • πŸ“Š Comprehensive Outputs - Analytical reports, structured codebooks, and CSV data exports
  • πŸ”„ Iterative Development - Code system evolves across interviews using constant comparison
  • πŸ“ Multiple Formats - Supports txt, json, docx, and pdf interview transcripts
  • πŸ”’ Privacy-First - All data stays local, only analysis requests sent to Claude API
  • πŸŽ“ Research-Grade - Follows established qualitative methods (grounded theory, inductive coding)

πŸ“‹ Prerequisites

  • Claude Code - Anthropic's official CLI for Claude (required)
  • Claude API access - Active Claude API subscription
  • Interview transcripts - At least 3-5 interviews in supported formats (txt, json, docx, pdf)

πŸš€ Installation & Setup

1. Fork or Clone This Repository

# Clone the repository
git clone https://github.com/YOUR_USERNAME/qualitative-research-system.git
cd qualitative-research-system

Or click the "Fork" button on GitHub to create your own copy.

2. Open in Claude Code

# Navigate to the project directory
cd qualitative-research-system

# Start Claude Code
claude

3. Verify Repository Structure

After cloning, your folder structure should look like this:

qualitative-research-system/
β”œβ”€β”€ .claude/              # Agent and command configurations (pre-configured)
β”œβ”€β”€ context/              # Your research context files (you add these)
β”œβ”€β”€ interviews/           # Your interview transcripts (you add these)
β”œβ”€β”€ state/                # System state (auto-generated during analysis)
β”œβ”€β”€ outputs/              # Results (auto-generated after analysis)
β”œβ”€β”€ README.md
└── QUICKSTART.md

4. Prepare Your Research Materials

The system needs two things from you:

  1. Your research question β†’ Add to context/research_question.txt
  2. Your interview transcripts β†’ Add to interviews/ folder

See the detailed setup guide below for formatting.

Data Privacy Note: All your data stays local on your machine. The system only sends data to Claude's API for analysis (following Anthropic's privacy policies). Remove any personally identifiable information from transcripts before analysis.

Git Privacy: A .gitignore file is included to protect your sensitive research data. Interview files, context files, state, and outputs are automatically excluded from git commits. This means you can safely use git for version control without accidentally committing participant data.

🎯 What This System Does

  1. Analyzes interview transcripts using inductive coding
  2. Develops code systems iteratively through constant comparison
  3. Ensures reliability via 3 independent coders per interview
  4. Provides human oversight at critical checkpoints
  5. Generates comprehensive outputs: analytical reports, codebooks, and structured data exports

πŸ—οΈ System Architecture

Specialized Sub-Agents

The system uses 6 specialized AI agents, each with a focused role:

Agent Purpose When It Runs
format-adapter Normalizes various input formats (txt, json, docx, pdf) Per interview, first
coder (Γ—3) Performs inductive coding independently Per interview, parallel
synthesis Consolidates coding, identifies consensus/divergence Per interview, after coders
report-writer Generates analytical research report After all interviews
codebook-generator Creates structured codebook documentation After all interviews
data-exporter Exports data to CSV for further analysis After all interviews

Orchestration

The /analyze slash command orchestrates the entire workflow, managing:

  • Sequential interview processing
  • Agent coordination
  • State management
  • Human checkpoints
  • Output generation

πŸ“ Project Structure

qualitative_research_system/
β”œβ”€β”€ .claude/
β”‚   β”œβ”€β”€ commands/
β”‚   β”‚   └── analyze.md              # Main orchestrator
β”‚   └── agents/
β”‚       β”œβ”€β”€ format-adapter.md       # Input normalization
β”‚       β”œβ”€β”€ coder.md                # Inductive coding
β”‚       β”œβ”€β”€ synthesis.md            # Consolidation & reliability
β”‚       β”œβ”€β”€ report-writer.md        # Research report generation
β”‚       β”œβ”€β”€ codebook-generator.md   # Codebook documentation
β”‚       └── data-exporter.md        # Data export to CSV
β”œβ”€β”€ context/
β”‚   β”œβ”€β”€ research_question.txt       # Your research question (add this!)
β”‚   β”œβ”€β”€ interview_guidelines.txt    # Optional: interview protocol
β”‚   └── theoretical_frameworks.txt  # Optional: theoretical lens
β”œβ”€β”€ interviews/
β”‚   β”œβ”€β”€ interview_01.txt            # Add your interviews here
β”‚   β”œβ”€β”€ interview_02.txt
β”‚   └── ...
β”œβ”€β”€ state/
β”‚   β”œβ”€β”€ code_system.json            # Evolving code system
β”‚   β”œβ”€β”€ progress.json               # Workflow state
β”‚   └── coder_outputs/              # Individual coder results
└── outputs/
    └── run_2025-01-07_143022/      # Timestamped results
        β”œβ”€β”€ report.md               # Analysis findings
        β”œβ”€β”€ codebook.md             # Human-readable codebook
        β”œβ”€β”€ codebook.json           # Structured codebook
        └── data/                   # Data exports
            β”œβ”€β”€ coded_excerpts.csv
            β”œβ”€β”€ code_frequencies.csv
            β”œβ”€β”€ code_cooccurrence.csv
            β”œβ”€β”€ interview_summary.csv
            └── coded_data_long.csv

πŸš€ Getting Started

1. Prepare Your Research Context

Create files in the context/ folder:

context/research_question.txt (required):

Research Question: [Your main research question here]

Sub-questions (optional):
- [Sub-question 1]
- [Sub-question 2]
- [Sub-question 3]

Example:
Research Question: How do professionals experience career transitions in the digital age?

Sub-questions:
- What challenges do they encounter during transitions?
- What strategies do they use to navigate uncertainty?
- How do they reconstruct their professional identity?

context/interview_guidelines.txt (optional):

Interview Protocol:
1. Opening: [Your opening question]
2. Main topic: [Core questions about your phenomenon]
3. Follow-up: [Probing questions]
4. Context: [Situational questions]
5. Reflection: [Meaning-making questions]

Example:
1. Opening: Tell me about your experience with [topic]
2. Challenges: What difficulties did you encounter?
3. Coping: How did you manage these challenges?
4. Support: What resources or support helped you?
5. Reflection: How has this experience shaped your perspective?

context/theoretical_frameworks.txt (optional):

Theoretical Lens: [Your theoretical framework]

Key concepts to consider:
- [Concept 1 and explanation]
- [Concept 2 and explanation]
- [Concept 3 and explanation]

Example:
Theoretical Lens: Identity Theory (Burke & Stets, 2009)

Key concepts to consider:
- Identity verification processes
- Self-meaning and role identities
- Identity disruption and change
- Social structure and identity formation

2. Add Your Interviews

Place interview transcripts in the interviews/ folder.

Supported formats:

  • Plain text (.txt)
  • JSON (.json)
  • Word documents (.docx)
  • PDFs (.pdf)

Example formats:

interviews/interview_01.txt (simple format - most common):

Interviewer: Can you tell me about your experience with [topic]?

Participant: Well, it was really challenging at first. I felt completely
lost and overwhelmed. The emotional weight was just so heavy, you know?
Like carrying this burden all the time.

Interviewer: What made it particularly difficult?

Participant: I think the uncertainty was the hardest part. Not knowing
when things would get better or if they ever would.

interviews/interview_02.json (structured format - if your data is in JSON):

{
  "interview_id": "interview_02",
  "participant_id": "P02",
  "date": "2024-11-20",
  "content": [
    {
      "speaker": "Interviewer",
      "text": "Tell me about your experience..."
    },
    {
      "speaker": "Participant",
      "text": "It was a really transformative time..."
    }
  ]
}

Note: The format-adapter agent will automatically normalize any of these formats into a standard structure for analysis.

3. Run the Analysis

In Claude Code, run:

/analyze

The system will:

  1. Load your research context
  2. Discover your interviews
  3. Process each interview sequentially:
    • 3 coders analyze independently
    • Synthesis identifies consensus
    • You review and approve codes
  4. Generate comprehensive outputs

4. Review Results

Find all outputs in timestamped folder: outputs/run_YYYY-MM-DD_HHMMSS/

  • report.md - Read your findings here
  • codebook.md - Reference code definitions
  • data/*.csv - Use for further analysis

πŸ”„ Workflow Detail

Iterative Coding Process

For each interview:

Interview N
    ↓
Format Adapter (normalizes input)
    ↓
Three Coders (independent, parallel)
    β”œβ”€β”€ Coder 1 β†’ codes + definitions
    β”œβ”€β”€ Coder 2 β†’ codes + definitions
    └── Coder 3 β†’ codes + definitions
    ↓
Synthesis (compares, consolidates)
    ↓
πŸ›‘ HUMAN CHECKPOINT
    - Review consensus codes
    - Resolve divergences
    - Make final decisions
    ↓
Updated Code System
    ↓
Next Interview (with evolved codes)

Human Checkpoints

After each interview synthesis, you'll see:

═══════════════════════════════════════════
   CHECKPOINT: Interview 1 Coding Review
═══════════════════════════════════════════

βœ… HIGH CONFIDENCE CODES (Approved)
[Codes all 3 coders agreed on]

⚠️  CODES NEEDING YOUR REVIEW
[Unique codes, conflicts, ambiguities]

Your decisions guide the code system evolution.

πŸ“Š Output Files Explained

report.md

Comprehensive qualitative analysis report including:

  • Executive summary
  • Research context
  • Findings organized by themes
  • Representative quotes
  • Cross-cutting patterns
  • Methodology notes

codebook.md / codebook.json

Complete code documentation:

  • Code definitions
  • Application guidelines (when to use / not use)
  • Exemplar quotes
  • Related codes
  • Development history

data/coded_excerpts.csv

Main data export with columns:

  • excerpt_id, interview_id, participant_id
  • excerpt_text, codes (pipe-separated)
  • location, speaker, context fields

data/code_frequencies.csv

Summary statistics per code:

  • Total excerpts coded
  • Interview prevalence
  • Exemplar quote

data/code_cooccurrence.csv

Shows which codes appear together (useful for theme analysis)

data/interview_summary.csv

Per-interview overview:

  • Excerpts coded
  • Unique codes used
  • Dominant themes

data/coded_data_long.csv

Long format (one row per code application) for statistical analysis in R/Python/SPSS

πŸŽ“ Methodological Rigor

This system implements qualitative research best practices:

Inductive Coding (Grounded Theory)

  • Codes emerge from data, not predetermined categories
  • Constant comparison method
  • Iterative refinement across interviews

Inter-Rater Reliability

  • 3 independent coders per interview
  • Systematic comparison of coding decisions
  • Consensus building with human oversight

Transparency

  • Clear audit trail of all decisions
  • Development history tracked in codebook
  • State saved at each step

Human-in-the-Loop

  • Critical checkpoints for review
  • Researcher maintains final authority
  • AI assists, human decides

πŸ”§ Advanced Usage

Resuming Interrupted Analysis

If analysis is interrupted, state is saved. Run /analyze again - the system will detect existing progress and ask if you want to continue.

Refining Existing Code System

To analyze new interviews with an existing code system:

  1. Keep state/code_system.json from previous run
  2. Add new interviews to interviews/
  3. Run /analyze
  4. Choose "continue from existing code system"

Customizing Agents

Edit agent prompts in .claude/agents/ to:

  • Adjust coding style (more interpretive vs. descriptive)
  • Add domain-specific instructions
  • Modify output formats
  • Include additional quality checks

Using Different Models

In agent YAML frontmatter, you can specify:

model: sonnet  # Default - balanced speed/quality
model: opus    # Higher quality, slower
model: haiku   # Faster, lower cost

πŸ“š Example Use Cases

Educational Research

  • Student experiences, learning processes, identity development
  • Theory: Situated learning, communities of practice

Health Research

  • Patient experiences, illness narratives, coping strategies
  • Theory: Health belief model, illness trajectory

Organizational Studies

  • Workplace experiences, organizational culture, change management
  • Theory: Organizational learning, sensemaking

Social Sciences

  • Life transitions, social phenomena, community dynamics
  • Theory: Symbolic interactionism, phenomenology

🀝 Tips for Best Results

Before Starting

  • Clearly articulate your research question
  • Have 3-5+ interviews for meaningful patterns
  • Ensure transcripts are clean and complete

During Analysis

  • Carefully review synthesis outputs at checkpoints
  • Don't rush - thoughtful decisions improve code quality
  • Use "notes" to track analytical insights

After Completion

  • Read report.md critically
  • Validate codebook against raw data
  • Use CSV exports for quantitative follow-up

⚠️ Limitations & Considerations

AI Coding Limitations

  • AI excels at pattern recognition but may miss cultural nuances
  • Human review essential for interpretive depth
  • Best for well-structured interview data

Not a Replacement for Human Analysis

  • System assists qualitative analysis
  • Researcher judgment remains critical
  • Use as a tool, not a substitute for expertise

Data Privacy

  • Transcripts stay local on your machine
  • Remove identifying information before analysis
  • Follow institutional ethics guidelines

πŸ†˜ Troubleshooting

Q: Agent isn't reading my interview files A: Check file format - ensure it's .txt, .json, .docx, or .pdf in interviews/ folder

Q: Coding seems too superficial A: Edit .claude/agents/coder.md to emphasize interpretive depth

Q: I want to change a code after checkpoint A: Edit state/code_system.json manually, then resume

Q: Output folder is empty A: Check state/error_log.json for errors, ensure all interviews were processed

πŸ“– Further Reading

Qualitative Methods

  • Charmaz, K. (2006). Constructing Grounded Theory
  • SaldaΓ±a, J. (2021). The Coding Manual for Qualitative Researchers
  • Braun & Clarke (2006). "Using thematic analysis in psychology"

AI-Assisted Qualitative Analysis

  • Emerging field - use critically and transparently
  • Always report AI assistance in methodology
  • Validate findings through traditional quality criteria

🀝 Contributing

Contributions are welcome! This system can be improved in many ways:

  • Agent prompts: Enhance coding strategies, add domain-specific expertise
  • Output formats: Add new export formats or visualization options
  • Documentation: Improve guides, add examples from different fields
  • Bug fixes: Report issues or submit fixes

To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -m 'Add your feature')
  4. Push to the branch (git push origin feature/your-feature)
  5. Open a Pull Request

πŸ“ License

This project is provided as-is for research and educational purposes.

Citation: If you use this system in research, please:

  • Describe the methodology transparently in your methods section
  • Cite appropriately (e.g., "Analysis was conducted using an AI-assisted qualitative coding system with multiple independent coders and human oversight")
  • Follow your institution's guidelines for AI-assisted research

πŸ™ Acknowledgments

Built using:

  • Claude Code by Anthropic
  • Grounded theory and inductive coding methodologies
  • Best practices from qualitative research literature

🚦 Ready to Start?

  1. Fork this repository
  2. Add your research context to context/research_question.txt
  3. Add your interviews to interviews/ folder
  4. Run /analyze in Claude Code
  5. Review and approve codes at checkpoints
  6. Get your results in outputs/ folder

Questions? Check the Troubleshooting section or open an issue on GitHub!

About

AI-powered qualitative research analysis system using Claude Code with inductive coding methodology and multiple independent coders

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors