NLSS helps researchers run statistical analyses through natural language conversations with an AI coding agent.
You describe what you want in plain English; NLSS handles the R scripts, produces well-formatted tables, and logs everything for reproducibility.
- Quick Start (5 minutes)
- What NLSS Does
- Glossary
- Part I — Installation
- Part II — Using NLSS
- Part III — Configuration & Customization
- Part IV — For Developers
- Troubleshooting
- License & Legal
Already have VS Code, a coding agent (Codex or Claude Code), and R installed? Here's the fastest path:
Open R — from your Start menu (Windows), Applications folder (macOS), or terminal (Linux: type R) — and paste:
install.packages(c("arrow","car","curl","DHARMa","emmeans","foreign","ggplot2","haven","influence.ME","jsonlite","lavaan","lme4","lmerTest","mice","MVN","performance","psych","pwr","semPower","VIM","viridisLite","yaml"))Codex users: Tell your agent:
"Install NLSS from https://github.com/docmh/nlss.git using $skill-installer"
Claude Code users: Download the NLSS ZIP, extract it, and move the nlss folder to:
- Windows:
%USERPROFILE%\.claude\skills\nlss - macOS/Linux:
~/.claude/skills/nlss
Restart your agent, then say:
"Run the NLSS demo to show me what it can do."
That's it! For detailed instructions, continue reading below.
┌─────────────────────────────────────────────────────────────────────┐
│ YOU (Senior Researcher) │
│ "Run descriptive stats for age and score, grouped by condition" │
└──────────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ AI AGENT (Assistant Researcher) │
│ • Understands your request │
│ • Runs the appropriate R scripts │
│ • Asks clarifying questions if needed │
└──────────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ NLSS OUTPUTS │
│ • report_canonical.md → Human-readable tables & narrative │
│ • analysis_log.jsonl → Machine-readable audit log │
│ • plots/ → Well-formatted figures │
│ • Full reports → Journal-style write-ups │
└─────────────────────────────────────────────────────────────────────┘
NLSS is packaged as an Agent Skill following the open Agent Skills standard. Your AI agent reads SKILL.md to discover what NLSS can do.
New to NLSS? Here are the key terms:
| Term | What it means |
|---|---|
| Workspace | A folder where NLSS stores your data, reports, and logs. Created automatically when you analyze a dataset. |
| Parquet | A fast, efficient file format. NLSS converts your data (CSV, SPSS, etc.) to Parquet for faster analysis. |
| Subskill | A single analysis module (e.g., descriptive-stats, t-test, regression). |
| Metaskill | A multi-step workflow that chains subskills together (e.g., write-full-report). |
| Agent | The AI assistant (Codex or Claude Code) that interprets your requests and runs NLSS. |
| IDE | Integrated Development Environment — the app where you write code and talk to the agent (e.g., VS Code, Cursor). |
| Skill | A plugin that teaches an AI agent new capabilities. NLSS is a skill. |
This guide walks you through setup step by step. Each step has GUI instructions (point-and-click) with terminal alternatives for those who prefer them.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Step 1 │ │ Step 2 │ │ Step 3 │ │ Step 4 │ │ Step 5 │
│ Install │ → │ Install │ → │ Install │ → │ Install R │ → │ Install │
│ an IDE │ │ an Agent │ │ R │ │ Packages │ │ NLSS │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
↓ ↓ ↓ ↓ ↓
VS Code or Codex or Download Open R and Copy folder
Cursor Claude Code from CRAN paste command to skills dir
An IDE is where you'll interact with the AI agent. Choose one:
Windows / macOS:
- Go to code.visualstudio.com
- Click the big Download button
- Run the installer and follow the prompts
- Launch VS Code when done
Linux:
- Go to code.visualstudio.com
- Download the
.deb(Ubuntu/Debian) or.rpm(Fedora/RHEL) package - Install via your package manager or double-click the file
- Or use Snap:
sudo snap install code --classic
- Go to cursor.com
- Download the installer for your platform (Windows, macOS, or Linux AppImage)
- Run the installer or make the AppImage executable and run it
- Cursor has AI built-in, but you'll still need to configure it for NLSS
The coding agent is the AI that understands your requests and runs NLSS. Choose one:
- Go to Codex download page
- Follow the installation instructions for your platform
- Open VS Code — Codex will appear in the sidebar
Recommended settings for NLSS:
- Use Agent mode (not Chat mode)
- Enable Auto context
- Use GPT-5.2-Codex model
- Set reasoning effort to Medium or High for better reports
Detailed Codex configuration
Codex exposes controls in the bottom bar. For NLSS:
-
Mode: Agent (so it can edit files and run commands)
-
Reasoning effort: Medium or High for statistics-heavy tasks
-
Network access: Required for the
research-academiautility. Enable via UI toggle, or add to yourconfig.toml:[sandbox_workspace_write] network_access = true
See Codex docs for details.
- Go to Claude Code overview
- Follow the installation instructions
- Claude Code runs in your terminal or integrates with VS Code
Recommended settings for NLSS:
- Use Opus 4.5 for best results (Sonnet 4.5 also works)
- Run
/configto see settings,/modelto switch models
Detailed Claude Code configuration
In Claude Code's interactive mode:
/config— Opens settings interface/model— Switch between models/status— Shows current model
Settings are stored in ~/.claude/settings.json (user) and .claude/settings.json (project).
See Claude Code settings docs for details.
R is the statistical engine that powers NLSS. You don't need to know R — the agent handles it.
- Go to CRAN R for Windows
- Click "Download R-4.x.x for Windows"
- Run the installer
- Important: When prompted, choose "Yes" to modify the PATH (this lets NLSS find R)
Note: If the installer doesn't offer a PATH option, you may need to add it manually. See Troubleshooting.
- Go to CRAN R for macOS
- Download the
.pkgfile for your Mac (Apple Silicon or Intel) - Double-click to install
- R is automatically added to your PATH
Ubuntu / Debian:
sudo apt update
sudo apt install r-baseFedora:
sudo dnf install RArch Linux:
sudo pacman -S rAfter installation, verify with: Rscript --version
NLSS needs several R packages for statistical analyses. Install them once and you're set.
- Open R:
- Windows: Start menu → search for "R" or "R x64"
- macOS: Applications folder → R
- Linux: Open a terminal and type
R, or find "R" in your applications menu
- The R Console will open
- Paste this command and press Enter:
install.packages(c("arrow","car","curl","DHARMa","emmeans","foreign","ggplot2","haven","influence.ME","jsonlite","lavaan","lme4","lmerTest","mice","MVN","performance","psych","pwr","semPower","VIM","viridisLite","yaml"))- Wait for installation to complete (may take a few minutes)
- You can close R when done
Alternative: Install via terminal
If you prefer using the terminal:
Rscript -e "install.packages(c('arrow','car','curl','DHARMa','emmeans','foreign','ggplot2','haven','influence.ME','jsonlite','lavaan','lme4','lmerTest','mice','MVN','performance','psych','pwr','semPower','VIM','viridisLite','yaml'), repos='https://cloud.r-project.org')"Troubleshooting: Package installation fails on Linux
Some R packages need system libraries. Install the dependencies for your distribution:
Ubuntu / Debian:
sudo apt update
sudo apt install -y libcurl4-openssl-dev libssl-dev libxml2-dev libfontconfig1-dev libharfbuzz-dev libfribidi-dev libfreetype6-dev libpng-dev libtiff5-dev libjpeg-devFedora:
sudo dnf install libcurl-devel openssl-devel libxml2-devel fontconfig-devel harfbuzz-devel fribidi-devel freetype-devel libpng-devel libtiff-devel libjpeg-develArch Linux:
sudo pacman -S curl openssl libxml2 fontconfig harfbuzz fribidi freetype2 libpng libtiff libjpeg-turboThen retry the R package installation.
NLSS is installed as a "skill" that your AI agent can use.
Easiest method: Ask your agent to install it:
"$skill-installer Install NLSS from https://github.com/docmh/nlss.git"
Manual method:
-
Download NLSS:
- Go to github.com/docmh/nlss
- Click the green Code button → Download ZIP
-
Extract the ZIP file
-
Rename the extracted folder to exactly
nlss -
Move the
nlssfolder to your Codex skills directory:Windows:
- Open File Explorer
- Type
%USERPROFILE%\.codex\skillsin the address bar and press Enter - If the
skillsfolder doesn't exist, create it - Move the
nlssfolder here
macOS:
- Open Finder
- Press Cmd+Shift+G and type
~/.codex/skills - If the folder doesn't exist, create it
- Move the
nlssfolder here
Linux:
- Open your file manager (Files, Nautilus, Dolphin, etc.)
- Press Ctrl+L to show the address bar, then type
~/.codex/skills - Or navigate to your home folder, show hidden files (Ctrl+H), and find/create
.codex/skills - Move the
nlssfolder here
-
Restart Codex
-
Verify: Type
/skillsin Codex — you should seenlsslisted
-
Download NLSS:
- Go to github.com/docmh/nlss
- Click the green Code button → Download ZIP
-
Extract the ZIP file
-
Rename the extracted folder to exactly
nlss -
Move the
nlssfolder to your Claude skills directory:Windows:
- Open File Explorer
- Type
%USERPROFILE%\.claude\skillsin the address bar and press Enter - If the
skillsfolder doesn't exist, create it - Move the
nlssfolder here
macOS:
- Open Finder
- Press Cmd+Shift+G and type
~/.claude/skills - If the folder doesn't exist, create it
- Move the
nlssfolder here
Linux:
- Open your file manager (Files, Nautilus, Dolphin, etc.)
- Press Ctrl+L to show the address bar, then type
~/.claude/skills - Or navigate to your home folder, show hidden files (Ctrl+H), and find/create
.claude/skills - Move the
nlssfolder here
-
Restart Claude Code
-
Verify: Ask "What skills are available?" — you should see
nlss
Alternative: Install via terminal (git clone)
Codex (macOS/Linux/WSL):
mkdir -p ~/.codex/skills
git clone https://github.com/docmh/nlss.git ~/.codex/skills/nlssCodex (Windows PowerShell):
New-Item -ItemType Directory -Force -Path "$HOME\.codex\skills" | Out-Null
git clone https://github.com/docmh/nlss.git "$HOME\.codex\skills\nlss"Claude Code (macOS/Linux):
mkdir -p ~/.claude/skills
git clone https://github.com/docmh/nlss.git ~/.claude/skills/nlssClaude Code (Windows PowerShell):
New-Item -ItemType Directory -Force -Path "$HOME\.claude\skills" | Out-Null
git clone https://github.com/docmh/nlss.git "$HOME\.claude\skills\nlss"Let's make sure NLSS is ready to go.
Ask your agent:
"Can you run
Rscript --versionand tell me what you see?"
You should get a response showing R version 4.x.x.
The best way to verify everything works is to run the built-in demo:
"Run the NLSS run-demo metaskill to show me what NLSS can do."
The agent will:
- Explain the NLSS workflow
- Set up a demo workspace with sample data
- Run some example analyses
- Show you the outputs
Once NLSS is installed, analyzing data is simple:
Tell the agent where your data is:
"Use NLSS to analyze
C:\Users\Me\Documents\my_study.csv"
Or for SPSS files:
"Analyze
data/experiment1.savwith NLSS"
Supported formats: CSV, SPSS (.sav), RDS, RData, Parquet
Use natural language:
"Run descriptive statistics for age and income, grouped by gender"
"Is there a correlation between stress and performance?"
"Compare test scores between the treatment and control groups"
NLSS creates a workspace folder with your results:
your-project/
nlss-workspace/
my_study/
report_canonical.md ← Your results are here!
analysis_log.jsonl ← Audit trail
plots/ ← Any figures
Tip: Keep report_canonical.md open in your editor to watch results appear in real time.
When you first analyze a dataset, NLSS creates a workspace — a dedicated folder for that dataset's analyses.
nlss-workspace/ ← Workspace root
nlss-workspace.yml ← Manifest (tracks all datasets)
my_study/ ← One folder per dataset
my_study.parquet ← Fast copy of your data
scratchpad.md ← Agent's planning notes
report_canonical.md ← All results (keeps growing)
analysis_log.jsonl ← Machine-readable log
plots/ ← Saved figures
backup/ ← Data backups before changes
report_20240115_describe-sample_demographics.md ← Full reports from metaskills
| File | Purpose |
|---|---|
report_canonical.md |
Your main results file. Every analysis appends a new section. Think of it as a lab notebook. |
analysis_log.jsonl |
Machine-readable log of every analysis. Used for reproducibility and integrity checks. |
scratchpad.md |
The agent's working notes. Useful for seeing its reasoning. |
*.parquet |
Your data in a fast format. All analyses read from this copy. |
Copy and paste these to try NLSS:
"Run descriptive stats for age, income, and satisfaction, grouped by region"
"Compare anxiety scores between the treatment and control groups using a t-test"
"Is there a significant difference in performance across the three training conditions? Use ANOVA."
"What's the correlation between hours_studied and exam_score? Use Spearman."
"Run a correlation matrix for all the personality variables"
"Predict job_satisfaction from salary, work_hours, and commute_time"
"Run a hierarchical regression: first demographics, then add personality traits"
"Show me the frequency distribution for education_level"
"Create a crosstab of gender by department with chi-square test"
"Run exploratory factor analysis on items q1 through q20"
"Describe my sample demographics for a methods section"
"Check all assumptions for running a regression predicting outcome from predictors A, B, and C"
"Write a full report testing whether condition affects performance, controlling for age"
| Analysis | What it does | Example prompt |
|---|---|---|
descriptive-stats |
Means, SDs, distributions | "Descriptive stats for age and score" |
frequencies |
Frequency tables | "Frequencies for gender and education" |
crosstabs |
Cross-tabulations with chi² | "Crosstab of gender by condition" |
correlations |
Pearson, Spearman, partial | "Correlate stress with performance" |
t-test |
Group comparisons | "Compare scores between groups" |
anova |
Multi-group comparisons | "ANOVA for outcome by condition" |
nonparametric |
Mann-Whitney, Kruskal-Wallis | "Non-parametric comparison" |
regression |
Linear/logistic regression | "Predict Y from X1, X2, X3" |
mixed-models |
Multilevel/repeated measures | "Mixed model with random intercepts" |
sem |
SEM, CFA, mediation | "CFA for my scale items" |
efa |
Exploratory factor analysis | "Factor analysis on survey items" |
scale |
Reliability (alpha, omega) | "Reliability for scale items" |
reliability |
ICC, kappa | "Inter-rater reliability" |
power |
Power analysis | "Power for detecting medium effect" |
assumptions |
Check statistical assumptions | "Check regression assumptions" |
plot |
Create figures | "Scatter plot of X vs Y" |
missings |
Missing data analysis | "Analyze missing data patterns" |
impute |
Imputation | "Impute missing values" |
data-transform |
Recode, compute, standardize | "Create a mean score variable" |
data-explorer |
Data dictionary | "Show me what's in this dataset" |
| Workflow | What it does |
|---|---|
run-demo |
Guided onboarding with sample data |
describe-sample |
Write a sample description for methods section |
explore-data |
Comprehensive data exploration |
screen-data |
Data quality checks and diagnostics |
prepare-data |
Data cleaning and transformation workflow |
check-assumptions |
Verify assumptions for planned analyses |
test-hypotheses |
Run and interpret hypothesis tests |
write-full-report |
Complete analysis with journal-style report |
explain-statistics |
Plain-language stats explanations |
explain-results |
Help interpreting NLSS output |
check-instruments |
Psychometric analysis of scales |
plan-power |
Power analysis planning |
format-document |
Format text in NLSS style |
generate-r-script |
Create standalone R script from analyses |
| Utility | What it does |
|---|---|
calc |
Quick statistical calculations |
research-academia |
Search scholarly literature (requires network) |
check-integrity |
Verify log integrity |
reconstruct-reports |
Rebuild reports from logs |
Instead of: "Analyze my data" Say: "Run descriptive statistics for age, income, and satisfaction, grouped by gender"
Instead of: "Compare the groups" Say: "Compare anxiety_score between the treatment and control conditions"
Add: "...and interpret the results" to get plain-language explanations.
Instead of running analyses one by one, use:
"Use the write-full-report metaskill to test whether training_type affects performance"
Watch results appear in real time and catch any issues immediately.
The agent can explain what it's doing:
"Explain why you chose that test" "What assumptions should I check?"
NLSS settings live in scripts/config.yml. Key sections:
defaults:
output_dir: "./nlss-workspace" # Where workspaces are created
digits: 2 # Decimal places in output
logging:
enabled: true # Log all analyses
include_outputs: true # Store output in logs (enables recovery)
modules:
crosstabs:
percent: "column" # Default percentage type
regression:
bootstrap: false # Bootstrap CIs off by defaultCLI flags override config settings for individual runs.
Output formatting is controlled by templates in assets/<subskill>/. Each template is a Markdown file with YAML front matter.
To customize output:
- Copy an existing template (e.g.,
assets/descriptive-stats/default-template.md) - Modify the formatting
- Either replace the original or register your template in
config.yml
Every analysis is logged to analysis_log.jsonl with:
- Timestamp
- Parameters used
- Results (if
include_outputs: true) - Checksums for integrity verification
Use check-integrity to verify logs haven't been modified.
Use reconstruct-reports to rebuild report_canonical.md from logs.
NLSS follows the Agent Skills standard:
nlss/
SKILL.md ← Entry point for agents
scripts/
R/ ← R analysis scripts
config.yml ← Configuration
assets/ ← Templates and sample data
references/
subskills/ ← Documentation for each analysis
metaskills/ ← Documentation for workflows
utilities/ ← Documentation for utilities
tests/ ← Test suite
- Workspace root detected by
nlss-workspace.yml(current dir, parent, or child) - All data converted to Parquet for fast I/O
data-transformandmissingsupdate data in place with automatic backups- Non-nested workspaces enforced
- Paths inside workspace: shown as relative
- Paths outside workspace: masked as
<external>/<filename>
Each subskill has:
- Script:
scripts/R/<name>.R - Reference:
references/subskills/<name>.md - Template(s):
assets/<name>/*.md
| Subskill | Script | Templates |
|---|---|---|
descriptive-stats |
descriptive_stats.R |
default, robust, distribution |
frequencies |
frequencies.R |
default, grouped |
crosstabs |
crosstabs.R |
default, grouped |
correlations |
correlations.R |
default, cross-correlation, matrix, comparison |
scale |
scale.R |
default |
efa |
efa.R |
default |
reliability |
reliability.R |
default |
data-explorer |
data_explorer.R |
default |
plot |
plot.R |
default |
data-transform |
data_transform.R |
default |
missings |
missings.R |
default |
impute |
impute.R |
default |
assumptions |
assumptions.R |
ttest, anova, regression, mixed-models, sem |
regression |
regression.R |
default |
power |
power.R |
default |
mixed-models |
mixed_models.R |
default, emmeans |
sem |
sem.R |
default, cfa, mediation, invariance |
anova |
anova.R |
default, posthoc, contrasts |
t-test |
t_test.R |
default |
nonparametric |
nonparametric.R |
default, posthoc |
init-workspace |
init_workspace.R |
default |
metaskill-runner |
metaskill_runner.R |
default, finalization |
Metaskills are agent-run workflows documented in references/metaskills/. They chain subskills and produce comprehensive reports.
Metaskill completion writes:
report_<YYYYMMDD>_<metaskill>_<intent>.md— Full report- Synopsis appended to
report_canonical.mdviametaskill-runner --synopsis
Descriptive Statistics
Rscript scripts/R/descriptive_stats.R \
--csv data.csv --vars age,score --group conditionCorrelations
Rscript scripts/R/correlations.R \
--csv data.csv --vars age,score,stress --method spearmanRegression
Rscript scripts/R/regression.R \
--csv data.csv --dv outcome --blocks "age,gender;stress,trait"ANOVA
Rscript scripts/R/anova.R \
--csv data.csv --dv outcome --between groupSEM/CFA
Rscript scripts/R/sem.R \
--csv data.csv --analysis cfa --factors "F1=item1,item2;F2=item3,item4"Mixed Models
Rscript scripts/R/mixed_models.R \
--csv data.csv --formula "score ~ time + (1|id)"See individual reference docs in references/subskills/ for full CLI options.
# Unix/WSL
bash cmdscripts/tests.sh smoke
# Windows PowerShell
.\cmdscripts\tests.ps1 smokeTests read from tests/tests.yml and output to outputs/test-runs/<timestamp>/.
Statistical modules include golden-value tests for numerical correctness:
- Generate goldens with independent R scripts in
tests/values/ - Compare against
analysis_log.jsonloutputs - Python checkers in
tests/values/check_<module>_golden.py
For batch testing prompts through Codex CLI:
# WSL/bash
./tests/prompt-robustness/run_prompts.sh --cd "/path/to/workspace" --effort medium
# PowerShell
.\tests\prompt-robustness\run_prompts.ps1 --cd "C:\path\to\workspace" --effort mediumNLSS was developed with AI assistance for drafting and iteration. All changes are curated, reviewed, and tested by the human maintainer.
- R 4.4+
Rscripton PATH- Base R packages:
base,stats,utils,graphics,grDevices,tools - CRAN packages:
arrow,car,curl,DHARMa,emmeans,foreign,ggplot2,haven,influence.ME,jsonlite,lavaan,lme4,lmerTest,mice,MVN,performance,psych,pwr,semPower,VIM,viridisLite,yaml
R's installer doesn't always add Rscript to your PATH. Fix it:
-
Find your R installation:
- Open R from the Start menu
- Type
R.home("bin")and press Enter - Note the path shown (e.g.,
C:\Program Files\R\R-4.4.0\bin)
-
Add to PATH:
- Press Windows key, type "environment variables"
- Click Edit the system environment variables
- Click Environment Variables...
- Under "User variables", select Path → Edit → New
- Paste the path from step 1
- Click OK three times
-
Restart your IDE
After installing R, restart your terminal or IDE. If still not found:
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.zshrc
source ~/.zshrcR should be on your PATH after installation. If not:
-
Check if R is installed:
which Rorwhereis R -
If installed but not found, add to your shell config:
bash (~/.bashrc):
echo 'export PATH="/usr/bin:$PATH"' >> ~/.bashrc source ~/.bashrc
zsh (~/.zshrc):
echo 'export PATH="/usr/bin:$PATH"' >> ~/.zshrc source ~/.zshrc
-
Restart your terminal or IDE
-
Verify the folder structure:
~/.codex/skills/nlss/SKILL.md # Codex ~/.claude/skills/nlss/SKILL.md # Claude CodeThe
SKILL.mdfile must be directly inside thenlssfolder. -
Restart your agent completely (not just the conversation)
-
Check for typos in the folder name — it must be exactly
nlss
- Make sure you're running R as Administrator for system-wide installs
- Or install to user library (R will prompt you)
- Install Xcode Command Line Tools:
xcode-select --install
- Install system dependencies first (see Step 4)
- Use absolute paths:
- Windows:
C:\Users\Me\Documents\data.csv - macOS:
/Users/me/Documents/data.csv - Linux:
/home/me/Documents/data.csv
- Windows:
- Or use relative paths from where the agent is running
- Check that the file actually exists at that path
The research-academia utility needs internet access. In Codex:
- Open settings
- Enable network access, or add to
config.toml:[sandbox_workspace_write] network_access = true
- Check
scratchpad.mdto see the agent's reasoning - Look at
analysis_log.jsonlfor exact parameters used - Ask the agent: "Explain what analysis you ran and why"
- Check the detailed reference docs in
references/ - Ask the agent: "Help me troubleshoot NLSS"
- Report issues at github.com/docmh/nlss/issues
NLSS is licensed under the Apache License, Version 2.0. See LICENSE for details.
NLSS™ is a trademark of Mike Hammes. The Apache License 2.0 does not grant permission to use the NLSS™ name beyond reasonable use to describe origin. See TRADEMARKS.md.
NLSS uses R packages from CRAN installed by the user. No third-party code is bundled.
- Provided "AS IS" under Apache-2.0; no warranties
- Users are responsible for validating results
- Not intended for safety-critical decisions without independent verification
- Modified versions may behave differently
Mike Hammes (mike.hammes@mikehammes.name)
If you use NLSS in published research, please cite:
Hammes, M. (2026). docmh/nlss: NLSS [Software]. Zenodo. https://doi.org/10.5281/zenodo.18173833
Find detailed testing information at github.com/docmh/nlss-demo