Skip to content

Commit 4c6c18d

Browse files
authored
Merge pull request #678 from microsoft/copilot/fix-fc133524-4bd1-478f-96ca-5db4b0edf20c
Add AGENTS.md file for AI coding agent guidance
2 parents 6d15c1f + db1da61 commit 4c6c18d

File tree

1 file changed

+358
-0
lines changed

1 file changed

+358
-0
lines changed

AGENTS.md

Lines changed: 358 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,358 @@
1+
# AGENTS.md
2+
3+
## Project Overview
4+
5+
Data Science for Beginners is a comprehensive 10-week, 20-lesson curriculum created by Microsoft Azure Cloud Advocates. The repository is a learning resource that teaches foundational data science concepts through project-based lessons, including Jupyter notebooks, interactive quizzes, and hands-on assignments.
6+
7+
**Key Technologies:**
8+
- **Jupyter Notebooks**: Primary learning medium using Python 3
9+
- **Python Libraries**: pandas, numpy, matplotlib for data analysis and visualization
10+
- **Vue.js 2**: Quiz application (quiz-app folder)
11+
- **Docsify**: Documentation site generator for offline access
12+
- **Node.js/npm**: Package management for JavaScript components
13+
- **Markdown**: All lesson content and documentation
14+
15+
**Architecture:**
16+
- Multi-language educational repository with extensive translations
17+
- Structured into lesson modules (1-Introduction through 6-Data-Science-In-Wild)
18+
- Each lesson includes README, notebooks, assignments, and quizzes
19+
- Standalone Vue.js quiz application for pre/post-lesson assessments
20+
- GitHub Codespaces and VS Code dev containers support
21+
22+
## Setup Commands
23+
24+
### Repository Setup
25+
```bash
26+
# Clone the repository (if not already cloned)
27+
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
28+
cd Data-Science-For-Beginners
29+
```
30+
31+
### Python Environment Setup
32+
```bash
33+
# Create a virtual environment (recommended)
34+
python -m venv venv
35+
source venv/bin/activate # On Windows: venv\Scripts\activate
36+
37+
# Install common data science libraries (no requirements.txt exists)
38+
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
39+
```
40+
41+
### Quiz Application Setup
42+
```bash
43+
# Navigate to quiz app
44+
cd quiz-app
45+
46+
# Install dependencies
47+
npm install
48+
49+
# Start development server
50+
npm run serve
51+
52+
# Build for production
53+
npm run build
54+
55+
# Lint and fix files
56+
npm run lint
57+
```
58+
59+
### Docsify Documentation Server
60+
```bash
61+
# Install Docsify globally
62+
npm install -g docsify-cli
63+
64+
# Serve documentation locally
65+
docsify serve
66+
67+
# Documentation will be available at localhost:3000
68+
```
69+
70+
### Visualization Projects Setup
71+
For visualization projects like meaningful-visualizations (lesson 13):
72+
```bash
73+
# Navigate to starter or solution folder
74+
cd 3-Data-Visualization/13-meaningful-visualizations/starter
75+
76+
# Install dependencies
77+
npm install
78+
79+
# Start development server
80+
npm run serve
81+
82+
# Build for production
83+
npm run build
84+
85+
# Lint files
86+
npm run lint
87+
```
88+
89+
## Development Workflow
90+
91+
### Working with Jupyter Notebooks
92+
1. Start Jupyter in the repository root: `jupyter notebook`
93+
2. Navigate to the desired lesson folder
94+
3. Open `.ipynb` files to work through exercises
95+
4. Notebooks are self-contained with explanations and code cells
96+
5. Most notebooks use pandas, numpy, and matplotlib - ensure these are installed
97+
98+
### Lesson Structure
99+
Each lesson typically contains:
100+
- `README.md` - Main lesson content with theory and examples
101+
- `notebook.ipynb` - Hands-on Jupyter notebook exercises
102+
- `assignment.ipynb` or `assignment.md` - Practice assignments
103+
- `solution/` folder - Solution notebooks and code
104+
- `images/` folder - Supporting visual materials
105+
106+
### Quiz Application Development
107+
- Vue.js 2 application with hot-reload during development
108+
- Quizzes stored in `quiz-app/src/assets/translations/`
109+
- Each language has its own translation folder (en, fr, es, etc.)
110+
- Quiz numbering starts at 0 and goes up to 39 (40 quizzes total)
111+
112+
### Adding Translations
113+
- Translations go in `translations/` folder at repository root
114+
- Each language has complete lesson structure mirrored from English
115+
- Automated translation via GitHub Actions (co-op-translator.yml)
116+
117+
## Testing Instructions
118+
119+
### Quiz Application Testing
120+
```bash
121+
cd quiz-app
122+
123+
# Run lint checks
124+
npm run lint
125+
126+
# Test build process
127+
npm run build
128+
129+
# Manual testing: Start dev server and verify quiz functionality
130+
npm run serve
131+
```
132+
133+
### Notebook Testing
134+
- No automated test framework exists for notebooks
135+
- Manual validation: Run all cells in sequence to ensure no errors
136+
- Verify data files are accessible and outputs are generated correctly
137+
- Check that visualizations render properly
138+
139+
### Documentation Testing
140+
```bash
141+
# Verify Docsify renders correctly
142+
docsify serve
143+
144+
# Check for broken links manually by navigating through content
145+
# Verify all lesson links work in the rendered documentation
146+
```
147+
148+
### Code Quality Checks
149+
```bash
150+
# Vue.js projects (quiz-app and visualization projects)
151+
cd quiz-app # or visualization project folder
152+
npm run lint
153+
154+
# Python notebooks - manual verification recommended
155+
# Ensure imports work and cells execute without errors
156+
```
157+
158+
## Code Style Guidelines
159+
160+
### Python (Jupyter Notebooks)
161+
- Follow PEP 8 style guidelines for Python code
162+
- Use clear variable names that explain the data being analyzed
163+
- Include markdown cells with explanations before code cells
164+
- Keep code cells focused on single concepts or operations
165+
- Use pandas for data manipulation, matplotlib for visualization
166+
- Common import pattern:
167+
```python
168+
import pandas as pd
169+
import numpy as np
170+
import matplotlib.pyplot as plt
171+
```
172+
173+
### JavaScript/Vue.js
174+
- Follow Vue.js 2 style guide and best practices
175+
- ESLint configuration in `quiz-app/package.json`
176+
- Use Vue single-file components (.vue files)
177+
- Maintain component-based architecture
178+
- Run `npm run lint` before committing changes
179+
180+
### Markdown Documentation
181+
- Use clear headings hierarchy (# ## ### etc.)
182+
- Include code blocks with language specifiers
183+
- Add alt text for images
184+
- Link to related lessons and resources
185+
- Keep line lengths reasonable for readability
186+
187+
### File Organization
188+
- Lesson content in numbered folders (01-defining-data-science, etc.)
189+
- Solutions in dedicated `solution/` subfolders
190+
- Translations mirror English structure in `translations/` folder
191+
- Keep data files in `data/` or lesson-specific folders
192+
193+
## Build and Deployment
194+
195+
### Quiz Application Deployment
196+
```bash
197+
cd quiz-app
198+
199+
# Build production version
200+
npm run build
201+
202+
# Output is in dist/ folder
203+
# Deploy dist/ folder to static hosting (Azure Static Web Apps, Netlify, etc.)
204+
```
205+
206+
### Azure Static Web Apps Deployment
207+
The quiz-app can be deployed to Azure Static Web Apps:
208+
1. Create Azure Static Web App resource
209+
2. Connect to GitHub repository
210+
3. Configure build settings:
211+
- App location: `quiz-app`
212+
- Output location: `dist`
213+
4. GitHub Actions workflow will auto-deploy on push
214+
215+
### Documentation Site
216+
```bash
217+
# Build PDF from Docsify (optional)
218+
npm run convert
219+
220+
# Docsify documentation is served directly from markdown files
221+
# No build step required for deployment
222+
# Deploy repository to static hosting with Docsify
223+
```
224+
225+
### GitHub Codespaces
226+
- Repository includes dev container configuration
227+
- Codespaces automatically sets up Python and Node.js environment
228+
- Open repository in Codespace via GitHub UI
229+
- All dependencies install automatically
230+
231+
## Pull Request Guidelines
232+
233+
### Before Submitting
234+
```bash
235+
# For Vue.js changes in quiz-app
236+
cd quiz-app
237+
npm run lint
238+
npm run build
239+
240+
# Test changes locally
241+
npm run serve
242+
```
243+
244+
### PR Title Format
245+
- Use clear, descriptive titles
246+
- Format: `[Component] Brief description`
247+
- Examples:
248+
- `[Lesson 7] Fix Python notebook import error`
249+
- `[Quiz App] Add German translation`
250+
- `[Docs] Update README with new prerequisites`
251+
252+
### Required Checks
253+
- Ensure all code runs without errors
254+
- Verify notebooks execute completely
255+
- Confirm Vue.js apps build successfully
256+
- Check that documentation links work
257+
- Test quiz application if modified
258+
- Verify translations maintain consistent structure
259+
260+
### Contribution Guidelines
261+
- Follow existing code style and patterns
262+
- Add explanatory comments for complex logic
263+
- Update relevant documentation
264+
- Test changes across different lesson modules if applicable
265+
- Review the CONTRIBUTING.md file
266+
267+
## Additional Notes
268+
269+
### Common Libraries Used
270+
- **pandas**: Data manipulation and analysis
271+
- **numpy**: Numerical computing
272+
- **matplotlib**: Data visualization and plotting
273+
- **seaborn**: Statistical data visualization (some lessons)
274+
- **scikit-learn**: Machine learning (advanced lessons)
275+
276+
### Working with Data Files
277+
- Data files located in `data/` folder or lesson-specific directories
278+
- Most notebooks expect data files in relative paths
279+
- CSV files are primary data format
280+
- Some lessons use JSON for non-relational data examples
281+
282+
### Multilingual Support
283+
- 40+ language translations via automated GitHub Actions
284+
- Translation workflow in `.github/workflows/co-op-translator.yml`
285+
- Translations in `translations/` folder with language codes
286+
- Quiz translations in `quiz-app/src/assets/translations/`
287+
288+
### Development Environment Options
289+
1. **Local Development**: Install Python, Jupyter, Node.js locally
290+
2. **GitHub Codespaces**: Cloud-based instant development environment
291+
3. **VS Code Dev Containers**: Local container-based development
292+
4. **Binder**: Launch notebooks in cloud (if configured)
293+
294+
### Lesson Content Guidelines
295+
- Each lesson is standalone but builds on previous concepts
296+
- Pre-lesson quizzes test prior knowledge
297+
- Post-lesson quizzes reinforce learning
298+
- Assignments provide hands-on practice
299+
- Sketchnotes provide visual summaries
300+
301+
### Troubleshooting Common Issues
302+
303+
**Jupyter Kernel Issues:**
304+
```bash
305+
# Ensure correct kernel is installed
306+
python -m ipykernel install --user --name=datascience
307+
```
308+
309+
**npm Install Failures:**
310+
```bash
311+
# Clear npm cache and retry
312+
npm cache clean --force
313+
rm -rf node_modules package-lock.json
314+
npm install
315+
```
316+
317+
**Import Errors in Notebooks:**
318+
- Verify all required libraries are installed
319+
- Check Python version compatibility (Python 3.7+ recommended)
320+
- Ensure virtual environment is activated
321+
322+
**Docsify Not Loading:**
323+
- Verify you're serving from repository root
324+
- Check that `index.html` exists
325+
- Ensure proper network access (port 3000)
326+
327+
### Performance Considerations
328+
- Large datasets may take time to load in notebooks
329+
- Visualization rendering can be slow for complex plots
330+
- Vue.js dev server enables hot-reload for quick iteration
331+
- Production builds are optimized and minified
332+
333+
### Security Notes
334+
- No sensitive data or credentials should be committed
335+
- Use environment variables for any API keys in cloud lessons
336+
- Azure-related lessons may require Azure account credentials
337+
- Keep dependencies updated for security patches
338+
339+
## Contributing to Translations
340+
- Automated translations managed via GitHub Actions
341+
- Manual corrections welcomed for translation accuracy
342+
- Follow existing translation folder structure
343+
- Update quiz links to include language parameter: `?loc=fr`
344+
- Test translated lessons for proper rendering
345+
346+
## Related Resources
347+
- Main curriculum: https://aka.ms/datascience-beginners
348+
- Microsoft Learn: https://docs.microsoft.com/learn/
349+
- Student Hub: https://docs.microsoft.com/learn/student-hub
350+
- Discussion Forum: https://github.com/microsoft/Data-Science-For-Beginners/discussions
351+
- Other Microsoft curricula: ML for Beginners, AI for Beginners, Web Dev for Beginners
352+
353+
## Project Maintenance
354+
- Regular updates to keep content current
355+
- Community contributions welcome
356+
- Issues tracked on GitHub
357+
- PRs reviewed by curriculum maintainers
358+
- Monthly content reviews and updates

0 commit comments

Comments
 (0)