Code standards
NoteLanguages
Most historical pipelines use R; some newer work uses Python. The same principles apply.
Languages at a glance
Some projects lean on Python for geospatial or automated ingestion; standards below apply to both.
Style
- Short functions, verb names (
compute_exposure_lag). - Comments explain “why”, not trivial
dplyrsteps; document epidemiologic assumptions that matter. - Shared constants first — clinical thresholds, admin cut-offs, shapefile versions in
config.*.
Folder roles
Code/ |
Typical content |
|---|---|
Process/ |
Ingestion, cleaning, exposures. |
Descriptive/ |
Exploratory tables/figures (non-final). |
Models/ |
Main estimates, sensitivities. |
Outputs
- Finals →
Output_Analysis/Graphs&Tables. - Avoid absolute paths; use
here::here()(R) or equivalent.
Review checklist
| Check | Before merge |
|---|---|
Formatter/linter (styler, ruff, …) |
Yes |
| Documented runtime | Yes |
New deps explained in README |
Yes |
Generative AI
- Verify numbers on toy cases.
- Never paste identifiable records into public prompts.
- Disclose in the PR comment if a large block came from an assistant tool.