Contributing to ARTEMIS
┌─ Quick start ────────────────────────────────────────────────────────────────┐
│ │
│ # Install ARTEMIS │
│ devtools::install_github("OHDSI/ARTEMIS") │
│ │
│ # Activate local commit message guard (recommended): │
│ git config core.hooksPath .githooks │
│ │
│ # Install commitizen (needed for cz commit wizard + cz bump): │
│ pip install commitizen │
│ │
│ # Commit format enforced on every PR by CI: │
│ type(scope): short description │
│ e.g. fix(r-bridge): guard against empty alignment output │
│ feat(scoring): pass gap-open param to alignment │
│ │
│ # If CI blocks your PR → Actions tab → lint-commits job → fix │
│ # the offending commit(s) with git rebase -i, then force-push. │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
Getting started
ARTEMIS is an R package. No git clone is required — devtools handles the download:
devtools::install_github("OHDSI/ARTEMIS")If you are contributing code, fork the repository on GitHub and clone your fork. Work on a branch cut from develop, then open a PR targeting develop.
Branch naming
All branches must follow the canonical pattern:
<type>/<scope-or-ticket>/<short-description>
e.g. feat/GH-42/cython-param-pass
fix/GH-67/empty-alignment
chore/GH-xx/update-deps
ci/GH-xx/r-tests
release/1.5.0
hotfix/1.4.2
Valid types: feat fix chore docs style refactor test ci release hotfix
See vignette("branch-versioning") for the full Gitflow topology and merge direction.
Commit format
ARTEMIS uses Conventional Commits:
type(scope): short description
Examples:
feat(cython): pass gap-open/extend params to C extension
fix(r-bridge): guard against empty alignment output
ci(docker): add arm64 image build workflow
chore(data): update regimen reference data 2025
perf(scoring): remove redundant max in TSW score matrix
See vignette("testing-strategy") — Conventional Commits section for the full type list, scope vocabulary, and breaking-change syntax.
Local commit validation
The .githooks/commit-msg hook gives immediate local feedback before a bad commit reaches CI.
Activation (one command):
What changes after activation:
git commit -m "fix stuff"
# ✘ Invalid commit message.
# Expected: type(scope): description
# Got: fix stuff
# See: vignette("testing-strategy") — Conventional Commits section
# → commit aborted
git commit -m "fix(r-bridge): guard against empty alignment output"
# ✔ commit proceeds normallyHooks are optional locally — CI is the mandatory gate. No Python, no Node, no extra installs required for the hook itself.
CI validation
Every PR triggers .github/workflows/lint.yml, which runs two jobs:
| Job | What it checks |
|---|---|
lint-commits |
Every commit in the PR via cz check + PR title |
lint-branch |
Branch name against the naming regex |
If CI blocks your PR:
- Go to the Actions tab on GitHub.
- Open the failing job (
lint-commitsorlint-branch). - Read the
✘line — it shows the exact offending commit or branch name. - Fix commits with
git rebase -i, then force-push:
git rebase -i origin/develop
# change "pick" to "reword" on bad commits, fix the messages
git push --force-with-leaseCutting a release
Only the release manager needs this. Everyone else: stop here.
pip install commitizen # one-time setup
# On release/* branch:
cz bump # auto-bumps DESCRIPTION Version, updates CHANGELOG.md,
# creates a signed git tag (e.g. v1.5.0)
git push && git push --tagscz bump reads commits since the last tag and determines the next version automatically (semver). See vignette("testing-strategy") — Release workflow section for details.
Python / Cython internals
Python and Cython are an implementation detail of ARTEMIS. Contributors do not need to install Python, touch .py / .pyx files, or understand the bridge layer. The R test suite (test-100-bridge.R) exercises the full stack automatically.
Technical Debt Standard
When to open a tech-debt issue
Open one when you encounter any of the following:
- Hard-to-maintain code with unclear intent
- A temporary workaround that was never revisited
- Poor separation of concerns making future changes risky
- Legacy behaviour whose contract is undocumented
- Missing tests for logic that is correctness-critical
Required labels
Every tech-debt issue must have exactly three labels:
| Group | Pick one |
|---|---|
type:tech-debt |
always |
area:* |
see table below |
priority:* |
P1 / P2 / P3 |
Area labels:
| Label | Scope |
|---|---|
area:data-records |
Patient records, regimen reference data |
area:scoring |
Aligner algorithm, penalty params, TSW / Cython implementation |
area:reports |
Output reports, stats, writeOutputs
|
area:prealign |
Pre-alignment — blacklisting (cleanByBlacklist, buildBlacklistRegex), stringDF_from_cdm, encode/decode
|
area:postalign |
Post-alignment — processAlignments, lineOfTreatment, removeOverlaps, createDrugDF
|
area:r-bridge |
R ↔︎ Python/Cython bridge and reticulate layer |
area:ci |
GitHub Actions workflows, hooks, CI tooling |
area:docs |
README, man pages, vignettes, CONTRIBUTING |
Priority rules:
| Label | When to use |
|---|---|
priority:P1 |
Blocks development or risks correctness — fix before next release |
priority:P2 |
Affects maintainability — schedule within current cycle |
priority:P3 |
Cleanup / low risk — backlog |
Issue template
Use the Technical Debt issue template (.github/ISSUE_TEMPLATE/tech-debt.yml). Required fields:
Location – file + function / line range
Problem – what is wrong and why
Impact – risk | maintainability | correctness | performance
Direction – suggested fix (optional but encouraged)
Related – linked issues / PRs
Compliant example title: debt: cleanText does not handle NULL drug_concept_id