Best practice criteria

This page documents the best practice criteria we are aiming to adhere to across this work as part of our quality assurance.

We include the raw Markdown checklists that you can copy and adapt for your own projects, along with links to the corresponding GitHub issues for this project where progress against each checklist is tracked.

STRESS-DES

STRESS-DES is a reporting checklist for discrete-event simulation studies. It helps ensure the model, data, assumptions, experimentation, and implementation are described clearly enough for others to replicate the work.

Replicable: New code based on described methods produces consistent results. Gives confidence in results, their validity and reliability.

View progress

View blank checklist

**Source:** Monks, T., Currie, C. S. M., Onggo, B. S., Robinson, S., Kunc, M., & Taylor, S. J. E. (2019). Strengthening the reporting of empirical simulation studies: Introducing the STRESS guidelines. Journal of Simulation, 13(1), 55–67. https://doi.org/10.1080/17477778.2018.1442155

## Objectives

* [ ] **1.1 Purpose of the model**<br>Explain the background and objectives for the model
* [ ] **1.2 Model outputs**<br>Define all quantitative performance measures that are reported, using equations where necessary.  Specify how and when they are calculated during the model run along with how any measures of error such as confidence intervals are calculated.
* [ ] **1.3 Experimentation aims**<br>If the model has been used for experimentation, state the objectives that it was used to investigate.<br>(A) Scenario based analysis – Provide a name and description for each scenario, providing a rationale for the choice of scenarios and ensure that item 2.3 (below) is completed.<br>(B) Design of experiments – Provide details of the overall design of the experiments with reference to performance measures and their parameters (provide further details in data below).<br>(C) Simulation Optimisation – (if appropriate) Provide full details of what is to be optimised, the parameters that were included and the algorithm(s) that was be used.  Where possible provide a citation of the algorithm(s).

## Logic

* [ ] **2.1 Base model overview diagram**<br>Describe the base model using appropriate diagrams and description.  This could include one or more process flow, activity cycle or equivalent diagrams sufficient to describe the model to readers.  Avoid complicated diagrams in the main text.  The goal is to describe the breadth and depth of the model with respect to the system being studied.
* [ ] **2.2 Base model logic**<br>Give details of the base model logic. Give additional model logic details sufficient to communicate to the reader how the model works.
* [ ] **2.3 Scenario logic**<br>Give details of the logical difference between the base case model and scenarios (if any).  This could be incorporated as text or where differences are substantial could be incorporated in the same manner as 2.2.
* [ ] **2.4 Algorithms**<br>Provide further detail on any algorithms in the model that (for example) mimic complex or manual processes in the real world (i.e.  scheduling of arrivals/ appointments/ operations/ maintenance, operation of a conveyor system, machine breakdowns, etc.). Sufficient detail should be included (or referred to in other published work) for the algorithms to be reproducible.  Pseudo-code may be used to describe an algorithm.
* [ ] **2.5.1 Components - entities**<br>Give details of all entities within the simulation including a description of their role in the model and a description of all their attributes.
* [ ] **2.5.2 Components - activities**<br>Describe the activities that entities engage in within the model. Provide details of entity routing into and out of the activity.
* [ ] **2.5.3 Components - resources**<br>List all the resources included within the model and which activities make use of them.
* [ ] **2.5.4 Components - queues**<br>Give details of the assumed queuing discipline used in the model (e.g. First in First Out, Last in First Out, prioritisation, etc.). Where one or more queues have a different discipline from the rest, provide a list of queues, indicating the queuing discipline used for each.  If reneging, balking or jockeying occur, etc., provide details of the rules. Detail any delays or capacity constraints on the queues.
* [ ] **2.5.5 Components - entry/exit points**<br>Give details of the model boundaries i.e. all arrival and exit points of entities.  Detail the arrival mechanism (e.g. ‘thinning’ to mimic a non-homogenous Poisson process or balking)

## Data

* [ ] **3.1 Data sources**<br>List and detail all data sources. Sources may include:<br>• Interviews with stakeholders,<br>• Samples of routinely collected data,<br>• Prospectively collected samples for the purpose of the simulation study,<br>• Public domain data published in either academic or organisational literature.   Provide, where possible, the link and DOI to the data or reference to published literature.<br>All data source descriptions should include details of the sample size, sample date ranges and use within the study.
* [ ] **3.2 Pre-processing**<br>Provide details of any data manipulation that has taken place before its use in the simulation, e.g. interpolation to account for missing data or the removal of outliers.
* [ ] **3.3 Input parameters**<br>List all input variables in the model. Provide a description of their use and include parameter values.  For stochastic inputs provide details of any continuous, discrete or empirical distributions used along with all associated parameters.  Give details of all time dependent parameters and correlation.<br>Clearly state:<br>• Base case data<br>• Data use in experimentation, where different from the base case.<br>• Where optimisation or design of experiments has been used, state the range of values that parameters can take.<br>• Where theoretical distributions are used, state how these were selected and prioritised above other candidate distributions.
* [ ] **3.4 Assumptions**<br>Where data or knowledge of the real system is unavailable what assumptions are included in the model?  This might include parameter values, distributions or routing logic within the model.

## Experimentation

* [ ] **4.1 Initialisation**<br>Report if the system modelled is terminating or non-terminating.  State if a warm-up period has been used, its length and the analysis method used to select it.  For terminating systems state the stopping condition.<br>State what if any initial model conditions have been included, e.g., pre-loaded queues and activities.  Report whether initialisation of these variables is deterministic or stochastic.
* [ ] **4.2 Run length**<br>Detail the run length of the simulation model and time units.
* [ ] **4.3 Estimation approach**<br>State the method used to account for the stochasticity: For example, two common methods are multiple replications or batch means. Where multiple replications have been used, state the number of replications and for batch means, indicate the batch length and whether the batch means procedure is standard, spaced or overlapping. For both procedures provide a justification for the methods used and the number of replications/size of batches.

## Implementation

* [ ] **5.1 Software or programming language**<br>State the operating system and version and build number.<br>State the name, version and build number of commercial or open source DES software that the model is implemented in.<br>State the name and version of general-purpose programming languages used (e.g. Python 3.5).<br>Where frameworks and libraries have been used provide all details including version numbers.
* [ ] **5.2 Random sampling**<br>State the algorithm used to generate random samples in the software/programming language used e.g. Mersenne Twister.<br>If common random numbers are used, state how seeds (or random number streams) are distributed among sampling processes.
* [ ] **5.3 Model execution**<br>State the event processing mechanism used e.g. three phase, event, activity, process interaction.<br>*Note that in some commercial software the event processing mechanism may not be published. In these cases authors should adhere to item 5.1 software recommendations.*<br>State all priority rules included if entities/activities compete for resources.<br>If the model is parallel, distributed and/or use grid or cloud computing, etc., state and preferably reference the technology used.  For parallel and distributed simulations the time management algorithms used.  If the HLA is used then state the version of the standard, which run-time infrastructure (and version), and any supporting documents (FOMs, etc.)
* [ ] **5.4 System specification**<br>State the model run time and specification of hardware used.  This is particularly important for large scale models that require substantial computing power.  For parallel, distributed and/or use grid or cloud computing, etc. state the details of all systems used in the implementation (processors, network, etc.)

## Code access

* [ ] **6.1 Computer model sharing statement**<br>Describe how someone could obtain the model described in the paper, the simulation software and any other associated software (or hardware) needed to reproduce the results.  Provide, where possible, the link and DOIs to these.

STARS reproducibility recommendations

These recommendations focus on making simulations reproducible, derived from computational reproducibility assessments of published healthcare DES models.

Reproducible: Running code regenerates the published results. Verifies code is working as expected and increases trust.

View progress

View blank checklist

**Source:** Heather, A., Monks, T., Harper, A., Mustafee, N., & Mayne, A. (2025). On the reproducibility of discrete-event simulation studies in health research: an empirical study using open models. Journal of Simulation, 1–25. https://doi.org/10.1080/17477778.2025.2552177

## Recommendations to support reproducibility

Set-up:

* [ ] ⭐ Share code with an open licence
* [ ] Link publication to a specific version of the code
* [ ] List dependencies and versions

Running the model:

* [ ] ⭐ Provide code for all scenarios and sensitivity analyses
* [ ] ⭐ Ensure model parameters are correct
* [ ] Control randomness

Outputs:

* [ ] ⭐ Include code to generate the tables, figures and other reported results
* [ ] ⭐ Include code to calculate all required model outputs

## Recommendations to support troubleshooting and reuse

Design:

* [ ] Separate model code from applications
* [ ] Avoid hard-coded parameters
* [ ] Minimise code duplication

Clarity:

* [ ] Comment sufficiently
* [ ] Ensure clarity and consistency in the model results tables
* [ ] Include run instructions
* [ ] State run time and machine specifications

Functionality:

* [ ] Address large file sizes
* [ ] Avoid excessive output files
* [ ] Save outputs to a file
* [ ] Optimise model run time

STARS reuse framework

This framework focuses on making DES models more reusable.

Reusable: Code can be adapted and used in next contexts. Saves time and increases impact.

View progress

View blank checklist

**Source:** Monks, T., Harper, A., & Mustafee, N. (2025). Towards sharing tools and artefacts for reusable simulations in healthcare. Journal of Simulation, 19(6), 619–638. https://doi.org/10.1080/17477778.2024.2347882

## Essential components

* [ ] Open licence
* [ ] Dependency management
* [ ] FOSS model
* [ ] Minimum documentation - minimal instructions (e.g. in README) that overview (a) what model does, (b) how to install and run model to obtain results, and (c) how to vary parameters to run new experiments
* [ ] ORCID
* [ ] Citation information
* [ ] Remote code repository
* [ ] Open science archive

## Optional components

* [ ] Enhanced documentation - Open and high quality documentation on how the model is implemented and works (e.g. via notebooks and markdown files, brought together using software like Quarto and Jupyter Book). Suggested content includes:
  * Plain english summary of project and model
  * Clarifying licence
  * Citation instructions
  * Contribution instructions
  * Model installation instructions
  * Structured code walk through of model
  * Documentation of modelling cycle using TRACE
  * Annotated simulation reporting guidelines
  * Clear description of model validation including its intended purpose
* [ ] Documentation hosting
* [ ] Online coding environment
* [ ] Model interface
* [ ] Web app hosting

NHS Levels of RAP

This framework provides a staged path for NHS teams to implement RAPs into their analytical work.

View progress

View blank checklist

The following framework is from the NHS RAP Community of Practice: **[NHS RAP Levels of RAP Framework](https://nhsdigital.github.io/rap-community-of-practice/introduction_to_RAP/levels_of_RAP/)**.

It is © 2024 Crown Copyright (NHS England), shared under the terms of the [Open Government 3.0 licence](https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/).

The specific version of the framework copied below is that from commit [2549256](https://github.com/NHSDigital/rap-community-of-practice/commit/2549256498886d6d7ea4cdb736e2a2864c8bb461) (9th September 2024).

## 🥉 Baseline

RAP fundamentals offering resilience against future change.

* [ ] Data produced by code in an open-source language (e.g., Python, R).
* [ ] Code is version controlled (see [Git basics](https://nhsdigital.github.io/rap-community-of-practice/training_resources/git/introduction-to-git/) and [using Git collaboratively](https://nhsdigital.github.io/rap-community-of-practice/training_resources/git/using-git-collaboratively/) guides).
* [ ] Repository includes a README.md file (or equivalent) that clearly details steps a user must follow to reproduce the code (use [NHS Open Source Policy section on Readmes](https://github.com/nhsx/open-source-policy/blob/main/open-source-policy.md#b-readmes) as a guide).
* [ ] Code has been [peer reviewed](https://nhsdigital.github.io/rap-community-of-practice/implementing_RAP/workflow/code-review/).
* [ ] Code is [published in the open](https://nhsdigital.github.io/rap-community-of-practice/implementing_RAP/publishing_code/how-to-publish-your-code-in-the-open/) and linked to & from accompanying publication (if relevant).

### 🥈 Silver

Implementing best practice by following good analytical and software engineering standards.

Meeting all of the above requirements, plus:

* [ ] Outputs are produced by code with minimal manual intervention.
* [ ] Code is well-documented including user guidance, explanation of code structure & methodology and [docstrings](https://nhsdigital.github.io/rap-community-of-practice/training_resources/python/python-functions/#documentation) for functions.
* [ ] Code is well-organised following [standard directory format](https://nhsdigital.github.io/rap-community-of-practice/training_resources/python/project-structure-and-packaging/).
* [ ] [Reusable functions](https://nhsdigital.github.io/rap-community-of-practice/training_resources/python/python-functions/) and/or classes are used where appropriate.
* [ ] Code adheres to agreed coding standards (e.g PEP8, [style guide for Pyspark](https://nhsdigital.github.io/rap-community-of-practice/training_resources/pyspark/pyspark-style-guide/)).
* [ ] Pipeline includes a testing framework ([unit tests](https://nhsdigital.github.io/rap-community-of-practice/training_resources/python/unit-testing/), [back tests](https://nhsdigital.github.io/rap-community-of-practice/training_resources/python/backtesting/)).
* [ ] Repository includes dependency information (e.g. [requirements.txt](https://pip.pypa.io/en/stable/user_guide/#requirements-files), [PipFile](https://github.com/pypa/pipfile/blob/main/README.rst), [environment.yml](https://nhsdigital.github.io/rap-community-of-practice/training_resources/python/virtual-environments/conda/)).
* [ ] [Logs](https://nhsdigital.github.io/rap-community-of-practice/training_resources/python/logging-and-error-handling/) are automatically recorded by the pipeline to ensure outputs are as expected.
* [ ] Data is handled and output in a [Tidy data format](https://medium.com/@kimrodrikwa/untidy-data-a90b6e3ebe4c).

### 🥇 Gold 

Analysis as a product to further elevate your analytical work and enhance its reusability to the public.

Meeting all of the above requirements, plus:

* [ ] Code is fully [packaged](https://packaging.python.org/en/latest/).
* [ ] Repository automatically runs tests etc. via CI/CD or a different integration/deployment tool e.g. [GitHub Actions](https://docs.github.com/en/actions).
* [ ] Process runs based on event-based triggers (e.g., new data in database) or on a schedule.
* [ ] Changes to the RAP are clearly signposted. E.g. a changelog in the package, releases etc. (See gov.uk info on [Semantic Versioning](https://github.com/alphagov/govuk-frontend/blob/main/docs/contributing/versioning.md)).

DES RAP Book verification and validation checklist

This checklist translates verification and validation guidance from the simulation literature into concrete, practical steps to build confidence in DES models.

View progress

View blank checklist

**Source**: https://pythonhealthdatascience.github.io/des_rap_book/pages/guide/verification_validation/verification_validation.html from Heather, A., Monks, T., Mustafee, N., Harper, A., Alidoost, F., Challen, R., & Slater, T. (2025). DES RAP Book: Reproducible Discrete-Event Simulation in Python and R. https://github.com/pythonhealthdatascience/des_rap_book. https://doi.org/10.5281/zenodo.17094155.

## Verification

Desk checking

* [ ] Systematically check code.
* [ ] Keep documentation complete and up-to-date.
* [ ] Maintain an environment with all required packages.
* [ ] Lint code.
* [ ] Get code review.

Debugging

* [ ] Write tests - they'll help for spotting bugs.
* [ ] During model development, monitor the model using logs - they'll help with spotting bugs.
* [ ] Use GitHub issues to record bugs as they arise, so they aren't forgotten and are recorded for future reference.

Assertion checking

* [ ] Add checks in the model which cause errors if something doesn't look right.
* [ ] Write tests which check that assertions hold true.

Special input testing

* [ ] If there are input variables with explicit limits, design boundary value tests to check the behaviour at, just inside, and just outside each boundary.
* [ ] Write stress tests which simulate worst-case load and ensure model is robust under heavy demand.
* [ ] Write tests with little or no activity/waits/service.

Bottom-up testing

* [ ] Write unit tests for each individual component of the model.
* [ ] Once individual parts work correctly, combine them and test how they interact - this can be via integration testing or functional testing.

Regression testing

* [ ] Write tests early.
* [ ] Run tests regularly (locally or automatically via. GitHub actions).

Mathematical proof of correctness

* [ ] For parts of the model where theoretical results exist (like an M/M/s queue), compare simulation outputs with results from mathematical formulas.

## Validation

Conceptual model validation

* [ ] Document and justify all modeling assumptions.
* [ ] Review the conceptual model with people familiar with the real system to assess completeness and accuracy.

Input data validation

* [ ] Check the datasets used - screen for outliers, determine if they are correct, and if the reason for them occurring should be incorporated into the simulation.
* [ ] Ensure you have performed appropriate input modelling steps when choosing your distributions.

Graphical comparison

* [ ] Create time-series plots and distributions of key results (e.g., daily patient arrivals, resource utilisation, waiting times) for both the model and the actual system, and compare the graphs to assess whether patterns and trends are similar.

Statistical comparison

* [ ] Collect real system data on key performance measures (e.g., wait times, lengths of stay, throughput) and compare with model outputs statistically using appropriate tests.

Turing test

* [ ] Collect matching sets of model output and real system, remove identifying labels, and present them to a panel of experts. Record whether experts can distinguish simulation outputs from real data. Use their feedback on distinguishing features to further improve the simulation.

Predictive validation

* [ ] Use historical arrival data, staffing schedules, treatment times, or other inputs from a specific time period to drive your simulation. Compare the simulation's predictions for that period (e.g., waiting times, bed occupancy) against the real outcomes for the same period.
* [ ] Consider varying the periods you validate on—year-by-year, season-by-season, or even for particular policy changes or events—to detect strengths or weaknesses in the model across different scenarios.
* [ ] Use graphical comparisons (e.g., time series plots) or statistical measures (e.g., goodness-of-fit, mean errors, confidence intervals) to assess how closely the model matches reality - see below.

Animation visualisation

* [ ] Create an animation to help with validation (as well as communication and reuse).

Comparison testing

* [ ] If you have multiple models of the same system, compare them!

Face validation

* [ ] Present key simulation outputs and model behaviour to people such as: project team members; intended users of the model (e.g., healthcare analysts, managers); people familiar with the real system (e.g., clinicians, frontline staff, patient representatives). Ask for their subjective feedback on whether the model and results "look right". Discuss specific areas, such as whether performance measures (e.g., patient flow, wait times) match expectations under similar conditions.

Experimentation validation

* [ ] Use a warm-up period.
* [ ] Use statistical methods to determine sufficient run length and number of replications.
* [ ] Perform sensitivity analysis to test how changes in input parameters affect outputs.

Cross validation

* [ ] Search for similar simulation studies and compare the key assumptions, methods and results. Discuss discrepancies and explain reasons for different findings or approaches. Use insights from other studies to improve or validate your own model.

HSMA model readiness checklist

This staged checklist provides guidance for deciding whether a DES model is ready enough to support real-world decision-making.

View progress

View blank checklist

**Source:** Model Readiness Checklist. Sammi Rosser & Amy Heather. HSMA - the little book of DES. https://des.hsma.co.uk/model_checklist.html

## Bronze

### Visualisations

*Creating simple visualisations of a few key areas is one of the best ways to spot unexpected behaviour and quickly validate key parts of your model logic. They allow you to quickly verify whether something looks unusually high or low, or if patterns are intuitively 'wrong'.*

Visualise the following:

- [ ] Arrivals over time
    - *This could be done as a dot/scatter plot with time on the x axis and each dot representing the arrival of an individual, with separate runs represented on the y axis, or as a heatmap*
    - *This helps to see if there are any unexpectedly large gaps between arrivals and if the general pace of arrivals over different time periods matches your understanding of the system*
    - *You could also do this recurrently at relevant timescale if recurrent patterns of arrivals - e.g if you are using time-dependent arrival generators to reflect patterns within a day, or across days/weeks/months. You could take an existing dotplot, filter it to a single run, and make the y axis instead reflect the day of week, for example.*
- [ ] Resource use over time
    - *This can be done overall as a line plot with simulation or clock time on the x axis and number or percentage of resources in use over time*
    - *You could also do this recurrently at relevant timescale if recurrent patterns of resource use - for example, if resources are obstructed in the evenings or at weekends*.
- [ ] Distributions of generated activity times **AND** generated inter-arrival times (one plot per activity or arrival generator).
    - *This could be done with a histogram, box plot, swarm plot or violin plot*
    - *This helps to check that the distribution of generated times roughly matches the real-world pattern, as well as checking for any implausibly long or short times*
- [ ] Queue lengths over time.
    - *This can be done with a line plot with simulation or clock time on the x axis and queue length on the y axis*
    - *This is useful for checking whether queues do build up, and if so, if they are implausibly large or small in comparison to the real system*

### Core Logic Checks

*When you are doing these checks, you are really just looking for something being 'off'.*

- [ ] Watch the journey of several individuals (via logs or animations).
    - *Do the journeys look 'right'? Do entities follow the paths you'd expect?*
- [ ] Review the code to check that time units have been used consistently throughout the code (e.g. time inputs don't swap from minutes to hours at any point).

*The following items can be robustly checked using [event logs](https://des.hsma.co.uk/event_logging.html).*

*However, you may be able to do some more basic checks with [console logs](https://des.hsma.co.uk/basic_debugging_tactics.html#using-simple-print-statements).*


- [ ] Check that all described stages demonstrate at least some activity during a sufficiently long, representative run.
    - *For example, in a model where patients arrive, all patients are triaged, then **some** patients are advised by a nurse while others are treated by a doctor, before all patients are discharged, you would want to confirm that entities are reaching each of these stages (e.g. through [console logs](https://des.hsma.co.uk/basic_debugging_tactics.html#using-simple-print-statements)), or that resources at each stage show at least some utilisation (e.g. via the plot you generate), helping you identify if there is an issue with logic that decides which pathway entities follow*
- [ ] Manually check that the number of entities entering the model equals the number of leaving the model plus the number still in the model.
    - *This ensures no entities are "lost" in queues with no output no sink.*
- [ ] Manually check that entities are never in two places at once.
- [ ] Manually check that resources are never simultaneously in use by two entities at once (or whatever variation is appropriate for your model).
- [ ] Manually check that resources are not used at times where they should be obstructed/unavailable (e.g. during breaks, evenings, weekends).

### Robustness for decision-making

- [ ] Number of runs set to a **minimum of 30** (ideally more) for generating any metrics or other outputs that will be used for decision-making.
- [ ] [Warm up length visually checked](https://pythonhealthdatascience.github.io/des_rap_book/pages/guide/output_analysis/length_warmup.html) to be appropriate.

### Reproducibility

- [ ] Add a random seed to the model, then test it by running the model twice with the same seed, and visually check outputs to confirm that the results do not change.

### Key checks against the real system

- [ ] Number of entities arriving in simulation verified against historical patterns (averages sensible, distributions sensible).
    - [ ] If relevant, this is also checked for entity subtypes/entities with different attributes where it is important that these patterns match the real system.
- [ ] Generated activity times visually verified (averages sensible, distribution sensible).
    - [ ] If relevant, this is also checked for entity subtypes/entities with different attributes where it is important that these patterns match the real system.
- [ ] Visual check that there are no implausible activity lengths.
    - [ ] If necessary, apply a cap to generated activity times to resolve this.
- [ ] Visual check that there are no implausible inter-arrival times for entities
    - [ ] If necessary, apply a cap to generated inter-arrival times to resolve this.
- [ ] Visual check of KPI outputs against the historical baseline - do these look sensible and similar to historical data?

### Process and Stakeholder Checks

- [ ] Approve the simulation process map with stakeholders.
- [ ] Check the scenario outputs with stakeholders - do they feel reasonable?
    - Note that stakeholder responses to outputs shouldn't necessarily be taken as a definite right/wrong judgment on the model - but they may help to sense check or indicate areas for more attention.

### Documentation

- [ ] Write a readme that explains how to run the model, how to change parameters in the model, and gives a brief overview of the system being modelled.
- [ ] Include sufficient comments in your work to help people understand non-obvious elements of the code.
- [ ] Clearly document
    - [ ] data sources
    - [ ] assumptions
    - [ ] inputs
    - [ ] decisions - including any changes to the analytical plan or decisions made during analysis
- [ ] [Document the versions of packages you have used](https://des.hsma.co.uk/stars.html#dependency-management), ideally using a requirements.txt or environment.yml file.


## Silver

### Documentation

- [ ] Add [docstrings](https://hsma-programme.github.io/h6_march_2025_forum_presentation/#/making-your-code-super-readable-with-docstrings).
- [ ] Complete a formal model reporting checklist (e.g [STRESS-DES](https://des.hsma.co.uk/stress_des.html)).

### Code Review

- [ ] Have a code review undertaken by someone else (see [the DES RAP book: peer review](https://pythonhealthdatascience.github.io/des_rap_book/pages/guide/sharing/peer_review.html)).

### Automated Testing

Define [formal automated tests](https://des.hsma.co.uk/tests.html):

- [ ] Model running successfully.
- [ ] Varying results are obtained when using different seeds and the same parameters.
- [ ] Identical results are obtained when the same seed and parameters are used.
- [ ] Number of entities entering the model equals the number of leaving the model plus the number still in the model. This ensures no entities are "lost" in queues with no output no sink.
- [ ] Utilisation never exceeds capacity (i.e. resources are never in use by multiple entities, unless allowed in your model).
- [ ] All stages of the model show some activity (i.e. there are no 'orphaned' steps).
- [ ] Simple 'expected behaviour' when parameters are varied:
    - [ ] Longer activity time = worse performance.
    - [ ] More arrivals = longer queues.
- [ ] Write tests for behaviour under extreme conditions:
    - [ ] Heavy demand.
    - [ ] Little to no demand.

### Reusability

- [ ] Add an [Open Licence](https://des.hsma.co.uk/stars.html#open-licence) to your repository

### Version Control

- [ ] Use version control for code (Git)
- [ ] Make the model available on a remote code hosting service (e.g. GitHub, BitBucket)

### Model Robustness

- [ ] Undertake sensitivity analysis (i.e. check how much changes in model input variables affect output performance measures, and consider whether results appear sensible/expected)

## Gold

### Documentation

- [ ] [Create a documentation site and host this](https://des.hsma.co.uk/stars.html#documentation-hosting).
    - Consider using a framework like [mkdocs-material](https://squidfunk.github.io/mkdocs-material/) to automatically display the docstrings of your functions and classes in an easy-to-access manner
- [ ] [Set up and maintain a changelog](https://keepachangelog.com/en/1.1.0/).
- [ ] Use [GitHub Issues](https://hsma-programme.github.io/h6_march_2025_forum_presentation/#/github-issues) to track bugs and highlight remaining tasks.
- [ ] [Document your quality assurance process.](https://pythonhealthdatascience.github.io/des_rap_book/pages/guide/verification_validation/quality_assurance.html)

### Reusability

- [ ] [Research artifact metadata (ORCID)](https://des.hsma.co.uk/stars.html#open-researcher-and-contributor-identifier-orcid)
- [ ] [Open Science Archive](https://des.hsma.co.uk/stars.html#open-science-archive)
- [ ] [Online Coding Environment](https://des.hsma.co.uk/stars.html#online-coding-environment)

### Model Efficiency

- [ ] [Parallelisation](https://des.hsma.co.uk/running_parallel_cpus.html) implemented.

### Model Communication and Validation

- [ ] [Create a web app interface for your model](https://des.hsma.co.uk/stars.html#model-interface)
- [ ] [Host the web app interface for your model](https://des.hsma.co.uk/stars.html#web-app-hosting)
- [ ] [Animated model output](https://hsma-tools.github.io/vidigi/vidigi_docs/adding_vidigi_to_a_simple_simpy_model_hsma_structure.html) created (if appropriate).
    - *Animations also have a role to play in model validation - inspecting the animation, including with non-technical stakeholders, can help identify subtle bugs.*

### Best Practice around Variability and Model Setup

- [ ] Formal automated method implemented for determining warm-up period.
- [ ] [Formal method used for determining appropriate replication count](https://pythonhealthdatascience.github.io/des_rap_book/pages/guide/output_analysis/n_reps.html).

### Automated Testing

- [ ] Define formal automated tests for:
    - [ ] Comparison of base case outputs to historical data for key metrics.
    - [ ] Statistical testing of generated distributions against real-world data (e.g. Kolgomorov-Smirnov Test).

PyOpenSci package checks

PyOpenSci is a volunteer-led organisation that conducts peer review of scientific Python packages. Their review template is a curated checklist of best-practice criteria covering packaging, testing, documentation, and sustainability, which they apply when assessing packages.

View progress

View blank checklist

**Source:** https://www.pyopensci.org/software-peer-review/appendices/templates.html

#### Documentation

The package includes all the following forms of documentation:

- [ ] **A statement of need** clearly stating problems the software is designed to solve and its target audience in the README file.
- [ ] **Installation instructions:** for the development version of the package and any non-standard dependencies in README.
- [ ] **Short quickstart tutorials** demonstrating significant functionality that successfully runs locally.
- [ ] **Function Documentation:** for all user-facing functions.
- [ ] **Examples** for all user-facing functions.
- [ ] **Community guidelines** including contribution guidelines in the README or CONTRIBUTING.
- [ ] **Metadata** including author(s), author e-mail(s), a URL, and any other relevant metadata, for example, in a `pyproject.toml` file or elsewhere.

Readme file  requirements
The package meets the readme requirements below:

- [ ] Package has a README.md file in the root directory.

The README should include, from top to bottom:

- [ ] The package name
- [ ] Badges for:
    - [ ] Continuous integration and test coverage,
    - [ ] Docs building (if you have a documentation website),
    - [ ] Python versions supported,
    - [ ] Current package version (on PyPI / Conda).

*NOTE: If the README has many more badges, you might want to consider using a table for badges: [see this example](https://github.com/ropensci/drake). Such a table should be wider than high. A badge for pyOpenSci peer review will be provided when the package is accepted.*

- [ ] Short description of package goals.
- [ ] Package installation instructions
- [ ] Any additional setup required to use the package (authentication tokens, etc.)
- [ ] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    - [ ] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
- [ ] Link to your documentation website.
- [ ] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
- [ ] Citation information

#### Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
The package structure should follow the general community best practices. In general, please consider whether:

- [ ] Package documentation is clear and easy to find and use.
- [ ] The need for the package is clear
- [ ] All functions have documentation and associated examples for use
- [ ] The package is easy to install


#### Functionality

- [ ] **Installation:** Installation succeeds as documented.
- [ ] **Functionality:** Any functional claims of the software been confirmed.
- [ ] **Performance:** Any performance claims of the software been confirmed.
- [ ] **Automated tests:**
  - [ ] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
  - [ ] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
- [ ] **Continuous Integration:** Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
- [ ] **Packaging guidelines**: The package conforms to the pyOpenSci [packaging guidelines](https://www.pyopensci.org/python-package-guide).
    A few notable highlights to look at:
    - [ ] Package supports modern versions of Python and not [End of life versions](https://endoflife.date/python).
    - [ ] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)