diff --git a/docs/images/Cropping_process.width-800.png b/docs/images/Cropping_process.width-800.png new file mode 100644 index 00000000..b967d2e2 Binary files /dev/null and b/docs/images/Cropping_process.width-800.png differ diff --git a/docs/images/Example_using_fake_data.width-1534.png b/docs/images/Example_using_fake_data.width-1534.png new file mode 100644 index 00000000..3b7d8375 Binary files /dev/null and b/docs/images/Example_using_fake_data.width-1534.png differ diff --git a/docs/images/Flow_of_transfer.width-800.png b/docs/images/Flow_of_transfer.width-800.png new file mode 100644 index 00000000..a30f25ac Binary files /dev/null and b/docs/images/Flow_of_transfer.width-800.png differ diff --git a/docs/images/Genetic_algorithm.width-1534.png b/docs/images/Genetic_algorithm.width-1534.png new file mode 100644 index 00000000..65d2ed61 Binary files /dev/null and b/docs/images/Genetic_algorithm.width-1534.png differ diff --git a/docs/images/Parkinsons_synthetic_brain_slices.width-800.png b/docs/images/Parkinsons_synthetic_brain_slices.width-800.png new file mode 100644 index 00000000..9dfc47a6 Binary files /dev/null and b/docs/images/Parkinsons_synthetic_brain_slices.width-800.png differ diff --git a/docs/images/Recruitment_graph.width-800.png b/docs/images/Recruitment_graph.width-800.png new file mode 100644 index 00000000..71ed3c56 Binary files /dev/null and b/docs/images/Recruitment_graph.width-800.png differ diff --git a/docs/images/User_interface.width-1534.png b/docs/images/User_interface.width-1534.png new file mode 100644 index 00000000..feda3252 Binary files /dev/null and b/docs/images/User_interface.width-1534.png differ diff --git a/docs/images/ai-dictionary.png b/docs/images/ai-dictionary.png new file mode 100644 index 00000000..ec144b4d Binary files /dev/null and b/docs/images/ai-dictionary.png differ diff --git a/docs/images/ambulance-delay-predictor.png b/docs/images/ambulance-delay-predictor.png new file mode 100644 index 00000000..6c319d1d Binary files /dev/null and b/docs/images/ambulance-delay-predictor.png differ diff --git a/docs/images/bed-allocation.png b/docs/images/bed-allocation.png new file mode 100644 index 00000000..f120dfc9 Binary files /dev/null and b/docs/images/bed-allocation.png differ diff --git a/docs/images/ct-alignment.png b/docs/images/ct-alignment.png new file mode 100644 index 00000000..13f27de0 Binary files /dev/null and b/docs/images/ct-alignment.png differ diff --git a/docs/images/data-lens-casestudy.png b/docs/images/data-lens-casestudy.png new file mode 100644 index 00000000..ad1c1e92 Binary files /dev/null and b/docs/images/data-lens-casestudy.png differ diff --git a/docs/images/data-lens.png b/docs/images/data-lens.png new file mode 100644 index 00000000..ad1c1e92 Binary files /dev/null and b/docs/images/data-lens.png differ diff --git a/docs/images/long-stay-baseline/clf-feature-importance.png b/docs/images/long-stay-baseline/clf-feature-importance.png new file mode 100644 index 00000000..24896b2f Binary files /dev/null and b/docs/images/long-stay-baseline/clf-feature-importance.png differ diff --git a/docs/images/long-stay-baseline/clf-predicted-actuals-final-model-test.png b/docs/images/long-stay-baseline/clf-predicted-actuals-final-model-test.png new file mode 100644 index 00000000..29f93eb3 Binary files /dev/null and b/docs/images/long-stay-baseline/clf-predicted-actuals-final-model-test.png differ diff --git a/docs/images/long-stay-baseline/clf-predicted-actuals-training.png b/docs/images/long-stay-baseline/clf-predicted-actuals-training.png new file mode 100644 index 00000000..8de05e6d Binary files /dev/null and b/docs/images/long-stay-baseline/clf-predicted-actuals-training.png differ diff --git a/docs/images/long-stay-baseline/clf-predicted-actuals-validation.png b/docs/images/long-stay-baseline/clf-predicted-actuals-validation.png new file mode 100644 index 00000000..93fdc9ef Binary files /dev/null and b/docs/images/long-stay-baseline/clf-predicted-actuals-validation.png differ diff --git a/docs/images/long-stay-baseline/correlation.png b/docs/images/long-stay-baseline/correlation.png new file mode 100644 index 00000000..8c954ba2 Binary files /dev/null and b/docs/images/long-stay-baseline/correlation.png differ diff --git a/docs/images/long-stay-baseline/los-boxplot.png b/docs/images/long-stay-baseline/los-boxplot.png new file mode 100644 index 00000000..56fa2d13 Binary files /dev/null and b/docs/images/long-stay-baseline/los-boxplot.png differ diff --git a/docs/images/long-stay-baseline/los-density.png b/docs/images/long-stay-baseline/los-density.png new file mode 100644 index 00000000..ca38a9c6 Binary files /dev/null and b/docs/images/long-stay-baseline/los-density.png differ diff --git a/docs/images/long-stay-baseline/los-dist-ethnicity.png b/docs/images/long-stay-baseline/los-dist-ethnicity.png new file mode 100644 index 00000000..6ad94a9b Binary files /dev/null and b/docs/images/long-stay-baseline/los-dist-ethnicity.png differ diff --git a/docs/images/long-stay-baseline/los-dist-imd.png b/docs/images/long-stay-baseline/los-dist-imd.png new file mode 100644 index 00000000..4a9ba8bb Binary files /dev/null and b/docs/images/long-stay-baseline/los-dist-imd.png differ diff --git a/docs/images/long-stay-baseline/los-dist-sex.png b/docs/images/long-stay-baseline/los-dist-sex.png new file mode 100644 index 00000000..2e328c90 Binary files /dev/null and b/docs/images/long-stay-baseline/los-dist-sex.png differ diff --git a/docs/images/long-stay-baseline/los-mean-los-ethnicity.png b/docs/images/long-stay-baseline/los-mean-los-ethnicity.png new file mode 100644 index 00000000..e9693dd6 Binary files /dev/null and b/docs/images/long-stay-baseline/los-mean-los-ethnicity.png differ diff --git a/docs/images/long-stay-baseline/los-mean-los-imd.png b/docs/images/long-stay-baseline/los-mean-los-imd.png new file mode 100644 index 00000000..da1d12e5 Binary files /dev/null and b/docs/images/long-stay-baseline/los-mean-los-imd.png differ diff --git a/docs/images/long-stay-baseline/los-mean-los-sex.png b/docs/images/long-stay-baseline/los-mean-los-sex.png new file mode 100644 index 00000000..93bb174c Binary files /dev/null and b/docs/images/long-stay-baseline/los-mean-los-sex.png differ diff --git a/docs/images/long-stay-baseline/los-rel-error-ethnicity.png b/docs/images/long-stay-baseline/los-rel-error-ethnicity.png new file mode 100644 index 00000000..9bc7e194 Binary files /dev/null and b/docs/images/long-stay-baseline/los-rel-error-ethnicity.png differ diff --git a/docs/images/long-stay-baseline/los-rel-error-imd.png b/docs/images/long-stay-baseline/los-rel-error-imd.png new file mode 100644 index 00000000..0d319dcb Binary files /dev/null and b/docs/images/long-stay-baseline/los-rel-error-imd.png differ diff --git a/docs/images/long-stay-baseline/los-rel-error-sex.png b/docs/images/long-stay-baseline/los-rel-error-sex.png new file mode 100644 index 00000000..65b43d50 Binary files /dev/null and b/docs/images/long-stay-baseline/los-rel-error-sex.png differ diff --git a/docs/images/long-stay-baseline/ml-approach.png b/docs/images/long-stay-baseline/ml-approach.png new file mode 100644 index 00000000..42432e9e Binary files /dev/null and b/docs/images/long-stay-baseline/ml-approach.png differ diff --git a/docs/images/long-stay-baseline/model-comparison.png b/docs/images/long-stay-baseline/model-comparison.png new file mode 100644 index 00000000..7d7a7de8 Binary files /dev/null and b/docs/images/long-stay-baseline/model-comparison.png differ diff --git a/docs/images/long-stay-baseline/regression-feature-importance.png b/docs/images/long-stay-baseline/regression-feature-importance.png new file mode 100644 index 00000000..66f14e10 Binary files /dev/null and b/docs/images/long-stay-baseline/regression-feature-importance.png differ diff --git a/docs/images/long-stay-baseline/regression-predicted-actuals-final-model-test.jpeg b/docs/images/long-stay-baseline/regression-predicted-actuals-final-model-test.jpeg new file mode 100644 index 00000000..9667a3d0 Binary files /dev/null and b/docs/images/long-stay-baseline/regression-predicted-actuals-final-model-test.jpeg differ diff --git a/docs/images/long-stay-baseline/regression-predicted-actuals-training.png b/docs/images/long-stay-baseline/regression-predicted-actuals-training.png new file mode 100644 index 00000000..ef1a1304 Binary files /dev/null and b/docs/images/long-stay-baseline/regression-predicted-actuals-training.png differ diff --git a/docs/images/long-stay-baseline/regression-predicted-actuals-validation.jpeg b/docs/images/long-stay-baseline/regression-predicted-actuals-validation.jpeg new file mode 100644 index 00000000..1898c022 Binary files /dev/null and b/docs/images/long-stay-baseline/regression-predicted-actuals-validation.jpeg differ diff --git a/docs/images/long-stay-baseline/residuals.jpeg b/docs/images/long-stay-baseline/residuals.jpeg new file mode 100644 index 00000000..e27f7041 Binary files /dev/null and b/docs/images/long-stay-baseline/residuals.jpeg differ diff --git a/docs/images/long-stay-baseline/sparsity-clean.png b/docs/images/long-stay-baseline/sparsity-clean.png new file mode 100644 index 00000000..07cd060e Binary files /dev/null and b/docs/images/long-stay-baseline/sparsity-clean.png differ diff --git a/docs/images/long-stay-baseline/sparsity-major.png b/docs/images/long-stay-baseline/sparsity-major.png new file mode 100644 index 00000000..79b844e1 Binary files /dev/null and b/docs/images/long-stay-baseline/sparsity-major.png differ diff --git a/docs/images/long-stay-baseline/sparsity.png b/docs/images/long-stay-baseline/sparsity.png new file mode 100644 index 00000000..6593c817 Binary files /dev/null and b/docs/images/long-stay-baseline/sparsity.png differ diff --git a/docs/images/long-stay-baseline/split-age.png b/docs/images/long-stay-baseline/split-age.png new file mode 100644 index 00000000..e86eddb9 Binary files /dev/null and b/docs/images/long-stay-baseline/split-age.png differ diff --git a/docs/images/long-stay-baseline/split-los.png b/docs/images/long-stay-baseline/split-los.png new file mode 100644 index 00000000..5bfc97f6 Binary files /dev/null and b/docs/images/long-stay-baseline/split-los.png differ diff --git a/docs/images/long-stay.png b/docs/images/long-stay.png new file mode 100644 index 00000000..cad52188 Binary files /dev/null and b/docs/images/long-stay.png differ diff --git a/docs/images/nursing-placement-optimisation/ga.png b/docs/images/nursing-placement-optimisation/ga.png new file mode 100644 index 00000000..f136f29e Binary files /dev/null and b/docs/images/nursing-placement-optimisation/ga.png differ diff --git a/docs/images/nursing-placement-optimisation/sch.png b/docs/images/nursing-placement-optimisation/sch.png new file mode 100644 index 00000000..d9db1d46 Binary files /dev/null and b/docs/images/nursing-placement-optimisation/sch.png differ diff --git a/docs/images/nursing-placement-optimisation/ui-after-running.png b/docs/images/nursing-placement-optimisation/ui-after-running.png new file mode 100644 index 00000000..43806008 Binary files /dev/null and b/docs/images/nursing-placement-optimisation/ui-after-running.png differ diff --git a/docs/images/nursing-placement-optimisation/ui-before-running.png b/docs/images/nursing-placement-optimisation/ui-before-running.png new file mode 100644 index 00000000..d7233a21 Binary files /dev/null and b/docs/images/nursing-placement-optimisation/ui-before-running.png differ diff --git a/docs/images/nursing-placement-optimisation/ui-during-running.png b/docs/images/nursing-placement-optimisation/ui-during-running.png new file mode 100644 index 00000000..12732969 Binary files /dev/null and b/docs/images/nursing-placement-optimisation/ui-during-running.png differ diff --git a/docs/images/parkinsons-detection.png b/docs/images/parkinsons-detection.png new file mode 100644 index 00000000..e6979b2e Binary files /dev/null and b/docs/images/parkinsons-detection.png differ diff --git a/docs/images/renal-health-prediction.png b/docs/images/renal-health-prediction.png new file mode 100644 index 00000000..836b669b Binary files /dev/null and b/docs/images/renal-health-prediction.png differ diff --git a/docs/images/skunkworks-project-flow.svg b/docs/images/skunkworks-project-flow.svg new file mode 100644 index 00000000..951377f0 --- /dev/null +++ b/docs/images/skunkworks-project-flow.svg @@ -0,0 +1,4 @@ + + + +
Information Governance








Information Governance...
Initial assessment




















Initial assessment...
Does this fit with national/Directorate priorities?
Does this fit with n...
Review existing solutions and professional body outreach
Review existing solu...
Inbound enquiry
Inbound enquiry
Pre-mobilisation
Pre-mobilisation
No
No
Yes
Yes
Accept?
Accept?
Not suitable for AI Skunkworks programme: refer to alternate programme if appropriate
Not suitable for AI...
Project planning







Project planning...
Literature review and market review
Literature review an...
Plan timeline
Plan timeline
Determine requirements: DPA, DPIA, DSPT
Determine requiremen...
NHSE SIRO approval (if required)
OR
Supplier IG engagement
NHSE SIRO approval (...
Trust to draft IG, seek own approvals

Support from Skunkworks
Trust to draft IG, s...
Pre-mobilisation work





















Pre-mobilisation work...
Tech










Tech...
Business case
(if relevant)
Business case...
Pre-project impact survey
Pre-project impact s...
Detailed conceptual data discovery
Detailed conceptual...
Set up ML environment%3CmxGraphModel%3E%3Croot%3E%3CmxCell%20id%3D%220%22%2F%3E%3CmxCell%20id%3D%221%22%20parent%3D%220%22%2F%3E%3CmxCell%20id%3D%222%22%20value%3D%22Tech%22%20style%3D%22rounded%3D0%3BwhiteSpace%3Dwrap%3Bhtml%3D1%3B%22%20vertex%3D%221%22%20parent%3D%221%22%3E%3CmxGeometry%20x%3D%22880%22%20y%3D%22440%22%20width%3D%22120%22%20height%3D%2260%22%20as%3D%22geometry%22%2F%3E%3C%2FmxCell%3E%3C%2Froot%3E%3C%2FmxGraphModel%3E
Set up ML environmen...
Create GitHub repo from template
Create GitHub repo f...
Test data transfer
Test data transfer
Evaluate SOTA vs standard options
Evaluate SOTA vs sta...
Full data transfer
Full data transfer
Project work
Project work
Sprint planning












Sprint planning...
Design backlog
Design backlog
Agree sprint plan and meetings with stakeholders
Agree sprint plan an...
Incorporate stakeholder feedback, iterate
Incorporate stakehol...
Forecast availability (including a/l and bank hols)
Forecast availabilit...
Deconflict with other projects
Deconflict with othe...


Decide delivery route - in-house, individual contractor, supplier company


Decide delivery rout...
Write up project brief
Write up project bri...
Iterative delivery












Iterative delivery...
Train models
Train models
Validate models
Validate models
Build models
Build models
Review desired outcomes
Review desired outco...
In depth data discovery
In depth data discov...
Close down























Close down...
Fake data generators and integration test
Fake data generators...


GitHub repo: upload and review


GitHub repo: upload...


Tech walkthrough with stakeholders


Tech walkthrough wit...
Problem statement analysis






Problem statement analysis...
Initial conceptual data discovery
Initial conceptual d...
Question analysis - what outcomes are required to solve the problem?
Question analysis -...
Product Owner SLT buy-in






Product Owner SLT buy-in...
Ethics considerations
Ethics considerations
Information Governance (IG) considerations
Information Governan...
GitHub repo










GitHub repo...


Tech report


Tech report...
Video walkthrough
(if relevant)
Video walkthrough...


Code review


Code review...


Documentation and guides


Documentation and gu...


Demo of how to use GitHub repo


Demo of how to use G...
Ready to publish?
Ready to publish?
Publish repo
Publish repo
Share learning





















Share learning...
Post-project
Post-project
Blog post
Blog post
Social media
Social media
Case study
Case study
Show & Tells






Show & Tells...
Public / NHS
Public / NHS
AI Lab /  NHSE Transformation Directorate
AI Lab /  NHSE Trans...
Stakeholder engagement







Stakeholder engagement...
Post-project impact survey
Post-project impact...
Plan immediate post-POC engagement
Plan immediate post-...
Roadmap to MVP and beyond%3CmxGraphModel%3E%3Croot%3E%3CmxCell%20id%3D%220%22%2F%3E%3CmxCell%20id%3D%221%22%20parent%3D%220%22%2F%3E%3CmxCell%20id%3D%222%22%20value%3D%22Case%20study%22%20style%3D%22rounded%3D0%3BwhiteSpace%3Dwrap%3Bhtml%3D1%3B%22%20vertex%3D%221%22%20parent%3D%221%22%3E%3CmxGeometry%20x%3D%222760%22%20y%3D%22280%22%20width%3D%22120%22%20height%3D%2260%22%20as%3D%22geometry%22%2F%3E%3C%2FmxCell%3E%3C%2Froot%3E%3C%2FmxGraphModel%3E
Roadmap to MVP and b...


Plan long term engagement


Plan long term engag...
Problem







Problem...
System efficiency
System efficiency
Other
Other
Clinical
Clinical
Business Intelligence
Business Intelligence
Future development







Future development...
Problem statement




 


Problem statement...
User story / user need
User story / user ne...
What data do we have?
What data do we have?
Current situation
Current situation
Impact of problem to users of health and care system
Impact of problem to...
Problem identification
Problem identificat...
Establish product owner (Trust/ALB)
Establish product ow...
Establish business case
Establish business c...
Establish Medical Device qualification
Establish Medical De...
Handover assets to development team
Handover assets to d...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/docs/images/synthetic-data-pipeline.png b/docs/images/synthetic-data-pipeline.png new file mode 100644 index 00000000..2aae6be5 Binary files /dev/null and b/docs/images/synthetic-data-pipeline.png differ diff --git a/docs/images/team-amadeus.png b/docs/images/team-amadeus.png new file mode 100644 index 00000000..7b1aeeab Binary files /dev/null and b/docs/images/team-amadeus.png differ diff --git a/docs/images/team-giuseppe.jpeg b/docs/images/team-giuseppe.jpeg new file mode 100644 index 00000000..eda8ec8b Binary files /dev/null and b/docs/images/team-giuseppe.jpeg differ diff --git a/docs/images/team-jennifer.png b/docs/images/team-jennifer.png new file mode 100644 index 00000000..0fce5c68 Binary files /dev/null and b/docs/images/team-jennifer.png differ diff --git a/docs/images/team-matthew.png b/docs/images/team-matthew.png new file mode 100644 index 00000000..c6302fd8 Binary files /dev/null and b/docs/images/team-matthew.png differ diff --git a/docs/images/team-oludare.jpg b/docs/images/team-oludare.jpg new file mode 100644 index 00000000..80580e9a Binary files /dev/null and b/docs/images/team-oludare.jpg differ diff --git a/docs/our_work/adrenal-lesions.md b/docs/our_work/adrenal-lesions.md new file mode 100644 index 00000000..ba36b11c --- /dev/null +++ b/docs/our_work/adrenal-lesions.md @@ -0,0 +1,22 @@ +--- +title: 'Using deep learning to detect adrenal lesions in CT scans' +summary: 'This project explored whether applying AI and deep learning augment the detection of adrenal incidentalomas in patients’ CT scans.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['classification','lesion detection','vision AI'] +--- + +Many cases of adrenal lesions, known as adrenal incidentalomas, are discovered incidentally on CT scans performed for other medical conditions. These lesions can be malignant, and so early detection is crucial for patients to receive the correct treatment and allow the public health system to target resources efficiently. Traditionally, the detection of adrenal lesions on CT scans relies on manual analysis by radiologists, which can be time-consuming and unsystematic. + +The main aim of this study was to examine whether or not using AI can improve the detection of adrenal incidentalomas in CT scans. Previous studies have suggested that AI has the potential in distinguishing different types of adrenal lesions. In this study, we specifically focused on detecting the presence of any type of adrenal lesion in CT scans. To demonstrate this proof-of-concept, we investigated the potential of applying deep learning techniques to predict the likelihood of a CT abdominal scan presenting as ‘normal’ or ‘abnormal’, the latter implying the presence of an adrenal lesion. +## Results + + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-adrenal-lesions-detection) +Case Study|[Case Study](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-deep-learning-to-detect-adrenal-lesions-in-ct-scans/) +Technical report|[medRxiv](https://www.medrxiv.org/content/10.1101/2023.02.22.23286184v1) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/ai-deep-dive.md b/docs/our_work/ai-deep-dive.md new file mode 100644 index 00000000..1fd2d70c --- /dev/null +++ b/docs/our_work/ai-deep-dive.md @@ -0,0 +1,149 @@ +--- +title: 'AI Deep Dive' +summary: 'The NHS AI Lab Skunkworks team have developed and delivered a series of workshops to improve confidence working with AI.' +category: 'Playbooks' +origin: 'Skunkworks' +tags: [] +--- + +### Motivation + +A series of practical workshops designed to increase confidence, trust and capability of implementing AI within the NHS and Social Care sector, based on the experience of the AI Lab Skunkworks team. + +### Audience + +Clinicians, technology teams, operations teams, and other stakeholders from organisations interested in utilising AI + +### Pre-requisites + +* I understand there is great potential for AI in Health and Care +* I want to increase my understanding about the practical application of AI in Health and Care +* I understand the variety and quantity of data in my organisation +* I'm willing to embrace being experimental and open to learning from experience + +### Attendees + +10 or 12 attendees max + +### Your presenters + +Workshops run by NHS AI Lab Skunkworks team for one organisation (e.g. Trust) at a time. + +### Format + +A series of weekly 75 minute workshops, delivered online through Google Meet or Microsoft Teams + +### By the end of the workshop series, learners will be able to + +* Be confident in having more conversations about AI in Health and Care +* Embrace an experimental approach to AI in Health and Care +* Understand practical steps required for experimenting with AI in Health and Care +* Create a detailed plan for an AI project + +## Workshop 1: AI fundamentals + +### Aim + +Establish baseline understanding of AI and what is possible + +### Key topics + +* Define AI, Machine Learning and Data Science +* Understand the two AI families (Narrow and General) +* What's possible with ML +* Ethics considerations +* The AI Life Cycle +* Examples of AI in Health and Care +* Examples of projects we’ve worked on + +### By the end of this workshop, learners will + +* Have a baseline understanding of AI & Machine Learning +* Be familiar with AI case studies in health and care +* Be excited about the potential for AI in their organisation + +## Workshop 2: Problem Discovery + +### Aim + +Develop skills to identify and communicate problems + +### Key topics: + +* Problem identification +* Identifying stakeholders +* Understanding user needs +* Writing a user story +* Capturing the user journey + +### By the end of this workshop, learners will + +* Have clearly defined problems they are facing +* Have identified stakeholder and user needs +* Documented the user journey + +## Workshop 3: Solution Discovery + +### Aim + +Identify solutions and potential AI technologies for a problem + +### Key topics + +* Solution identification +* Appropriate AI technologies +* Intended outcomes: Press Release + +### By the end of this workshop, learners will + +* Generate potential solutions for their problem +* Evaluate AI technologies as part of the solution +* Draft a “Press Release” for the future state + +## Workshop 4: Practicalities + +### Aim + +To understand the practical aspects of every AI project. + +### Key topics + +* Data Data Data: how much, where from +* Information Governance (IG) +* Regulatory frameworks +* Ethics approvals + +### By the end of this workshop, learners will + +* Identify the data needs of an AI project +* Understand how to work with Information Governance +* Understand the regulatory requirements for a project +* Understand ethical frameworks applicable to AI projects + +## Workshop 5: Launching your AI experiment + +### Aim + +To understand the next steps in launching your AI Experiment + +### Key Topics + +* Business and technical due diligence +* Build vs Buy? +* Team make up and roles +* Partnering with Skunkworks, AI Award, AHSN +* Keeping up to date with developments in AI + +### By the end of this workshop, learners will + +* Understand the need for business and technical due diligence +* Understand the balance of build vs buy +* Have a robust understanding of what they need to launch their AI experiment +* Be connected to the wider AI community within the NHS and care sector + +## Book your sessions + +If you'd like to arrange an AI Deep Dive with your team, please [get in touch](mailto:england.aiskunkworks@nhs.net?subject=AI%20Deep%20Dive%20enquiry). + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/ai-dictionary.md b/docs/our_work/ai-dictionary.md new file mode 100644 index 00000000..992037e8 --- /dev/null +++ b/docs/our_work/ai-dictionary.md @@ -0,0 +1,27 @@ +--- +title: 'AI Dictionary' +summary: 'A simple dictionary of common AI terms with a health and care context.' +category: 'Projects' +origin: 'Skunkworks' +tags : ['ai', 'dictionary'] +--- + +[![AI Dictionary](../images/ai-dictionary.png)](https://nhsx.github.io/ai-dictionary) + +AI is full of acronyms and a common understanding of technical terms is often lacking. + +We decided to create a simple, open source, AI dictionary of terms with a health and care context to help level up those working in the field. + +## Results + +A front-end website: [https://nhsx.github.io/ai-dictionary](https://nhsx.github.io/ai-dictionary) written in HTML/CSS/JavaScript (frontend) with a JSON schema driven database of terms. + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/ai-dictionary) +Case Study|N/A +Technical report|N/A +Algorithmic Impact Assessment|N/A + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/ambulance-delay-predictor.md b/docs/our_work/ambulance-delay-predictor.md new file mode 100644 index 00000000..c85c30cd --- /dev/null +++ b/docs/our_work/ambulance-delay-predictor.md @@ -0,0 +1,23 @@ +--- +title: 'Ambulance Handover Delay Predictor' +summary: 'Predict ambulance delays at hospital, with reasons, to allow them to influence hospitals'' behaviour to mitigate against queues before they happen.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['ambulance','handover delay','predictor','random forest', 'decision tree', 'classification', 'time series'] +--- + +![Ambulance Handover Delay Predictor screenshot](../images/ambulance-delay-predictor.png) + +Ambulance Handover Delay Predictor was selected as a project in Q2 2022 following a succesful pitch to the AI Skunkworks problem-sourcing programme. + +## Results + +A proof-of-concept demonstrator written in Python (machine learning model, Jupyter Notebooks). + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-ambulance-queueing-prediction) +Technical report|[PDF](https://github.com/nhsx/skunkworks-ambulance-queueing-prediction/raw/main/docs/ambulance-queueing-prediction-technical-report.pdf) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/bed-allocation.md b/docs/our_work/bed-allocation.md new file mode 100644 index 00000000..73fd84f1 --- /dev/null +++ b/docs/our_work/bed-allocation.md @@ -0,0 +1,24 @@ +--- +title: 'Bed allocation' +summary: 'Machine learning to effectively aid bed management in Kettering General Hospital.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['bed management','bayesian forecasting','monte carlo tree search','greedy allocation'] +--- + +![Bed allocation screenshot](../images/bed-allocation.png) + +Bed allocation was identified as a suitable opportunity for the AI Skunkworks programme in May 2021. + +## Results + +A proof-of-concept demonstrator written in Python (backend, virtual hospital, models) and HTML/CSS/JavaScript (frontend). + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-bed-allocation) +Case Study|[Case Study](https://www.nhsx.nhs.uk/ai-lab/explore-all-resources/develop-ai/improving-hospital-bed-allocation-using-ai/) +Technical report|[PDF](https://github.com/nhsx/skunkworks-bed-allocation/blob/main/docs/NHS_AI_Lab_Skunkworks_Bed_Allocation_Technical_Report.pdf) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-adrenal.md b/docs/our_work/casestudy-adrenal.md new file mode 100644 index 00000000..9bef1e23 --- /dev/null +++ b/docs/our_work/casestudy-adrenal.md @@ -0,0 +1,73 @@ +--- +title: 'Using deep learning to detect adrenal lesions in CT scans' +summary: 'Augmenting the detection of adrenal incidentalomas in patients’ CT scans.' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['vision AI','classification','deep learning', 'pathology', 'neural networks'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-deep-learning-to-detect-adrenal-lesions-in-ct-scans/) on the NHS England Transformation Directorate website. + +## Case Study +Many cases of adrenal lesions, known as adrenal incidentalomas, are discovered incidentally on CT scans performed for other medical conditions. These lesions can be malignant, and so early detection is crucial for patients to receive the correct treatment and allow the public health system to target resources efficiently. Traditionally, the detection of adrenal lesions on CT scans relies on manual analysis by radiologists, which can be time-consuming and unsystematic. + + +**The challenge** +Can applying AI and deep learning augment the detection of adrenal incidentalomas in patients’ CT scans? + + +### Overview +Autopsy studies reveal a statistic that as many as 6% of all natural deaths displayed a previously undiagnosed adrenal lesion. Such lesions are also found incidentally (and are therefore referred to as adrenal incidentalomas) in approximately 1% of chest or abdominal CT scans. These lesions affect approximately 50,000 patients annually in the United Kingdom, with significant impact on patient health, including 10% to 15% of cases of excess hormone production, or 1% to 5% of cases of cancer. + +It is a significant challenge for the health care system to, in a standardised way, promptly reassure the majority of patients, who have no abnormalities, whilst effectively focusing on those with hormone excess or cancers. Issues include over-reporting (false positives), causing patient anxiety and unnecessary investigations (wasting resources of the health care system), and under-reporting (missed cases), with potentially fatal outcomes. This has major impacts on patient well-being and clinical outcomes, as well as cost-effectiveness. + +The main aim of this study was to examine whether or not using Artificial Intelligence (AI) can improve the detection of adrenal incidentalomas in CT scans. Previous studies have suggested that AI has the potential in distinguishing different types of adrenal lesions. In this study, we specifically focused on detecting the presence of any type of adrenal lesion in CT scans. To demonstrate this proof-of-concept, we investigated the potential of applying deep learning techniques to predict the likelihood of a CT abdominal scan presenting as ‘normal’ or ‘abnormal’, the latter implying the presence of an adrenal lesion. + +### What we did +Using the data provided by University Hospitals of North Midlands NHS Trust, we developed a 2.5D deep learning model to perform detection of adrenal lesions in patients’ CT scans (binary classification of normal and abnormal adrenal glands). The entire dataset is completely anonymised and does not contain any personal or identifiable information of patients. The only clinical information taken were the binary labels for adrenal lesions (‘normal’ or ‘abnormal’) for the pseudo-labelled patients and their CT scans. + +#### 2.5D images +A 2.5D image is a type of image that lies between a typical 2D and 3D image. It can retain some level of 3D features and can potentially be processed as a 2D image by deep learning models. A greyscale 2D image is two dimensional with a size of x × y, where x and y are the length and width of the 2D image. For a greyscale 3D image (e.g., a CT scan), with a size of x × y × n, it can be considered as a combination of a stack of n number of greyscale 2D images. In other words, a CT scan is a 3D image consisting of multiple 2D images layered on top of each other. The size of a 2.5D image is x × y × 3, and it represents a stack of 3 greyscale 2D images. + +Typically, an extra dimension of pixel information is required to record and display 2D colour images in electronic systems, such as the three RGB (red, green, and blue) colour channels. This increases the size of a 2D image to x × y × 3, where the 3 represents the three RGB channels. Many commonly used families of 2D deep learning algorithms (e.g., VGG, ResNet, and EfficientNet) have taken colour images into account and have the ability to process images with the extra three channels. Taking the advantage of the fact that pixel volumes have the same size between 2D colour images and 2.5D images, converting our 3 dimensional CT scan data to 2.5D images can allow us to apply 2D deep learning models on our images. + +#### Why using a 2.5D model +Due to the intrinsic nature of CT scans (e.g., a high operating cost, limited number of available CT scanners, and patients’ exposure to radiation), the acquisition of a sufficient amount of CT scans for 3D deep learning models training is challenging. In many cases, the performance of 3D deep learning models is limited by the small and non-diversified dataset. Training, validating, and testing the model with a small dataset can lead to many disadvantages, for example, a high risk of overfitting the training-validation set (low prediction ability on an unseen test set), and evaluating the model performance within the ambit of a small number statistic (underrepresented test set results in the test accuracy much lower/higher than the underlying model performance). + +To overcome some of the disadvantage of training a 3D deep learning model, we took a 2.5D deep learning model approach in this case study. Training the model using 2.5D images enables our deep learning model to still learn from the 3D features of the CT scans, while increasing the number of training and testing data points in this study. Moreover, we can apply 2D deep learning models to the set of 2.5D images, which allow us to apply transfer learning to train our own model further based on the knowledge learned by other deep learning applications (e.g., ImageNet, and the NHS AI Lab’s National COVID-19 Chest Imaging Database). + +![Adrenal flow of transfer](../images/Flow_of_transfer.width-800.png) + +#### Classification of 3D CT scans +To perform the binary classification on the overal CT scans (instead of a single 2.5D image), the classification results from each individual 2.5D image that make up a CT scan are considered. + +To connect the classification prediction results from the 2.5D images to the CT scan, we introduce an operating value for our model to provide the final classification. The CT scans are classified as normal if the number of abnormal 2.5D images is lower than the threshold operating value. For example, if the operating value is defined to be X, a CT scan will be considered as normal if there are more than X of its 2.5D images classified as normal by our model. + +#### Processing the CT scans to focus on the adrenal glands + +To prepare the CT scans for this case study (region of interest focus on the adrenal grands), we also developed a manual 3D cropping tool for CT scans. This cropping applied to all three dimensions, including a 1D cropping to select the appropriate axial slices and a 2D cropping on each axial slice. The final cropped 3D image covered the whole adrenal gland on both sides with some extra margin on each side. + +![Adrenal cropping](../images/Cropping_process.width-800.png) + +### Outcomes and lessons learned + +[The resulting code, released as open source on our](https://github.com/nhsx/skunkworks-adrenal-lesions-detection) Github (available to anyone to re-use), enables users to: + +- Process CT scans to focus on the region of interest (e.g., adrenal glands), +- Transform 3D CT scans to sets of 2.5D images, +- Train a deep learning model with the 2.5D images for adrenal lesion detection (classification: normal vs. abnormal), +- Evaluate the trained deep learning model on an independent test set. + +This proof-of-concept model demonstrates the ability and potential of applying such deep learning techniques in the detection of adrenal lesions on CT scans. It also shows an opportunity to detect adrenal incidentalomas using deep learning. + +> An AI solution will allow for lesions to be detected more systematically and flagged for the reporting radiologist. In addition to enhanced patient safety, through minimising missed cases and variability in reporting, this is likely to be a cost-effective solution, saving clinician time. +– Professor Fahmy Hanna, Professor of Endocrinology and Metabolism, Keele Medical School and University Hospitals of North Midlands NHS Trust + + +### Who was involved? + +This project was a collaboration between the NHS AI Lab Skunkworks, within the Transformation Directorate at NHS England and NHS Improvement, and University Hospitals of North Midlands NHS Trust. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-ai-deep-dive.md b/docs/our_work/casestudy-ai-deep-dive.md new file mode 100644 index 00000000..d76a4833 --- /dev/null +++ b/docs/our_work/casestudy-ai-deep-dive.md @@ -0,0 +1,94 @@ +--- +title: 'Sharing AI skills and experience through deep dive workshops with University Hospital Southampton' +summary: 'What we learned running our AI Deep Dives at University Hospital Southampton' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['ai','education','workshops'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/understand-ai/sharing-ai-skills-and-experience-through-deep-dive-workshops/) on the NHS England Transformation Directorate website. + +## Case Study + + +### Case Study Overview +The NHS AI Lab Skunkworks team provides public sector health and social care organisations with artificial intelligence (AI) support and technical expertise. The team of data scientists and AI project specialists has been helping others with their explorations with AI solutions for a range of problems, [from supporting hospital bed allocation to detecting CT scan anomalies](https://transform.england.nhs.uk/ai-lab/ai-lab-programmes/skunkworks/ai-skunkworks-projects/). + +It became clear from engaging with organisations in these projects that the NHS AI Lab has an important role to play in increasing the trust and confidence of healthcare staff with AI tools. Both in their creation and in their everyday use. By exploring the possibilities for AI together with the organisations who will use them, the AI Skunkworks team aims to bring some clarity to the potential of AI and diminish some of the hype. + +### The challenge +Despite increasing interest in the use of AI technologies within the NHS, it is difficult for busy teams to develop the skills and experience necessary to start new experimentation with AI or manage a successful AI project. Even when a potential use for AI is identified, ideas are often thwarted by the complexity of using AI technologies, lack of suitable data, concerns about patient data security or the burden of achieving regulatory approvals of AI as a medical device. + +The Digital team at University Hospital Southampton (UHS) approached us with a request to explore the ethical and safety considerations of applying AI in their work. It is especially critical with AI in health and care that the people affected by its use are confident the tools are robust, and any support for decisions is fair for all patients. + +UHS Digital has been exploring different applications of AI in healthcare for some time. Previously, UHS had applied to the AI Lab Skunkworks Team to see if AI could assist in prioritising patients for endoscopy procedures. With a clear interest in the topic, UHS wanted to learn more about AI key terms and fundamentals. They also wanted to increase confidence in the organisation with regard to identifying the kind of problems that AI could support, and confirm the practicalities and considerations when launching and running an AI experiment. + +The Skunkworks programme aims to test the development and, when appropriate, adopt AI technologies into all areas of health and care. In addition to supporting the development of proof-of concept AI solutions, and providing [open source code on Github](https://nhsx.github.io/skunkworks) for others to re-purpose, we are also providing teams like UHS with a series of AI deep dive workshops. + +> This was a great opportunity to get a frontline NHS IT team thinking about applied AI inside the system. It has certainly served to inspire the team to try new things. +– Matt Stammers, Clinical Lead, University Hospital Southampton Data Science + +### AI deep dive workshops +The workshops provide organisations with the relevant knowledge and tools to understand how to safely launch an AI experiment in healthcare. We provide guidance on identifying a potential real application of AI and use this idea to create a problem statement and identify AI solutions. We also consider the practicalities of running the experiment and the ethical and information governance considerations that are so vital for producing safe and effective technologies. + +No previous experience or knowledge of AI is necessary as the series of workshops provide an introduction to the key terms, types and applications of AI in healthcare. Hence, this workshop series is open to anyone who is interested in how AI can support their organisation, and this can include clinicians, technology teams, operations teams and senior stakeholders. + +The 5-part series of virtual and interactive workshops covers: + +- an introduction to AI and healthcare case studies +- how to identify potential applications of AI and write up a patient and user focused problem statement +- practicalities when starting and running an AI experiment, including who needs to be involved in an AI experiment for example ethics, information governance and medical device regulation +- agile ways of working to ensure the problem and the solution is always patient and user focused. +- innovation methodologies, for example the Amazon ‘working backward’ press release. + +### What we did +Having piloted the deep dive process with colleagues across the NHS Transformation Directorate, we arranged calls with the Southampton team to understand their needs. + +We began by establishing a workshop group of up to 12 participants who would reflect the likely members of staff to be involved in running a data-driven digital transformation initiative. UHS provided a diverse group of participants from teams including electronic patient record (EPR), business intelligence, database/IT, APEX development, clinicians and research data science. + +In particular, the group wanted support with …. + +- being more confident in discussion about AI in healthcare +- embracing the idea of experimentation with AI in healthcare +- understanding the practical steps required to start experimenting +- creating a detailed plan for an AI project. + +We set up weekly workshops, delivered online over a period of 5 weeks. The workshops looked to identify one problem that was worked on through the series. + +The running order for this weekly series was: + +- Workshop 1: AI fundamentals - establish a baseline understanding of AI and the art of the possible. +- Workshop 2: Problem Discovery - develop skills to identify and communicate problems. +- Workshop 3: Solution Discovery - identify solutions and potential AI technologies to solve problems. +- Workshop 4: Practicalities - understand the practical aspects of AI projects. +- Workshop 5: Launching your AI experiment - understand the next steps in launching an AI project. + +You can read more about the [deep dive workshop agenda on our Github website](https://nhsx.github.io/skunkworks/ai-deep-dive). + +We involved the group in interactive elements using tools such as Mentimeter and Google Jamboard, allowing the groups to collaborate, share ideas and aid discussions. + +We included a number of innovation approaches such as the Amazon ‘working backward’ press release product development approach, which helps to imagine what the desired end result will look like. We also introduced the [lean canvas method](https://leanstack.com/lean-canvas) to clearly capture what the problem and potential solution could be, including identifying alternative solutions that may already exist. + + +### Outcomes and lessons learned +The workshops provided valuable insights to the NHS AI Lab Skunkworks team about the importance of group engagement. Having a diverse AI project team that includes people from technical, governance and frontline backgrounds is important to ensure you fully understand the problem you’re trying to solve. + +The experience also demonstrated the value of discussing topics such as “build or buy?” The team at UHS were keen to invest wisely in any AI developments and to learn how to find out about existing tools. With so many AI applications already in existence, there may be tools you can use “off the shelf” or valuable lessons to learn from previous investigations into similar issues. + +The workshops gave us a good opportunity to stress the importance of using AI safely and ethically. The data you use and the testing and governance processes you apply must all result in AI that benefits all patients safely and ethically. + +As a result of the deep dive sessions: + +- 67% of participants felt more confident in their baseline understanding of AI and Machine Learning. +- 71% of participants felt more confident in identifying potential solutions. +- 60% of participants felt more confident in identifying the data needs of an AI project. +- 75% of participants felt more confident conducting business and technical due diligence. + +The team at University Hospital Southampton also reported: + +- a need for additional support when identifying and launching AI experiments +- the importance of diverse groups who represent different roles and teams in order to help the group explore the problem from different perspectives. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-bed-allocation.md b/docs/our_work/casestudy-bed-allocation.md new file mode 100644 index 00000000..b666cd53 --- /dev/null +++ b/docs/our_work/casestudy-bed-allocation.md @@ -0,0 +1,103 @@ +--- +title: 'Improving hospital bed allocation using AI' +summary: 'An investigation of AI techniques that could be used to generate options for moving patients in a way that supports the human team to make the best decisions. ' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['bed management','bayesian forecasting','monte carlo tree search','greedy allocation'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/improving-hospital-bed-allocation-using-ai/) on the NHS England Transformation Directorate website. + +## Case Study +Kettering General Hospital approached the NHS AI Lab Skunkworks team with a request for support exploring artificial intelligence (AI) to improve bed management. Their vision was to use AI to achieve the “right patient, in the right bed, receiving the right care, at the right time.” + +This 14-week project investigated the AI techniques that could be used to generate options for moving patients in a way that supports the human team to make the best decisions. The project aimed to provide a proof of concept tool that uses historic data to predict demand and make bed allocation suggestions to the bed management team - providing the open source code for further experimentation at the end of the project. + +### Overview +Admitting patients into hospital is like a game of Tetris, or chess, where the allocation of each patient and bed can have a huge knock-on effect to the smooth-running of admissions and the welfare of patients. This scheduling of beds is managed by a human team who rely on individual expertise to deliver a system not unlike air traffic control, calculating the best arrangement with a continually changing set of demands and numbers of patients. + +The main challenges for managing hospital admissions were reported to be: + +- Demand and capacity is complex. Not all patients are the same. Not all beds are the same. +- Staff are overwhelmed with options. Managing hundreds of beds and people presents too many choices. +- Expertise of staff, and therefore needs, varies. + +The NHS AI Lab Skunkworks funded and supported an AI investigation with the team at Kettering General Hospital, alongside Faculty, an AI specialist supplier provided through the Home Office’s [Accelerated Capability Environment](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/improving-hospital-bed-allocation-using-ai/#ace) (ACE). + +The work looked at whether AI could support better, faster decision-making, using a tool that would predict patient flow and provide bed allocation options for a human team to consider. The potential benefits being: + +- high-quality, consistent bed allocation decisions +- improved patient experience +- improved workforce efficiency and staff satisfaction +- a reduction in the average number of patient moves per admission (and after-hours moves) +- reductions in inpatient length of stay +- improved problem solving capability within the team. + + +### What we did +We ran a discovery phase in which the project team sought to understand the problem, the existing process and the constraints. We talked to others who are trying similar projects. The team also researched existing attempts to use AI for demand management and scheduling. + +#### Creating a virtual hospital +Following a robust information governance process, the team had access to 5 years’ worth of historic pseudonymised data from the patient admission system (PAS) and 1 to 2 years of patient flow data. Pseudonymisation is a technique that separates data from direct identifiers (for example name, surname, NHS number) and replaces them with a pseudonym (for example, a reference number), so that identifying an individual from that data is not possible without additional information. + +Having assessed the data quality and analysed pre- and post COVID changes, they engineered training and test sets for modelling. This provided a virtual hospital environment with which to explore the use of AI. + +#### Choosing a technical approach +For the forecasting component of the project, the team used a Bayesian modelling approach which used historical data to predict how many patients with specific characteristics would present at the hospital over time. + +The team then compared three approaches to allocating a bed: greedy allocation, Monte Carlo Tree Search (MCTS) and reinforcement learning. + +Greedy allocation allocates beds based on the best bed at the point of admission thereby making it a fast and less resource intensive method. + +The MCTS model operates by considering future events such as the number and nature of patients who are likely to arrive for admission within the next couple of hours, along with constraints of available beds, and then uses that information to allocate the best bed to a patient at the point of admission. This makes the model resource intensive and requires significant computing power to operate. + +Finally, a reinforcement learning approach was considered, which uses “agents” to maximise a reward over time, but this was not developed within the constraints of this 14 week project. + +#### Building a user interface +In order to provide staff with a usable and understandable front end, the team developed and tested a web-based user interface (UI) and integrated the allocation models. + +The team implemented the greedy allocation method in the user interface as it was the least resource intensive approach and able to provide an explainable allocation suggestion. + +The resulting proof of concept was then tested and reviewed by Kettering General Hospital. + +### Outcomes and lessons learned +The result is a proof of concept, created in 14 weeks, with a user interface that provides staff with the following: + +- The ability to visualise a virtual hospital, showing current occupancy rates and forecasted demand for beds. +- A demonstration of what a fully developed allocation model could provide, making suggestions to the user along with an explanation. +- The ability to test the model on a wide range of patients with different attributes and associated constraints and validate the performance. + +> This tool will help the likes of myself and others by supporting decision making. Support is the key word here, machine learning will support us to make these difficult bed allocation and patient decisions. +– Digital Director, Kettering General Hospital NHS Foundation Trust + +> I regularly hear that a bed is a bed and I know it’s not ... But when you have those front door pressures, you can’t get ambulances offloaded and I have beds in the wrong place - this is the time I need the real support, real time data, an automatic risk assessment that is generated for each patient. +– Member of bed management staff, Kettering General Hospital + +There have been significant challenges with this project. + +#### Data quality +Attempting to get a total view of the trust’s capacity and demand is complicated. In this example with Kettering General Hospital, there is no centralised patient flow information. Admission data would be needed for all specialties across the trust for the allocation algorithm to produce the best results. + +#### Complexity of patient needs +The unique nature of patients’ needs means taking into consideration a large number of complex combinations in order to achieve the best allocation decision. + +#### Adapting quickly to change +In a real-world setting, the technology would need to be easily reconfigured by staff with new information about increased beds, changed ward layouts or flu admission peaks. There is currently limited ability to see the impact of changes like these. + +### What next? +Kettering General Hospital will be working with Faculty to bid for further funding to develop and operationalise the bed allocation system. This will aim to build connections to patient data in real-time, refining the algorithm, and understanding how the allocation tool can be integrated into site management practises. + +### Who was involved? +This project is a collaboration between NHSX, [Kettering General Hospital NHS Trust](https://www.kgh.nhs.uk/), [Faculty](https://faculty.ai/) and the [Home Office’s Accelerated Capability Environment](https://www.gov.uk/government/groups/accelerated-capability-environment-ace) (ACE). The AI Lab Skunkworks exists within the NHS AI Lab to support the health and care community to rapidly progress ideas from the conceptual stage to a proof of concept. + +The NHS AI Lab is working with the Home Office programme: Accelerated Capability Environment (ACE) to develop some of its skunkworks projects, providing access to a large pool of talented and experienced suppliers who pitch their own vision for the project. + +Accelerated Capability Environment (ACE) is part of the Homeland Security Group within the Home Office. It provides access to more than 250 organisations from across industry, academia and the third sector who collaborate to bring the right blend of capabilities to a given challenge. Most of these are small and medium-sized enterprises (SMEs) offering cutting-edge specialist expertise. + +ACE is designed to bring innovation at pace, accelerating the process from defining a problem to developing a solution and delivering practical impact to just 10 to 12 weeks. + +Faculty is an applied AI company that helps build and accelerate an organisation's AI capability. They offer a range of software and services solutions. Faculty works with a number of high-profile brands globally as well as government departments and agencies. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-ct-alignment.md b/docs/our_work/casestudy-ct-alignment.md new file mode 100644 index 00000000..cd714576 --- /dev/null +++ b/docs/our_work/casestudy-ct-alignment.md @@ -0,0 +1,93 @@ +--- +title: 'Using AI to identify tissue growth from CT scans' +summary: 'A range of classical and machine learning computer vision techniques to align and detect lesions in anomyised CT scans over time from George Eliot Hospital NHS Trust.' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['ct','computer vision','image registration', 'lesion detection'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-ai-to-identify-tissue-growth-from-ct-scans/) on the NHS England Transformation Directorate website. + +## Case Study +George Eliot Hospital approached the NHS AI Lab Skunkworks team with an idea to use AI to speed up the analysis of computerised tomography (CT) scans. + +CT scans are used in the detection and understanding of disease. Radiologists currently manually compare two CT scans, taken at different dates, to see whether a patient’s disease has improved, deteriorated or remained unchanged. + +This 12-week project investigated the AI techniques that could be applied to the problem and sought to provide a proof of concept (a feasibility study) about whether it was possible to identify organs and growths, report on any changes and highlight areas of concern to the radiologist. + +### Overview +There are a number of challenges for radiologists reviewing cancer patients by comparing CT scans: + +- Review is time-consuming. It typically takes 30 to 40 minutes to assess scans for each patient. +- They must check if there has been growth through multiple dimensions, but can only check one dimension at a time. Growth changes that look minimal in one dimension may be significant if viewed in 3D. +- The manual alignment of images is not precise because of variations in the position of the patient’s body between scans. +- It is not easy to see small or developing growths, increasing the possibility of missed detection. In the abdomen, for example, radiologists are reportedly making differing interpretations in up to 37% of cases. (Siewert, 2008) + +The NHS AI Lab Skunkworks funded and supported an AI investigation with the team at George Eliot Hospital, alongside Roke, an AI specialist supplier provided through the Home Office’s [Accelerated Capability Environment (ACE)](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-ai-to-identify-tissue-growth-from-ct-scans/#who). + +The work looked at whether AI could be used to identify features in a CT scan and automatically align images to provide radiologists with a quick, trustworthy support tool that would improve early detection and diagnosis of growths and improve patient outcomes. The team aimed to: + +- provide fast, automatic overlay of scans to enable tissue growth comparison in 2D and 3D +- successfully align scans despite changes in body shape (caused by breathing during scanning, or weight gain and loss between scans) +- identify different parts of the body (bone, organs and tissue growth) +- automatically measure tissue growth, in 2D and 3D +- detect anomalies (new growths or changes not present in previous scans) + +### What we did +#### Tissue sectioning +Before detecting anomalies, the tool needs to separate the various types of tissue present in a scan (bone, fat, organs). The team first tried the recently released Facebook self-supervised learning method “DINO”, but found that this method was less accurate on CT scans than other approaches and required large amounts of memory to process. Instead, the team proceeded with a texton approach. Textons are micro-structures in images, or ‘groups’ of pixels, that can be recognised visually before the whole image is. + +The process involved applying machine learning (where an algorithm gradually improves its accuracy through repeated exposure to new data like scans and images) to the use of textons. This method was able to identify and differentiate between different tissue types accurately and rapidly. + +#### Anomaly detection +Much like the process humans take in distinguishing between objects, classifying them and sorting them by size, computer vision takes an image input and gives output in the form of information, for example on size or colour. To detect anomalies, the team tried two methods: ellipsoid detection and infill prediction. + +Ellipsoid detection aims to isolate rounded 3D volumes of tissue, which were expected to correlate with lesions more strongly than other geometric shapes and identify those which differ from their surroundings. Infill prediction aims to learn what the parts of an image should look like in order to recognise anomalies that shouldn’t be there. A combination of the two methods was found to be the most accurate at detecting lesions, while manually ignoring anomalies that aren’t of concern, for example pockets of air inside the body. + +#### Scan alignment +As well as automatically detecting lesions, the project was designed to keep a ‘human in the loop’ and act as a support for radiologists conducting examinations. To do this, three methods were tested for aligning two scans from the same patient, usually taken months apart, in 2D and 3D: keypoint alignment, phase correlation and coherent point drift. Phase correlation was found to be the most robust but did not deal with body shape changes, while coherent point drift was most effective for aligning scans while taking into account any body shape changes. + +The tool allows the user to choose which of the three adjustment options provides the best alignment for each case. + + +### Outcomes and lessons learned +The hoped-for benefits of the project have had some partial successes: + +#### Scan alignment +The team achieved both rigid and non-rigid 3D alignment that were an improvement on manual alignment but not perfect. The methods used to deal with difficulties caused by patients inhaling or exhaling during scanning were observed to work in most cases. + +#### Overlay +Precise overlay of scans was achieved in the tool created both for 3D and 2D images, including when zooming, rotating or panning the image. + +#### New tissue growth detection +Anomaly sizes were measured in 3D but more work is needed for robust correspondence between existing lesions to measure change.3D to successfully aid the identification of new growths. + +#### Time saving +Further development and testing is required to establish a reduction in radiologists’ time, but the tool provides a process that is less manual for radiologists. + +George Eliot Hospital provided anonymised CT scans from 100 patients with tissue growths, and a number of marked-up scans to be used as “ground truth” data. In order to progress this work further, greater numbers of scans are needed to provide quantitative metrics of success. + +Integration of novel image processing techniques with existing scanner software also needs to be explored to minimise friction in the workflow for radiologists. + +Further exploration is also needed to identify and compare techniques and approaches that have already been tried by the medical imaging community. + + +### What next? +George Eliot Hospital and Roke have submitted a joint application for further funding through the AI Award to further test the tool and establish a rigorous evaluation ahead of any regulatory work that will be required to use the software in clinical workflow. + +The team are testing these techniques against the latest techniques demonstrated as part of the Medical Image Computing and Computer Assisted Intervention ([MICCAI](http://www.miccai.org/)) conferences. + +### Who was involved? +This project is a collaboration between NHSX, [George Eliot Hospital](http://www.geh.nhs.uk/), and [Roke](https://www.roke.co.uk/) who were selected through the Home Office’s [Accelerated Capability Environment](https://www.gov.uk/government/groups/accelerated-capability-environment-ace) (ACE). The AI Lab Skunkworks exists within the NHS AI Lab to support the health and care community to rapidly progress ideas from the conceptual stage to a proof of concept. + +The NHS AI Lab is working with the Home Office programme: Accelerated Capability Environment (ACE) to develop some of its skunkworks projects, providing access to a large pool of talented and experienced suppliers who pitch their own vision for the project. + +Accelerated Capability Environment (ACE) is part of the Homeland Security Group within the Home Office. It provides access to more than 250 organisations from across industry, academia and the third sector who collaborate to bring the right blend of capabilities to a given challenge. Most of these are small and medium-sized enterprises (SMEs) offering cutting-edge specialist expertise. + +ACE is designed to bring innovation at pace, accelerating the process from defining a problem to developing a solution and delivering practical impact to just 10 to 12 weeks. + +Roke is a long-established science and engineering organisation, providing AI and machine learning expertise and lending technical capabilities to projects with NHS AI Lab Skunkworks. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-data-lens.md b/docs/our_work/casestudy-data-lens.md new file mode 100644 index 00000000..2cc9442e --- /dev/null +++ b/docs/our_work/casestudy-data-lens.md @@ -0,0 +1,74 @@ +--- +title: 'Data Lens: a fast-access data search in multiple languages' +summary: 'Data Lens brings together information about multiple databases, providing a fast-access search in multiple languages.' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['natural language processing','semantic search','scraping'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/data-lens-a-fast-access-data-search-in-multiple-languages/) on the NHS England Transformation Directorate website. + +## Case Study +A pilot project for the NHS AI Lab Skunkworks team, Data Lens brings together information about multiple databases, providing a fast-access search in multiple languages + +### Overview +As the successful candidate from a Skunkworks problem-sourcing event, Data Lens was first picked as a pilot project for the NHS AI (Artificial Intelligence) Lab Skunkworks team in September 2020. + +The pitch outlined a common data problem for analysts and researchers across the UK: large volumes of data held on numerous incompatible databases in different organisations. The team wanted to be able to quickly source relevant information with one search engine. + +### How Data Lens works +Following a 12-week development phase, a first-stage prototype of the Data Lens has been completed. Using Natural Language Processing (NLP) and other AI technologies, the Data Lens project is creating a universal search engine for health and social care data catalogues and metadata. + +The Data Lens joins up data catalogues from NHS Digital, the Health Innovation Gateway, MDXCube, NHS Data Catalogue, PHE Fingertips and the Office for National Statistics. + + +By providing user-friendly access to previously time-consuming separate data catalogues, Data Lens aims to: + +- present information about data from across the sector with one search +- give preview information and direct users to an original location (avoiding the need for another database) +- provide multilingual support and a user focused approach +- reduce workload and improve the quality of information available +- build up a picture of what data is collected and how it flows through the health and social care system. + +The search tool not only increases data access and collaboration, it is learning to improve the results it provides by tracking what people search for, whether they click through and which dataset they use so that its results can be even more relevant over time. + +Using Natural Language Processing (NLP), the engine is able to suggest relevant results that go beyond the scope of the search terms it is given. With the use of browser translation, it also supports searches and results in all 71 languages supported by Amazon Web Services (AWS), increasing the usability and inclusivity of the product. + +The prototype is getting support through the NHS digital service development pipelines - part of the journey towards achieving a fully fledged AI product that is making a real difference to the delivery of health and social care. + +### Why Data Lens is needed +The health and social care sector has huge amounts of data, spread across many NHS organisations and even more databases. When searching for information it can often be difficult to find out whether it exists, and where it is. Searching and cross-referencing multiple data catalogues can also be extremely time-consuming. + +This project furthers the ‘Joining-Up Care Agenda’ by enabling cross-organisational views of data. + +Using artificial intelligence to power this search engine reduces the time required to make the most of existing data sets, and answers the call from the Secretary of State for Health and Social Care to turbo-charge data responsiveness and ease the burden of data collection across the health and care system. + +### Open source Data Lens code +Code and documentation from the development of this project is available for developers and AI enthusiasts. By making code freely available it is hoped that new search engines can be developed and that more organisations will engage with using AI. + +With thanks to colleagues at NHSX Analytics Unit, Accelerated Capability Environment (ACE) and Naimuri - the ACE community member selected via competition - we undertook the following open approach over 12 weeks and 6 sprints: + +1. NHSX Analytics Unit colleagues identified a number of openly licenced data sets from the healthcare world - forming the core of the proof-of-concept. +2. Partners at Naimuri built the platform on the community edition of ElasticSearch, a popular system for full-text search engines. +3. The team employed various Natural Language Processing (NLP) techniques in order to go further than providing simple keyword searches. +4. The AI was trained to understand better semantic similarities in searches, i.e. to suggest results for “smoking” alongside “cancer” using vector analysis and cosine similarity. +5. User feedback was built into the AI training, so the more users who give “thumbs up” (or down) to suggested results, the better the results will become over time. +6. Fuzzy matching was implemented to help with typos and misspellings. +7. A recommendation engine was developed to suggest related but not searched for sets. +8. Finally, around 9 published NHS acronyms and jargon busters were used to help unpick things like “A&E” and “IP”. + +The team brought in metadata from NHS Digital, NHS England and Improvement, Public Health England, Office for National Statistics and Health Data Research UK in order to prove that Data lens could onboard from different organisations in different ways: APIs, scraping, even manual metadata files. + +> Working with the AI Lab Skunkworks on this project was Agile in the truest sense of the word. We pitched an idea, had funding approved and were up and running in a very short amount of time. I sincerely hope it can be taken forward into production to help its users get value from the wealth of data and information that is produced by the Health and Social Care sector. +– Paul Ross, Data Engineer, NHSX Analytics Unit. + +> This project has shown the value of better data access using intelligent, domain specific search. The approach of creating a proof of concept and the freedom it's given us to apply advanced technology has really added value. +– Kieran Moran, Naimuri. + +> At ACE our overriding mission is to keep the public safe, so we welcomed the opportunity to work with NHSX, and help them tackle the challenges they and the wider healthcare sector face. +– Simon Christoforato, CEO of ACE’s Vivace supplier community. + + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-long-stay.md b/docs/our_work/casestudy-long-stay.md new file mode 100644 index 00000000..546ef4f3 --- /dev/null +++ b/docs/our_work/casestudy-long-stay.md @@ -0,0 +1,95 @@ +--- +title: 'Using machine learning to identify patients at risk of long term hospital stays' +summary: 'Machine learning using historical data from Gloucestershire Hospitals NHS Foundation Trust to predict how long a patient will stay in hospital upon admission.' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['synthetic staining','classification','deep learning', 'pathology', 'neural networks'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-machine-learning-to-identify-patients-at-risk-of-long-term-hospital-stays/) on the NHS England Transformation Directorate website. + +## Case Study +Studies show that many patients who stay in hospital for extended periods experience negative outcomes: According to the Journal of Gerontology, “Ten days of bed rest in hospital leads to the equivalent of 10 years ageing in the muscles of people over 80” (Kortbein et al 2004). Long stays also create problems for busy hospitals because beds stay occupied for longer and require extra resources to manage. In this context, a long stay is regarded as any stay longer than 21 days. + +These longer length of stays are often avoidable, as one study showed 60% of immobile older patients had no medical reason that required bed rest (Graf 2006, American Journal of Nursing). + +The Business Intelligence team at Gloucestershire Hospital (GHFT), supported by GHFT's CIO and senior clinical leaders, developed an idea to use AI to address the issue of “long stayers” and was awarded a rapid feasibility project as part of the Skunkworks’ third round of competition at the NHS AI Lab. + +> Long stayers at the Gloucestershire Hospitals Trust occupy an average of 278 beds per day, which is around 4% of all admissions but accounts for 34% of bed use. The ability to identify and intervene early could make a real difference to these patients. + +### Overview +Sarah Hammond, Gloucestershire Hospitals’ Associate CIO, and Joe Green, Deputy Head of Business Intelligence, report that more than 30% of bed days in all of the Gloucestershire Trust’s acute hospitals are used by long stayers. It has been observed that long stayers have an 11% mortality rate during their hospital stay (compared to 5% of all admissions). Of these long stayers, 23% became unwell after being deemed medically fit for discharge (compared to 1% overall). + +Long stayers at the Gloucestershire Hospitals Trust occupy an average of 278 beds per day, which is around 4% of all admissions but accounts for 34% of bed use. The ability to identify and intervene early could make a real difference to these patients. + +After the completion of a rigorous Data Protection Impact Assessment (DPIA) and resulting Data Protection Agreement (DPA), the “longstayers” project had access to a good volume of data from over 7 years’ worth of admissions information. By analysing this data using machine learning methods (computer algorithms that learn from data), the team hoped to understand whether it is possible to identify the patients most at risk of becoming “longstayers”. + +The challenge was to discover whether AI can provide a useful prediction solution. The follow up is whether that solution could be effectively and safely put into production and the results shared for wider use. + +The project sought to: + +- predict which patients are most likely to become "long stayers" (stay in hospital for more than 21 days) +- provide hospital staff with a prototype tool that provides a visible long stay risk score on every electronic patient record as soon as a new patient arrives at the hospital +- maximise the learning opportunities from the project by working in the open and making the resulting source code available for continued - experimentation by other researchers and developers +- contribute to potential better outcomes for people staying in hospitals around the country. + + +### What we did +Over a 12-week period, the project investigated an experimental approach to using AI to better understand the admissions data at Gloucestershire Hospitals. This rapid innovation is intended to explore a proof of concept that could help predict which patient will be a long stayer. + +Data from more than 1 million admissions, stored in a secure SQL warehouse, was joined to an electronic patient record system to link key patient demographic indicators to each admission. The data available included age, sex, deprivation level, geography, and admission history like a patient’s presenting complaints, their socio-economic status, emergency department investigations and care home admissions. + +This provided a rich and ready dataset for the project using data that is owned and securely held by the trust. This therefore allowed the data to be used for research purposes when anonymised appropriately. + +The team: + +- worked with clinicians to understand the problem and possible uses for the tool +- developed an historical dataset for the project using patient admissions information +- opted to progress a machine learning Generative Adversarial Networks (GANs) solution using an innovative AI algorithm +- explained which factors contributed to the likelihood of a patient becoming a long-stayer (by building a parametric risk model) +- tested how well the model worked against the expectations from the initial data analysis. + +### Why a GANs solution was chosen +GANs, first designed by [Ian Godfellow](https://arxiv.org/pdf/1406.2661.pdf), involves simultaneously training two models. This includes a generative model that captures the data distribution, and a discriminative model that estimates the probability that a sample came from the training data. + +The Generative Adversarial Networks (GANs) was chosen because the use of the GANs to predict length of stay from a structured dataset offered a novel and experimental approach to a problem that may have been more commonly solved with models such as gradient boosted trees. + +There are many types of GANs, for this project a deep-convolutional generative adversarial network (DC-GAN) was selected to predict the length of stay. This model architecture (which broadly followed the [approach outlined here](https://arxiv.org/abs/1511.06434)) allowed for flexibility with applications over a wide range of data processing problems and the ability to apply a convolutional methodology to mixed data. + +Through training it was clear that the discriminator portion of DC-GAN was most effective and that, combined with heuristics that were learned during an initial data-analysis phase, a new hybrid model could be used. Once the DC-GAN model had produced the length of stay estimate, this was fed into a risk model that used a cumulative density function model to produce a risk score for each patient’s likelihood of becoming a long-term hospital stayer. The final output was a risk score for that patient, from 1 to 5, with 5 being the highest risk. + +### The outcomes and lessons learned +The project succeeded in achieving: + +- a “long stay risk” model that successfully detects two-thirds of long stayers and stratifies the risk in a useful way +- identification of the factors that were predictive of long-stayers (specific to Gloucestershire) +- a model performance accuracy within 1% at all stages when the model was tested on unseen data before, during and after the 2020 COVID waves. + +This proof-of-concept risk identifier will enable hospital staff to look closely at whether the patients identified as having a high probability of a long stay could benefit from earlier interventions and changes to their care pathway. + +The team at Gloucestershire Hospitals is keen to heed the lessons learned by working together, and taking the next steps to understand what technical, compliance, and logistical requirements are necessary to adopt it. + +> Our aim was to develop a proof of concept for a “long stay risk” score algorithm. Would it be possible to predict a patient’s length of stay the minute they arrive at the front door? The initial “long stay risk” model successfully detects two-thirds of long stayers at time of arrival, or very soon after. +– Joe Green, Deputy Head of Business Intelligence, Gloucestershire Hospitals Trust + +The results have been very positive. It is hoped the information will allow trust staff to carefully tailor a patient’s care pathway accordingly. Based on a well-established evidence base showing the negative impacts of unnecessary long stays, the AI tool has the potential to lead to decrease in the length of hospital stays overall, with corresponding reductions in patient deterioration and mortality during admission, and also lead to reduced readmission rates. + +### What next? +Gloucestershire Hospitals Business Intelligence team reports: “We will soon begin taking the model output tables, which run every 15 minutes, into our electronic patient record system to test and evaluate with clinicians.“ + +The Skunkworks team will continue to support Gloucestershire Hospitals with how to further test and evaluate the model, and adopt it safely into wider hospital use. The team is also preparing to support other organisations considering a similar approach. + +### Who was involved? +This project is a collaboration between NHSX, [Gloucestershire Hospitals NHS Foundation Trust](https://www.gloshospitals.nhs.uk/), [Polygeist](https://polygei.st/) and the Home Office’s [Accelerated Capability Environment](https://www.gov.uk/government/groups/accelerated-capability-environment-ace) (ACE). The AI Lab Skunkworks exists within the NHS AI Lab to support the health and care community to rapidly progress ideas from the conceptual stage to a proof of concept. + +The NHS AI Lab is working with the Home Office programme: Accelerated Capability Environment (ACE) to develop some of its skunkworks projects, providing access to a large pool of talented and experienced suppliers who pitch their own vision for the project. + +Accelerated Capability Environment (ACE) is part of the Homeland Security Group within the Home Office. It provides access to more than 250 organisations from across industry, academia and the third sector who collaborate to bring the right blend of capabilities to a given challenge. Most of these are small and medium-sized enterprises (SMEs) offering cutting-edge specialist expertise. + +ACE is designed to bring innovation at pace, accelerating the process from defining a problem to developing a solution and delivering practical impact to just 10 to 12 weeks. + +Polygeist, a software company specialising in state-scale analytics, builds world-leading AI technology for defence, national security, law enforcement, and healthcare customers. The team for this project was able to produce a live system, producing insights, from a standing start, in 12 weeks. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-nhs-resolution.md b/docs/our_work/casestudy-nhs-resolution.md new file mode 100644 index 00000000..4987d8c5 --- /dev/null +++ b/docs/our_work/casestudy-nhs-resolution.md @@ -0,0 +1,92 @@ +--- +title: 'Using AI to support NHS Resolution with negligence claims prediction' +summary: 'Exploring how AI might help the NHS to understand and identify risk, preventing harm and saving valuable resources.' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['classification']] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/understand-ai/using-ai-to-support-nhs-resolution-with-negligence-claims-prediction/) on the NHS England Transformation Directorate website. + +## Case Study +NHS trusts undertake an enormous number of activities each year (around 240 million in the year 2018 to 2019 according to the [Kings Fund](https://www.kingsfund.org.uk/projects/nhs-in-a-nutshell/NHS-activity)). The vast majority of people receive safe care, however [NHS Resolution](https://resolution.nhs.uk/) received over 15,000 claims for compensation last year on behalf of the NHS in England, which includes hospital trusts and GPs. Although many (almost half the claims in 2018 to 2019) get settled without damages, NHS Resolution figures show that claims can cost the health and care system up to £2.6 billion a year. + +In early 2021, the NHS AI Lab Skunkworks team started a rapid feasibility study with NHS Resolution to investigate whether it is possible to use machine learning AI to predict the number of claims a trust is likely to receive and learn what drives them in order to improve safety for patients. + + + + + +### Overview +NHS Resolution provides expertise to the NHS on resolving concerns and disputes fairly, shares learning to enable organisation-wide improvement and to preserve valuable resources for patient care. + +The organisation holds a wealth of historic data around claims, giving insight and valuable data around the causes and impacts of harm. + +Over a 10-week period, the project sought to: + +- improve patient safety by reducing the time lag between incidents and the detection of claims, giving the opportunity for prevention +- create a risk profile for individual trusts by analysing a variety of NHS data +- better understand future costs to the NHS by predicting the volume of incidents that lead to a claim +- help reduce new claims arising by understanding the factors that drive the likelihood of a claim + +The NHS AI Lab Skunkworks team worked with NHS Resolution and Accelerated Capability Environment (ACE) suppliers to provide a rapid delivery plan to: + +- develop a machine learning model that could predict claims +- produce a code pipeline that could prepare input data, then train and run the chosen model. + +### How the AI machine learning was developed +The project aimed to prove the value of machine learning in determining insights from the available data. Automated machine learning was used to run repeated processes on the available data in order to select the AI models that uncovered the most relevant information. + +NHS Resolution provided multiple sources of data from the healthcare system spanning two years. This included claims, incidents, workforce, maternity, staff survey and publicly available population data sources. + +The team applied automated machine learning methods to rapidly test multiple approaches. This testing: + +- looked for correlations, or similarities, between the datasets and investigated effects associated with time lag +- used different modelling approaches including an XGBoost model and a Bayesian Hierarchical Model (BHM) to reach an end result that could both provide accurate predictions and actionable explanations +- identified the “decision tree” model as the best performing and most cost-effective to train because of its ability to cope with varied data quality (data variables and missing values) +- included a focus on “model explainability” - the need to understand how the model is making predictions so that we can explain what drives negligence claims +#### Constraints + +There were a number of constraints that impacted the project including a lack of completeness and consistency across datasets. The data understandably showed a strong association between the number of claims and the size of the trust population, and it was necessary to eliminate the “size effect” so as not to hide other effects present in the data. + +It was also not possible to try to predict the rate of claims per specialty (for instance the rate of claims specific to maternity), only by trust, because the data was not complete or granular enough to do so. + +#### Data security +The data used for this testing process was pseudonymised by replacing any personal identifiable information with artificial data (pseudonyms). It was responsibly managed in accordance with General Data Protection Regulations (GDPR). + +The project made use of ACE’s PodDev environment for the data. Using this bespoke environment meant that sensitive clinical data could be included securely and destroyed at the end of the project in a way that met the requirements of the NHS Resolution data controller. + +### Impacts and outcomes +The completion of the “proof of concept” feasibility study led to some significant successes. Although claims prediction was not perfected, the results significantly outperformed baseline models when predicting rates of claims, by trust and by month. + +Whilst useful for hospital trusts, predicting the rate of claims is not yet impactful in improving patient safety without further development of this project. + +The project has provided an informative view of the data landscape across the wider health service, and its relevance to understanding medical negligence claims. + +As a result of the work, it is possible to gain the following insights into what drives rates of claims, from wider health service data: + +- The datasets for incidents, specialty, referral-to-treatment and hospital activities have the largest potential impact on the predicted rate of claims. +- The presence of specialties in a trust tends to have a correlation with the predicted rate of claims. +- The number of incidents, and specifically the percentage of incidents resulting in severe harm or death of the patient, also has a correlation with the predicted rate of claims. +- Longer waiting times also appear to correlate with the predicted rate of claims, although this varies with specialty. + +The results are an indication that prediction of claims is possible but that large volumes of quality data are required to take the challenge further. + +> NHSX Skunkworks was a perfect way for us to kickstart our AI journey. We were able to work with experts who guided us through the process and help us every step of the way. We’ve learned so much and are keen to take this forward for further investigation. +– Niamh McKenna, Chief Information Officer, NHS Resolution + +> This project highlighted not just that the data could broadly be used for better-than-baseline forecasts, but it also suggested a number of actionable ideas on how to improve data quality, something we really care about because without good data there cannot be good AI. +– Giuseppe Sollazzo, Head of Skunkworks, NHS AI Lab, NHSX + +### What next? +The resulting code base and methodology is a starting point for further machine learning exercises. The proof of concept has demonstrated how machine learning could increase NHS Resolution’s understanding of medical negligence claims. + +In order to demonstrate a reduction in patient harm, more work will need to be done to select and engineer new features. + +Ideally, improving the consistency of the claim reporting methodology across trusts would significantly improve the predictive power of the negligence claims data. + +Future improvements to both the data and the model would provide a more accurate forecasting model and more insightful explanations. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-nursing-placement-optimisation.md b/docs/our_work/casestudy-nursing-placement-optimisation.md new file mode 100644 index 00000000..e2dbac58 --- /dev/null +++ b/docs/our_work/casestudy-nursing-placement-optimisation.md @@ -0,0 +1,139 @@ +--- +title: 'Using AI to find optimal placement schedules for nursing students' +summary: 'Assessing the development of a genetic algorithm and tool that automatically generates student nurse placement schedules' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['genetic algorithm', 'optimisation', 'nursing'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-ai-to-find-optimal-placement-schedules-for-nursing-students/) on the NHS England Transformation Directorate website. + +## Case Study +Students training to become nurses undertake placements at teaching hospitals as part of their studies. These placements endeavour to help students experience and learn practical skills they can apply in their roles once they are qualified. In general, these placement schedules are produced by hand and what may seem to be a relatively simple task, quickly becomes complex once you consider the different constraints involved. + + + + +**The challenge** +Can a tool or approach be developed which automatically generates student nurse placement schedules that adhere to the requirements and constraints of the different stakeholders, while additionally providing a more diverse range of placements for the students? + + +### Overview +Any placement allocation process involves three stakeholders - the Trust who will be coordinating and hosting the students on placement, the university where the students undertake classroom study and where the final degree will be provided, and the students themselves. + +There is a range of contexts which require consideration. For example, the universities will have set dates when placements should take place, and these are usually particular for each university. Some universities will require students to visit specific types of wards over the course of their placements, while others may be more flexible. + +Similarly, the Trust has requirements that must be met. Each ward at a hospital has a maximum student hosting capacity, and this can vary depending on the student’s current year of study. Wards must also keep to internal Education Audit Standards to ensure they can provide necessary support to students, and this must be checked regularly. + +When these requirements are combined with the fact that Trusts can have relationships with multiple universities, placing students at tens of wards at a time, it is clear such complex schedules could be time consuming for a placement coordinator to produce. It was calculated the generation of placements for a single year group of students from one university can take each Trust anywhere between five and ten hours. Given multiple universities and multiple intakes the time spent scheduling placements can easily stretch into hundreds of hours for each Trust, each year. + +Presently time limitations and the lack of quantification ability presented by the manual process mean it is not feasible for placements to cover a range of disciplines and specialities, yet this is something universities would like to support to enhance skills development of nurse placements. + +### What we did +We worked with Imperial College Healthcare NHS Trust, in conjunction with North West London CCGs to undertake this project to develop an AI-driven solution to the problem of placing students. Imperial hosts students from seven universities, placing them across three hospitals, totalling more than 80 wards and placement settings. Imperial College Healthcare NHS Trust is one the largest acute trusts in the UK ([according to the Kings Fund](https://www.kingsfund.org.uk/blog/2020/09/biggest-hospital-england)), so would provide substantial evidence as to whether this solution was viable. The data used was anonymised as this task could be undertaken without needing to provide any details on a personal level. Instead, information about the hospitals were taken, and randomly generated student profiles were created containing examples of the information the Trust would have about each student. + +The task posed here is one of optimisation, of which there are many different approaches. The method selected was a Genetic Algorithm, where the best version of something is found through a process which is like the evolutionary process seen in nature. The algorithm runs as follows: + +- Create a population of objects (in this case, it was a population of potential schedules for all students) + +- Apply ‘mutations’ and produce ‘offspring’ from this population of objects + - Mutations were produced by randomly changing the allocated ward for a random placement + - Offspring were produced by combining schedules e.g. taking the front half of one schedule, and the back half of another schedule and sticking them together to produce a hybrid + +- Put the ‘mutated’ and ‘offspring’ objects back into the population, and score the population + + - The scoring part is key, as this is what dictates what a ‘good’ schedule looks like. This is where you define what absolutely cannot be in a schedule, and what you’d like a good schedule to have + +- Repeat the process hundreds of times until you have found a schedule which meets all your needs. + +![Genetic Algorithm Illustration](../images/Genetic_algorithm.width-1534.png) +> A diagram showing how the genetic algorithm works. + +Genetic Algorithms will not produce a perfect, optimal solution as the way the problem space is explored is random. This means, while all your requirements will be met, the most varied selection of placements might not be found because it would take too long for the Genetic Algorithm to find that solution. However, a more varied selection of placements will be allocated than at the beginning, or if the process was carried out with the current, manual process. + +The chosen approach was to use such a Genetic Algorithm, run multiple times to produce multiple different schedule options, which are then presented back to the placement coordinator for them to cast their expert eye over and choose the best option. Alongside the schedules themselves, the scores which the Genetic Algorithm uses can be presented so that different schedules can be compared to each other. Further, this score can be broken down into components, so the exact strengths and weaknesses of each schedule can be seen, allowing the placement coordinator to evaluate which suits their Trust and its students. + +#### Deciding how to score schedules +The scoring of schedules is arguably the most important part of a Genetic Algorithm because it defines what a ‘good’ schedule and ‘bad’ schedule might look like. The scoring components can be divided into two categories: ‘must-haves’ and ‘nice-to-haves’. + +Must-have scoring components are things a schedule must adhere to or otherwise it could not be used. In the current process, the placement coordinator would ensure any schedule produced meets these criteria: + +- Each ward and placement setting does not have more students allocated than their capacity allows. This includes both overall and for each year group. Year groups have different capacities because each year requires a different level of support and management. + +- Any ward a student is placed on is aligned to the student’s COVID risk level. Some wards will have a higher risk of COVID exposure than others, and some students will be at a higher risk of COVID than others. + +- Students must only be on one placement at a time. While this is an obvious statement to a human, this must be included in the scoring of the schedules, so the algorithm understands this is not acceptable. + +- Students must have a specific ward or placement setting allocated for each placement. Similar to the previous point, this feels like an obvious criteria, but must be enforced to make sure the final schedule produced is useful. + +Meanwhile, nice-to-have scoring components focus more on trying to ensure diversity of placements. These are the things the Genetic Algorithm tries to maximise (or increase as much as possible): + +- Number of unique wards. This helps to signal to the algorithm that, as much as possible, students should be placed at a ward where they have not been before. + +- Number of unique specialities. To build upon unique wards, this aims to send students to gain experience in a speciality which they have not previously been placed in. + +- Placement Capacity Utilisation percentage. This helps to address cases where wards which can host lots of students are chosen more commonly, because it is known they have lots of capacity. By including this component, the algorithm tries to ensure wards and placement settings are as full as possible across the board, spreading the placement load across the Trust. + +- Focus on specific specialisms. The tool has four scoring components which promote inclusion of specialities - Medical, Surgical, Critical Care and Community wards. This helps to ensure placements include specific topics or types of wards. As this is not always necessary, these can be turned on and off, and are turned off by default in the open-source code. + +#### The user interface +A simple user interface was produced using Streamlit, an interactive interface for web browser-based applications, allowing the user to adjust elements of the placement optimisation tool, and to view progress as the tool produces schedules. + +Once all schedules are produced, a comparison table is both displayed and saved down, summarising the various scoring components and helping the user begin to understand which schedule of those produced might be the best. + +![Genetic Algorithm Illustration](../images/User_interface.width-1534.png) +> The User Interface that is seen upon launching the tool. + + +#### Automating the report +The final benefit of the tool is that insight into the schedules can be easily obtained as the schedules are generated electronically, meaning they can be easily reformatted to provide a different view of the information. As an example, it is straightforward to change a schedule from a format where the placements on each ward are shown, to a format where the placements for each student are shown. + +This simplicity extends to being able to calculate hours on placement each week, which is currently a mandatory reporting ask for Trusts. We hear that a substantial amount of time is spent by placement coordinators manually analysing placement schedules to produce reporting around the hours of placement time the Trust has supported each week, from each university. + +As part of the tool, the final schedule produced is reported in several different ways: + +- From the student’s perspective, showing what placements locations they have been allocated for each week of the studies +- From the ward’s perspective, showing what students they have on placement with them and when +- From a ward capacity utilisation perspective, showing the placement coordinator where there is spare capacity if a placement needs to be manually reallocated +- From a placement hours perspective, providing various summaries of hours across wards, university cohorts and both weekly and quarterly summaries for the mandatory reporting required of the Trust. + + +![Genetic Algorithm Illustration](../images/Example_using_fake_data.width-1534.png) +> An example schedule produced using fake data. + + +### Outcomes and lessons learned +The resulting code, released as open source on our GitHub ([available to anyone to re-use](https://github.com/nhsx/skunkworks-nursing-placement-schedule-optimisation)), enables users to: + +- Load information about wards, placement, and students into the tool + +- Run the genetic algorithm to produce up to 10 schedules + +- See score comparisons across produced schedules, allowing quantitative comparison + +- Automate mandatory reporting + +- Where relevant technical expertise is available, the scoring metrics could be extended and updated to reflect exactly what the user would be looking for. + +The tool is estimated to save hundreds of hours constructing and analysing schedules for nursing students. This time can be spent much more effectively elsewhere, thereby freeing up placement coordinators to utilise their expertise across the Trust. Additionally, the tool produces improved schedules by consistently taking into account the wards and specialities that a student has already undertaken a placement within. + +It should be noted this tool aims to support placement coordinators, as the process of producing placement schedules isn’t just fitting together all the pieces. Bespoke requests will come in from students, requiring tweaking of placement schedules to accommodate a wide range of circumstances. This tool aims to provide a high-quality baseline from which placement coordinators can construct schedules which meet every requirement, from all over the Trust. + +> We can't fix the nursing shortage without training more nurses and for that, we need to have (the right) clinical placements available. This tool will not only help us to allocate placements more efficiently and effectively, but it will also free up valuable time for the practice learning facilitators to focus on teaching and professional development for students. Ultimately more students will be able to get the placements and training they need. +– Hai Lin Leung, Programme Manager, North West London CCG + +### Finally, a successful element of this project has been the knowledge sharing and mutual upskilling between the two parties. AI Lab Skunkworks helped Imperial College Healthcare Trust identify and implement a novel AI-led approach which provided a solution they may otherwise not have been able to explore. Meanwhile, Imperial College Healthcare Trust shared their expertise in how the placement allocation process works, particularly what does and does not work well. This was useful underlying information the AI Lab Skunkworks team hopes to share more broadly across the range of projects we work on. + +### What’s next? +The NHS AI Lab Skunkworks team has released the code from the project on GitHub to demonstrate how the tool might work, using generated fake data. + +A pilot is planned to evaluate how well the tool works in practice, working with Imperial College Healthcare Trust during a future student intake. Following this pilot, and an evaluation exercise, improvements will be identified and implemented in a subsequent phase of work. + +#### Who was involved? +This project was a collaboration between the NHS AI Lab Skunkworks, within the Transformation Directorate at NHS England, and Imperial College Healthcare NHS Trust. + +NHS AI Lab Skunkworks is a team of data scientists, engineers and project leaders who support the health and social care community to rapidly progress ideas from the conceptual stage to a proof of concept. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-opensafely.md b/docs/our_work/casestudy-opensafely.md new file mode 100644 index 00000000..89eb6561 --- /dev/null +++ b/docs/our_work/casestudy-opensafely.md @@ -0,0 +1,159 @@ +--- +title: 'Working with a Trusted Research Environment (TRE) ' +summary: 'An exploration of OpenSafely' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['opensafely','TRE'] +--- + +## Info +This is a backup of the case study published [here]() on the NHS England Transformation Directorate website. + +## Case Study +Trusted research environments (TREs) take the form of a secure data environment that allows analysts and researchers to undertake in-depth analysis on rich, joined-up datasets without them seeing any identifiable information. Data is held within a secure server and does not leave that server. Following a recommendation from the [Goldacre review](https://www.gov.uk/government/publications/better-broader-safer-using-health-data-for-research-and-analysis) TREs form a key part of the [Data saves lives: reshaping health and social care with data policy paper](https://www.gov.uk/government/publications/data-saves-lives-reshaping-health-and-social-care-with-data/data-saves-lives-reshaping-health-and-social-care-with-data#empowering-researchers-with-the-data-they-need-to-develop-life-changing-treatments-diagnostics-models-of-care-and-insights). + + +**The challenge** +Using TREs for analytical projects is still a new concept to many analysts. The NHS AI Lab Skunkworks team and NHS England’s Health Inequalities and Evaluation Analytics Team embarked on a partnership project to gain experience of using a TRE to answer key and important questions for a defined project. They used a TRE called [OpenSAFELY](https://www.opensafely.org/), an open-source software platform for analysis of electronic health care records data for COVID-19 related research. + + +OpenSAFELY gives trusted researchers restricted levels of access to the server to run analysis on real data and obtain aggregate results, without having sight of the patient level data. Aggregate results are checked to ensure there are no disclosure risks before being released from the server. This highly secure way of working enables researchers to have access to large and sensitive datasets in a safe manner. + + + +This case study outlines the team’s experiences of working through the OpenSAFELY platform to understand the impact of the NHS @Home programme. + + + +### The NHS @Home programme + +During the COVID-19 pandemic there was a necessity to safely deliver high quality care into patients’ homes where appropriate. As part of its response to the pandemic, NHS England brought forward initiatives to help patients better self-manage their health and care at home under the [NHS @Home programme](https://www.england.nhs.uk/nhs-at-home/). NHS @Home is a nationally-led programme of work providing better connected, more [personalised care](https://www.england.nhs.uk/personalisedcare/) in people’s homes, including care homes.  + +Now, two years on from the start of the pandemic, the NHS @Home team wanted to understand the reach and coverage of their initiatives to inform future planning. In particular, the team wanted to understand the reach of the following initiatives:  + +- [Blood pressure monitoring](https://www.england.nhs.uk/ourwork/clinical-policy/cvd/home-blood-pressure-monitoring/)  +- [Covid Oximetry](https://www.england.nhs.uk/nhs-at-home/covid-oximetry-at-home/)  +- Proactive care  + + +Obtaining this information can be difficult, because access to patient level data within primary care electronic health records is controlled by strict information governance rules due to the importance of patient privacy. This limits the scope of analysis which can be carried out on primary care data. The two main software systems used in primary care in England are operated by the companies [TPP](https://tpp-uk.com/) and [EMIS](https://www.emishealth.com/).  + +Therefore, to answer the study questions, the NHS AI Lab Skunkworks Team and NHS England’s Health Inequalities and Evaluation Analytics Team embarked on a partnership project exploring the Uptake of NHS @Home interventions during the COVID-19 pandemic, using the OpenSAFELY platform. OpenSAFELY is a TRE which provides access to patient records in the TPP system, and for certain studies also provides access to patient records in the EMIS system. + +### Project aims  + +This project had two main aims:  + +- To understand the reach and coverage of the NHS @Home programme during the pandemic. Specifically looking at: blood pressure monitoring, pulse oximetry and proactive care interventions.  + +- To understand how to approach an analysis project using OpenSAFELY, including the amount of time and resource required, and whether this platform would be useful for future analyses.  + + +Part of the initial project aims was to look at whether coding of uptake of the interventions of interest varied across TPP and EMIS. Unfortunately, OpenSAFELY were unable to provide access to the EMIS system at the time, so the scope was narrowed to TPP only, which still represents a significant dataset of approximately [24 million patient records](https://wellcomeopenresearch.org/articles/7-191/v1).  + +This case study will focus on the second bullet point, the experience of using OpenSAFELY, as the code for the analysis will be published separately and there is ongoing review and further development of the analysis. + +### What we did + +There are several stages to completing an OpenSAFELY project. These include:  + +- Setting up the project +- The [co-piloting](https://www.bennett.ox.ac.uk/blog/2021/08/opensafely-co-pilot-programme-assisting-users-on-their-opensafely-journey/) period and project development  +- Project Wrap up  + + +#### Setting up the project  + +The initial step was to complete the study access request and receive approval for the project through OpenSAFELY. This included:  + +- Naming the study sponsor, study lead and team who would need to access the server and work on the analysis  +- Stating the purpose and scope of the study and the desired outputs  +- Stating which datasets need to be accessed   +- Providing additional details, including the relevant experience of the researchers and whether the study is research, a service evaluation, or an audit  + + +Once this was granted, the next step for everyone working on the analysis was to successfully apply to become an [accredited researcher](https://www.ons.gov.uk/aboutus/whatwedo/statistics/requestingstatistics/secureresearchservice/becomeanaccreditedresearcher), recognised by the [Office of National Statistics](https://www.ons.gov.uk/), and to start setting up the technical requirements for OpenSAFELY in parallel.  + +The application process to be an accredited researcher will be detailed by OpenSAFELY at the start of the project, [however an outline of the process can be found here](https://uksa.statisticsauthority.gov.uk/wp-content/uploads/2019/07/DEA_Accredited_Researcher_Application_Guidance_v1.0.pdf). Having filled in the application form, it is necessary to attend a virtual training session, which walks through best practices for data publication and data privacy. These sessions must be booked in advance and, due to availability, the whole accreditation process could take a few weeks to complete. Hence, it is worth filling out the application form and booking the training as soon as possible before the start of the co-pilot period.  + +For the technical setup, there are two main options for setting up the OpenSAFELY environment:   + +- The first option is to use an online development environment. The online environment recommended by OpenSAFELY is Gitpod, which provides 50 free hours of monthly usage and negates the need for any software other than that required for accessing the TPP server. The Gitpod environment is easy to set up and integrates well with GitHub.  +- If higher levels of monthly usage are envisaged, which may particularly be the case during the more time intensive co-piloting period, it may be preferrable to have the required software downloaded locally. [The required software is](https://docs.opensafely.org/install-intro/):  +- Git +- Docker +- Coding language and IDE (Integrated development environment, the software which provides the environment for programming - for this project python and Visual Studio code were used)  + +For this project the decision was taken to download the required software locally. This software is not part of the standard set up for NHS England laptops, and higher levels of permissions (e.g. a developer account) are required to install it. Obtaining the necessary permissions takes time, so to allow swifter progress the project was instead undertaken on cloud-based virtual machines with a data science set up provided by the NHS Transformation Directorate Analytics Unit. This provided a fast, effective and secure method for working on the code locally.  + +Code for all OpenSAFELY projects (regardless of the setup) is stored on GitHub, a website designed for storing and collaborating on code. Instructions for setting up a GitHub account and preparing the necessary computing environment for completion of the project can be found in the OpenSAFELY [Getting Started Guide](https://docs.opensafely.org/getting-started/).  + +Another key component to the technical set up (regardless of whether Gitpod or local set up is used) is installing the software required to access the TPP server. This is needed to view the results when the analysis code has been run on real data. This is done by connecting to the TPP VPN. Unfortunately, due to the specific settings for the VPN, it was not possible to connect to the VPN via a cloud-based virtual machine. Connecting to the VPN on NHS England laptops was also not possible at the time. To negate this problem, the OpenSAFELY co-pilots accessed the results and completed all the necessary checks prior to releasing them from the TPP server.  + +#### The [co-piloting](https://www.bennett.ox.ac.uk/blog/2021/08/opensafely-co-pilot-programme-assisting-users-on-their-opensafely-journey/) period and project development period  + +For the first four weeks of each project OpenSAFELY operate a ‘co-piloting’ period, where an OpenSAFELY ‘co-pilot’ provides an enhanced level of support, with regular meetings, to help researchers to understand and implement OpenSAFELY ways of working. The same co-pilot also continues to provide support beyond the co-piloting period. This was found to be a hugely beneficial system, however ongoing technical difficulties slowed progress during the co-piloting period, as obtaining the correct set-up for completing the project, and installing the necessary software, proved challenging.  + +The project team also met for regular review meetings with a small group of collaborators and stakeholders in NHS England in an agile way of working. This group included members with relevant expertise, including analysts, subject experts and data scientists, who inputted knowledge and experience as well as providing steers for each stage of the analysis and giving feedback on work already undertaken.  + +In addition to gaining an understanding of the NHS @home interventions and working with OpenSAFELY, the collaborative nature of the project also provided benefits for team members involved, in learning from each other’s skills and areas of expertise. Senior team members assisted the development of junior team members’ skills, in particular the development of:  + +-  Best practice for coding such as the benefits of dynamic coding, use of functions to prevent duplication of code and reduce code maintenance burden and commenting the code.  + +- GitHub workflows including the benefits of raising small pull requests in making code reviews simpler and importance of maintaining good repo structure.  + + +A key part of the process is writing the code for extracting the relevant study cohort and variables for analysis. This code is known as the study definition. In addition to allowing time for resolving technology and set up issues, it is also important to be aware that creating the code, particularly for the study definition, is an iterative process. It is beneficial to clearly define the study population and the variables of interest at the start of the project with the co-pilot, as it may take several iterations to correctly code the method for extracting the study cohort. Care should also be taken to ensure that key demographics align with NHS standards and that definitions within the study definition are applied correctly.   + +Examples of issues which require careful consideration and clarification include: the correct manner for extracting demographic variables, the appropriate inclusion criteria (i.e. what data needed to be present for a patient to be included in the study), issues relating to redaction of small numbers and rounding principles. Redaction and rounding should be applied within the code. Co-pilots may be able to assist with functions which can apply redacting and rounding automatically, but careful consideration must be given to exactly what redaction and rounding is required in each scenario.   + +Many variables are extracted based on a codelist (a collection of clinical codes that classifies patients as having certain conditions or demographic properties – code systems include SNOMED and CTV3). OpenSAFELY provide a web-based tool, [OpenCodelists](https://www.opencodelists.org/), for creating and managing codelists. The website can be searched to see if an appropriate codelist already exists or used to create a new one. The project team are responsible for use of codelists and it is important to ensure that codelists used are appropriate for the purpose. Clinical or other expert input may be required to ensure this.   + +Currently, for studies looking at multiple time periods, [separate cohorts are extracted for each of the relevant time periods](https://docs.opensafely.org/measures/#extract-the-data) (e.g. one week or one month). For this project separate cohorts were extracted for each week over the time period from April 2019 to June 2022. This can result in the project taking significant time to run in the server. One way to reduce this workload is to put any variables which do not change significantly over time into a ‘static’ study definition and run this cohort extraction only once. OpenSAFELY provide a function to then join the weekly or monthly and static datasets.  + +Whilst working with data at arm’s length has significant benefits for security and information governance, it does bring additional challenges. OpenSAFELY provide the means to create fake data, which can be used to test the code prior to running on real data, however the return expectations for the fake data (what it is expected to look like) must be defined by the researcher and this can lead to a mismatch in the format of the fake data compared to the real data. Consequently, even if the project runs locally, upon running in the server there may be problems. Co-pilots and OpenSAFELY tech support are very willing to help, and can be contacted quickly via OpenSAFELY’s dedicated Slack channel   + +(Slack is a messaging app for business, which can also be accessed via a web browser), however trouble-shooting such issues can still be difficult.  + +Depending on the size of the dataset being analysed, there is also a risk of exceeding the memory limit within the server. To mitigate this, OpenSAFELY provide suggestions of ways to [improve performance and minimise memory usage,](https://docs.opensafely.org/memory-efficient-working/#remove-or-slim-down-large-objects-from-memory) such as considering the datatype for each column of a dataframe.  + +It is beneficial for the co-pilot to check the study definition prior to running on the server. Time should also be allowed for running on the server (due to potential for technical difficulties) and obtaining release of results (which requires a full review by OpenSAFELY to ensure there are no disclosure issues).  + +#### Project wrap-up  + +Once all the required results have been obtained, the next steps are to analyse the results and share them internally for further validation and analysis. In line with OpenSAFELY’s transparency principles, the GitHub repository containing the project code will be made public.  + +#### Lessons learned:  + +- Be prepared: it’s helpful to have the software in place and to map out study aims and requirements prior to beginning the co-piloting period. Safe researcher accreditation should be obtained as early as possible as access to the server is dependent on completion of this step.  + + +- Use of co-piloting period: OpenSAFELY recommend 50-60% of the researchers’ working time be spent on the project during the co-piloting period. This ensures maximum benefit is obtained from the enhanced level of co-pilot support. Co-pilots can help identify appropriate codelists, ensure correct understanding of OpenSAFELY functions and assist with issues such as redacting and rounding scripts.  + +- Technical skills: use of the OpenSAFELY module requires a fair amount of technical skill, including writing code, use of software and GitHub and understanding of Git ways of working. It is beneficial to have at least one member of the team who already has some level of technical skill in these areas, and if not then some basic training before the start of the project may be required.  + +- Time investment: undertaking projects with OpenSAFELY is time intensive due to the time needed to learn about OpenSAFELY ways of working, this should be factored into the planning for a project.  + +- Limitations of the dataset: clinical coding of healthcare interventions is not always consistent, which may limit the usefulness of the data obtained.  + +- Importance of the study definition: correct extraction of cohort and variables within the study definition requires careful planning. Thorough checking of codelists and study definition code, along with good communication and liaison with co-pilot, can help with this.  + + +- Benefits of collaboration: collaborating across teams is beneficial for exchanging knowledge and expertise and ensuring the project team has a good skills mix.   + +- Slack channel: Making good use of the Slack channel can help to progress the project as OpenSAFELY support, including tech support, are very willing to help and generally respond quickly.  + +- Server time: consider ways to improve performance and reduce memory usage. Allow time for server downtime and keep in mind dates and times for planned maintenance (speak to co-pilot or check on Slack).  + +- Checking codelists and definitions: Dedicate time to ensure codelists meet the requirements of the project and definitions of variables within the study definition are correct. Ensure key demographics align with NHS standards.  + +- Good coding practice: good coding practice, such as using functions to prevent duplication of code, is important to reduce the likelihood of errors and reduce code maintenance burden, make code reviews easier and improve readability of the code.  + + +### Summary  + +The level of access to a vast amount of primary care data makes this platform worth using for future studies. However, the significant time investment required and the need for a certain level of technical skill within the research team should be factored into the decision to undertake a project through OpenSAFELY. Good planning and preparation, such as having the correct software in place prior to starting the study, are essential to ensure smooth running of the project. The co-piloting scheme is extremely helpful, and it is advisable to make best use of the initial co-piloting period to become familiar with OpenSAFELY ways of working. + + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-parkinsons-detection.md b/docs/our_work/casestudy-parkinsons-detection.md new file mode 100644 index 00000000..ac5402ad --- /dev/null +++ b/docs/our_work/casestudy-parkinsons-detection.md @@ -0,0 +1,77 @@ +--- +title: 'Identifying and quantifying Parkinson’s Disease using AI on brain slices' +summary: 'This project developed an approach to enhance the identification of biomarkers which are indicative of Parkinson’s Disease, and explored whether automated identification of Parkinson’s Disease in these slices is possible' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['synthetic staining','classification','deep learning', 'pathology', 'neural networks'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/identifying-and-quantifying-parkinsons-disease-using-ai-on-brain-slices/) on the NHS England Transformation Directorate website. + +## Case Study +Identification of Parkinson’s Disease is carried out by neuropathologists who analyse post-mortem brain slices. This process is highly time intensive, and the neuropathologists are highly trained in their field. Being able to look at introducing automation to this process has potential to increase the speed at which Parkinson’s Disease can be diagnosed in a brain, as well as freeing up neuropathologists who are otherwise required to spend hours looking at the brain slices themselves. + +**The challenge** +Develop an approach to enhance the identification of biomarkers which are indicative of Parkinson’s Disease, and explore whether automated identification of Parkinson’s Disease in these slices is possible. + + +### Overview +Parkinson’s UK Brain Bank, at Imperial College London, receives brains which are donated by people who have taken the decision to donate their brain prior to their passing. The Parkinson’s UK Brain Bank is the world's only brain bank solely dedicated to Parkinson's research1. Accurate identification and Parkinson’s Disease in post-mortem brain tissue is critical to ensure that the brain is as useful as possible in research studies. These research studies aim to help us to understand what causes Parkinson’s Disease and how drugs can be developed for Parkinson’s Disease. + +Parkinson’s Disease can be diagnosed in a slice of the brain by looking for the presence of a type of protein, which is ‘stained’ (their colour is changed to a more contrasting one) using a chemical. This protein is important, as Lewy Bodies are up of these proteins, and the presence of Lewy Bodies can be indicative of Parkinson’s Disease. + +Once the staining is undertaken, a digital image of the brain slice can be taken. A number of these images are produced for different parts of the brain, and it is from these images that the neuropathologist produces their diagnosis. + +A significant benefit of working with the Parkinson’s UK Brain Bank was having access to images of a large number of brains, which have been consistently processed and recorded as images. This gave the AI tooling a strong starting point to be developed from. + +In this project, a proof-of-concept solution was developed which had two key steps. First, brain slices were synthetically stained (i.e. staining was performed on the digital images using algorithms) to highlight the proteins of interest in bright, contrasting colours. This aimed to provide a view for neuropathologists that was less intensive to analyse. + +Following this step, an automated classifier was developed which used the synthetically stained brain slices to produce a judgement as to whether or not the image contained evidence of Parkinson’s Disease. This classifier was able to do this with performance exceeding that of experts manually reviewing the images. + +### What we did +Over a 12-week period, the project used the Parkinson’s UK Brain Bank images to explore the viability of using AI to identify Parkinson’s Disease.This rapid innovation is intended to explore a proof of concept that could help predict where Parkinson’s Disease was present in brain slice images. + +Overall, about 400 brains were used for this project, split into those both with and without Parkinson’s Disease present. The images from these brains were very detailed as they were taken using microscopes capable of magnifying at 200x what an eye can normally see. This also meant that the images were very large files, so had to be processed using powerful computer hardware. This ensured that analysis could happen within a reasonable timeframe. + +The project produced a solution which can be split into two key steps. The first of those was synthetic staining, where the existing images had an algorithm applied to them in order to highlight alpha-synuclein (also known as α-syn or αS) proteins more clearly. These are the proteins which are indicative of Parkinson’s Disease being present. The second step was to automatically classify images of brain slices into whether they contain evidence of Parkinson’s Disease or not. + +### Synthetically staining the brain slices +Synthetic staining of the brain slice image refers to taking the original images of slices of brains, and applying an algorithm to them to highlight regions of interest. The benefit of this is to aid neuropathologists in being able to spot relevant material more quickly. This could speed up the overall process of diagnosing Parkinson’s Disease in an image of a brain slice. + +The chosen algorithm was a pre-trained type of neural network which had been specifically designed to understand colour, texture and spatial elements of an image. The neural network is provided with hints as to the colours of certain elements in an image. Using this information, and the algorithm’s own understanding of colour, texture and space within images, the algorithm attempts to colour the entire image. The results produced were successful in highlighting α-syn proteins very clearly. Additionally, it was seen that in brain slices without any α-syn proteins, the synthetic staining tended to not incorrectly stain these images. More details on the approach used can be found in the technical report ([here](https://www.biorxiv.org/content/10.1101/2022.08.30.505459v1)). + +![Bed allocation screenshot](../images/Parkinsons_synthetic_brain_slices.width-800.png) +> **Figure 1**: An example of the synthetic staining process. a) the original slide, containing the α-syn proteins stained in a brownish colour b) a processed version of the original slide, filtered for the brownish colour c) the synthetically stained image after the algorithm has been applied to it. The α-syn proteins are now highlighted in a greenish colour. + +### Identifying presence of Parkinson’s Disease +With the ability to synthetically stain images, the next stage of the project could be attempted. This step aimed to classify images of brain slices into whether or not evidence of Parkinson’s Disease was present in the image. For this, a particular type of Neural Network which had previously been demonstrated to work quickly on large datasets. This was important, as while powerful computers were used, the selection of an inappropriate algorithm might have meant that no useful results could be obtained within the 12 week time frame. + +The neural network model takes in the synthetically stained images and gives a classification for whether Parkinson’s Disease is present in the image. The results seen using this method were excellent and could match (and exceed) the performance of experts in some aspects. + +#### Outcomes and lessons learned +In summary, the code from this project, released as open source on our Github (available to anyone to re-use, [link here](https://github.com/nhsx/skunkworks-parkinsons-detection)), demonstrates the viability of automation of identification of Parkinson’s Disease in post-mortem brain slices. This is achieved by synthetically staining the images, before applying a neural network to predict whether or not the disease is present. + +The tool achieves cutting edge performance, and demonstrates ability which exceeds that of expert raters in some aspects. Despite this only being a proof of concept, results were promising, which is exciting for future further development of this work. + +A key lesson learned was the importance of good data. This is particularly true in projects involving images where quality and way the image was captured can vary significantly. A key driver of success in this project was having lots of consistent and high quality images to work with. + +### What’s next? +The NHS AI Lab Skunkworks team has released the code from the project on [Github](https://github.com/nhsx/skunkworks-parkinsons-detection) to allow anyone to try the methodology out using fake data which has a similar appearance to the images of brain slices. + +A second phase is currently being planned to develop this work further, continuing to work in conjunction with Polygeist, Parkinson’s UK and the Parkinson’s UK Brain Bank at Imperial College London. + +### Who was involved? +This project was a collaboration between the NHS AI Lab Skunkworks, within the Transformation Directorate at NHS England and NHS Improvement, Parkinson’s UK, Parkinson’s UK Brain Bank at Imperial College London, [Polygeist](https://polygei.st/) and the Home Office’s [Accelerated Capability Environment](https://www.gov.uk/government/groups/accelerated-capability-environment-ace) (ACE). + +NHS AI Lab Skunkworks is a team of data scientists, engineers and project leaders who support the health and social care community to rapidly progress ideas from the conceptual stage to a proof of concept. + +Accelerated Capability Environment (ACE) is part of the Homeland Security Group within the Home Office. It provides access to more than 250 organisations from across industry, academia and the third sector who collaborate to bring the right blend of capabilities to a given challenge. Most of these are small and medium-sized enterprises (SMEs) offering cutting-edge specialist expertise. + +ACE is designed to bring innovation at pace, accelerating the process from defining a problem to developing a solution and delivering practical impact to just 10 to 12 weeks. + +Polygeist, a software company specialising in state-scale analytics, builds world-leading AI technology for defence, national security, law enforcement, and healthcare customers. The team for this project was able to produce a live system, producing insights, from a standing start, in 12 weeks. + + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-recruitment-shortlisting.md b/docs/our_work/casestudy-recruitment-shortlisting.md new file mode 100644 index 00000000..cd38f870 --- /dev/null +++ b/docs/our_work/casestudy-recruitment-shortlisting.md @@ -0,0 +1,39 @@ +--- +title: 'Examining whether recruitment data can, and should, be used to train AI models for shortlisting interview candidates' +summary: 'Identify where bias has potential to occur when using machine learning for shortlisting interview candidates and mitigate it' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['nlp', 'neural networks'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/examining-whether-recruitment-data-can-and-should-be-used-to-train-ai-models-for-shortlisting-interview-candidates/) on the NHS England Transformation Directorate website. + +## Case Study +Recruitment is a complex process where many factors need to be considered and understood so that the right candidates can be shortlisted for an interview. As well as that, it is a time-consuming process. However, the information provided in a job application has the potential to lead to bias if not handled correctly. This can happen with human shortlisters, but extra care must be taken if a machine (e.g. AI) is involved that is entirely unaware of what is and is not sensitive. + + +**The challenge** +Can we identify where bias has potential to occur when using machine learning for shortlisting interview candidates as part of the NHS England recruitment process? How does this manifest, and how can it be mitigated? + + +### Overview +Recruitment in the NHS involves identifying the best candidates for a wide array of jobs, where the skills typically vary significantly between roles. The process is very time consuming, with applications, CVs and other supporting documents reviewed to identify the best candidates. These candidates are then ‘shortlisted’ for an interview, which is the next step in the process. This shortlisting is done by often highly experienced hiring managers, who have a nuanced understanding of what makes a good candidate. However, the fact that humans are involved in this review process means decisions taken at this stage can vary, from person to person as well as between hiring rounds. + +One way to gain some consistency might be to leverage artificial intelligence (AI), specifically machine learning, to undertake this process. The purpose of this project was to review the feasibility and consistency of this in the context of the NHS. + +**What do we mean by bias?** +There are many ways to define bias depending on whether it is in relation to the data, design, outcomes or implementation of AI. In this piece of work, bias was investigated in the representativeness of the data set and the results of the predictive model using the distribution and balance of shortlisted candidates across groups of protected characteristics in those who applied. + +When talking about bias by the predictive model, the model was determined to have shown ‘bias’ if the errors made in prediction were larger than an accepted error rate (which is defined by the person carrying out the work). + +Bias can also be identified by looking at integrity of the source data (looking at factors such as the way it was collected) or sufficiency (see [here](https://en.wikipedia.org/wiki/Fairness_(machine_learning)#:~:text=of%20a%20model.%22-,Sufficiency,-%5Bedit%5D)) of the data + +![Bed allocation screenshot](../images/Recruitment_graph.width-800.png) +> **Figure 1**: An example of the synthetic staining process. a) the original slide, containing the α-syn proteins stained in a brownish colour b) a processed version of the original slide, filtered for the brownish colour c) the synthetically stained image after the algorithm has been applied to it. The α-syn proteins are now highlighted in a greenish colour. + +> I regularly hear that a bed is a bed and I know it’s not ... But when you have those front door pressures, you can’t get ambulances offloaded and I have beds in the wrong place - this is the time I need the real support, real time data, an automatic risk assessment that is generated for each patient. +– Member of bed management staff, Kettering General Hospital + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/casestudy-synthetic-data-pipeline.md b/docs/our_work/casestudy-synthetic-data-pipeline.md new file mode 100644 index 00000000..8e6dc1cf --- /dev/null +++ b/docs/our_work/casestudy-synthetic-data-pipeline.md @@ -0,0 +1,100 @@ +--- +title: 'Exploring how to create mock patient data (synthetic data) from real patient data' +summary: 'The generation of safe and effective synthetic data to be used in technologies that improve health and social care.' +category: 'CaseStudies' +origin: 'Skunkworks' +tags: ['synthetic data','variational autoencoder','privacy','quality','utility','kedro'] +--- + +## Info +This is a backup of the case study published [here](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/exploring-how-to-create-mock-patient-data-synthetic-data-from-real-patient-data/) on the NHS England Transformation Directorate website. + +## Case Study +The NHS AI Lab Skunkworks team has been releasing open-source code from their artificial intelligence (AI) projects since 2021. One of the challenges faced with releasing code is that without suitable test data it is not possible to properly demonstrate AI tools, preventing users without data access from being able to see the tool in action. + +One avenue for enabling this is to provide “synthetic data”, where new “fake” data is generated from real data using a specifically designed model, in a way that maintains several characteristics of the original data. In particular, synthetic data aims to achieve: + +- Utility - the synthetic data must be fit for its defined use. +- Quality - it must be a sufficient representation of the real data. +- Privacy - it mustn’t ‘leak’ or expose any sensitive information from the real data. + +**The challenge** +This project aimed to provide others with a simple, re-usable way of generating safe and effective synthetic data to be used in technologies that improve health and social care. + +### The challenge +Using real patient data for research and development carries with it safety and privacy concerns about the anonymity of the people behind the information. Various anonymisation techniques can be used to turn data into a form that does not directly identify individuals and where re-identification is not likely to take place. However, it is very difficult to entirely remove the chance of re-identification. Wide release of anonymised data will always carry some risks. Synthetic data aims to remove the need for such concerns because there is no “real patient” connected with the data. + +There are many ways to generate synthetic data. One common challenge with synthetic data approaches is that they are usually configured specifically for a dataset. This is a problem because it means a significant amount of work is needed to update them for use with a different data source. + +Additionally, once data has been produced, it can be difficult to know whether it is actually useful. + +In a partnership project, the NHS Transformation Directorate’s Analytics Unit and the NHS AI Lab Skunkworks team sought to further improve an existing synthetic data generation model (called [SynthVAE](https://nhsx.github.io/nhsx-internship-projects/synthetic-data-exploration-vae/)) and develop a framework for generating synthetic data that could be shared for others to re-use. + +### The teams explored how SynthVAE could be used to generate synthetic data, how that data would be evaluated and how the whole process could be documented for others to re-use. + +If you would like greater technical detail about this project, please read the version on the [Skunkworks Github website](https://nhsx.github.io/skunkworks/synthetic-data-pipeline). + +They sought to: + +- increase the range of synthetic data types that SynthVAE can generate +- create a standard series of checks that can be carried out on the data produced, so that people can better understand its characteristics +- implement a structure to allow users to run the full functionality with a single piece of code. + +To be able to increase SynthVAE’s range of capabilities, the teams needed an input dataset containing a number of different data types in order to broaden the range of the data produced. + +The teams chose to work from a starting dataset that was already in the public domain. This meant people wishing to use the code after release could access and use the same dataset with which the project was developed. MIMIC-III was selected because the size and variety of its data would enable them to produce an input file that would closely match the broad range of typical hospital data. + +From the raw [MIMIC-III](https://physionet.org/content/mimiciii/1.4/) files, they produced a single dataset containing treatment provided by a hypothetical set of patients. It looked similar to datasets that might be encountered in a real hospital setting, helping to keep this project as relevant as possible to anyone wishing to explore the use of synthetic data for health and care. + + +#### 1. Adapting SynthVAE +SynthVAE was originally written primarily to generate synthetic data from both continuous data (data with an infinite number of values) and categorical data (data that can be divided into groups). The inclusion of other data types (like dates) in the new input dataset meant SynthVAE needed to be adapted to take the new set of variables. + +#### 2. Producing synthetic data +Having sourced suitable data and created a useful input file, it was possible to use the input file to train a SynthVAE model that could generate synthetic data. The model was used to generate a synthetic dataset containing several million entries, a substantial increase on volumes previously produced using SynthVAE. + +This wasn’t without challenges, as SynthVAE hadn’t been substantially tested using dates or large volumes of data. However, SynthVAE was successfully adapted to produce a synthetic version of the input data from MIMIC-III. + +#### 3. Creating a checking process +In order to evaluate the privacy, quality and utility of the synthetic data produced, a set of checks were needed. There is currently no industry standard, so the teams chose a range of evaluation approaches designed to provide the broadest possible assessment of the data. + +The process aimed to check whether the synthetic data was a good substitute for the real data, without causing a change in performance (also known as the utility). The additional checks that were added aimed to make the evaluation of utility more robust, for example by checking there are no identical records in the synthetic and real datasets, but also to provide visual aids to allow the user to see what differences are present in the data. + +These checks were combined and their results collected in a web-based report, to allow results to be packaged and shared with any data produced. + +#### 4. Creating a pipeline +Finally, the teams pulled these steps into a single workflow process for others to follow. + +The input data generation, SynthVAE training, synthetic data production and output checking processes were chained together, creating a single flow to train a model, produce synthetic data and then evaluate the final output. + +To make the end-to-end process as user-friendly as possible, a pipelining library called [QuantumBlack’s Kedro](https://medium.com/quantumblack/introducing-kedro-the-open-source-library-for-production-ready-machine-learning-code-d1c6d26ce2cf) was employed. This allowed each step in the workflow to be linked to the next, meaning users can run all parts of the process with a single command. It also gives users the ability to control the definitions within the pipeline and change it according to their needs. + +### Outcomes and lessons learned +The resulting code (click here to see the code) enables users to see how: + +- an input dataset can be constructed from an open-source dataset, MIMIC-III +- SynthVAE can be adapted to be trained on a new input dataset with mixed data-types +- SynthVAE can be used to produce synthetic data +- synthetic data can be evaluated to assess it’s privacy, quality and utility +- a pipeline can be used to tie together steps in a process for a simpler user experience. + +By using the set of evaluation techniques, concerns around the quality of the synthetic data can be directly addressed and measured using the variety of metrics produced as part of the report. + +The approach outlined here is not intended to demonstrate a perfectly performing synthetic data generation model, but instead to outline a pipeline that enables the generation and evaluation of synthetic data. Things like overfitting to the training data, and the potential for bias will be highlighted by the evaluation metrics but will not be remedied. + +It’s important to emphasise that concerns around re-identification are reduced by using synthetic data but not completely removed. Looking at privacy metrics for the synthetic dataset will help the user to understand how well privacy has been preserved, but re-identification may still be possible. + +### What next? +The Analytics Unit is continuing to develop and improve SynthVAE, with a focus on improving the model’s ability to produce high quality synthetic data. + +To better understand the privacy of any patient data used to train a synthetic data generating model, the Analytics Unit has undertaken a project exploring the use of ‘adversarial attacks’ to prove what information about the original training data might be gained from a model alone. The project focussed on a particular type of adversarial attack, called a ‘membership attack. It explored how different levels of information would influence what the attacker could learn about the underlying dataset, and therefore the implications to any individuals whose information was used to train a model. + +Who was involved? +This project was a collaboration between the NHS AI Lab Skunkworks and the Analytics Unit within the Transformation Directorate at NHS England and Improvement. + +The NHS AI Lab Skunkworks is a team of data scientists, engineers and project leaders who support the health and social care community to rapidly progress ideas from the conceptual stage to a proof of concept. + +The Analytics Unit consists of a team of analysts, economists, data scientists and data engineers who provide leadership to other analysts who are working in the system and raise data analysis up the health and care system agenda. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/clinical-coding.md b/docs/our_work/clinical-coding.md new file mode 100644 index 00000000..a6d17b2e --- /dev/null +++ b/docs/our_work/clinical-coding.md @@ -0,0 +1,20 @@ +--- +title: 'Clinical coding automation with the Royal Free and Kettering General' +summary: 'Data scientists in the AI Lab Skunkworks team and the NHS Transformation Directorate Analytics unit are supporting this project to investigate whether the process of clinical coding (applying standard code words to health records) can be supported by artificial intelligence.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['nlp','neural networks'] +--- + +When you visit your doctor or attend hospital a lot of information is collected about you on computers, including your symptoms, tests, investigations, diagnosis, and treatments. Across the NHS, this represents a huge amount of information that could be used to help us learn how to tailor treatments more accurately for individual patients and to offer them better and safer healthcare. The challenge is that most of the information held within these records is in written form that is difficult to use. + +The process of reading health records and applying standardised codes based on particular words, conditions or treatments, is called "clinical coding". The process of clinical coding is time-consuming, expensive and carries the risk of mistakes. + +We are providing data science capability to a joint project with the Royal Free Hospital and Kettering General Hospital. This project aims to understand which open source models are best to support clinical coders by automating part of the clinical coding process using natural language processing (NLP) to teach computers to ‘read’ electronic health records. The aim is for the technology to summarise and suggest the standardised codes that will then be checked by clinical coders. + +NLP is a branch of AI used to interpret unstructured text data, such as free-text notes. + +The project was abandoned due to data access challenges. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/ct-alignment.md b/docs/our_work/ct-alignment.md new file mode 100644 index 00000000..390575e6 --- /dev/null +++ b/docs/our_work/ct-alignment.md @@ -0,0 +1,25 @@ +--- +title: 'CT Alignment and Lesion Detection' +summary: 'A range of classical and machine learning computer vision techniques to align and detect lesions in anomyised CT scans over time from George Eliot Hospital NHS Trust.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['ct','computer vision','image registration','lesion detection'] +--- + +![CT Alignment and Lesion Detection screenshot](images/ct-alignment.png) + +As the successful candidate from the AI Skunkworks problem-sourcing programme, CT Alignment and Lesion Detection was first picked as a pilot project for the AI Skunkworks team in April 2021. + +## Results + +A proof-of-concept demonstrator written in Python (user interface, classical computer vision models, notebooks with machine learning models). + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-ct-alignment-lesion-detection) +Case Study|[Case Study](https://www.nhsx.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-ai-to-identify-tissue-growth-from-ct-scans/) +Technical report|[PDF](https://github.com/nhsx/skunkworks-ct-alignment-lesion-detection/blob/main/docs/NHS_AI_Lab_Skunkworks_CT_Alignment_Lesion_Detection_Technical_Report.pdf) +Video walkthrough|[Youtube](http://www.youtube.com/watch?v=QygOnGLcszk) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/data-lens.md b/docs/our_work/data-lens.md new file mode 100644 index 00000000..91be2a7e --- /dev/null +++ b/docs/our_work/data-lens.md @@ -0,0 +1,25 @@ +--- +title: 'Data Lens' +summary: 'Data Lens brings together information about multiple databases, providing a fast-access search in multiple languages.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['natural language processing', 'semantic search', 'scraping'] +--- + +![Data Lens screenshot](../images/data-lens.png) + +As the successful candidate from a Dragons’ Den-style project pitch, Data Lens was first picked as a pilot project for the NHS AI (Artificial Intelligence) Lab Skunkworks team in September 2020. + +The pitch outlined a common data problem for analysts and researchers across the UK: large volumes of data held on numerous incompatible databases in different organisations. The team wanted to be able to quickly source relevant information with one search engine. + +## Results + +A prototype website written in HTML/CSS/JavaScript (frontend), JavaScript (backend) and python (scrapers, search) implementing elasticsearch and natural language search across a number of NHS databases. + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-data-lens) +Case Study|[Case Study](https://www.nhsx.nhs.uk/ai-lab/explore-all-resources/develop-ai/data-lens-a-fast-access-data-search-in-multiple-languages/) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/home.md b/docs/our_work/home.md new file mode 100644 index 00000000..82a846da --- /dev/null +++ b/docs/our_work/home.md @@ -0,0 +1,55 @@ +--- +title: 'NHS AI Lab Skunkworks' +summary: 'The NHS AI Lab Skunkworks team demonstrates the potential for AI in health and social care through practical experience' +category: 'Overview' +origin: 'Skunkworks' +tags: ['classification','lesion detection','vision AI'] +--- + +!!! info + Welcome to the technical website of the NHS AI Lab Skunkworks team. For our general public-facing website, please visit the ['https://www.nhsx.nhs.uk/ai-lab/ai-lab-programmes/skunkworks/'](AI Skunkworks programme) + +The [AI Skunkworks programme](https://www.nhsx.nhs.uk/ai-lab/ai-lab-programmes/skunkworks/) is part of the [NHS AI Lab](https://www.nhsx.nhs.uk/ai-lab/). It finds new ways to use AI for driving forward the early adoption of technology to support health, in both clinical and business contexts. The team provides free short-term expertise and resources to public sector health and social care organisations to support AI projects and develop capability. + +The programme was closed to new projects at the end of 2022. This website captures the findings, source code, and lessons of its activities. + +## How we work + +The NHS AI Lab Skunkworks team was built around the idea of short-term, rapid projects, aiming to investigate the use of AI for improving efficiency and accuracy in health and care. + +The programme aimed to facilitate a robust conversation around uses of AI in health and care, encouraging the community of healthcare AI practitioners share and discuss their experiences, documenting the findings and releasing any open source code produced. + +The Skunkworks' vision is that organisations in the health and care system will be able, through practical experience, to understand, build, buy, deploy, support, and challenge AI solutions. In order to achieve this, we have mostly ran projects in three ways, all centred around the idea of **co-production**: + +### 1. Internal + +Utilising our internal team of data scientists and data / technology leads, we are able to run small agile projects that may involve one-to-one coaching in the use of python and machine learning frameworks, through to data discoveries to help assess whether the data your organisation possesses is suitable for an AI approach. + + +### 2. Short term resource + +We are able to sponsor individual contractors provided through the [Public Sector Resourcing framework](https://www.publicsectorresourcing.co.uk/) to assist in the conception or implemention of AI solutions for your organisation. + +### 3. With a supplier + +Parterning with the Home Office's [Accelerated Capability Environment (ACE)](https://www.gov.uk/government/groups/accelerated-capability-environment-ace), we are able to sponsor 12 week agile projects using a pool of cutting edge AI suppliers. + +Health and care organisations can submit their proposed AI problem, and we will work with ACE to select and manage a supplier to deliver an AI proof of concept. + +These projects culminate with: + +* Working code, published under an Open Source license on Github +* A technical report, detailing methodology and findings + +**Co-production** means that regardless of the project and the presence or absence of external contractors and partners, we would always work in a collaborative way, by bringing everyone involved around the table: data scientists, data engineers, experts of ML techinques, ethicists, regulation advisers, clinical and non-clinical NHS colleagues, and so on. The in-house AI Skunkworks team would always provide the steering and technical scrutiny to the project. + +### Capability building + +We also provide ad-hoc support, advice and education through initiatives such as our AI Deep Dive workshops, bringing an organisation through the opportunities and challenges of using AI, with an open mind about its implications and issues. + +## Get in touch + +As of 2023, you can still reach us via email at [england.aiskunkworks@nhs.net](mailto:england.aiskunkworks@nhs.net) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# diff --git a/docs/our_work/long-stay-baseline.md b/docs/our_work/long-stay-baseline.md new file mode 100644 index 00000000..c8e3c2b8 --- /dev/null +++ b/docs/our_work/long-stay-baseline.md @@ -0,0 +1,565 @@ +--- +title: 'Long Stayer Risk Stratification Baseline Models' +summary: 'Baseline machine learning models using historical data from Gloucestershire Hospitals NHS Foundation Trust to predict how long a patient will stay in hospital upon admission.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['LoS','length of stay','baseline','risk model', 'regression', 'classification'] +--- + +Long Stayer risk stratification baseline models was selected as a project to run in tandem with the [Long Stayer Risk Stratification](long-stay) project, and started in March 2022. + +Baseline models provide a mechanism to generate baseline metrics to assess the performance of more complex models, and establish the effectiveness of simple approaches. + +!!! Warning "Intended Audience" + This report has been written for analysts and data scientists at NHS Trusts/ALBs + +A series of Jupyter Notebooks used to generate this report are available on [Github](https://github.com/nhsx/skunkworks-long-stayer-risk-stratification-baseline/tree/main/notebooks). + +## Table of contents + +* 1. [Background](#background) +* 2. [Approach](#approach) +* 3. [Data ingest and processing](#dataingestandprocessing) +* 4. [Feature engineering](#featureengineering) +* 5. [Statistical analysis](#statisticalanalysis) +* 6. [Modelling](#modelling) + * 6.1 [Regression models](#regressionmodels) + * 6.2 [Demographic analysis](#demographicanalysis) + * 6.3 [Classification models](#classificationmodels) + * 6.4 [Model comparison](#modelcomparison) +* 7. [Conclusions](#conclusions) +* 8. [Future work](#futurework) + +## 1. Background + +Hospital long stayers, those with a length of stay (LoS) of 21 days or longer, have significantly worse medical and social outcomes than other patients. Long-stayers are often medically optimised (fit for discharge) many days before their actual discharge. Moreover, there are a complex mixture of medical, cultural and socioeconomic factors which contribute to the causes of unnecessary long stays. + +This project aims to complement [previous work](long-stay) by generating simple baseline regression and classification models that could be replicated at other hospital trusts, and is divided into two phases: + +1. Series of Jupyter Notebooks containing baseline model code +2. [Reproducible Analytical Pipeline](https://github.com/NHSDigital/rap-community-of-practice) including data pipelines + +Currently, this project has completed **Phase 1**. + +## 2. Approach + +The aim of this project is to perform the simplest possible feature engineering and modelling to arrive at a reasonable baseline model, for more advanced feature engineering and modelling to be compared against. + +The approach involved: + +1. Defining the population and data cleaning +2. Feature engineering, focussing on basic numerical and categorical features +3. Simple baseline models implemented using commonly available packages including [scikit-learn 1.1.1](https://scikit-learn.org/), [CatBoost 1.0.6](https://catboost.ai) and [XGBoost 1.3.3](https://xgboost.readthedocs.io/en/stable/) +4. Analysis of model performance by demographic +5. A comparison of regression-based and classification-based risk stratification models +6. A set of extensions for future work + +## 3. Data ingest and processing + +GHNHSFT performed a SQL export from their EPR system containing ~770,000 records across 99 columns, with significant sparsity across several columns and a section of rows, as visualised by light coloured blocks (null values) in the below image: + +![Image of data sparsity](../images/long-stay-baseline/sparsity.png) + +> Figure 1. Plot of data sparsity of raw data. Null values are light coloured blocks. (_Note that not all columns are labelled_) + +The population for this study was defined as non-elective, major cases as recorded in the `IS_MAJOR` and `elective_or_non_elective` fields. + +Filtering the dataset to this definition resulted in a reduction of ~770,000 rows to ~170,000 rows (78% reduction): + +![Image of data sparsity - major cases](../images/long-stay-baseline/sparsity-major.png) + +> Figure 2. Plot of data sparsity of "major" cases. Null values are light coloured blocks. (_Note that not all columns are labelled_) + +Data was processed by: + +1. Converting datetime columns into the correct data type +2. Ordering records by `START_DATE_TIME_HOSPITAL_PROVIDER_SPELL` +3. Removing fields not available at admission +4. Removing empty and redundant (e.g. `LENGTH_OF_STAY_IN_MINUTES` duplicates `LENGTH_OF_STAY`) columns +5. Removing duplicate rows +4. Removing local identifiers +5. Imputing `stroke_ward_stay` as `N` if not specified +6. Binary encoding `stroke_ward_stay`, `IS_care_home_on_admission`, `IS_care_home_on_discharge` and `IS_illness_not_injury` + +This resulted in ~170,000 rows across ~50 columns as visualised in the below image: + +![Image of data sparsity of "clean" data](../images/long-stay-baseline/sparsity-clean.png) + +> Figure 3. Plot of data sparsity of clean data. Null values are light coloured blocks. (_Note that not all columns are labelled_) + +The resulting data dictionary is available [here](https://github.com/nhsx/skunkworks-long-stayer-risk-stratification-baseline/blob/main/docs/data-dictionary.csv). + +Additionally, Length of Stay was capped to 30 days, due to a long tail of long stayers over ~15 days and the definition of long stayer being over 21 days. The effect of capping can be visualised by comparing box plots of the distribution of length of stay on the raw data (left image, y scale up to 300 days) and the capped data (right image, y scale up to 30 days): + +

![Boxplot of length of stay](../images/long-stay-baseline/los-boxplot.png)

+ +> Figure 4. Plot of the distribution of long stayers in the raw (left) data and capped (right) data. Note different y scales. + +The resulting distribution of length of stays shows a ~bimodal distribution caused by the capping - the majority of stays are short (<5 days), which a long tail and population of long stayers: + +

![Density plot of length of stay](../images/long-stay-baseline/los-density.png)

+ +> Figure 5. Plot of density of length of stay for capped data. + +## 4. Feature engineering + +After discussion with GHNHSFT, the following decisions were made in feature selection: + +1. Select the following features for inclusion in the model, which are available on admission: +```python + "ae_arrival_mode", + "AGE_ON_ADMISSION", + "EL CountLast12m", + "EMCountLast12m", + "IS_illness_not_injury", + "IS_cancer", + "IS_care_home_on_admission", + "IS_chronic_kidney_disease", + "IS_COPD", + "IS_coronary_heart_disease", + "IS_dementia", + "IS_diabetes", + "IS_frailty_proxy", + "IS_hypertension", + "IS_mental_health", + "MAIN_SPECIALTY_CODE_AT_ADMISSION_DESCRIPTION", + "OP First CountLast12m", + "OP FU CountLast12m", + "SOURCE_OF_ADMISSION_HOSPITAL_PROVIDER_SPELL_DESCRIPTION", + "stroke_ward_stay", + "LENGTH_OF_STAY", + "arrival_day_of_week", + "arrival_month_name" +``` +2. Exclude the following features, but retain for later analysis of model fairness: +```python + "ETHNIC_CATEGORY_CODE_DESCRIPTION", + "IMD county decile", + "OAC Group Name", + "OAC Subgroup Name", + "OAC Supergroup Name", + "PATIENT_GENDER_CURRENT_DESCRIPTION", + "POST_CODE_AT_ADMISSION_DATE_DISTRICT", + "Rural urban classification" +``` +3. Generate `arrival_day_of_week` and `arrival_month_name` recalculated from `START_DATE_TIME_HOSPITAL_PROVIDER_SPELL` + +This resulted in a dataset of **~170,000 rows across 30 columns**. + +One-hot encoding was performed for categorical variables, but non-one-hot encoded features were also kept for models like CatBoost which [manage categorical variables themselves](https://catboost.ai/en/docs/features/categorical-features). + +### 5. Statistical analysis + +In order to select appropriate modelling approaches, some basic statistical analysis was conducted to understand normality and inter-correlation of the selected features. + +#### Correlation analysis + +Correlation analysis confirmed presence of significant **collinearity** between different features. + +Top 20 one-hot encoded features correlated with `LENGTH_OF_STAY`, ranked by absolute correlation, were: + +![Plot of correlations with LENGTH_OF_STAY](../images/long-stay-baseline/correlation.png) + +> Figure 6. Plot of top 20 correlated features with `LENGTH_OF_STAY`. Blue columns are positively correlated (ie. increase length of stay) and red columns are negatively correlated (ie. reduce length of stay). + +These indicate that age and age-related illness, as well as arrival mode are strong factors in determining length of stay. + +#### Variation inflation factors + +Variation inflation factors (VIF) confirmed the presence of multi-colinearity between a number of features (VIF > 10). + +#### Homoescadisticity + +A basic ordinary least squares (OLS) regression model was fitted to the full feature set, then residuals calculated. + +Residuals failed Shapiro-Wilk, Kolmogorov-Smirnov and Anderson-Darling tests for normality, as well as visual inspection: + +![Plot of residuals for OLS model of length of stay](../images/long-stay-baseline/residuals.jpeg) + +> Figure 7. Plot of residuals (errors) in an OLS model of length of stay for all data. + +OLS methods were therefore excluded from modelling. + +## 6. Modelling + +The machine learning modelling approach was as follows: + +1. Split the data into a training (70%), validation (15%) and test (15%) data set +2. Check the data splits do not introduce selection bias for length of stay, age, sex, or ethnicity +3. Train baseline models with default parameters on the training set +4. Evaluate baseline models on the validation test +5. Select the best performing model +6. Tune the best performing model using cross-validation on the training and validation set +7. Report the final performance of the model using the test set + +![Summary of machine learning approach](../images/long-stay-baseline/ml-approach.png) + +> Figure 8. Summary of machine learning approach used in this project. + +Training, validation and test splits were representative of the population and did not introduce selection bias: + +**Length of stay** + +![Distribution of length of stay by data split](../images/long-stay-baseline/split-los.png) + +> Figure 9. Distribution of length of stay by data split. + +**Age** + +![Distribution of age by data split](../images/long-stay-baseline/split-age.png) + +> Figure 10. Distribution of age by data split. + +**Sex** + +Proportion of `male`, `female` patients in each split: + +``` +train: [0.53, 0.47] +validate: [0.51, 0.49] +test: [0.53, 0.47] +``` + +**Ethnicity** + +Proportions of each ethnicity for each split: + +``` +train: [0.87, 0.05, 0.02, 0.02, 0.01, 0.01, 0.01, 0.0, ...] +validate: [0.88, 0.05, 0.03, 0.02, 0.01, 0.01, 0.01, 0.0, ...] +test: [0.87, 0.05, 0.02, 0.02, 0.01, 0.01, 0.0, 0.0, ...] +``` + +### 6.1 Regression models + +A range of baseline regression models were selected: + +Model|Rationale +---|--- +[Mean](https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyRegressor.html)|The simplest baseline, uses the mean length of stay as the prediction in all cases +[ElasticNet](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html)|A regularised implementation of linear regression that can be used for multi-colinear datasets such as in this dataset +[DecisionTreeRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html)|A simple, single tree regressor that is highly explainable +[RandomForestRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html)|An ensemble of decision trees with potentially better performance than a single tree +[XGBRegressor](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBRegressor)|A boosted tree technique that can improve on ensemble techniques such as RandomForest +[CatBoostRegressor](https://catboost.ai/en/docs/concepts/python-reference_catboostregressor)|A boosted tree technique designed specifically for datasets with high levels of categorical features as in this dataset + +Each model was trained with default parameters, and evaluated using root mean squared error (RMSE) on both the training set and then on the (unseen) validation set: + +Model|Training RMSE (days)|Validation RMSE (days) +---|---|--- +Mean|6.89|6.94 +ElasticNet|6.55|6.60 +DecisionTree|0.55|9.11 +RandomForest|2.46|6.52 +XGBoost|5.97|6.32 +CatBoost|6.13|6.26 + +The best performing baseline model was Catboost with an RMSE of 6.26 days. Both DecisionTree and RandomForest models overfit the training data, as seen with low training RMSE resulting in much higher validation RMSE. + +A single metric (e.g. RMSE) does not capture the behaviour of each model, so we visualise both the Predicted vs Actual plots as well as the corresponding relative error for both the training set and the validation set: + +**Training performance:** + +![Plots of predicted vs actual and corresponding errors on the training dataset](../images/long-stay-baseline/regression-predicted-actuals-training.png) + +> Figure 11. Plots of predicted vs actual (left, red dashed line shows ideal model) and corresponding relative errors (right, red solid line shows mean error with 95% limits of agreement in green dashed lines) on the training dataset. + +The RandomForest model appears to fit the training data well, but when compared with the performance on the validation set below, we can see this is due to overfitting on the training data set: + +**Validation performance:** + +![Plots of predicted vs actual and corresponding errors on the validation dataset](../images/long-stay-baseline/regression-predicted-actuals-validation.jpeg) + +> Figure 12. Plots of predicted vs actual (left, red dashed line shows ideal model) and corresponding relative errors (right, red solid line shows mean error with 95% limits of agreement in green dashed lines) on the validation dataset. + +In all cases, the poor predictive power at higher length of stays is evident - there appears to be a linear increase in error caused by the models' inability to predict higher length of stays. + +This is likely due to the bimodal nature of the underlying length of stay values - most stayers are short, while there is a significant portion of long stayers. + +Further tuning of the CatBoost model using GridSearch and cross-validation led to the following results: + +Parameter|Optimal value +---|--- +`depth`|6 +`l2_leaf_reg`|9 +`learning_rate`|0.1 + +with + +Model|Training RMSE (days)|Validation RMSE (days)|Test RMSE (days)|Test MAE (days) +---|---|---|---|--- +CatBoost (tuned)|6.24|6.18|6.06|4.12 + +The test MAE of 4.12 days compares reasonably well to the previous work using a convolutional neural network which achieved a MAE of 3.8 days. + +However, a plot of predicted vs actual using the test dataset shows again the model's inability to capture long stayers: + +![Plots of predicted vs actual and corresponding errors for the final model - test set](../images/long-stay-baseline/regression-predicted-actuals-final-model-test.jpeg) +> Figure 13. Plots of predicted vs actual (left, red dashed line shows ideal model) and corresponding relative errors (right, red solid line shows mean error with 95% limits of agreement in green dashed lines) on the test dataset for the final model. + +We can still explore the most important features that make up the prediction by plotting feature importances of the final model: + +![Feature importances the final model](../images/long-stay-baseline/regression-feature-importance.png) +> Figure 14. Feature importances for the final regression model. + +These broadly align with the correlated features explored earlier on - namely, age, arrival mode, serious illness but also include the number of previous visits, which can be considered a proxy for serious illness itself. + +Because the final model, using CatBoost, does not include one-hot encoding of the categorical data as CatBoost deals with this internally, we don't have further granularity on admission mode and arrival mode to compare. + +### 6.2 Demographic analysis + +While the model is not peformant enough to deploy into production, it is still important to understand whether or not the model incorporates bias into its predictions. + +There are many kinds of bias in machine learning projects, and here we are looking at representation bias: + +> Does the model perform better or worse for specific categories of people across sex, ethnicity and other demographics? + +The specific categories are: + +``` +"ETHNIC_CATEGORY_CODE_DESCRIPTION", "IMD county decile", "OAC Group Name", "OAC Subgroup Name", "OAC Supergroup Name", "PATIENT_GENDER_CURRENT_DESCRIPTION", "POST_CODE_AT_ADMISSION_DATE_DISTRICT", "Rural urban classification" +``` + +Before looking at model performance, we need to understand how represented each category is, before drawing conclusions on categories with small sample size (note that for brevity we will only share results from `"ETHNIC_CATEGORY_CODE_DESCRIPTION", "IMD county decile","PATIENT_GENDER_CURRENT_DESCRIPTION"`): + +![Underlying counts for ethnicity - all data](../images/long-stay-baseline/los-dist-ethnicity.png) +> Figure 15. Underlying counts for ethnicity - all data. + +We can see that for `ETHNIC_CATEGORY_CODE_DESCRIPTION`, the overwhelming majority of patients report `British`. We should be careful what conclusions we draw in further analysis about smaller categories, as the sample size will be very small and likely not statistically representative. + +![Underlying counts for sex - all data](../images/long-stay-baseline/los-dist-sex.png) +> Figure 16. Underlying counts for sex - all data. + +Sex is broadly equal, with slightly more female than male patients in this dataset. + +![Underlying counts for index of multiple deprivation - all data](../images/long-stay-baseline/los-dist-imd.png) +> Figure 17. Underlying counts for index of multiple deprivation - all data + +Index of Multiple Deprivation (IMD) deciles are skewed to the lower end, ie. there are more deprived patients present in this dataset than not. + +Now we can look at the distribution of length of stay for the above categories: + +![Underlying length of stay by ethnicity - all data](../images/long-stay-baseline/los-mean-los-ethnicity.png) +> Figure 18. Underlying length of stay by ethnicity - all data. + +There is significant variation of length of stay for different ethnic groups, for example with White and black Carribean patients having a length of stay of 2.6 days on average, versus 6.0 days for Irish patients. However, as discussed previously, the count of these groups is 560 and 892 individuals respectively so further statistical hypothesis tests need to be conducted to understand if the distributions are truly different (e.g. a two-sided Kolmogorov-Smirnov test). + +![Underlying length of stay by sex - all data](../images/long-stay-baseline/los-mean-los-sex.png) +> Figure 19. Underlying length of stay by sex - all data. + +Mean length of stay is almost identical across patient sex. + +![Underlying length of stay by index of multiple deprivation - all data](../images/long-stay-baseline/los-mean-los-imd.png) +> Figure 20. Underlying length of stay by index of multiple deprivation - all data. + +There are small variations in length of stay across IMD deciles, although more tests need to be conducted to understand if these differences are statistically significant. + +Because we are interested in if the model performs differently by category, we will plot the error of the predictions of the test dataset relative to the overall (mean) error for all categories. This will help identify potential discrimination in model performance. + +![Relative error in length of stay predictions for different ethnic groups - test data](../images/long-stay-baseline/los-rel-error-ethnicity.png) +> Figure 21. Relative error in length of stay predictions for different ethnic groups - test data. + +The model appears to perform significantly worse for Carribean (overestimating length of stay by 2.7 days compared to the mean error) and Any other mixed background (underestimating length of stay by 1.8 days compared to the mean error). Sample sizes are 719 and 536 patients respectively. As discussed the small sample sizes need further investigation and/or additional data collection to establish the statistical significance of this performance difference. + +![Relative error in length of stay predictions for different sex - test data](../images/long-stay-baseline/los-rel-error-sex.png) +> Figure 22. Relative error in length of stay predictions for different sex - test data. + +Sex has almost no (0.002 days) error from the average. + +![Relative error in length of stay predictions for different index of multiple deprivations deciles - test data](../images/long-stay-baseline/los-rel-error-imd.png) +> igure 23. Relative error in length of stay predictions for different index of multiple deprivations deciles - test data. + +The lowest IMD county decile (1) has an error of 0.5 days underestimating from the mean error, which at under a day may not lead to any difference in treatment if this prediction is used in clinical practice (ie. a length of stay of 1.5 days is the same as a length of stay of 2.0 days - both would count as 2 whole days). + +We also know that length of stay varies by group, so further plots of the ratio of MAE to length of stay are generated in the notebooks, but not included here for brevity. + +The final model generated did not adequately capture length of stay across the population. Some sample sizes of demographic groups were too small to draw conclusions, but the process of exploring the underlying distribution of the target feature (length of stay), count (n) and model performance were important and should remain part of future work. + +### 6.3 Classification models + +In addition to predicting the length of stay in days, we are also interested in stratifying the risk of a patient becoming a long stayer. This can be inferred from their predicted length of stay (see [model comparison](#modelcomparison)), but we can also train a classification model to do this directly. + +The agreed stratification of risk of long stay is defined as: + +Risk Category|Day Range for Risk Category +-----|------ +1 - Very low risk|0-6 +2 - Low risk|7-10 +3 - Normal risk|11-13 +4 - Elevated risk|14-15 +5 - High risk|>15 + +We keep the training features the same as in the regression models, and encode risk from the actual length of stay as the target feature. + +> Postcript: classification models based on increasing risk (1-5) are ordinal in nature, and an appropriate model should be used where different classes are not treated as independent as per the examples in this implementation. + +The classification equivalents of the regression models were selected: + +Model|Regression version|Classification version +---|---|--- +Dummy|[Mean](https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyRegressor.html)|[Prior](https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html) +ElasticNet|[ElasticNet](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html)|[LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) +Decision Tree|[DecisionTreeRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html)|[DecisionTreeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) +Random Forest|[RandomForestRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html)|[RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) +XGBoost|[XGBRegressor](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBRegressor)|[XGBClassifier](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBClassifier) +CatBoost|[CatBoostRegressor](https://catboost.ai/en/docs/concepts/python-reference_catboostregressor)|[CatBoostClassifier](https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier) + +The training, validation and test regime was the same as for the regression models. + +**Class imbalance** + +In the training set, we observe the following class imbalance: + +Risk score|Number of patients|% of total patients +---|---|--- +1|89711|74.0 +2|12634|10.4 +3|5226|9.1 +4|2613|4.3 +5|10990|2.2 + +ie. the majority of patients are low risk, and the highest risk group is only 2.2% of the population. + +Class and/or sample weights were calculated using the above training imbalances and passed into all models. + +Models were trained using default parameters, and evaluated using the [weighted F1 score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) which represents the balance between precision and recall, and accounts for class imbalance. F1 scores range from 0 to 1 (where 1 is "ideal" or maximum). + +Model|Training weighted F1 score|Validation weighted F1 score +---|---|--- +Prior|0.63|0.62 +ElasticNet|0.54|0.53 +DecisionTree|1.00|0.59 +RandomForest|1.00|0.64 +XGBoost|0.57|0.54 +CatBoost|0.57|0.54 + +While the RandomForest model obtained the highest validation weighted F1 score (0.64), it also overfit the training data (weighted F1 score of 1.00). + +A visual inspection of model performance, plotting both total counts of risk categories in actual vs predicted cases, as well as the proportion of actual risk in each predicted category, are shown below for both the training and validation data sets: + +**Training performance:** + +![Plots of predicted vs actual risks on the training dataset](../images/long-stay-baseline/clf-predicted-actuals-training.png) +> Figure 24. Plots of predicted vs actual risks on the training dataset. Left image shows count of actual and predicted risks for each category. Right image shows proportion of actual risk that makes up each predicted risk category. + +We can see that both the DecisionTree and RandomForest models severely overfit the training data. + +We also see that none of the models are able to capture the nature of the highest risk categories, with every risk category containing a large (>50%) proportion of the lowest risk level (level 1). This is despite weighting the models to account for class imbalance. + +**Validation peformance:** + +![Plots of predicted vs actual risks on the validation dataset](../images/long-stay-baseline/clf-predicted-actuals-validation.png) +> Figure 25. Plots of predicted vs actual risks on the validation dataset. Left image shows count of actual and predicted risks for each category. Right image shows proportion of actual risk that makes up each predicted risk category. + +The RandomForest model has an anomaly in its predictions for risk category 4 where it is missing any of the highest risk category 5 compared to other predictions. This is likely due to the overfitting observed in the previous plot. + +Both CatBoost and XGBoost have similar levels of predictive power, defined by the lower proportion of very low risk in the predictions for high risk, although at ~50% these are still too high. + +Both CatBoost and XGBoost overpredict higher risk categories, while underpredicting the lowest risk category. This will lead both to false positives where very low risk cases are shown as high risk, and false negatives where high risk cases are shown as lower risk. + +CatBoost was selected as the final model due to the lack of significant difference in performance with XGBoost, and for consistency with the final regression model. Further tuning of the CatBoost model using GridSearch (with a smaller paramater space than with regression due to compute time) and cross-validation led to the following results: + +Parameter|Optimal value +---|--- +`depth`|10 +`l2_leaf_reg`|1 +`learning_rate`|0.1 + +with + +Model|Training weighted F1 score|Validation weighted F1 score|Test weighted F1 score|Test balanced accuracy|Test AUC (OVR, weighted) +---|---|---|---|---|--- +CatBoost (tuned)|0.61|0.75|0.60|0.27|0.70 + +Balanced accuracy was determined as 0.27, a poor result for accurately predicting the correct class. The overall Area Under the receiving operator characterstic Curve (AUC), which was calculated as a weighted one-versus-rest metric, was 0.70. + +![Plots of predicted vs actual for the final model - test set](../images/long-stay-baseline/clf-predicted-actuals-final-model-test.png) +> Figure 26. Plots of predicted vs actual risks on the test dataset for the final model. Left image shows count of actual and predicted risks for each category. Right image shows proportion of actual risk that makes up each predicted risk category. + +The final model still assigns over 50% of the lower risk class (the most populated class) to every predicted class, which would lead to a high number of false positives. It also fails to capture the highest risk class adequately, leading to a high number of false negatives. + +Despite the poor performance, we can still explore the most important features that make up the prediction by plotting feature importances of the final model: + +![Feature importances the final model](../images/long-stay-baseline/clf-feature-importance.png) +> Figure 27. Feature importances for the final regression model. + +In this case, `arrival_month_name` and `arrival_day_of_week` are the two most important features, which differs from the regression model and correlation analysis. This may be why the false positive and false negative rates for the model are so high, and needs further exploration. + +Demographic analysis of the risk stratification model was not conducted as the model performance did not justify exploring whether there was representation bias at this stage. + +### 6.4 Model comparison + +As a final modelling step, we can compare both the regression models and classification models, by encoding the predicted length of stay from the regression model as a corresponding risk. + +This comparison may help us understand whether a classification or regression approach is more suitable for this type of data. + +![Comparison of both models](../images/long-stay-baseline/model-comparison.png) + +Figure 28. Comparison of both models. Left image shows proportion of actual risk for each predicted risk category for the classification model. Right image shows proportion of actual risk for each equivalent predicted risk category derived from the regression model. + +Here we can see that the regression model, encoded as a risk stratification model, performs much better than the classification approach: + +* The number of very low risk patients is much lower for higher risk patients, under 20% in the case of high risk. This means lower false positives. +* The proportion of high risk patients is higher in the predicted higher risk categories. This means lower false negatives. + +If risk stratification is the key desired output, then further refining the regression model may be the better approach to improving the overall performance of the system. + +## 7. Conclusions + +A number of baseline machine learning models were trained on EPR data from GHNHSFT. + +The best performing regression model achieved a Mean Absolute Error of 4.1 days, compared to 3.8 days for previous work using a convolutional neural network. + +Simpler baseline models benefit from enhanced explainability and less compute resources for training. In this case the most important features were related to age and serious illness. + +The overall performance of the best regression model was still poor - despite an MAE of 4.1 days, the model failed to capture long stayers and requires further work before use. + +The best performing classification model achieved a weighted F1 score of 0.6. + +The overall performance of the best classification model was poor - the model failed to capture high risk and assigned a high proportion (>50%) of very low risk patients to higher risk groups. + +Using the regression model to calculate equivalent risk scores led to a better risk stratification model, where only ~20% of very low risk patients were assigned to the high risk group. + +Demographic analysis showed that the model performed differently for different ethnicities and indices of multiple deprivation, but both model performance needs to be improved and sample sizes need to be increased in order to draw any meaning from these initial findings. + +There is opportunity for much future work, which should be balanced with the utility of these predictions in the clinical context. + +## 8. Future work + +### Modelling improvements + +1. Feature engineering of free text fields. Early on we decided to focus on simple numerical and categorical features for this project. A huge amount of rich data is present in fields such as `presenting_complaint` and `reason_for_admission`. +2. Including features available after admission. Fields such as `all_diagnoses` and `all_treatments` will provide clinically important information, and may improve the performance of the predictions. +3. Focussing on a smaller number of features. Once the most important features are identified, a model using the top e.g. 10 features could be trained and tested. +4. Building two models - one for short stay and one for long stay. This may help capture the bimodal nature of the underlying dataset. +5. Including `MINOR` cases. This project focussed on `MAJOR`, `non-elective` cases. 70%+ of the original data belonged to minor cases, and in combination with the above, including this data could lead to an improvement in model performance. +6. Treating Length of Stay as a discrete variable and applying poisson distribution appropriate approaches to modelling. +7. Exploring Generalised Linear Models using e.g. [pyGAM](https://pygam.readthedocs.io/en/latest/index.html). +8. Exploring Bayesian approaches. +9. Exploring the addition of latent variable(s). + +### Demographic analysis improvements + +1. Statistical testing of fairness. Once model performance reaches a sufficient level, further statistical tests of model performance across demographics should be conducted using e.g. a two-sided Kolmogorov-Smirnov test. +2. Combine smaller groups. For example, grouping `British` and `Non-British` ethnicities would allow statistical comparisons to be made between the majority group and other groups. + +### Technical improvements + +1. Move from Notebooks to python scripts. Jupyter Notebooks are an excellent exploratory tool, but do not work well with version control or automated testing. +2. Implement a [Reproducible Analytical Pipeline](https://github.com/NHSDigital/rap-community-of-practice). This will allow reuse of the approaches here and improve overall code quality. +3. Abstract visualisation code into functions. This will improve readability of the code. + +#### Acknowledgments + +1. Joe Green, GHNHSFT for presenting the challenge to Skunkworks and supporting problem definition/data selection +2. Tom Lane, GHNHSFT for support in final stages +3. Brad Pearce and Peter Coetzee, Polygeist, for the original CNN-based model +4. Jennifer Hall, Matthew Cooper and Sanson Poon, NHS AI Lab Skunkworks for guidance, code and report review +5. Chris Mainey, NHSE, for suggestions of additional modelling improvements + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-long-stayer-risk-stratification-baseline) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/long-stay.md b/docs/our_work/long-stay.md new file mode 100644 index 00000000..f64019ab --- /dev/null +++ b/docs/our_work/long-stay.md @@ -0,0 +1,26 @@ +--- +title: 'Long Stayer Risk Stratification' +summary: 'Machine learning using historical data from Gloucestershire Hospitals NHS Foundation Trust to predict how long a patient will stay in hospital upon admission.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['LoS','length of stay','neural network','risk model'] +--- + + + +![Long Stayer Risk Stratification screenshot](../images/long-stay.png) + +As the successful candidate from the AI Skunkworks problem-sourcing programme, Long Stayer Risk Stratification was first picked as a pilot project for the AI Skunkworks team in April 2021. + +## Results + +A proof-of-concept demonstrator written in Python (machine learning model and backend), and HTML/CSS/JavaScript (frontend). + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-long-stayer-risk-stratification) +Case Study|[Case Study](https://www.nhsx.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-machine-learning-to-identify-patients-at-risk-of-long-term-hospital-stays/) +Technical report|On request: ai-skunkworks@nhsx.nhs.uk + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/nhs-resolution.md b/docs/our_work/nhs-resolution.md new file mode 100644 index 00000000..3d0d0aff --- /dev/null +++ b/docs/our_work/nhs-resolution.md @@ -0,0 +1,20 @@ +--- +title: 'Predicting negligence claims with NHS Resolution' +summary: 'This project investigated whether it is possible to use machine learning AI to predict the number of claims a trust is likely to receive and learn what drives them in order to improve safety for patients.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['classification','prediction'] +--- + +NHS Resolution provides expertise to the NHS on resolving concerns and disputes. The organisation holds a wealth of historic data around claims, giving insight and valuable data around the causes and impacts of harm. + +The NHS Resolution team wanted to understand whether AI methods could be applied to their data to better understand and identify risk, preventing harm and saving valuable resources. + +We aimed to prove the value of machine learning in determining insights from the available data. Automated machine learning was used to run repeated processes on the available data in order to select the AI models that uncovered the most relevant information. + +Output|Link +---|--- +Case Study|https://transform.england.nhs.uk/ai-lab/explore-all-resources/understand-ai/using-ai-to-support-nhs-resolution-with-negligence-claims-prediction/ + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/nursing-placement-optimisation.md b/docs/our_work/nursing-placement-optimisation.md new file mode 100644 index 00000000..89d0d396 --- /dev/null +++ b/docs/our_work/nursing-placement-optimisation.md @@ -0,0 +1,33 @@ +--- +title: 'Nursing Placement Schedule Optimisation Tool' +summary: 'Optimisation problem developed with Imperial College Healthcare Trust to produce optimised schedules for student nurses going on placement within the trust.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['optimisation','genetic algorithm','nursing'] +--- + +This project is an example of the AI Skunkworks team offering capability resources to produce proof-of-concepts which could be applicable to the NHS at large. The project ran from January 2022 to May 2022. + +![Tool User Interface prior to running](../images/nursing-placement-optimisation/ui-before-running.png) + +User Interface that is seen upon launching the tool + +![Tool User Interface during running](../images/nursing-placement-optimisation/ui-during-running.png) + +User Interface while the tool is running + +![Tool User Interface after running](../images/nursing-placement-optimisation/ui-after-running.png) + +User Interface after the tool has run + +## Results + +A proof-of-concept demonstrator written in Python (machine learning model, user interface and backend) + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-nursing-placement-schedule-optimisation) +Case Study|[Case Study](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/using-ai-to-find-optimal-placement-schedules-for-nursing-students/) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/nwas.md b/docs/our_work/nwas.md new file mode 100644 index 00000000..01e25bf5 --- /dev/null +++ b/docs/our_work/nwas.md @@ -0,0 +1,14 @@ +--- +title: 'NWAS – Ambulance data exploration' +summary: 'Data exploration of ambulance service' +category: 'Projects' +origin: 'Skunkworks' +tags: ['eda'] +--- + +The aim of this proof-of-concept project was to develop a machine learning model that could predict the triage outcome of emergency calls based on the information provided by the caller. The model was trained on a large dataset of emergency call data and triage outcomes to identify patterns and relationships between the information provided and the resulting triage classification. + +Two different AI approaches were involved in the developed models, including using a gradient-boosted decision trees model for the numerical and categorical type of data, and a NLP model to handle the free-text data. + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/open-safely.md b/docs/our_work/open-safely.md new file mode 100644 index 00000000..49396488 --- /dev/null +++ b/docs/our_work/open-safely.md @@ -0,0 +1,22 @@ +--- +title: 'Working with a Trusted Research Environment' +summary: 'An exploration of OpenSafely' +category: 'Projects' +origin: 'Skunkworks' +tags: ['eda'] +--- + +OpenSAFELY gives trusted researchers restricted levels of access to the server to run analysis on real data and obtain aggregate results, without having sight of the patient level data. Aggregate results are checked to ensure there are no disclosure risks before being released from the server. This highly secure way of working enables researchers to have access to large and sensitive datasets in a safe manner. + +This project had two main aims: + +- To understand the reach and coverage of the NHS @Home programme during the pandemic. Specifically looking at: blood pressure monitoring, pulse oximetry and proactive care interventions. + +- To understand how to approach an analysis project using OpenSAFELY, including the amount of time and resource required, and whether this platform would be useful for future analyses. + +Output|Link +---|--- +Case Study|TBD + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/parkinsons-detection.md b/docs/our_work/parkinsons-detection.md new file mode 100644 index 00000000..97429e26 --- /dev/null +++ b/docs/our_work/parkinsons-detection.md @@ -0,0 +1,24 @@ +--- +title: 'Parkinson''s Disease Pathology Prediction' +summary: 'Automatic segmentation and detection of Parkinson’s disease pathology using synthetic staining and deep neural networks' +category: 'Projects' +origin: 'Skunkworks' +tags: ['parkinson\'s disease','synthetic staining','classification','deep learning', 'pathology', 'neural networks'] +--- + +![Parkinson's prediction diagram](../images/parkinsons-detection.png) + +"Parkinson's Disease Pathology Prediction" was selected as a project in 2022 following a successful pitch to the AI Skunkworks problem-sourcing programme. + +## Results + +A proof-of-concept demonstrator written in Python (machine learning models, command line interface (CLI), Jupyter Notebooks). + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-parkinsons-detection/) +Technical report|[biorxiv.org](https://www.biorxiv.org/content/10.1101/2022.08.30.505459v1) +Case Study|[Case Study](https://transform.england.nhs.uk/ai-lab/explore-all-resources/develop-ai/identifying-and-quantifying-parkinsons-disease-using-ai-on-brain-slices/) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/project-flow.md b/docs/our_work/project-flow.md new file mode 100644 index 00000000..3b2610c9 --- /dev/null +++ b/docs/our_work/project-flow.md @@ -0,0 +1,14 @@ +--- +title: 'AI Prototype Project Flow' +summary: 'How to run a successful AI proof of concept prototype project' +category: 'Playbooks' +origin: 'Skunkworks' +tags: [] +--- + +An iterated, tried and tested approach to running AI proof of concept prototype projects. + +![Visual overview of skunkworks project flow](../images/skunkworks-project-flow.svg) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/renal-health-prediction.md b/docs/our_work/renal-health-prediction.md new file mode 100644 index 00000000..602c4858 --- /dev/null +++ b/docs/our_work/renal-health-prediction.md @@ -0,0 +1,26 @@ +--- +title: 'Renal Health Prediction' +summary: 'Identifying acute kidney injury (AKI) patients at higher risk of requiring ITU, needing renal support (dialysis), or likely to have a higher potential for mortality.' +category: 'Projects' +origin: 'Skunkworks' +tags: ['aki','renal health','rnn','deep learning', 'time series', 'neural networks'] +--- + + + +![Renal Health Prediction diagram](../images/renal-health-prediction.png) + +Renal Health Predition was selected as a project in Spring 2022 following a succesful pitch to the AI Skunkworks problem-sourcing programme. + +## Results + +A proof-of-concept demonstrator written in Python (machine learning models, command line interface (CLI), Jupyter Notebooks). + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-renal-health-prediction/) +Technical report|[PDF](https://github.com/nhsx/skunkworks-renal-health-prediction/raw/main/docs/renal-health-prediction-technical-report.pdf) +Pre-print (MedRxiv)|[PDF](https://www.medrxiv.org/content/10.1101/2023.02.22.23286184v1) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) +# \ No newline at end of file diff --git a/docs/our_work/synthetic-data-pipeline.md b/docs/our_work/synthetic-data-pipeline.md new file mode 100644 index 00000000..baaa1692 --- /dev/null +++ b/docs/our_work/synthetic-data-pipeline.md @@ -0,0 +1,120 @@ +--- +title: 'Synthetic Data Generation Pipeline' +summary: 'Exploring how to create mock patient data from real patient data.' +category: 'Playbook' +origin: 'Skunkworks' +tags: ['synthetic data','variational autoencoder','privacy','quality','utility','kedro'] +--- + + +![Kedro Pipeline Structure](../images/synthetic-data-pipeline.png) + +The NHS AI Lab Skunkworks team has been releasing [open-source code](https://www.gov.uk/guidance/be-open-and-use-open-source) from their artificial intelligence (AI) projects since 2021. One of the challenges faced with releasing code is that without suitable test data it is not possible to properly demonstrate AI tools, preventing users without data access from being able to see the tool in action. + +One avenue for enabling this is to provide “synthetic data”, where new “fake” data is generated from real data using a specifically designed model, in a way that maintains several characteristics of the original data: +Utility - is the synthetic data fit for its defined use? +Quality - is the synthetic data a sufficient representation of the real data? +Privacy - does the synthetic data ‘leak’ or expose any sensitive information from the real data? + + +### The challenge… +> This project aimed to provide others with a simple, re-usable way of generating safe and effective synthetic data to be used in technologies that improve health and social care. + +Using real patient data for research and development carries with it safety and privacy concerns about the anonymity of the people behind the information. Various anonymisation techniques can be used to turn data into a form that does not directly identify individuals and where re-identification is not likely to take place. However, it is very difficult to entirely remove the chance of re-identification, so wide release of anonymised data will always carry some risks. Synthetic data removes the need for such concerns because there is no “real patient” connected with the data, so re-identification is not possible. + +Using a synthetic data generation model called [SynthVAE](https://nhsx.github.io/nhsx-internship-projects/synthetic-data-exploration-vae/), produced by the NHS Transformation Directorate’s Analytics Unit,the Skunkworks team embarked on a joint project to produce a framework for generating synthetic data. The teams explored how SynthVAE could be used to generate synthetic data, how that data would be evaluated and how the whole process could be documented for others to re-use. + + +### AI Dictionary +> We have tried to explain the technical terms used in this case study. If you would like to explore the language of AI and data science, please [visit our AI Dictionary](https://nhsx.github.io/ai-dictionary). + + + +## Overview +There are many ways to generate synthetic data, with SynthVAE being just one approach. One common challenge with synthetic data approaches is that they are usually configured specifically for a dataset. This is a problem because it means a significant amount of work is needed to update them for use with a different data source. Additionally, once an approach has successfully produced data, it can be difficult to know whether what was generated using the approach is actually useful. Using approaches like SynthVAE currently requires rework of the source code each time a new dataset is used, and there is no standard set of checks that can be used for every dataset. + +The work carried out jointly by NHS AI Lab Skunkworks and the Analytics Unit sought to: +Increase the range of synthetic data types that SynthVAE can generate (like categorical data and dates). SynthVAE originally used the [SUPPORT dataset from PyCox](https://github.com/havakv/pycox), so finding a dataset with a wider range of data types would be helpful. +Create a standard series of checks that can be carried out on the data produced, so that a user can better understand the characteristics of the synthetic data produced +Implement a structure to allow users to run the full functionality with a single piece of code. + +## What we did +The two teams worked together to: +- identify a suitable open source dataset for the project. +- use this data source to generate an “input dataset” that looks like real patient data +- adapt an existing synthetic code generator model (SynthVAE) and use it to produce synthetic patient data from the input dataset +- outline the checks that would need to be done to the synthetic data to confirm its quality and suitability +- pull these steps into a single user-friendly workflow process for anyone to use. + + +### Creating an input dataset + +In order to further develop the capabilities of SynthVAE since it was first produced, an input dataset containing a number of different data types was required in order to broaden the range of the data produced. The teams chose one already in the public domain. This meant people wishing to use the code after release could access and use the same dataset with which the project was developed. [MIMIC-III](https://physionet.org/content/mimiciii/1.4/) was selected because the size and variety of its data would enable us to produce an input file that would closely match the broad range of typical hospital data. + +We processed the raw MIMIC-III files to produce a single dataset which described treatment provided by a hypothetical set of patients. The resulting input file contained columns with numbers, categories and dates, as well as multiple entries for some patients. It looked similar to datasets that might be encountered in a real hospital setting, helping to keep this project as relevant as possible to potential stakeholders such as NHS data analysts and data scientists, as well as research teams within trusts who are interested in exploring the use of synthetic data. + +### Adapting SynthVAE + +SynthVAE, the Analytics Unit’s “Variational Autoencoder for generating Synthetic Data”, uses an autoencoder architecture to train a model to compress the input data into a smaller number of variables, before attempting to reconstruct the original input information. Once the model is trained, the statistical distributions within the model are sampled and output data constructed from these samples. Due to the training process, the model tries to reconstruct output data that looks like the original training data. + +SynthVAE was originally written primarily to generate synthetic data from both continuous data (data with an infinite number of values) and categorical data (data that can be divided into groups). The inclusion of dates in the new input dataset meant SynthVAE needed to be adapted to take the new set of variables. + +Once this was done, it was possible to use the input file to train a SynthVAE model, and then use that model to generate synthetic data. The model was used to generate a synthetic dataset containing several million entries, a substantial increase on volumes previously produced using SynthVAE. + +This wasn’t without challenges, as SynthVAE hadn’t been substantially tested using dates or large volumes of data. However, through close collaboration with the Analytics Unit, SynthVAE was successfully adapted to produce a synthetic version of the input data from MIMIC-III. + +### Creating a checking process +In order to evaluate the privacy, quality and utility of the synthetic data produced, a set of checks were needed. There is not currently an industry standard, so we chose to use an evaluation capability from [Synthetic Data Vault](https://sdv.dev/) (SDV), alongside other approaches which provide a broader range of assessments of the data. SDV’s evaluation capability provides a wide range of metrics which are already implemented, giving a starting point for building a more complete evaluation approach. SDV’s evaluation uses metrics to check whether your synthetic data would be a good substitute for the real data, without causing a change in performance (also known as the utility). The additional checks that were added aimed to make the evaluation of utility more robust, for example by checking there are no identical records in the synthetic and real datasets, but also to provide visual aids to allow the user to see what differences are present in the data. + +The checks included: +- **Collision analysis** - checking that no two records are exactly the same in the input and synthetic datasets +- **Correlation analysis** - compares the relationship between the two datasets to see if patterns have been accurately preserved in the synthetic dataset +- **Evaluating the Gower distance** - looking at the closeness of similarity between the input and synthetic datasets to make sure they are not too similar +- **Comparing each dataset using Principal Component Analysis** - reducing the size of the data set to its principal components whilst keeping as much information as possible helps us see how similar the input and synthetic datasets are, and helps us to understand whether the synthetic dataset is useful +- **Propensity testing** - checking whether a model can differentiate between our real and synthetic data.We used a logistic regression model that had been trained on input data. We combined the real and synthetic data then fitted the logistic regression model to the data set. Using the fitted model, we could see how well it differentiated between the real and synthetic data by looking at its ability to predict how likely each row was real or synthetic. +- **Comparison of the Voas-Williamson statistic** - A global goodness of fit metric that compares the variation over degrees of freedom in the synthetic and ground truth data. +- **Comparison of statistical distributions of the features** - to get a high level view of the similarity of the two datasets, the categorical and numerical columns were compared visually. For a more in depth overview of both the real and synthetic datasets we used pandas-profiling to generate reports for each. Pandas profiling is a way of quickly exploring data using just a few lines of code instead of trying to understand every variable. + +These checks were combined and their results collected in a web-based report, to allow results to be packaged and shared with any data produced. + +### Creating a pipeline + +To make the end-to-end process as user-friendly as possible, [QuantumBlack’s Kedro](https://medium.com/quantumblack/introducing-kedro-the-open-source-library-for-production-ready-machine-learning-code-d1c6d26ce2cf) was employed. This is a pipelining library that allows functionality to be chained together, allowing a user to run a full set of scripts with a single command. It also allows a user to define all their parameters, features and settings in a configuration file, making it easier to know what is defined in the pipeline and how to change according to the needs of each user. + +The input data generation, SynthVAE training, synthetic data production and output checking processes were chained together, creating a single flow to train a model, produce synthetic data and then evaluate the final output. + +## Outcomes and lessons learned +The resulting code, to be released as open source (available to anyone to re-use), enables users to see how: + +- an input dataset can be constructed from an open-source dataset, MIMIC-III +- SynthVAE can be adapted to be trained on a new input dataset with mixed data-types +- SynthVAE can be used to produce synthetic data +- synthetic data can be evaluated to assess it’s privacy, quality and utility +- a pipeline can be used to tie together steps in a process for a simpler user experience. + +By using the set of evaluation techniques, concerns around the quality of the synthetic data can be directly addressed and measured using the variety of metrics produced as part of the report. The approach outlined here is not intended to demonstrate a perfectly performing synthetic data generation model, but instead to outline a pipeline that enables the generation and evaluation of synthetic data. Things like overfitting to the training data, and the potential for bias will be highlighted by the evaluation metrics but will not be remedied. + +Concerns around re-identification are reduced by using synthetic data, however they are not absolutely removed.To better understand the privacy of any patient data used to train a synthetic data generating model, the Analytics Unit have undertaken a project exploring the use of ‘adversarial attacks’ to prove what information about the training data can be ascertained from a model alone. The project focussed on a particular type of adversarial attack, a ‘membership attack’, and explored how different levels of information would influence what the attacker could learn about the underlying dataset, and therefore the implications to any individuals whose information was used to train a model. + + +## What next? +AI Lab Skunkworks will be releasing the code from the project on our Github site to demonstrate how SynthVAE can be used in a practical, end-to-end configuration. + +The Analytics Unit is continuing to develop and improve SynthVAE, with a focus on improving the model’s ability to produce high quality synthetic data. + +### Who was involved? + +This project was a collaboration between the NHS AI Lab Skunkworks and the Analytics Unit within the Transformation Directorate at NHS England and Improvement. + +The NHS AI Lab Skunkworks is a team of data scientists, engineers and project leaders who support the health and social care community to rapidly progress ideas from the conceptual stage to a proof of concept. + +The Analytics Unit consists of a team of analysts, economists, data scientists and data engineers who provide leadership to other analysts who are working in the system and raise data analysis up the health and care system agenda. + +Output|Link +---|--- +Open Source Code & Documentation|[Github](https://github.com/nhsx/skunkworks-synthetic-data) +Case Study|[Case Study](https://www.nhsx.nhs.uk/ai-lab/explore-all-resources/develop-ai/exploring-how-to-create-mock-patient-data-synthetic-data-from-real-patient-data/) + +[comment]: <> (The below header stops the title from being rendered (as mkdocs adds it to the page from the "title" attribute) - this way we can add it in the main.html, along with the summary.) + +# \ No newline at end of file diff --git a/docs/our_work/template-project.md b/docs/our_work/template-project.md new file mode 100644 index 00000000..f1d1365d --- /dev/null +++ b/docs/our_work/template-project.md @@ -0,0 +1,21 @@ +--- +title: 'TITLE GOES HERE' +summary: 'ONE LINE SUMMARY GOES HERE' +category: 'Projects' +--- + + + + +Text and images go here in Markdown or HTML + +## Results + +e.g. A proof-of-concept demonstrator written in Python (machine learning models, command line interface (CLI), Jupyter Notebooks). + +Output|Link +---|--- +Open Source Code & Documentation|[Github](#add URL) +Case Study| +Technical report|[e.g. biorxiv.org]() +Algorithmic Impact Assessment|e.g. N/A diff --git a/overrides/main.html b/overrides/main.html index f223d92c..0ef16551 100644 --- a/overrides/main.html +++ b/overrides/main.html @@ -7,5 +7,13 @@ {% endif %} + +{% if page and page.meta and page.meta.summary %} +

{{ page.meta.title }}

+
{{ page.meta.summary }}
+ {{ page.meta.tags }} + +{% endif %} + {{ super() }} {% endblock content %} \ No newline at end of file