-
Notifications
You must be signed in to change notification settings - Fork 26
PPD data model
The MDR is the legacy database on which dispenser payments is based. It most notably is mapped to BNF code only, which makes it hard to make use of the rich metadata available in the publicly-available dm+d NHS database. It also has a restriction of 40 characters on the BNF Name field.
dm+d is itself generated from CDR, which is the next-generation payments database that's been in development for around a decade.
We can map from PPD data (which is supplied in MDR format) to dm+d via a mapping which is updated by NHSBSA every quarter. As dm+d is updated monthly, this means it's possible for some actively prescribed dm+d products to have no corresponding BNF Code mapped; however, given the lag of 8+ weeks between a dispensing event and the release of the monthly PPD data, this is unlikely ever to be a problem.
- Broken bulk: not reflected in quantity number
- Doesn't reflect broken bulk, or other costs based on prescribing which cannot be directly attributed to practices. These costs include VAT, nationally unidentified prescribing, broken bulk, schedule adjustments, net cross boundary costs and out of pocket expenses and payments for containers for national unidentified prescribing (source)
dm+d is the NHS Dictionary of Medicines & Devices and is documented here.
There is a dm+d browser here but its search is pretty slow, and it'd be nice to build our own.
Data is published weekly by NHS Digital from TRUD, but we currently download it as part of the monthly import. There is an issue about changing to a weekly download. The code for the task that fetches the data is here, and the code for the task that imports the data is here.
VMPs, AMPs, VMPPs, and AMPPs are the main objects in the dm+d data model.
This is quoted directly from the documentation above:
-
The Virtual Medicinal Product (VMP) describes the generic title for a product including the form and strength, for example Atenolol 100mg tablets.
-
The Virtual Medicinal Product Pack (VMPP) describes the generic title for a generic or proprietary product pack which is known to have been available. The description includes the pack size, for example Atenolol 100mg tablets 28 tablet.
-
The Actual Medicinal Product (AMP) describes an actual product which is known to have been available linked to the name of a particular supplier, for example Tenormin 100mg tablets (AstraZeneca UK Ltd).
-
The Actual Medicinal Product Pack (AMPP) describes an actual product which is known to have been available linked to both the name of a particular supplier and information on the pack size of the product, for example Tenormin 100mg tablets (AstraZeneca UK Ltd) 28 tablet 2 x 14 tablets.
There are foreign key relationships between these four types of object. (These relationships are implicit in our current implementation, but are observed to be enforced in the source.)
The relationships are as follows:
- A VMP has many AMPs / an AMP belongs to a VMP
- A VMP has many VMPPs / a VMPP belongs to a VMP
- An AMP has many AMPPs / an AMPP belongs to an AMP
- A VMPP has many AMPPs / an AMPP belongs to a VMPP
You can think of the relationship between the four types of object as forming a diamond.
VMP<------+ AMP
^ ^
| |
| |
| |
+ +
VMPP<-----+ AMPP
We store these four types of object in four tables, with the following (simplified) schemas:
field | description |
---|---|
vpid | Primary key |
nm | Name |
field | description |
---|---|
vppid | Primary key |
vpid | Foreign key to VMP |
nm | Name |
field | description |
---|---|
apid | Primary key |
vpid | Foreign key to VMP |
nm | Name |
field | description |
---|---|
appid | Primary key |
apid | Foreign key to AMP |
vppid | Foreign key to VMPP |
nm | Name |
That is, an AMP's VMP exists, a VMPP's VMP exists, and an AMPP's VMPP and AMP both exist.
Eg: SELECT COUNT(*) FROM dmd_amp LEFT OUTER JOIN dmd_vmp ON dmd_amp.vpid = dmd_vmp.vpid WHERE dmd_vmp.vpid IS NULL
.
That is, whichever way you traverse the diamond from AMPP to VMP, you get the same result. In other words, an AMPP's AMP's VMP is the same as that AMPP's VMPP's VMP.
SELECT COUNT(dmd_ampp.*)
FROM dmd_ampp
INNER JOIN dmd_amp ON dmd_ampp.apid = dmd_amp.apid
INNER JOIN dmd_vmpp ON dmd_ampp.vppid = dmd_vmpp.vppid
WHERE dmd_amp.vpid != dmd_vmpp.vpid;
That is, every VMP has at least one VMPP, AMP, and AMPP.
WITH
vmpp_counts AS (
SELECT
dmd_vmp.vpid,
COUNT(dmd_vmpp.*) AS vmpp_count
FROM dmd_vmp
INNER JOIN dmd_vmpp ON dmd_vmp.vpid = dmd_vmpp.vpid
GROUP BY dmd_vmp.vpid
),
amp_counts AS (
SELECT
dmd_vmp.vpid,
COUNT(dmd_amp.*) AS amp_count
FROM dmd_vmp
INNER JOIN dmd_amp ON dmd_vmp.vpid = dmd_amp.vpid
GROUP BY dmd_vmp.vpid
),
ampp_counts AS (
SELECT
dmd_vmp.vpid,
COUNT(dmd_ampp.*) AS ampp_count
FROM dmd_vmp
INNER JOIN dmd_vmpp ON dmd_vmp.vpid = dmd_vmpp.vpid
INNER JOIN dmd_ampp ON dmd_vmpp.vppid = dmd_ampp.vppid
GROUP BY dmd_vmp.vpid
)
SELECT
vmpp_counts.vpid,
COALESCE(vmpp_count, 0) AS vmpp_count,
COALESCE(amp_count, 0) AS amp_count,
COALESCE(ampp_count, 0) AS ampp_count
FROM vmpp_counts
FULL OUTER JOIN amp_counts ON vmpp_counts.vpid = amp_counts.vpid
FULL OUTER JOIN ampp_counts ON vmpp_counts.vpid = ampp_counts.vpid
WHERE vmpp_count = 0 OR amp_count = 0 OR ampp_count = 0
I have found it helpful to think of the structure of the data by thinking just about the objects belonging to a single VMP. These objects can be arranged in a 2-dimensional table, with VMPPs as the column headings, AMPs as the row headings, and AMPPs in the table cells.
For example, here's the table for VMP Diclofenac 2.32% gel:
Diclofenac 2.32% gel 100 gram | Diclofenac 2.32% gel 30 gram | Diclofenac 2.32% gel 50 gram | |
---|---|---|---|
Diclofenac 2.32% gel (DE Pharmaceuticals) | Diclofenac 2.32% gel (DE Pharmaceuticals) 30 gram | Diclofenac 2.32% gel (DE Pharmaceuticals) 50 gram | |
Diclofenac 2.32% gel (Colorama Pharmaceuticals Ltd) | Diclofenac 2.32% gel (Colorama Pharmaceuticals Ltd) 30 gram | Diclofenac 2.32% gel (Colorama Pharmaceuticals Ltd) 50 gram | |
Voltarol 12 Hour Emulgel P 2.32% gel (GlaxoSmithKline Consumer Healthcare) | Voltarol 12 Hour Emulgel P 2.32% gel (GlaxoSmithKline Consumer Healthcare) 100 gram | Voltarol 12 Hour Emulgel P 2.32% gel (GlaxoSmithKline Consumer Healthcare) 30 gram | Voltarol 12 Hour Emulgel P 2.32% gel (GlaxoSmithKline Consumer Healthcare) 50 gram |
We see that there are 3 VMPPs (for 100g, 30g, and 50g pack sizes), and 3 AMPs (for 3 different manufacturers). There are 7 AMPPs: each manufacturer produces 30g and 50g packs, but only GSK produce a 100g pack.