648 move pg conversion to imputation. #183

AnneONS · 2024-01-11T18:43:11Z

Pull Request submission

Note: the code runs, and tests work, but I haven't checked the outputs

PG conversion now happens at the beginning of the imputation module. The PG column, 201, is initially numeric, and contains many nulls.

This means that in construction, col 201 can remain numeric as in spp.

Step 1 of TMI is no longer needed.

The approach to PG conversion is now:

Fill all nulls in column 201 with numeric pg values from the sic mapper
Copy this column to a new column, "pg_numeric"
Convert column 201 to alpha-numeric

The pg_numeric column will now have nulls filled for apportionment, and for the tau and sas outputs.

In outputs, the pg conversion is done for the NI data.

Closes or fixes

Detail the ticket(s) you are closing with this PR
Closes #

Code

Documentation

Any new code includes all the following forms of documentation:

Function Documentation Docstrings within the function(s')/methods have been created
- Includes Args and returns for all major functions
- The docstring details data types
Updated Documentation: User and/or developer working doc has been updated

Data

All data needed to run this script is available in Dev/Test
All data is excluded from this pull request
Secrets checker pre-commit passes

Testing

Unit tests Unit tests have been created and are passing or a new ticket to create tests has been created

Peer Review Section

All requirements install from (updated) environment.yaml
Documentation has been created and is clear - check the working document
Doctrings (Google format) have been created and accurately describe the function's functionality
Unit tests pass, or if not present a new ticket to create tests has been created
Code runs The code runs on reviewer's machine and/or CDSW

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction.

I recommend merging this request.

Review comments

Insert detailed comments here!

These might include, but not exclusively:

bugs that need fixing (does it work as expected? and does it work with other code
that it is likely to interact with?)
alternative methods (could it be written more efficiently or with more clarity?)
documentation improvements (does the documentation reflect how the code actually works?)
additional tests that should be implemented (do the tests effectively assure that it
works correctly?)
code style improvements (could the code be written more clearly?)
Do the changes represent a change in functionality so the version number should increase? Start a discussion if so.
As a review you can generates the same outputs from running the code

Your suggestions should be tailored to the code that you are reviewing.
Be critical and clear, but not mean. Ask questions and set actions.

github-actions · 2024-01-11T18:43:45Z

Detailed Coverage Report

File	Stmts	Miss	Cover	Missing
src
__init__.py	0	0	100%
src/aggregation
__init__.py	0	0	100%
src/construction
__init__.py	0	0	100%
construction.py	48	48	0%	2–4, 6–7, 9, 12, 39–42, 44–46, 49–51, 54–63, 66, 68, 71, 74, 77–78, 81–83, 86–88, 91–92, 95, 103, 111–112, 114, 118, 120
old_construction.py	113	113	0%	3–7, 10, 13, 41–45, 48–50, 53–55, 60, 63, 65, 68, 74–75, 88–89, 92, 101, 103–104, 107, 117, 120, 124, 132, 135–136, 138, 140–141, 144, 149, 156, 164, 167, 171–174, 176–177, 180, 183–187, 192, 199–202, 204, 207, 210–213, 216–217, 220–222, 224–229, 233–239, 244–245, 248–249, 252, 254–255, 257, 259, 263–264, 267, 270, 275, 278–280, 282, 286, 289, 292, 295–297, 301, 304, 306
src/estimation
__init__.py	0	0	100%
apply_weights.py	16	16	0%	2–5, 7, 10, 24–29, 32–34, 36
calculate_weights.py	37	37	0%	1–3, 6, 9, 23–25, 28, 30, 33, 64–66, 69, 72, 75–78, 80–81, 85, 96, 99, 102, 105, 108, 111, 114–115, 121–122, 124, 127, 139–140
cellno_mapper.py	7	0	100%
src/imputation
MoR.py	90	90	0%	2–4, 6–7, 13–14, 17, 35, 38, 40–41, 43, 47–48, 54, 63–66, 70–71, 73–75, 77, 79, 82, 84, 87, 112, 117–120, 122, 125, 131–136, 138, 141, 147–149, 152, 161, 165, 171, 178, 188–190, 194, 197, 206–207, 210, 218, 221, 229, 231, 233, 235–236, 243–244, 249–250, 253, 264, 270, 273, 276, 278–279, 281, 284, 287–289, 291, 294, 296, 300, 303–304
__init__.py	0	0	100%
apportionment.py	36	10	72%	124, 126, 141, 143, 145, 157–159, 162, 164
expansion_imputation.py	39	32	17%	21, 25–26, 28–29, 32, 35, 38–40, 44, 47, 49, 52–53, 55, 58, 61, 81–83, 87, 90, 93, 96, 102, 107, 110, 112, 118, 121, 125
imputation_helpers.py	77	39	49%	25–27, 29–30, 32, 34, 141–142, 144, 178, 180, 182–183, 185, 189, 195, 198, 202, 205–206, 208, 234, 237, 240, 246–248, 251–252, 254–255, 258–259, 262–264, 270, 272
impute_civ_def.py	94	48	48%	134–136, 166, 168–169, 172–173, 176–178, 183–184, 186, 189, 191–194, 196, 198, 203–204, 206, 209, 211–213, 216, 218–219, 221–222, 224–225, 238, 241–245, 247–248, 251–252, 254–255, 257
manual_imputation.py	19	19	0%	1–2, 4, 6, 9–10, 28–29, 33, 35, 37, 44, 59, 61–62, 64–65, 67–68
pg_conversion.py	36	9	75%	51, 54, 57, 121, 124, 127, 156, 159, 161
sf_expansion.py	65	65	0%	2–4, 6–7, 9, 11–12, 15, 24, 26, 29, 32–33, 36–37, 40, 43–44, 47, 49, 56–57, 61, 64, 69–70, 73, 75, 78–79, 81, 83, 86, 90, 92, 96, 104, 107, 111, 113–114, 116, 118, 121, 124, 133, 136, 139, 148, 151, 153, 159, 165, 168–169, 174, 178–179, 183, 186, 194, 196, 200, 202
short_to_long.py	21	21	0%	1, 3–4, 7, 20, 23, 25, 32–33, 35, 37–38, 40–41, 43–45, 47, 49, 53, 55
tmi_imputation.py	194	155	20%	43, 46, 48, 52, 54, 105, 119, 121–122, 124–128, 130–131, 134, 140, 143–144, 147, 150, 152, 157, 162, 167, 180, 183–184, 186, 189–190, 211–213, 215, 217, 220, 223–225, 227–228, 231, 234, 241–244, 246, 270, 273, 275, 278, 282, 286, 307–308, 311, 314–315, 318, 321–322, 324–325, 327, 329, 332, 334, 336, 339, 342–343, 345, 347–349, 351, 358, 360, 367, 369–370, 372–373, 376, 378, 380, 382, 385, 388–389, 392, 394, 396, 401, 407, 413, 419, 423, 442–443, 445, 447–448, 450–451, 453–454, 457, 459–460, 463, 466, 468–469, 485, 487, 490–492, 494, 496, 498–499, 502, 504–505, 508, 510, 512–513, 531, 535, 538–539, 542–543, 546, 549, 551, 554, 556, 561, 563, 570–571, 574, 577–578, 580, 583, 585–586
src/northern_ireland
__init__.py	0	0	100%
ni_headcount_fte.py	29	29	0%	2–4, 6, 8, 11, 25, 27, 29, 31, 33, 35, 37, 40, 55, 57–59, 61, 63–64, 66–67, 69, 72, 81, 83–84, 86
ni_staging.py	36	36	0%	3–6, 8, 10, 13, 21–22, 25, 27–29, 31, 34, 40, 44, 46, 49, 80–81, 84, 89, 93, 96, 99–107, 109, 111
src/outlier_detection
__init__.py	0	0	100%
auto_outliers.py	83	39	53%	23–24, 26–27, 31–36, 38, 40, 43, 70, 74, 116, 190, 192, 194–195, 197, 201, 234, 237, 240, 242, 245, 249, 252, 255, 257–258, 260, 262, 272–275, 277
manual_outliers.py	16	0	100%
src/outputs
__init__.py	0	0	100%
export_files.py	101	101	0%	5–11, 13–14, 18, 24, 42, 44, 51, 56, 63, 68, 71, 98, 104, 107, 114, 116, 119, 124–125, 128–132, 135–136, 139, 151–153, 155, 158, 169, 171–172, 174, 177, 195, 198, 201–202, 209, 213–214, 217–219, 222, 224, 226–235, 239, 241–250, 255–256, 258, 261–263, 266, 269, 272, 286, 289–290, 298, 301, 303, 305, 315, 317–319, 328, 330, 333–334
form_output_prep.py	20	20	0%	1–3, 6, 32–33, 36–37, 39, 41, 44–48, 51, 58, 60, 64, 66
gb_sas.py	26	26	0%	2–5, 7–9, 11, 14, 34–36, 39, 44, 47, 50, 53, 56, 59, 64, 67–69, 72–74
intram_by_civil_defence.py	26	26	0%	2–6, 8, 11, 14, 34–36, 38–39, 42–44, 46, 49, 54, 57, 60–62, 65–67
intram_by_itl1.py	38	38	0%	2–5, 7–8, 10, 13, 36–38, 41, 44–47, 50–52, 55–56, 59–62, 65–66, 69, 72, 75, 78–82, 87–89
intram_by_pg.py	27	27	0%	2–5, 7, 10, 13, 31–33, 36–38, 40, 43–45, 48, 51, 54, 57–60, 65–67
intram_by_sic.py	36	36	0%	2–5, 7, 10, 13, 32–34, 36, 39–40, 43–45, 47, 50–52, 57, 73–77, 79, 82, 85, 88, 91, 94–95, 98–100
long_form.py	22	22	0%	2–5, 7–9, 11, 14, 33–35, 38, 41, 44, 47, 50–52, 54–56
manifest_output.py	78	78	0%	1–4, 8, 11–12, 15, 33, 48–51, 54–55, 59–60, 65–66, 68, 71–75, 78–84, 86, 104–105, 112, 114–115, 122, 125, 127, 129, 131, 135, 145, 150–151, 157, 160–161, 163–164, 172–175, 182, 189, 191, 196, 198–200, 202–203, 205–206, 208–211, 213, 216, 218, 224–225, 228–229
map_output_cols.py	58	50	13%	21–22, 24, 27–28, 30–31, 34, 52–53, 56, 59, 62, 65, 67, 69, 71–72, 89, 99, 102, 107, 110–111, 114, 116, 131–132, 134, 137, 139, 158–159, 162, 165, 168, 171, 173, 175, 177, 179–180, 199, 201, 203–204, 207–208, 211–212
ni_sas.py	24	24	0%	2–9, 11, 14, 36–38, 41, 43, 46, 49, 54, 57–59, 62–64
outputs_helpers.py	23	10	56%	46, 53–55, 79–81, 84, 87, 89
short_form.py	40	19	52%	78, 85, 87, 110–112, 115, 118, 121, 124, 127, 130, 133, 136–138, 140–142
status_filtered.py	12	6	50%	29–31, 33–35
tau.py	28	28	0%	2–8, 10, 13, 33–35, 38, 41, 46, 49, 52, 55, 58, 61, 64, 69, 72–74, 77–79
total_fte.py	14	14	0%	2–5, 8, 11, 24–25, 27–28, 34, 39–41
src/site_apportionment
__init__.py	0	0	100%
site_apportionment.py	69	69	0%	1–4, 6, 8, 11–19, 22–23, 26, 39, 42–49, 52, 71–73, 76, 79, 84, 87, 90, 92, 95, 122, 125, 128, 132, 135, 138, 141–142, 145–146, 149–151, 154, 157, 160–161, 164, 167, 170–173, 178, 181, 184–185, 188, 191, 194, 198
src/staging
__init__.py	0	0	100%
history_loader.py	32	2	93%	42, 54
spp_parser.py	14	0	100%
spp_snapshot_processing.py	34	0	100%
staging_helpers.py	144	110	23%	51–53, 55–57, 59, 69, 72, 97–103, 120, 123, 132–134, 137, 140–143, 145, 147, 166–167, 169–170, 172, 211, 214, 217, 220, 223, 226–227, 230, 233, 235, 237, 240, 243, 258–260, 262–264, 266–268, 272–273, 277, 279, 282, 284, 306–307, 311, 316–318, 340, 343, 345, 348, 353–355, 358–359, 362, 364, 369, 375, 398–399, 402, 407–408, 411, 414, 418, 425, 456–459, 461–462, 465–466, 502, 505–508, 511, 514, 517–520, 523, 525
validation.py	223	66	70%	17–18, 73, 206, 208, 308–309, 336, 339, 386–387, 398, 422, 436–437, 443, 447, 455–456, 462–463, 502–503, 510, 512, 515, 517, 538, 540–541, 544–545, 548–550, 552, 554, 557–558, 560, 562–563, 634, 636–637, 640–641, 644–645, 648, 651–652, 655, 657, 659–660, 673, 675–681, 684, 687
src/utils
__init__.py	0	0	100%
helpers.py	17	5	70%	14–15, 19–20, 22
local_file_mods.py	105	44	58%	33–38, 83, 133–135, 183–184, 195–199, 210–211, 222, 233, 244–245, 247, 258–259, 270–271, 280, 288, 299, 301–302, 304–305, 309–311, 315, 329–330, 333–334, 336
TOTAL	2267	1627	28%

Summary of tests

Tests	Skipped	Failures	Errors	Time
52	0 💤	0 ❌	0 🔥	1.249s ⏱️

coatet

All the code looks good on a readthrough but I'm getting some errors when running. Can't seem to highlight specific lines in some files so I'm putting them here.

Line 11 of staging_main:

from src.staging import pg_conversion as pg

This fails since it's been moved to imputation.

Line 289 of staging_main:

 # Map PG from SIC/PG numbers to column '201'.
full_responses = pg.run_pg_conversion(full_responses, pg_num_alpha, sic_pg_alpha_mapper, target_col="201")

This fails with got an unexpected keyword argument 'target_col' since it looks like the argument should be pg_column instead of target_col.

When I fixed these I got it to run as far as Line 46 in pg_conversion but that prompts what looks like an error in columns, not sure I know how to fix that.

src/imputation/pg_conversion.py

AnneONS added 2 commits January 11, 2024 17:47

update tests

7fdcfa0

move pg_conversion to imputation

53bb094

coatet requested changes Jan 12, 2024

View reviewed changes

src/imputation/pg_conversion.py Show resolved Hide resolved

src/imputation/pg_conversion.py Outdated Show resolved Hide resolved

src/imputation/pg_conversion.py Show resolved Hide resolved

src/imputation/pg_conversion.py Show resolved Hide resolved

AnneONS added 5 commits January 15, 2024 10:09

648 minor changes

8b0176a

add exception if mapper not working

eb637e2

remove duplicate line from config

58e7e57

remove unnecessary pg conversion from NI sas

86a91e3

Merge remote-tracking branch 'origin/develop' into 648_move_pg_numeric

76d0f9d

coatet approved these changes Jan 15, 2024

View reviewed changes

coatet merged commit 829f3e9 into develop Jan 15, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

648 move pg conversion to imputation. #183

648 move pg conversion to imputation. #183

AnneONS commented Jan 11, 2024

github-actions bot commented Jan 11, 2024 •

edited

Loading

coatet left a comment

648 move pg conversion to imputation. #183

648 move pg conversion to imputation. #183

Conversation

AnneONS commented Jan 11, 2024

Pull Request submission

Note: the code runs, and tests work, but I haven't checked the outputs

Closes or fixes

Code

Documentation

Data

Testing

Peer Review Section

Final approval (post-review)

Review comments

github-actions bot commented Jan 11, 2024 • edited Loading

Summary of tests

coatet left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 11, 2024 •

edited

Loading