Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix custom line item parse errors in statement of activities #83

Open
kwheelan opened this issue Jul 22, 2024 · 17 comments
Open

Fix custom line item parse errors in statement of activities #83

kwheelan opened this issue Jul 22, 2024 · 17 comments
Assignees
Labels
bug high priority python Task to implement ACFR parsing on the backend xbrl

Comments

@kwheelan
Copy link
Collaborator

kwheelan commented Jul 22, 2024

Overview

The custom line items should parse into well-formed inline XBRL, which should then show correctly in the interactive viewer. However, there are some issues with the custom line items (and some issues that also occur with non-custom items).

Well-formed iXBRL for a cell in the Statement of Activities table will look something like this:


<td id="StatementofActivities_C91" class="row">
   <ix:nonFraction contextRef="D20220630_GovernmentalActivities" name="acfr:ExpensesCustom" unitRef="USD" id="StatementofActivities_C91" decimals="0" format="ixt:num-dot-decimal" >205,445</ix:nonFraction>
</td>

And show the dimensions, concept (XBRL tag), and properties in the viewer:
Screenshot 2024-08-02 at 10 53 48 PM

Problem 1: no concept (tag)

Some of the line items don't show a concept:
Screenshot 2024-08-02 at 10 54 29 PM

This is because the concept (ixbrl tag) should be stored in the name attribute, and something is going wrong in the conversion, putting nan in there:

<td id="StatementofActivities_C90" class="row">
   $ <ix:nonFraction contextRef="Ddate_parse_error_GovernmentalActivities" name="nan" unitRef="USD" id="StatementofActivities_C90" decimals="0" format="ixt:num-dot-decimal" >239,296</ix:nonFraction>
 </td>

There's also a date parse error, but I'll separate that into a different issue (#85).

Problem 2: not recognizing the ixbrl tag at all

Screenshot 2024-08-02 at 10 54 43 PM
<td id="StatementofActivities_D91" class="row">
  <ix:nonFraction contextRef="Ddate_parse_error_ProgramRevenuesFromChargesForServices_GovernmentalActivities" name=" acfr:RevenueForOtherProgramsCustom" unitRef="USD" id="StatementofActivities_D91" decimals="0" format="ixt:num-dot-decimal" >22,761</ix:nonFraction>
 </td>

I think this issue is because there is a space before the name attribute. It should be name="acfr:RevenueForOtherProgramsCustom" instead of name=" acfr:RevenueForOtherProgramsCustom".

Solutions

To fix the second type of issue I think there are 2 options:

1. Excel fix

Edit the formulas in the Excel sheet to remove the spaces between the XBRL tags when the user identifies a custom tag. Right now, the custom tags aren't consistent with the normal ones:
Screenshot 2024-08-02 at 11 14 36 PM

The Excel formula is:

=IF(B10="", "Choose from drop-down -->", IF(COUNTIF('Lookup GovWide Stmt Activities'!$B:$B, B10) = 0, "acfr:ExpensesCustom, acfr:RevenueForOtherProgramsCustom, acfr:NetExpenseRevenueCustom", _xlfn.CONCAT(_xlfn.XLOOKUP(B10, 'Lookup GovWide Stmt Activities'!$B:$B, 'Lookup GovWide Stmt Activities'!$D:$D), ",", _xlfn.XLOOKUP(B10, 'Lookup GovWide Stmt Activities'!$B:$B, 'Lookup GovWide Stmt Activities'!$C:$C), ",", _xlfn.XLOOKUP(B10, 'Lookup GovWide Stmt Activities'!$B:$B, 'Lookup GovWide Stmt Activities'!$E:$E))))

We can fix the issue by editing "acfr:ExpensesCustom, acfr:RevenueForOtherProgramsCustom, acfr:NetExpenseRevenueCustom" to "acfr:ExpensesCustom,acfr:RevenueForOtherProgramsCustom,acfr:NetExpenseRevenueCustom"

2. Python fix

Editing all the Excel formulas might be a bit tedious, so instead you could edit

xbrl_list = str(row["xbrl_element"]).strip().split(',')
to handle cases where the delimiter is ',' and also where it is , .

Fixing the no-concept issue

I have less guidance on the no concept issue. The nan in the name attribute is likely coming from incorrect data parsing when creating Cell objects in the process_cells() method for the StatementofActivities class.

def process_cells(self):
"""Create a list of Cell objects to represent Excel data"""
for col_name in self._df['header'].unique():
# Grab all cells in the given column
rows = self._df[self._df['header'] == col_name]
# initialize section
section = ""
for _, row in rows.iterrows():
# for each column, create corresponding context
if col_name == "expenses":
# avoid assuming that this is a typed dim for FundIdentifierAxis
dims = []
else:
dims = [Dimension(col_name)]
# if the current section is a dimension, record it unless the column contradicts it
if section in axis_dict and col_name not in DIMENSIONS["TypeOfGovernmentUnit"]:
dims.append(Dimension(section))
# create new context for the column and section
context = self.add_context(col_name, dims)
# process xbrl tags according to the relevant column groups
# (see color coding on Excel sheet)
xbrl_list = str(row["xbrl_element"]).strip().split(',')
if len(xbrl_list) == 1 or col_name == "expenses":
# first tag in list from relevant cell in column A in the excel sheet
xbrl_tag = xbrl_list[0]
elif col_name in DIMENSIONS["TypeOfProgramRevenues"]:
# second tag in list from relevant cell in column A (revenue)
xbrl_tag = xbrl_list[1]
else:
# third tag in list from relevant cell in column A (net revenue/deficit)
xbrl_tag = xbrl_list[2]
# create cell data
if xbrl_tag != "Choose from drop-down -->":
cell = Cell(id = row["id"],
xbrl_tag = xbrl_tag,
row_name = str(row["nan"]),
col_name = str(row["header"]),
value = row["value"],
context = context)
self._data.append(cell)
# reassign the relevant section name if appropriate
if cell.tr_class() in ["section_header", "subsection_header"]:
section = clean(cell.row_name())

If you can't get anywhere troubleshooting this, I'm happy to take a closer look.

@kwheelan kwheelan added python Task to implement ACFR parsing on the backend xbrl labels Jul 22, 2024
@lucakato
Copy link
Collaborator

@kwheelan Hi. Could you explain what this feature should do? what examples are there that I can use to test and work on this?

@kwheelan
Copy link
Collaborator Author

@lucakato Sure, I will add more info. In the meantime, could you finish resolving #82 (I've created pull request #84 to close that issue and #81)? This issue would be good to tackle after merging those changes

@lucakato
Copy link
Collaborator

@kwheelan will do

@kwheelan kwheelan changed the title Add capacity to parse new custom line items Fix custom line item parse errors in statement of activities Aug 3, 2024
@kwheelan
Copy link
Collaborator Author

kwheelan commented Aug 3, 2024

@lucakato Just added detail in the description. Please create a new branch, and when you're ready for me to take a look, create a pull request for this issue's branch into dev and add me as a reviewer to the PR. Thanks

@lucakato
Copy link
Collaborator

lucakato commented Aug 6, 2024

@kwheelan do you know which files you tested for the no concept tag issue? when I first tried it just with Ogemaw excel file the concept showed for the same number as yours. I still modified the python code for handling white space and would like to test it. Or did you change on the Excel side?
branch is '83-custom line'
image

@kwheelan
Copy link
Collaborator Author

kwheelan commented Aug 6, 2024

@lucakato I used the Ogemaw file you sent me a few weeks ago (updated_ACFR_Ogemaw_template (2).xlsx), but maybe Sarrah uploaded a more recent template since? Which one did you test here?

@lucakato
Copy link
Collaborator

lucakato commented Aug 6, 2024

@kwheelan I used the same ogemaw file, with the random number that was causing issues removed.
updated_ACFR_Ogemaw_template.xlsx

@kwheelan
Copy link
Collaborator Author

kwheelan commented Aug 6, 2024

@lucakato Weird. Using your version of the file I can see the concept for the first row but the same issue persists lower in the table:

Screenshot 2024-08-06 at 1 34 04 PM

@lucakato
Copy link
Collaborator

lucakato commented Aug 6, 2024

@kwheelan ok I will look into it. Inspecting the html file it looks like this for that part:
<ix:nonFraction contextRef="Ddate_parse_error_GovernmentalActivities" name="acfr:RevenueUsedForGeneralGovernmentServicesAdministration" unitRef="USD" id="StatementofActivities_C95" decimals="0" format="ixt:num-dot-decimal" >1,974,877</ix:nonFraction>

which seems correct(?). I also see it in the Lookup table. Do you think it's some other issue?
image

@kwheelan
Copy link
Collaborator Author

kwheelan commented Aug 16, 2024

@lucakato

Sorry I missed this comment earlier. Seems like there are 2 issues. The first is that this tag should actually be acfr:ExpensesForGeneralGovernmentServicesAdministration not revenues. The next two cells should be revenue, but are expenses. This is an issue throughout the table, and probably is caused by a bug in the process_cells() method for the StatementOfActivities class:

def process_cells(self):

I think the second issue is that General Government Administration doesn't seem to be in the ACFR taxonomy (found on YETI.) This would explain why the concepts aren't showing in the viewer.

I'm going on vacation for the next week, but hopefully this gives you enough info to keep working on it. Marc may also be able to help confirm if the General Government Administration tags are in the taxonomy.

@lucakato
Copy link
Collaborator

@kwheelan thanks for the comment I didn't realize it should have been expenses! I'll look into it.

@lucakato
Copy link
Collaborator

lucakato commented Aug 22, 2024

@kwheelan @nfitz1
I think I figured out where the issue is happening and I want to check if I'm correct. In the Ogemaw template (attached) that I received, if you go to Statement of Activities tab, it has acfr:RevenueUsedForGeneralGovernmentServicesAdministration as a tag. Should this be included or removed from the spreadsheet?

What happens with the code is it sees the column 'Expenses' and then it runs this line in statement_of_activites.py and assigns the first tag for the xbrl_tag.

                if len(xbrl_list) == 1 or col_name == "expenses":
                    # first tag in list from relevant cell in column A in the excel sheet
                    xbrl_tag = xbrl_list[0]
image

updated_ACFR_Ogemaw_template.xlsx

@kwheelan
Copy link
Collaborator Author

@lucakato You're right; the order is wrong on all the non-custom XBRL tags in that column. We do want to include acfr:RevenueUsedForGeneralGovernmentServicesAdministration, but the order is wrong. The list should be expense tag, revenue tag, net expense (revenue) tag. The same issue happens with the Clayton example; I've just never noticed the swapped tags before.

So we just need to change the Excel function in column A for the Statement of Activities for rows 9+ to:

=IF(B14="", "Choose from drop-down -->", IF(COUNTIF('Lookup GovWide Stmt Activities'!$B:$B, B14) = 0, "acfr:ExpensesCustom, acfr:RevenueForOtherProgramsCustom, acfr:NetExpenseRevenueCustom", _xlfn.CONCAT(_xlfn.XLOOKUP(B14, 'Lookup GovWide Stmt Activities'!$B:$B, 'Lookup GovWide Stmt Activities'!$C:$C), ",", _xlfn.XLOOKUP(B14, 'Lookup GovWide Stmt Activities'!$B:$B, 'Lookup GovWide Stmt Activities'!$D:$D), ",", _xlfn.XLOOKUP(B14, 'Lookup GovWide Stmt Activities'!$B:$B, 'Lookup GovWide Stmt Activities'!$E:$E))))

(Basically just switch 'Lookup GovWide Stmt Activities'!$C:$C and 'Lookup GovWide Stmt Activities'!$D:$D to swap the position of the revenue and expense tags.)

I'm skeptical that this will fix all the parsing errors, but it should be a good start. Let me know how it works

@lucakato
Copy link
Collaborator

@kwheelan
image
Do you mean like this? Attached is my updated Ogemaw file:
Tried it but I still don't see the concept:
image
updated_ACFR_Ogemaw_template.xlsx

@kwheelan
Copy link
Collaborator Author

@lucakato Yes the order looks right in that cell (assuming the part of the cell that's not visible is the full expenses tag).

But I think this only fixes part of the issue. The second issue is that General Government Administration doesn't seem to be in the ACFR taxonomy (found on YETI). This would explain why the concepts aren't showing in the viewer.

The concepts (RevenueUsedForGeneralGovernmentServicesAdministration and ExpensesForGeneralGovernmentServicesAdministration) appear in the Excel version of the taxonomy under 200000 - Statement - Activities - Expenses and Revenues for Programs but not in YETI. Looking at the taxonomy in YETI, maybe the right tag is acfr:ExpensesForGeneralGovernmentServices? This is a question for @joffemd

We'll want to update the Statement of Activities lookup table in Excel to match the tag we decide on here.

@lucakato
Copy link
Collaborator

@joffemd Here are the templates I use for your reference. Thank you.
Clayton.Excel.xlsx
updated_ACFR_Ogemaw_template.xlsx

@joffemd
Copy link
Contributor

joffemd commented Aug 30, 2024

@lucakato I will send this to my contact in India and ask him to create templates for two other cities so that we can do further testing...

Also, sorry, after reading your email and this thread, I see that you are seeking an input from me. I may not be able to answer ahead of vacation and apologize for the inconvenience.

For now, please use acfr:ExpensesForGeneralGovernmentServices as Katrina recommends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug high priority python Task to implement ACFR parsing on the backend xbrl
Projects
None yet
Development

No branches or pull requests

3 participants