You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After integrating the 2023 923 monthly data into the EIA tables, I noticed that the row count for the eia_monthly-gen_eia923 table went down instead of up. I was able to track this bug back to the associate_generator_tables() function in the allocate_gen_fuel_by_generator_energy_source() function. This is where the plants are initially dropped, but I couldn't figure out why.
I think it's related to the fact that the boiler_generator_assn table doesn't have 2023 data. Data gets truncated by the gen_fuel_by_gen_esc() func and the select_input_data() which relegates the data collected to 2022 because there is no 860 monthly table for bga. This limitation to 2022 seems to bork the rest of the way that the 2023 data is passed to the other tables and then maybe explains how some of the data is eventually dropped.
It's only 0.001% of the overall table, so it didn't feel like a priority to address as part of the 923 monthly integration, however, if you want information on the specific plants getting dropped, then it's an issue.
The following plant_id_eia values are missing from the gen_eia923 output table once the 923 monthly data has been integrated:
[64927, 61852, 64551, 64544, 63233, 65793, 62248]
Bug Severity
How badly is this bug affecting you?
Low: The bug isn't causing me problems, but something's still wrong here.
To Reproduce
I built a version of the PUDL db with the PR #2936 and with dev (pre #2936 integration).
I then built the output table gen_eia923 from each table and compared the difference:
Describe the bug
After integrating the 2023 923 monthly data into the EIA tables, I noticed that the row count for the
eia_monthly-gen_eia923
table went down instead of up. I was able to track this bug back to theassociate_generator_tables()
function in theallocate_gen_fuel_by_generator_energy_source()
function. This is where the plants are initially dropped, but I couldn't figure out why.I think it's related to the fact that the
boiler_generator_assn
table doesn't have 2023 data. Data gets truncated by thegen_fuel_by_gen_esc()
func and theselect_input_data()
which relegates the data collected to 2022 because there is no 860 monthly table for bga. This limitation to 2022 seems to bork the rest of the way that the 2023 data is passed to the other tables and then maybe explains how some of the data is eventually dropped.It's only 0.001% of the overall table, so it didn't feel like a priority to address as part of the 923 monthly integration, however, if you want information on the specific plants getting dropped, then it's an issue.
The following
plant_id_eia
values are missing from thegen_eia923
output table once the 923 monthly data has been integrated:[64927, 61852, 64551, 64544, 63233, 65793, 62248]
Bug Severity
How badly is this bug affecting you?
To Reproduce
I built a version of the PUDL db with the PR #2936 and with
dev
(pre #2936 integration).I then built the output table
gen_eia923
from each table and compared the difference:The output
cat_df
should have 64 rows.Expected behavior
I would expect that these records would not get dropped from the
gen_eia923
output table.The text was updated successfully, but these errors were encountered: