Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: BOM Update Tool #31072

Merged
merged 22 commits into from
Jun 9, 2022
Merged

perf: BOM Update Tool #31072

merged 22 commits into from
Jun 9, 2022

Conversation

marination
Copy link
Collaborator

@marination marination commented May 19, 2022

First Draft
get_boms_in_bottom_up_order
  • Create child-parent map once and fetch value from child key to get parents
  • Get parents recursively for a leaf node (get all ancestors)
  • Approx. 44 secs for 4lakh 70k BOMS (used to go on for over 5-10 mins earlier, maybe more idk)
update_cost
  • Remove double RM cost calculation
  • Update child items in db as well (optionally)
  • Don't regenerate exploded items for each BOM while updating cost (only in update cost job/bom function)🤦🏻‍♀️
  • Approx 5 minutes for 20k BOMs. [WIP]

There were correctness issues that could all be solved with a newer AlGoRiThM, that took preference. So first correctness >> then perf

New BOM Cost Updation Flow

Dependants = Immediate Parents, Dependencies = Children (parent-child is not accurate really)

Consider the following BOMs and their dependencies:

BOM-structure drawio

  • While updating all BOMs, we update the Raw Material BOMs or leaf nodes first and work our way up
  • An important thing here: We must only update a BOM after all its Dependencies are resolved/updated
  • That means: α (alpha) cannot be updated until A,B,1,2,3,4 are updated
  • For this we update BOMs one level at a time, and within a level we process BOMs parallelly as they are independent.
  • BOMs within each level are split into batches of 20k (takes 5 mins) and queued for updation
  • Once a level is done a Cron Job will prep and start processing the next level

Working:

  • Given the above example, on submitting the BOM Update Log, Raw Material BOMs will be processed first (batch size = 20k). The status will be In Progress

  • These Jobs will keep updating their row in the table in the Log and keep checking if all other jobs of this level are done (via bom_batches table)

  • The Current Level is also maintained in the Progress Section, which will increment as we progress up the levels

  • Once a level is complete and the cron (every 5 mins) determines a level is complete (all rows of current level have status Pending -> Completed), it will prepare next level parent boms (dependants) and kickstart the next level

    Screenshot 2022-05-24 at 2 17 49 PM
  • How do we determine "Parent BOMs" ? : By checking immediate parents with no Dependencies. Here 1 has immediate parents A and α (alpha). A can be considered since it has no unresolved/non-updated Dependencies, but α (alpha) has A and B as unresolved dependencies. So we proceed with A as parent of 1 in the next level.

  • A Cron Job will run every 5 minutes and check for In Progress jobs and start processing the next level if all jobs of current level are completed

  • The Log will be set as Complete when there are no more Dependants/ Parent BOMs to resolve/update


Important change/fix to test

  • Rate updation as per Raw Material Cost as per used to not work, which now does
  • Rates are reset rightfully on save and fetched as per Valuation/Last Purchase/Price List

To Test:

  • Create BOM with 3 levels (RM BOM, Subassembly BOM that uses RM BOM, FG BOM that uses Subassemble BOM)
  • Set RM cost as per Valuation Rate > Change valuation rate of Item in RM BOM > Update all costs via tool > check
  • Set RM cost as per Valuation Rate with different BOM currency (USD)> Change valuation rate of Item in RM BOM > Update all costs via tool > check
  • Set RM cost as per Price List > Change Price List rate of Item in RM BOM > Update all costs via tool > check
  • Set RM cost as per Price List in different Price List currency > Change Price List rate of Item in RM BOM > Update all costs via tool > check

Todo:

  • Clearing BOM Update Logs periodically (or atleast the table in it)

- Create child-parent map once and fetch value from child key to get parents
- Get parents recursively for a leaf node (get all ancestors)
- Approx. 44 secs for 4lakh 70k boms
@github-actions github-actions bot added the needs-tests This PR needs automated unit-tests. label May 19, 2022
@codecov
Copy link

codecov bot commented May 19, 2022

Codecov Report

Merging #31072 (3fa0a46) into develop (5c69378) will decrease coverage by 0.04%.
The diff coverage is 86.23%.

@@             Coverage Diff             @@
##           develop   #31072      +/-   ##
===========================================
- Coverage    63.50%   63.46%   -0.05%     
===========================================
  Files          986      988       +2     
  Lines        67457    67574     +117     
===========================================
+ Hits         42838    42885      +47     
- Misses       24619    24689      +70     
Impacted Files Coverage Δ
...t/accounts/report/sales_register/sales_register.py 73.68% <ø> (-8.19%) ⬇️
erpnext/e_commerce/redisearch_utils.py 36.42% <ø> (ø)
erpnext/hooks.py 100.00% <ø> (ø)
...cturing/doctype/bom_update_tool/bom_update_tool.py 79.31% <0.00%> (-1.94%) ⬇️
erpnext/loan_management/doctype/loan/loan.py 67.78% <44.44%> (ø)
...ext/stock/doctype/stock_entry/stock_entry_utils.py 93.33% <66.66%> (ø)
...facturing/doctype/bom_update_log/bom_update_log.py 89.89% <87.67%> (+5.37%) ⬆️
...uring/doctype/bom_update_log/bom_updation_utils.py 88.11% <88.11%> (ø)
erpnext/manufacturing/doctype/bom/bom.py 87.63% <93.47%> (-0.14%) ⬇️
...ounts/doctype/purchase_invoice/purchase_invoice.py 83.48% <100.00%> (ø)
... and 36 more

- Move `get_boms_in_bottom_up_order` in bom update tool’s file
- Remove repeated rm cost update from `update_cost`. `calculate_cost` handles RM cost update
- db_update children in `calculate_cost` optionally
- Don’t call `update_exploded_items` and regenerate exploded items in `update_cost`. They will stay the same (except cost)
- Doc is only used to iterate over items(which wont change) and change rate/amount of rows
- These changes are inserted in db via `db_update`, so no harm
- Tested locally: refetching cached doc after db update, reflects fresh data.
@marination marination added the CI-failing Unit tests or patch tests are failing. label May 19, 2022
@marination marination force-pushed the perf-bom-update-tool branch 4 times, most recently from 1c84087 to ef36f6f Compare May 19, 2022 21:26
- Use the right price list and currency to avoid rate conversion (1000/62.9), since rates are reset correctly now
- Use RM rate based on Price List in BOM. Non stock item has no valuation
@marination marination removed the CI-failing Unit tests or patch tests are failing. label May 19, 2022
@marination marination linked an issue May 23, 2022 that may be closed by this pull request
3 tasks
@marination
Copy link
Collaborator Author

Update cost tests are breaking. Will fix while writing tests

marination and others added 2 commits May 25, 2022 11:21
- Process BOMs level wise and Pause after level is complete
- Cron job will resume Paused jobs, which will again process the new level and pause at the end
- This will go on until all BOMs are updated
- Added Progress section with fields to track updated BOMs in Log
- Cleanup: Add BOM Updation utils file to contain helper functions/sub-functions
- Cleanup: BOM Update Log file will only contain functions that are in direct context of the Log

Co-authored-by: Gavin D'souza <gavin18d@gmail.com>
@marination marination force-pushed the perf-bom-update-tool branch 2 times, most recently from 565d2dc to c6e9d6c Compare May 27, 2022 05:24
- Generate RM-Rate map from Items table (will include subassembly items with rate)
- Function to reset exploded item rate from above map
- `db_update` exploded item rate only if rate is changed
- Via Update Cost, only update exploded items rate, do not regenerate table again
- Exploded Items are regenerated on Save and Replace BOM job
- `calculate_exploded_cost` is run only via non doc events (Update Cost button, Update BOMs Cost Job)
- Separate getting dependants and checking if they are valid (loop within loop led to redundant processing that slowed down function)
- Adding to above, the same dependant(parent) was repeatedly processed as many children shared it. Expensive.
- Use a parent-child map similar to child-parent map to check if all children are resolved
- `map.get()` reduced time: 10 mins -> 0.9s~1 second (as compared to `get_cached_doc` or query)
- Total time: 17 seconds to process 6599 leaf boms and 4.2L parent boms
- Previous Total time: >10 mins (I terminated it due to not wanting to waste time XD)
- If `Update Cost` job is ongoing, then block creation of new ones since all BOMs are updated
- `db_update` in `calculate_rm_cost` only if changed values to reduce redundant row updates
- Misc: Use variable for batch size
- This was done due to stale reads while the background jobs tried updating status of the log
- Added a table where all bom jobs within log will be tracked with what level they are processing
- Cron job will check if table jobs are all processed every 5 mins
- If yes, it will prepare parents and call `process_boms_cost_level_wise` to start next level
- If pending jobs, do nothing
- Current BOM Level is being tracked that helps adding rows to the table
- Individual bom cost jobs (that are queued) will process and update boms > will update BOM Update Batch table row with list of updated BOMs
@marination
Copy link
Collaborator Author

Was an issue with concurrent jobs and incorrect locking sequence causing lost connections to db server.
Fixed and works as expected.
4 jobs parallelly take ~ 12-14 minutes (4*20k = 80,000 BOMs)

- Utility to update cost in all BOMs without cron jobs or background jobs (run immediately)
- Re-use util wherever all bom costs are to be updated
- Skip explicit commits if in test
- Specify company in test records (dirty data sometimes, company wh mismatch)
- Skip background jobs queueing if in test
@marination marination removed the needs-tests This PR needs automated unit-tests. label Jun 7, 2022
@marination marination marked this pull request as ready for review June 7, 2022 11:13
@marination marination force-pushed the perf-bom-update-tool branch 9 times, most recently from 9543c9a to 6dd3263 Compare June 8, 2022 05:18
- Use base_rate for assertions as rate is subject to change due to conversion factor (USD)
- Use qb instead of db.sql
- Don't use `args` as argument for function
- Cleaner variable names
@ankush ankush self-assigned this Jun 8, 2022
@ankush
Copy link
Member

ankush commented Jun 9, 2022

Tested with deeply nested BOM with different costs for each RM and some ops cost.

RM cost

  • Started with everything valued @ 0
  • Update all bottom-most item (RM) to some valuation rate and did BOM update (scheduled job trigger manually)

Exploded item cost == individual subassy costs.
Screenshot 2022-06-09 at 2 05 45 PM


Ops cost

  • Used same BOM, revised and replaced a revision with ops.
  • Ran update tool

Screenshot 2022-06-09 at 2 20 14 PM

Increased levels (by moving this full subassy on another FG)

Screenshot 2022-06-09 at 2 29 02 PM

Partitioning / update logic LGTM 👍


unrelated feature request (I added these columns in BOM explorer (v. hacky implementation) but this might help in actually exploring costs 😬 )

…ta (update cost)

- Remove `auto_commit_on_many_writes` in `update_cost_in_level()` as commits happen every N BOMs
- Auto commit every 50 BOMs
- test: Remove hacky `frappe.flags.in_test` returns
- test: Enqueue `now` if in tests (for update cost and replace bom)
- Replace BOM: Copy bom object to `_doc_before_save` so that version.py finds a difference between the two
- Replace BOM: Add reference to version
- Update Cost: Unset `processed_boms` if Log is completed (useless after completion)
- test: `update_cost_in_all_boms_in_test` works close to actual prod implementation (only call Cron job manually)
- Test: use `enqueue_replace_bom`  so that test works closest to production behaviour

Co-authored-by: Ankush Menat <ankushmenat@gmail.com>
@ankush ankush merged commit 16c8b74 into frappe:develop Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Performance of BOM Update Tool and Correctness
3 participants