browsertrix/backend/btrixcloud/migrations
Tessa Walsh df4c4e6c5a
Optimize workflow statistics updates (#892)
* optimizations:
- rename update_crawl_config_stats to stats_recompute_all, only used in migration to fetch all crawls
and do a full recompute of all file sizes
- add stats_recompute_last to only get last crawl by size, increment total size by specified amount, and incr/decr number of crawls
- Update migration 0007 to use stats_recompute_all
- Add isCrawlRunning, lastCrawlStopping, and lastRun to
stats_recompute_last
- Increment crawlSuccessfulCount in stats_recompute_last

* operator/crawls:
- operator: keep track of filesAddedSize in redis as well
- rename update_crawl to update_crawl_state_if_changed() and only update
if state is different, otherwise return false
- ensure mark_finished() operations only occur if crawl is state has changed
- don't clear 'stopping' flag, can track if crawl was stopped
- state always starts with "starting", don't reset to starting

tests:
- Add test for incremental workflow stats updating
- don't clear stopping==true, indicates crawl was manually stopped

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-05-26 22:57:08 -07:00
..
__init__.py Refactor to use new operator on backend (#789) 2023-04-24 18:30:52 -07:00
migration_0001_archives_to_orgs.py Refactor to use new operator on backend (#789) 2023-04-24 18:30:52 -07:00
migration_0002_crawlconfig_crawlstats.py CrawlConfig migration and crawl stats query optimization (#633) 2023-02-24 18:01:15 -08:00
migration_0003_mutable_crawl_configs.py Fix migration to avoid jobType KeyError (#727) 2023-03-27 13:52:05 -07:00
migration_0004_config_seeds.py Refactor to use new operator on backend (#789) 2023-04-24 18:30:52 -07:00
migration_0005_operator_scheduled_jobs.py backend: fixes to 0005 migration: (#843) 2023-05-10 12:00:41 +02:00
migration_0006_precompute_crawl_stats.py Optimize workflow statistics updates (#892) 2023-05-26 22:57:08 -07:00
migration_0007_colls_and_config_update.py Optimize workflow statistics updates (#892) 2023-05-26 22:57:08 -07:00