browsertrix/backend/test
Tessa Walsh df4c4e6c5a
Optimize workflow statistics updates (#892)
* optimizations:
- rename update_crawl_config_stats to stats_recompute_all, only used in migration to fetch all crawls
and do a full recompute of all file sizes
- add stats_recompute_last to only get last crawl by size, increment total size by specified amount, and incr/decr number of crawls
- Update migration 0007 to use stats_recompute_all
- Add isCrawlRunning, lastCrawlStopping, and lastRun to
stats_recompute_last
- Increment crawlSuccessfulCount in stats_recompute_last

* operator/crawls:
- operator: keep track of filesAddedSize in redis as well
- rename update_crawl to update_crawl_state_if_changed() and only update
if state is different, otherwise return false
- ensure mark_finished() operations only occur if crawl is state has changed
- don't clear 'stopping' flag, can track if crawl was stopped
- state always starts with "starting", don't reset to starting

tests:
- Add test for incremental workflow stats updating
- don't clear stopping==true, indicates crawl was manually stopped

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-05-26 22:57:08 -07:00
..
__init__.py
conftest.py Rework collections to track collections in Crawl (#878) 2023-05-25 15:41:50 -04:00
test_collections.py Rework collections to track collections in Crawl (#878) 2023-05-25 15:41:50 -04:00
test_crawl_config_search_values.py Update collections backend API (#759) 2023-04-14 12:17:18 -04:00
test_crawl_config_tags.py Update collections backend API (#759) 2023-04-14 12:17:18 -04:00
test_crawlconfigs.py Optimize workflow statistics updates (#892) 2023-05-26 22:57:08 -07:00
test_filter_sort_results.py Rework collections to track collections in Crawl (#878) 2023-05-25 15:41:50 -04:00
test_invites.py Paginate API list endpoints (#659) 2023-03-06 14:41:25 -05:00
test_login.py
test_org.py Paginate API list endpoints (#659) 2023-03-06 14:41:25 -05:00
test_permissions.py Paginate API list endpoints (#659) 2023-03-06 14:41:25 -05:00
test_run_crawl.py Rework collections to track collections in Crawl (#878) 2023-05-25 15:41:50 -04:00
test_settings.py tests: fixes for crawl cancel + crawl stopped (#864) 2023-05-22 20:17:29 -07:00
test_stop_cancel_crawl.py Optimize workflow statistics updates (#892) 2023-05-26 22:57:08 -07:00
test_users.py
test_workflow_auto_add_to_collection.py Rework collections to track collections in Crawl (#878) 2023-05-25 15:41:50 -04:00