browsertrix/backend
Tessa Walsh a85f9496b0
Include number of Identical Files in QA stats and meter (#1848)
This PR adds Identical Files to the QA Page Match Analysis meter bars.
To do this, the backend calculates the number of non-HTML pages once and
includes it under the key `Files` in each of the `screenshotMatch` and
`textMatch` QA stats return arrays.

The backend additionally removes the file count from "No Data" to
prevent these from being counted twice.

---------

Co-authored-by: emma <hi@emma.cafe>
2024-06-06 13:15:19 -04:00
..
btrixcloud Include number of Identical Files in QA stats and meter (#1848) 2024-06-06 13:15:19 -04:00
test Include number of Identical Files in QA stats and meter (#1848) 2024-06-06 13:15:19 -04:00
test_nightly Give test_crawl_timeout 10 mins to finish (#1627) 2024-03-26 18:33:30 -07:00
.pylintrc
Dockerfile Backend mem usage fix - use fixed MOTOR_MAX_WORKERS + switch to gunicorn (#1468) 2024-01-16 15:32:42 -08:00
mypy.ini Support multiple crawler versions (#1420) 2024-01-16 15:32:12 -08:00
requirements.txt Add endpoints to read pages from older crawl WACZs into database (#1562) 2024-03-19 14:14:21 -07:00
test-requirements.txt Add slugs to org backend (#1250) 2023-10-10 18:30:09 -07:00