browsertrix/backend/btrixcloud
Ilya Kreymer 4f676e4e82
QA Runs Initial Backend Implementation (#1586)
Supports running QA Runs via the QA API!

Builds on top of the `issue-1498-crawl-qa-backend-support` branch, fixes
#1498

Also requires the latest Browsertrix Crawler 1.1.0+ (from
webrecorder/browsertrix-crawler#469 branch)

Notable changes:
- QARun objects contain info about QA runs, which are crawls
performed on data loaded from existing crawls.

- Various crawl db operations can be performed on either the crawl or
`qa.` object, and core crawl fields have been moved to CoreCrawlable.

- While running,`QARun` data stored in a single `qa` object, while
finished qa runs are added to `qaFinished` dictionary on the Crawl. The
QA list API returns data from the finished list, sorted by most recent
first.

- Includes additional type fixes / type safety, especially around
BaseCrawl / Crawl / UploadedCrawl functionality, also creating specific
get_upload(), get_basecrawl(), get_crawl() getters for internal use and
get_crawl_out() for API

- Support filtering and sorting pages via `qaFilterBy` (screenshotMatch, textMatch) 
along with `gt`, `lt`, `gte`, `lte` params to return pages based on QA results.

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-03-20 22:42:16 -07:00
..
migrations QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
operator QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
__init__.py
auth.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
background_jobs.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
basecrawls.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
colls.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
crawlconfigs.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
crawlmanager.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
crawls.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
db.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
emailsender.py Email Templates (#1375) 2023-11-15 15:22:12 -08:00
invites.py Email Templates (#1375) 2023-11-15 15:22:12 -08:00
k8sapi.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
main_op.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
main.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
models.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
orgs.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
pages.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
pagination.py Format backend with Black 24 (#1507) 2024-02-07 11:35:34 -08:00
profiles.py Support multiple crawler versions (#1420) 2024-01-16 15:32:12 -08:00
storages.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
uploads.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00
users.py Add API endpoints for crawl statistics (#1461) 2024-01-10 13:30:47 -08:00
utils.py Add endpoints to read pages from older crawl WACZs into database (#1562) 2024-03-19 14:14:21 -07:00
version.py version: bump to 1.10.0-beta.0 2024-02-20 00:22:29 -08:00
webhooks.py QA Runs Initial Backend Implementation (#1586) 2024-03-20 22:42:16 -07:00