browsertrix/frontend
Ilya Kreymer 00eb62214d
Uploads API: BaseCrawl refactor + Initial support for /uploads endpoint (#937)
* basecrawl refactor: make crawls db more generic, supporting different types of 'base crawls': crawls, uploads, manual archives
- move shared functionality to basecrawl.py
- create a base BaseCrawl object, which contains start / finish time, metadata and files array
- create BaseCrawlOps, base class for CrawlOps, which supports base crawl deletion, querying and collection add/remove

* uploads api: (part of #929)
- new UploadCrawl object which extends BaseCrawl, has name and description
- support multipart form data data upload to /uploads/formdata
- support streaming upload of a single file via /uploads/stream, using botocore multipart upload to upload to s3-endpoint in parts
- require 'filename' param to set upload filename for streaming uploads (otherwise use form data names)
- sanitize filename, place uploads in /uploads/<uuid>/<sanitized-filename>-<random>.wacz
- uploads have internal id 'upload-<uuid>'
- create UploadedCrawl object with CrawlFiles pointing to the newly uploaded files, set state to 'complete'
- handle upload failures, abort multipart upload
- ensure uploads added within org bucket path
- return id / added when adding new UploadedCrawl
- support listing, deleting, and patch /uploads
- support upload details via /replay.json to support for replay
- add support for 'replaceId=<id>', which would remove all previous files in upload after new upload succeeds. if replaceId doesn't exist, create new upload. (only for stream endpoint so far).
- support patching upload metadata: notes, tags and name on uploads (UpdateUpload extends UpdateCrawl and adds 'name')

* base crawls api: Add /all-crawls list and delete endpoints for all crawl types (without resources)
- support all-crawls/<id>/replay.json with resources
- Use ListCrawlOut model for /all-crawls list endpoint
- Extend BaseCrawlOut from ListCrawlOut, add type
- use 'type: crawl' for crawls and 'type: upload' for uploads
- migration: ensure all previous crawl objects / missing type are set to 'type: crawl'
- indexes: add db indices on 'type' field and with 'type' field and oid, cid, finished, state

* tests: add test for multipart and streaming upload, listing uploads, deleting upload
- add sample WACZ for upload testing: 'example.wacz' and 'example-2.wacz'

* collections: support adding and remove both crawls and uploads via base crawl
- include collection_ids in /all-crawls list
- collections replay.json can include both crawls and uploads

bump version to 1.6.0-beta.2
---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-07-07 09:13:26 -07:00
..
.husky Run frontend formatter on pre-commit hook (#461) 2023-01-12 14:04:15 -08:00
.vscode Add all localization files to source control (#502) 2023-01-18 14:49:38 -08:00
assets
config CI: Add Playwright UI e2e tests + CI (#614) 2023-03-22 16:23:22 -07:00
scripts CI: Add Playwright UI e2e tests + CI (#614) 2023-03-22 16:23:22 -07:00
src Fix links in watch crawl after workflow crawl completes (#943) 2023-07-06 15:04:26 -07:00
tests Add README.md related to run playwright tests locally (#722) 2023-03-28 16:08:28 -07:00
xliff text: rename workflowuration -> workflow (#741) 2023-04-04 08:48:06 -07:00
.dockerignore
.editorconfig chore: add editorconfig in frontend 2023-02-24 13:04:11 -08:00
.gitignore CI: Add Playwright UI e2e tests + CI (#614) 2023-03-22 16:23:22 -07:00
.prettierignore Add all localization files to source control (#502) 2023-01-18 14:49:38 -08:00
00-browsertrix-nginx-init.sh
Dockerfile frontend: fix RWP_BASE_URL not being set correctly for nginx image 2023-06-13 00:04:46 -07:00
frontend.conf.template Uploads API: BaseCrawl refactor + Initial support for /uploads endpoint (#937) 2023-07-07 09:13:26 -07:00
index.d.ts Frontend collections beta UI (#886) 2023-06-06 17:52:01 -07:00
lit-localize.json
minio.conf
package.json Uploads API: BaseCrawl refactor + Initial support for /uploads endpoint (#937) 2023-07-07 09:13:26 -07:00
playwright.config.ts CI: Add Playwright UI e2e tests + CI (#614) 2023-03-22 16:23:22 -07:00
postcss.config.js
sample.env.local
tailwind.config.js
tsconfig.json
web-test-runner.config.mjs misc frontend build fixes: playwright version + chunking (#740) 2023-04-03 21:27:44 -07:00
webpack.config.js Frontend collections beta UI (#886) 2023-06-06 17:52:01 -07:00
webpack.dev.js CI: Add Playwright UI e2e tests + CI (#614) 2023-03-22 16:23:22 -07:00
webpack.prod.js
yarn.lock Frontend collections beta UI (#886) 2023-06-06 17:52:01 -07:00