Commit Graph

70 Commits

Author SHA1 Message Date
Tessa Walsh
103d91556f
Remove non-org-scoped invites from backend (#585)
* Remove non-org-scoped invites
- remove POST /users/invite and related tests
- remove GET /users/invite-delete/{token}
2023-02-08 18:56:28 -08:00
Tessa Walsh
b642c53c59
Make crawlconfig name optional (#588) 2023-02-08 18:38:15 -08:00
Tessa Walsh
ce8f426978
Add notes to crawl and crawl updates (#587) 2023-02-08 18:36:22 -08:00
Ilya Kreymer
40fb04b385
backend: /orgs/<id>/remove: return 404 if org user doesn't exist, fix… (#561)
* backend: /orgs/<id>/remove: return 404 if org user doesn't exist, fixes issue in #535

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-02-08 16:22:36 -05:00
Tessa Walsh
a7a18b9db0
Add org-specific delete invite endpoint (#575)
Adds POST /orgs/{oid}/invites/delete, which expects the invited
email address in the POST body.

This endpoint will also delete duplicate invites with the same
email/oid combination if env var ALLOW_DUPE_INVITES allows dupes.
2023-02-08 16:10:09 -05:00
Tessa Walsh
95155e6fbf
Invite token improvements (#564)
- URL decode email address in invites.invite_user
- Add tests for accepting invites
2023-02-07 20:40:28 -08:00
Tessa Walsh
6d424a1ae0
Serialize pending invites to return "id" not "_id" (#559) 2023-02-06 12:28:11 -05:00
Ilya Kreymer
67df783885 bump version to 1.2.1-beta.0 2023-02-05 12:27:45 -08:00
Ilya Kreymer
af7ba4c90a version: update to 1.2.0 2023-02-02 23:46:23 -08:00
Tessa Walsh
2e3b3cb228
Add API endpoint to update crawl tags (#545)
* Add API endpoint to update crawls (tags only for now)
* Allow setting tags to empty list in crawlconfig updates
2023-02-01 22:24:36 -05:00
Tessa Walsh
23022193fb
Reformat backend for black 23.1.0 (#548) 2023-02-01 20:01:09 -05:00
Tessa Walsh
58aafc4191
Make API updates for member updates (#541)
* Add API endpoint that lists pending invites for all orgs (superuser-only)
* Add API endpoint that lists pending invites for org
* Add user emails to /api/orgs/<oid> response
2023-02-01 16:44:00 -05:00
Ilya Kreymer
9048d46c6c backend: add extraHops to support #543 2023-02-01 13:21:26 -08:00
Tessa Walsh
7d25565ef4
Add org role to /users/me-with-orgs (#536)
* Add org role to /users/me-with-orgs
* Add SUPERADMIN role and return in /me-with-orgs for superusers
2023-01-31 16:27:13 -05:00
Tessa Walsh
6cb79b580a
Fix issue where users are added to default org as admin (#534)
Users should only be added as to the default org with Owner permissions
if they are not specifically being invited to another org. This commit
fixes the logic in the post-registration callback to make this the case.
2023-01-31 12:55:31 -08:00
Ilya Kreymer
6df31e13ab
backend: profile api: return additional data in profile /browser/<id> endpoint (#537)
supports #533 , switching to client side rendering from VNC websocket
2023-01-31 11:58:50 -08:00
Tessa Walsh
2e6bf7535d
Add support for tags to update_crawl_config API endpoint (#521)
* Add test for updating crawlconfigs
2023-01-30 21:46:54 -08:00
Tessa Walsh
231c37108c
Handle DuplicateKeyError on org rename requests (#514)
* Handle DuplicateKeyError on org rename requests
2023-01-25 17:46:35 -08:00
Tessa Walsh
9f0abd6a28
Only drop indexes if migrations are run (#515) 2023-01-25 17:46:10 -08:00
Tessa Walsh
0486d50fe9
Add new /users/me-with-orgs API endpoint (#510) 2023-01-24 10:23:30 -05:00
Tessa Walsh
31e7939cba
Add new API user management endpoints (#511)
- Remove user from org
- Delete user invite
2023-01-23 17:03:07 -08:00
Tessa Walsh
c0e2ec6155
Fix logic for creating pidfile parent dir (#512) 2023-01-23 17:02:25 -08:00
Ilya Kreymer
ccd87e0dff
Rename api / nginx settings -> backend / frontend, set pull policy job images (#504)
* rename config values
- api -> backend
- nginx -> frontend

* job pods:
- set job_pull_policy from api_pull_policy (same as backend image)
- default to Always, but can be overridden for local deployment (same as backend image)

typo fix: CRAWL_NAMESPACE -> CRAWLER_NAMESPACE (part of #491)
ansible: set default label to :latest instead of :dev for
2023-01-18 20:21:36 -08:00
Ilya Kreymer
1dfa494210
backend: add default behavior time to /api/settings (part of #321) (#499) 2023-01-18 14:52:15 -08:00
Tessa Walsh
0fa60ebc45
Rename archives/teams -> orgs in codebase + add db migration (#486)
* Rename archives to orgs and aid to oid on backend

* Rename archive to org and aid to oid in frontend

* Remove translation artifact

* Rename team -> organization

* Add database migrations and run once on startup

* This commit also applies the new by_one_worker decorator to other
asyncio tasks to prevent heavy tasks from being run in each worker.

* Run black, pylint, and husky via pre-commit

* Set db version and use in migrations

* Update and prepare database in single task

* Migrate k8s configmaps
2023-01-18 14:51:04 -08:00
Ilya Kreymer
d028b93412
backend: password related fixes: (#479)
- mongodb: support passwords with '@' by escaping mongo username and password
- superadmin: update superadmin email and password after initial creation if updated in helm values
2023-01-13 18:22:50 -08:00
Ilya Kreymer
bc67cc8443
backend: registration: (#472)
- if registration is enabled, newly registred users get added to the default org, instead of getting their own org/archive
2023-01-13 00:03:37 -08:00
Ilya Kreymer
827b643262
backend: add 'allow_dupe_invites' option to allow re-inviting users. if not set (default), duplicate invites will result in errors (#471) 2023-01-12 23:25:48 -08:00
Ilya Kreymer
4dbca8c421
email sending tweaks: (#470)
- support 'reply-to' email field in values, and in ansible-based values
- set 'subject' for different types of messages
2023-01-12 23:25:23 -08:00
Ilya Kreymer
a916322c30
ansible: digitalocean tweaks: (#469)
* ansible: digitalocean tweaks:
- add org_name to template
- better check for db existence
- simplify domain, fix default_org

chart:
- make job images pull IfNotPresent
2023-01-12 23:11:20 -08:00
Ilya Kreymer
2daa742585
Copy tags from crawlconfig to crawl (#467), fixes #466
- add tags to crawl object
- ensure tags are copied from crawlconfig to crawl when crawl is created (both manually and scheduled)
- tests: add test to ensure tags added to crawl, remove redundant wait replaced with fixtures
2023-01-12 17:46:19 -08:00
Tessa Walsh
49460bb070
Add default organization + invite to default org (#465), #455
- Add default switch to Archive (org) model
- Set default org name via values.yaml
- Add check to ensure only one org with default org name exists
- Stop creating new orgs for new users
- Add new API endpoints for creating and renaming orgs (part of #457)
- Make Archive.name unique via index
- Wait for db connection on init, log if waiting
- Make archive-less invites invite user to default org with Owner role
- Rename default org from chart value if changed
- Don't create new org for invited users
2023-01-12 16:44:18 -08:00
Ilya Kreymer
5efeaa58b1
API filters by user + crawl collection ids (#462)
backend: object filtering:
- add filtering crawls, crawlconfigs and profiles by userid= query arg, fixes #460
- add filtering crawls by crawlconfig via cid= query arg, fixes #400 
- tests: add test_filter_results test suite to test filtering crawls and crawlconfigs by user, also create user with 'crawler' permissions, run second crawl with that user.
2023-01-11 16:50:38 -08:00
Ilya Kreymer
7b5d82936d
backend: initial tags api support (addresses #365): (#434)
* backend: initial tags api support (addresses #365):
- add 'tags' field to crawlconfig (array of strings)
- allow querying crawlconfigs to specify multiple 'tag' query args, eg. tag=A&tag=B
- add /archives/<aid>/crawlconfigs/tags api to query by distinct tag, include index on aid + tag
tests: add tests for adding configs, querying by tags
tests: fix fixtures to retry login if initial attempts fails, use test seed of https://webrecorder.net instead of https://example.com/
2023-01-11 13:29:35 -08:00
Ilya Kreymer
56a6d7a5d8
Backend lint check (#451)
- apply lint + format fixes to backend
- add ci for lint + format fixes for backend
- use fixed version of pydantic
2023-01-10 16:17:06 -08:00
Ilya Kreymer
30bda8c75d
VNC-Based Profile Browser (#433)
* profile browser vnc support + fixes:
- switch profile browser rendering to use VNC
- frontend: add @novnc/novnc as dependency, create separate bundle novnc.js to load into vnc browser (to avoid loading from each container)
- frontend: update proxy paths to proxy websocket, index page to crawler
- frontend: allow browser profiles in all browsers, remove browser compatibility check
- frontend: update webpack dev config, apply prettier
- frontend: node version fix
- backend: get vncpassword, build new URL for proxying to crawler iframe
- backend: fix profile / crawl job pull policy from 'Always' -> 'Never', should use existing image for job
- backend: fix kill signal to use bash -c to work with latest backend image
- backend/chart: add 'profile_browser_timeout_seconds' to chart values to control how long profile browser to remain when idle (default to 60)
- backend: remove utils.py, now using secret.token_hex() for random suffix
Co-authored-by: sua yoo <sua@suayoo.com>
2023-01-10 14:42:42 -08:00
Tessa Walsh
d1b59c9bd0
Use archive_viewer_dep permissions to GET crawls (#443)
* Use archive_viewer_dep permissions to GET crawls

* Add is_viewer check to archive_dep

* Add API endpoint to add new user to archive directly (/archive/<id>/add-user)

* Add tests

* Refactor tests to use fixtures

* And remove login test that duplicates fixtures
2023-01-09 19:11:53 -08:00
Ilya Kreymer
dfca09fc9c
Add single crawl info api at /crawls/{crawl_id} (#418)
* backend: crawl info apis:
- add /crawls/{crawl_id} api endpoint which just lists the crawl info, without resolving the individual files
- move /crawls/{crawl_id}.json -> /crawls/{crawl_id}/replay.json for clarity that it's used for replay

* frontend: update api for new replay.json endpoint
2022-12-19 14:54:48 -08:00
sua yoo
28346e0a54
New create crawl config user workflow (#391) 2022-12-12 13:50:33 -08:00
Ilya Kreymer
61c63d0be9
Remove Code and Configs for Swarm/podman support (#407)
- remove swarm / podman support
- remove docker-compose.yml, btrixcloud.swarm package, and podman/swarm scripts from scripts/ dir-
- remove python-on-whales
- add error if not running in k8s
- remove python-on-whales
2022-12-08 18:19:58 -08:00
Ilya Kreymer
2d93cef966
CI: Add K3D CI test (#405)
- add testing with K3D cluster
- bump backend image to python 3.10-slim for newer python, smaller image.
- bump to 1.2.0-beta.0
2022-12-07 23:26:16 -08:00
Ilya Kreymer
0aa09be8c3
README + CHANGES + doc tweaks for 1.1.0 release (#402)
- update README + docs with deprecation of non-k8s deployment
- add CHANGES.md
- bump version to 1.1.0
2022-12-06 12:27:27 -08:00
Ilya Kreymer
829548af0f doc tweaks:
- fix typos in docs
- update prod deployment info
- update minikube info
- add info on how to run with local images
- bump version to 1.1.0-beta.3 for testing multiarch build
2022-12-05 18:14:19 -08:00
Ilya Kreymer
82ffc0dfbc
Local Deployment Work: Support running locally + test cluster on CI (#396)
* k8s local deployment work:
- make it easier to deploy w/o ingress by setting 'local_service_port' (suggested port 30870)
- if using local minio, ensure file endpoints set to /data/ and /data/ proxies correctly to local bucket
- if not using minio, ensure file endpoints point to correct access / endpoint url.
- setup should work with docker desktop, minikube, microk8s and k3s!
- nginx chart: bump nginx memory limit to 20Mi
- nginx image: 00-default-override-resolver-config -> 00-browsertrix-nginx-init for clarity
- nginx image: use default nginx.conf, pin to nginx 1.23.2
- mongo: readd readiness probe, bump connect wait timeout (needed for ci)
- config: set superadmin username to 'admin'
- config schema: set 'name' as required 
- add sample chart values overrides:
- chart values: local-config.yaml for running locally with 'local_service_port'
- chart values: add microk8s-hosted.yaml for configuring a hosted microk8s setup
- chart values: add microk8s-ci.yaml for ci tests
- ci: remove docker swarm tests
- ci: add microk8s integration tests: launching cluster, logging in, running a crawl of example.com, downloading/checking WACZ
- bump to 1.1.0-beta.2
2022-12-02 19:58:34 -08:00
Ilya Kreymer
aabb0b2a92
chart / deployment fixes to run on microk8s: (fixes #385) (#387)
- ingress: fix proxying /data to minio, use another ingress which proxies correct host to ensure presigned urls work
- presigning: determine if signing endpoint url (minio) or access endpoint (cloud bucket) based on if access endpoint is provided, set bool on storage object
- chart: fix indent on incorrect storageClassName configs
- ingress: make 'ingress_class' configurable (set to 'public' for microk8s, default to 'nginx')
- minio: use older minio image which supports legacy fs based setup (for now)
- nginx service: add 'nginx_service_use_node_port' config setting: if true, will use NodePort for frontend,
other will use default (ClusterIP) and only for the frontend / nginx
- chart: remove changing service type for other services
2022-11-30 09:21:58 -08:00
Ilya Kreymer
afe536e568 version: bump to 1.1.0-beta.1 2022-11-23 23:37:33 -08:00
Ilya Kreymer
d6386b7051
Release Build + Versioning (#373)
- Adds version to version.txt in root
- adds update-version.sh which updates version in frontend/package.json and backend/btrixcloud/version.py
- frontend: loads version from $VERSION env var, ../version.txt or package.json
- ci: on new github release, pushes webrecorder/browsertrix-backend and webrecorder/browsertrix-frontend images to Dockerhub with current version, as well as latest.
- version set to 1.1.0-beta.0
- closes #357
2022-11-18 17:15:25 -08:00
Ilya Kreymer
704838f562 backend: add 'lang' and 'blockAds' field to raw config, in prep for future work, and fixes #369 2022-11-18 17:09:31 -08:00
Ilya Kreymer
793611e5bb
add exclusion api, fixes #311 (#349)
* add exclusion api, fixes #311
add new apis: `POST crawls/{crawl_id}/exclusion?regex=...` and `DELETE crawls/{crawl_id}/exclusion?regex=...` which will:
- create new config with add 'regex' as exclusion (deleting or making inactive previous config) OR remove as exclusion.
- update crawl to point to new config
- update statefulset to point to new config, causing crawler pods to restart
- filter out urls matching 'regex' from both queue and seen list (currently a bit slow) (when adding only)
- return 400 if exclusion already existing when adding, or doesn't exist when removing
- api reads redis list in reverse to match how exclusion queue is used
2022-11-12 17:24:30 -08:00
Ilya Kreymer
d340bceb39 style pass: normalize docstring spacing 2022-10-19 21:47:34 -07:00