This PR adds a new checkbox to both page and seed crawl workflow types,
which will fail the crawl if behaviors detect the browser is not logged
in for supported sites.
Changes include:
- Backend support for the new crawler flag
- A new `failed_not_logged_in` crawl state
- Checkbox workflow editor and config details in the frontend (currently
in the Scope section - I think it makes sense to have this option up
front, but worth considering)
- User Guide documentation of new option
- A new nightly test for the new workflow option and
`failed_not_logged_in` state
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: sua yoo <sua@webrecorder.org>
Resolves https://github.com/webrecorder/browsertrix/issues/2764
## Changes
Uses crawl first seed as starting URL instead of workflow first seed to
fix replay after saving workflow without running.
## Manual testing
1. Log in as crawler
2. Create a new workflow and crawl a single page, for example,
https://example.com/
3. Edit the workflow to change starting URL to https://example.org/
4. Save without running
5. Go to latest crawl tab, verify replay loads https://example.com/
Resolves#2646
Depends on #2710
## Changes
(Copied from #2689)
- Allows users to specify URL list as file.
- Allow uploading a text file of URLs
- Allow specifying >100 URLs into URL list, where they will turn into an uploaded list automatically.
---------
Co-authored-by: sua yoo <sua@suayoo.com>
- Fix race condition related to browser commit time
- The profile commit request waits for browser to actual finish, and
profile saved. This can cause request to time out, resulting in a retry,
in which the browser has already been closed.
- With these changes, the commit is now 'idempotent' and returns a
waiting_for_browser until the profile is actually committed.
- On frontend, keep pinging commit endpoint with a timeout while 'waiting_for_browser' is returned, actual committed when endpoint returns profile id.
---------
Co-authored-by: sua yoo <sua@suayoo.com>
Resolves#2718
## Changes
- Enables manual QA review for successfully finished crawls.
- Individual pages and full crawl can be reviewed without assistive QA running
- Show replay, screenshot and text without comparison if no assistive QA yet.
- follow-up to: #2736: remove '^' custom prefix URLs to avoid accumulating '^' via utility function
- Show URL prefix list in settings for custom prefix scope.
- Update user guide with correct custom prefix field.
---------
Co-authored-by: sua yoo <sua@webrecorder.org>
- Automatically compute prefix from starting URL, if no other prefix is
set in custom prefix mode.
- Ensure each prefix is actually a prefix: add '^' to each custom prefix
URL, as include URL path is a regex
- rename 'Extra URL Prefixes' to just 'URL Prefixes' and adjust help
text to indicate that the prefix list is the list that is in scope
- fixes#2735, follow up to #2722
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: sua yoo <sua@webrecorder.org>
## Changes
- Deletes and rewrites arrays in URL search params in workflow list when
editing array filters (i.e. tags & profiles)
- Removes a missed `console.log`
- bump to 1.17.3
cc @SuaYoo
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Fixes#2721
This PR removes frontend logic that set the seed-level scopeType for
custom page prefix workflows to `prefix`, which was causing the scope to
balloon larger than what users intended for some workflows.
Resolves https://github.com/webrecorder/browsertrix/issues/2660
## Changes
- Enables filtering workflow list by tag
- Displays tags near workflow name in detail view
- Adds `<btrix-filter-chip>` component
- Migrates "schedule state", "only running", and "only mine" filters
- Adds basic documentation to Storybook
---------
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Connected to #2661
- Removes crawl workflows from being returned as part of the profile
response.
- Frontend: removes display of workflows in profile details.
- Adds 'inUse' flag to all profile responses to indicate profile is in
use by at least one workflow
- Adds 'profileid' as possible filter for workflows search in
preparation for filtering by profile id (#2708)
- Make 'profile_in_use' a proper error (returning 400) on profile
delete.
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
No issue created, quick fix for edge case
## Changes
Adds ID to accept page toast messages so that "Please log in ..."
message closes once the invite is accepted or declined.
## Manual testing
1. Log in as org admin
2. Invite user (one you have access to) to an org
3. Log out and log in as invited user
4. Click invite link in inbox
5. Click accept quickly. Verify "Please log in ..." message is replaced
with "You've joined ..."
## Changes
Following
bbd5fb81c4,
since the banner is shown throughout the duration of the trial, it should be
made informational at the beginning of the trial so that it's not as obtrusive.
Add docs about path / virtual 'access_addressing_style' that is
available for each storage option.
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
- Rename 'Modifying Running Crawls' to 'Running Crawls'
- Add section about pausing/resuming crawls, and that paused crawls will eventually become stopped if not resumed.
- Add new crawl pausing, paused, resuming statuses and icons.
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
- only set state to 'paused' if shoudPause is true and crawl is still
running (using FAILED_STATES list)
- treat failed/canceled crawl as inactive, don't show replay (using
RUNNING_STATES list)
---------
Co-authored-by: sua yoo <sua@webrecorder.org>
- Adds `crawler_network_policy_additional_egress` setting, to add egress
rules to the existing crawler network policy. Useful for when you want
to allow-list a single IPs without replacing the whole network policy.
- Adds docs about `crawler_network_policy_additional_egress` to the customization page.
- Resolves#2121
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Resolves https://github.com/webrecorder/browsertrix/issues/2650
## Changes
Differentials between `trialing` and `trialing_canceled` when displaying
messages:
- No changes to messages if `trialing_canceled`.
- If `trialing`, show messaging that subscription will automatically
continue.
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
- Updates status icons & colors in several places in the app
- Moves "Action Menus" and updated "Status Indicators" design docs from
public docs to Storybook
- [Storybook] Adds `remark-gfm` to enable tables in MDX
- [Storybook] Adds a custom `ColorSwatch` block
- [Browsertrix Docs] Swaps out custom colors and fonts included with
docs for color variables from Hickory and Webrecorder CDN's hosted font
files, respectively
---------
Co-authored-by: sua yoo <sua@suayoo.com>