Enabled with `logging.fileMode`: true
- disables elasticsearch, kibana and ingress
- only enables fluentd to write logs in the node's volume
- lightweight logging into files (in JSON format and compressed in gzip)
- log file rotation (default: rotating files every 4 hours, retention 3 days)
* backend: fix for total crawl timelimit:
- time limit is computed for total job run time
- when limit is exceeded, job starts to stop crawls gracefully, equivalent to 'stop crawl' operation
- fix for #664
* rename crawl-timeout -> crawl_expire_time
* fix lint
* backend: make crawlconfigs mutable! (#656)
- crawlconfig PATCH /{id} can now receive a new JSON config to replace the old one (in addition to scale, schedule, tags)
- exclusions: add / remove APIs mutate the current crawlconfig, do not result in a new crawlconfig created
- exclusions: ensure crawl job 'config' is updated when exclusions are added/removed, unify add/remove exclusions on crawl
- k8s: crawlconfig json is updated along with scale
- k8s: stateful set is restarted by updating annotation, instead of changing template
- crawl object: now has 'config', as well as 'profileid', 'schedule', 'crawlTimeout', 'jobType' properties to ensure anything that is changeable is stored on the crawl
- crawlconfigcore: store share properties between crawl and crawlconfig in new crawlconfigcore (includes 'schedule', 'jobType', 'config', 'profileid', 'schedule', 'crawlTimeout', 'tags', 'oid')
- crawlconfig object: remove 'oldId', 'newId', disallow deactivating/deleting while crawl is running
- rename 'userid' -> 'createdBy'
- remove unused 'completions' field
- add missing return to fix /run response
- crawlout: ensure 'profileName' is resolved on CrawlOut from profileid
- crawlout: return 'name' instead of 'configName' for consistent response
- update: 'modified', 'modifiedBy' fields to set modification date and user modifying config
- update: ensure PROFILE_FILENAME is updated in configmap is profileid provided, clear if profileid==""
- update: return 'settings_changed' and 'metadata_changed' if either crawl settings or metadata changed
- tests: update tests to check settings_changed/metadata_changed return values
add revision tracking to crawlconfig:
- store each revision separate mongo db collection
- revisions accessible via /crawlconfigs/{cid}/revs
- store 'rev' int in crawlconfig and in crawljob
- only add revision history if crawl config changed
migration:
- update to db v3
- copy fields from crawlconfig -> crawl
- rename userid -> createdBy
- copy userid -> modifiedBy, created -> modified
- skip invalid crawls (missing config), make createdBy optional (just in case)
frontend: Update crawl config keys with new API (#681), update frontend to use new PATCH endpoint, load config from crawl object in details view
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: sua yoo <sua@suayoo.com>
* Paginate API list endpoints
fastapi-pagination is pinned to 0.9.3, the latest release that plays
nicely with pinned versions of fastapi and fastapi-users.
* Increase page size via overriden Params and Page classes
* update api resource list keys
---------
Co-authored-by: sua yoo <sua@suayoo.com>
- Adds icons to details nav items
- Adds replay glyph icon
- Hides "Replay" & "Files" pages if the crawl is running
- Updates border radius 3px → 4px
- Updates colour values, aligns with mockups
- Replaces `margin` from menu items with `gap` values
- Removes animation
Prettier made some spacing adjustments, I also moved some lines around so they're all in the same spot now. 😬
* modify the template file to highlight optional host that stores WAC
files
* numerically reorder the tcp ports - fix the 404's on the documentation
* add a configuration file - this allows automatic selection of inventory directory
* provide better examples on documentation
* chart crawl args cleanup:
- move configurable settings out of 'crawler_args'
- add 'crawler_session_size_limit_bytes' and 'crawler_session_time_limit_seconds' for --timeLimit and --sizeLimit option for crawler
- remove hard-coded 'timeout' to allow configuring via crawl config
- set liveness check port from existing config value
- add comments that requests hd must be at least double the size limit
- defaults: set crawler_requests_hd to 22GB, default crawl session size limit to 10GB
Accessibility improvement, better for screen readers to have h1 be the content of the page as opposed to the application / brand name. Rest of the nav bar to be dealt with at a later date.
### Main Pages + General
- Adds H1 page titles for all main pages
- Moves the New Crawl Config action into the title row from the search controls box
- Gives the Crawl Config search controls box the same style as the Crawls search controls box
- Adds +8px of padding to the search controls box to match mockups
- Search box: medium → small
- Title row control buttons: medium → small
### Details Pages
- h2 → h1 for crawl config and crawl detail pages
- h3 → h2 for crawl config and crawl detail pages
- Removes crawl title margin bottom at medium breakpoint on crawl details page
- Aligns crawl config details title row controls with end of flexbox on mobile to match crawl details controls in the same spot