Releases: openzim/zimit
Releases · openzim/zimit
1.6.3
Changed
- Adapt to new
warc2zim
code structure - Using browsertrix-crawler 0.12.4
- Using warc2zim 1.5.5
Added
- New
--build
parameter (optional) to specify the directory holding Browsertrix files ; if not set,--output
directory is used ; zimit creates one subdir of this folder per invocation to isolate datasets ; subdir is kept only
if--keep
is set.
Fixed
--collection
parameter was not working (#252)
1.6.2
1.6.1
1.6.0
Changed
- Scraper fails for all HTTP error codes returned when checking URL at startup (#223)
- User-Agent now has a default value (#228)
- Manipulation of spaces with UA suffix and adminEmail has been modified
- Same User-Agent is used for check_url (Python) and Browsertrix crawler (#227)
- Using browsertrix-crawler 0.12.0
1.5.3
1.5.2
1.5.1
1.5.0
1.4.1
1.4.0
Added
--title
to set ZIM title--description
to set ZIM description- New crawler options:
--maxPageLimit
,--delay
,--diskUtilization
--zim-lang
param to set warc2zim's--lang
(ISO-639-3)
Changed
- Using browsertrix-crawler 0.10.2
- Default and accepted values for
--waitUntil
from crawler's update - Using warc2zim
1.5.2
- Disabled Chrome updates to prevent incidental inclusion of update data in WARC/ZIM (#172)
--failOnFailedSeed
used inconditionally--lang
now passed to crawler (ISO-639-1)
Removed
--newContext
from crawler's update