Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed changes for 1.0 (updated source repo) #19

Open
wants to merge 169 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
169 commits
Select commit Hold shift + click to select a range
480c5f6
first cut at bagit 1.0 spec
johnscancella Nov 23, 2016
8b266fa
updated ed summers, and justin littman addresses
johnscancella Dec 6, 2016
988036b
Remove legacy comment about old DNS results for xml.resource.org
acdha Dec 9, 2016
5b1cc87
Trim trailing whitespace
acdha Dec 9, 2016
d74ef0e
Remove long-commented “other file metadata” section
acdha Dec 9, 2016
16258e7
Strengthen guidance on UTF-8 BOMs in tag files
acdha Dec 9, 2016
cd901f1
Recommend modern hash algorithms
acdha Dec 9, 2016
6673683
Update recommendations related to Windows filenaming
acdha Dec 9, 2016
1da8da7
Update interoperability disclaimer
acdha Dec 9, 2016
766324f
Discourage the use of manual bag creation
acdha Dec 9, 2016
f898aff
Add filename-related normalization discussion
acdha Dec 10, 2016
67c81c8
Update "Payload-Oxum" documentation
acdha Dec 12, 2016
0361e96
Consistent number of spaces after a period
acdha Dec 13, 2016
185eb6b
Upgrade must/must-not text to entity references
acdha Dec 13, 2016
a8dc5df
Strengthen wording: all manifests MUST list all files
acdha Dec 13, 2016
8edbef1
Better references syntax
acdha Dec 13, 2016
5e0f3c1
Update transfer recommendations
acdha Dec 13, 2016
c8ef7c6
Consistent indentation in Terminology
acdha Dec 14, 2016
900ce0c
Clarify the format for tag manifest algorithm names
acdha Dec 14, 2016
e87dda8
Update interoperability reference
acdha Jan 5, 2017
2f2ca4b
Update terminology for bag checksum algorithms
acdha Jan 5, 2017
b975a0e
Consistent indentation for terminology list
acdha Jan 5, 2017
37926c0
Terminology: simplify “tag file” definition
acdha Jan 5, 2017
5fbfdf6
Terminology: simplify “valid” definition
acdha Jan 5, 2017
8236209
Convert numbered list in Structure section to <list>
acdha Jan 5, 2017
94bcdaa
Remove Serialization section
acdha Jan 5, 2017
cd5d6fe
Cherry pick: first cut at changes from recommendations by Dave Crocker
johnscancella Jan 5, 2017
545e072
Use <figure><artwork type="abnf"> for ABNF diagrams
acdha Jan 5, 2017
35c3272
Minor <figure> consistency cleanup
acdha Jan 5, 2017
4407055
Terminology: whitespace cleanup
acdha Jan 5, 2017
b07d6c0
Convert tabs to spaces
acdha Jan 5, 2017
ce06c32
Terminology: update definition of “complete”
acdha Jan 5, 2017
757f68a
fixed some wording of sentences and removed section on Disk and netwo…
johnscancella Jan 6, 2017
e304db4
Remove stale references
acdha Jan 6, 2017
318fe5c
Remove trailing whitespace
acdha Jan 6, 2017
5815284
Remove stale inline TODO items
acdha Jan 6, 2017
2f6d466
Normalize XML declarations against current IETF template
acdha Jan 6, 2017
c6f0a08
Prose review for section 1
acdha Jan 6, 2017
b6413fd
Fix formatting for Section 2 (“Structure”)
acdha Jan 6, 2017
b35ceb5
Fix formatting for “Bag Declaration”
acdha Jan 6, 2017
45685bc
Copy-editing for “Payload Directory”
acdha Jan 6, 2017
272e6ce
Copy editing for “Payload Manifest”
acdha Jan 6, 2017
3bbebd5
Remove injunction against reusing payload manifest algorithms
acdha Jan 6, 2017
d9ceeee
Copy-editing for “Payload-Oxum”
acdha Jan 6, 2017
38b8b59
Simplify “Other Tag Files” text
acdha Jan 6, 2017
00eb4ab
Spelling
acdha Jan 6, 2017
8c9d621
Copy-editing for “Complete, Incomplete, and Valid bags”
acdha Jan 6, 2017
ef7f6df
Update version number in example bag
acdha Jan 6, 2017
a60e644
Update authors
acdha Jan 17, 2017
169a094
Clarify wording about directories in manifests
acdha Jan 23, 2017
3d3bb73
Remove duplicate wording
acdha Jan 23, 2017
0257f7b
Update “filesystem” reference in “complete” requirements
acdha Jan 23, 2017
e677092
Update “special directory characters” prose
acdha Jan 23, 2017
1f97d39
Note that bag-info.txt fields are intended for human consumption
acdha Jan 23, 2017
aaffd0a
Update Makefile and add a format target
acdha Jan 23, 2017
bc862cf
XML formatting
acdha Jan 23, 2017
56c069e
Remove `Bag-Size` from example bag-info.txt
acdha Feb 28, 2017
c4a350d
Fix Makefile targets
acdha Aug 15, 2017
5204986
Tidy source formatting in the references section
acdha Aug 15, 2017
fc58679
Add reference to conformance suite project
acdha Aug 15, 2017
9a2787c
Restore fetch.txt
acdha Aug 15, 2017
c2fe6ce
Add a reminder that fetch.txt entries must be listed in manifests
acdha Aug 15, 2017
1bd0a8f
Convert fetch.txt artwork to ABNF
acdha Aug 15, 2017
7e2b7e5
Remove dead comments
acdha Nov 2, 2017
708b8f8
Tidy format of manifest section
acdha Nov 2, 2017
66c24eb
Update payload manifest to prohibit duplicates
acdha Nov 2, 2017
87893e2
Clarify that “whitespace” means tabs / spaces
acdha Nov 3, 2017
06c02e6
Stricter bagit.txt wording
acdha Nov 6, 2017
c72f1e4
Tighten restrictions for bag-info.txt syntax
acdha Nov 6, 2017
b6060cf
fixed ABNF for bag-info.txt and added one for bagit.txt
johnscancella Nov 7, 2017
d2d5e1e
Include Git configuration used for pretty diffs
acdha Dec 12, 2017
c287f0b
Changes in preparation for 1.0
Mar 5, 2018
5d7f9db
Requested tweeks.
Mar 7, 2018
39af339
Merge pull request #1 from justinlittman/bagit1.0-changes
johnscancella Mar 12, 2018
36cd8fe
Clarify that tag manifests should also list every manifest
acdha Mar 29, 2018
ae9ead9
refs #3 - updated ABNF to better match the normative
johnscancella Apr 2, 2018
0b07d73
refs #3 - added ABNF reference and not about using core rules
johnscancella Apr 5, 2018
c3b62bb
Metadata elements are normatively OPTIONAL
zimeon Apr 5, 2018
f30661f
Be more specific than SHA-2 family
zimeon Apr 5, 2018
bc87214
Merge pull request #6 from zimeon/sha2_specifics
johnscancella Apr 10, 2018
18a94ef
Merge pull request #5 from zimeon/metadata_optional
johnscancella Apr 18, 2018
f4d5911
refs #2 - added language to specify that all files must be in all man…
johnscancella Apr 18, 2018
ac339b4
refs #7 - changed section about bags differing in normalization only …
johnscancella Apr 18, 2018
4a8e2a3
refs #2 - fixing grammer
johnscancella Apr 19, 2018
65f5e7e
Tidy ABNF (e.g. shorter line length)
stain Apr 20, 2018
6dfaf3e
Can't say "only cryptographic-quality"
stain Apr 20, 2018
851cf0d
Use sha512 as example, not sha1
stain Apr 20, 2018
72bcdfc
Payload is a set of named files
stain Apr 20, 2018
d561756
List all types of tag files
stain Apr 20, 2018
e0b411d
*manifest-sha512* in examples
stain Apr 20, 2018
853c870
MD5 has never been called "MD-5"
stain Apr 20, 2018
f1e34b1
Explain why example use md5
stain Apr 20, 2018
75add1b
Avoid confusing "sub-directory must not be changed"
stain Apr 20, 2018
893a1ab
Merge pull request #8 from stain/abnf-tidy
johnscancella Apr 20, 2018
7bc88cc
refs #2 - changed text to be more clear that the previous version is …
johnscancella Apr 20, 2018
4bf2689
updated date
johnscancella Apr 20, 2018
fa015eb
changed the bagit.txt example to make it more clear that you can choo…
johnscancella Apr 20, 2018
f5a72aa
Typo "ather" -> "rather"
stain Apr 23, 2018
b6bbf22
filenames, not file names
stain Apr 23, 2018
7e296f1
files may be organized in subdirectories
stain Apr 23, 2018
0bfdbb5
correct plural (many files, many sequences)
stain Apr 23, 2018
9e3417b
Avoid RFC "MAY" for "base directory"
stain Apr 23, 2018
5fa2515
Use "filepath" instead of "filename"
stain Apr 23, 2018
b86cd54
Avoid unknown Source-URL in example
stain Apr 23, 2018
92711ec
Attempt to clarify whitespace in bag-info.txt
stain Apr 23, 2018
84c0ae3
About multiplicity of metadata elements
stain Apr 23, 2018
71d1d8c
Merge pull request #13 from stain/payload-directory-structure
johnscancella Apr 24, 2018
9d5a3a2
Merge pull request #10 from stain/sha512-example
johnscancella Apr 24, 2018
a0dccff
Merge pull request #11 from stain/payload-is-files
johnscancella Apr 24, 2018
37b878f
Merge pull request #12 from stain/terminology-tagfile
johnscancella Apr 24, 2018
c6af1cc
clarified that the encoding is used by the remaining tag files
johnscancella Apr 24, 2018
7b5549e
Merge pull request #15 from stain/confusing-MAY-clause
johnscancella Apr 24, 2018
e23147e
Merge pull request #18 from stain/avoid-source-url
johnscancella Apr 24, 2018
749c158
Merge pull request #20 from stain/whitespace-ignored
johnscancella Apr 24, 2018
7a145eb
Merge pull request #9 from stain/cryptographic
johnscancella Apr 24, 2018
72a60fe
added Stian Soiland-Reyes to list of contributors
johnscancella Apr 24, 2018
e45f0f1
RECOMMENDED instead of SHOULD BagCount/-GroupID link
stain Apr 24, 2018
772a59b
Consistent "line feed (LF)" instead of "newline"
stain May 1, 2018
5bcb9c5
Typo fiilepath
stain May 1, 2018
1bbf1f5
Reference RFC2989 registry for ENCODING
stain May 1, 2018
d90f995
Merge pull request #16 from stain/filepath
johnscancella May 1, 2018
0e5abdc
Merge pull request #21 from stain/unique-metadata-elements
johnscancella May 1, 2018
952a087
Merge branch 'bagit1.0' into line-feed-not-newline
stain May 1, 2018
0a3a03b
Merge pull request #27 from stain/rfc2978
johnscancella May 9, 2018
0389daf
Merge pull request #26 from stain/line-feed-not-newline
johnscancella May 9, 2018
44be572
refs #28 - changed wording for pull request
johnscancella May 11, 2018
040fb7e
refs #17 - fixed merge conflicts for only percent encode CR, LF, and …
johnscancella May 11, 2018
aee2ff3
refs #22 - fixed merge conflict for tag files have optional newline a…
johnscancella May 11, 2018
7f021d4
refs #23 - fixing merge conflict on using registry for hash names
johnscancella May 11, 2018
96a7809
refs #19 - tag manifest should be complete and have the same list of …
johnscancella May 11, 2018
ffaa626
Tweaks.
May 14, 2018
d2e49eb
Merge pull request #29 from justinlittman/tweaks
johnscancella May 22, 2018
8cc9b11
Fixed keywords in non-normative section.
May 17, 2018
6445821
Merge branch 'bagit1.0' into shoulds
johnscancella May 22, 2018
0ac6b04
Merge pull request #30 from justinlittman/shoulds
johnscancella May 22, 2018
9606be2
updated date
johnscancella May 22, 2018
3b69c59
increment draft number
jkunze May 22, 2018
59ebead
tweak to author order
jkunze May 22, 2018
414de08
grouped unaffiliated authors together
jkunze May 22, 2018
87a19df
added '-' affiliation to fix formatting glitch
jkunze May 22, 2018
ea112c9
refs #31. Makes changes requested from ISE review.
May 23, 2018
78921ae
Merge pull request #32 from justinlittman/ise_tweaks
acdha May 24, 2018
36ec049
alphabetize contributors; fix typo
jkunze May 24, 2018
26be6b3
add two non-normative sentences to address ISE review comment about e…
jkunze May 24, 2018
5988dd7
added the missing noun ("rules") that the term "ABNF" qualifies
jkunze May 24, 2018
9286639
made artwork displays all end consistently
jkunze May 25, 2018
f165f2d
reduced redundant use of "optional"
jkunze May 25, 2018
a5b1efc
tightened "file" reference to "file name"
jkunze May 25, 2018
d8ad39f
added an extra summary section sentence
jkunze May 25, 2018
04e7d84
clarified metadata
jkunze May 25, 2018
e4f0c7d
moved summary paragraph from end of section to beginning of section
jkunze May 25, 2018
2613492
clarified security implications of older checksum algorithms
jkunze May 25, 2018
d8692c9
usage tweak
jkunze May 25, 2018
855c3f3
beefed up response to security case
jkunze May 25, 2018
f08da8f
changed a MUST to a standard &must;
jkunze May 25, 2018
cbc7d77
more consistent terminology (validation -> checksum)
jkunze May 29, 2018
5829d83
bumped draft number and current date
jkunze Jun 4, 2018
8d243f0
Update Justin's contact info
justinlittman Aug 23, 2018
9fd5e14
Merge pull request #34 from justinlittman/patch-1
acdha Aug 24, 2018
fd267b6
updated email address
jscancella Aug 28, 2018
3d07217
Added clarification about malicious attackers.
justinlittman Aug 29, 2018
0082b46
Changed reference to character set registry.
justinlittman Aug 30, 2018
692db8a
Merge pull request #35 from jscancella/contactInfoUpdate
acdha Sep 7, 2018
f465c60
Merge pull request #37 from justinlittman/patch-2
acdha Sep 7, 2018
27da533
Updates of wording.
justinlittman Sep 13, 2018
1e3c4b4
Merge pull request #36 from justinlittman/patch-1
acdha Sep 13, 2018
d597d91
Update bagit.xml
jkunze Sep 13, 2018
df1db48
update draft number and draft date
jkunze Sep 17, 2018
73a516b
Final XML from IETF
acdha Nov 14, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 14 additions & 28 deletions bagit.xml
Original file line number Diff line number Diff line change
Expand Up @@ -287,8 +287,7 @@ The base directory can have any name.
|
+-- [optional tag directories]/
|
+-- [optional tag files]
</artwork>
+-- [optional tag files] </artwork>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the intention of this to avoid an extra line in the rendered text output? I'm not a huge fan of the closing tag being on the end of the line like this but I'm not sure it's worth changing everything to go the other way.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, and it definitely makes the XML ugly. It may be a defect in the xml2rfc tool. If you can find a way around it, go for it, but if push comes to shove, I think getting the rendered human-oriented document consistent and correct is more important than making the XML pretty.

</figure>
<section title="Required Elements" anchor="sec-required-elements">
<section title="Bag Declaration: bagit.txt" anchor="sec-bag-decl">
Expand All @@ -298,8 +297,7 @@ The base directory can have any name.
<figure>
<artwork>
BagIt-Version: M.N
Tag-File-Character-Encoding: ENCODING
</artwork>
Tag-File-Character-Encoding: ENCODING </artwork>
<postamble>
<spanx style="emph">M.N</spanx> identifies the BagIt major (M) and minor (N) version numbers.
<spanx style="emph">ENCODING</spanx> identifies the character set encoding used by the remaining tag files.
Expand Down Expand Up @@ -364,8 +362,7 @@ Tag-File-Character-Encoding: ENCODING
<preamble>Example payload manifest filenames</preamble>
<artwork>
manifest-sha256.txt
manifest-sha512.txt
</artwork>
manifest-sha512.txt </artwork>
</figure>
<t>
Each line of a payload manifest file &must; be of the form:
Expand Down Expand Up @@ -436,8 +433,7 @@ placeholder file with a name such as ".keep".
<preamble>Example tag manifest filenames:</preamble>
<artwork>
tagmanifest-sha256.txt
tagmanifest-sha512.txt
</artwork>
tagmanifest-sha512.txt </artwork>
</figure>
<t>
A tag manifest file has the same form as the payload file manifest
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the tag manifest also required to be complete and equivalent across hash algorithms?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was our intention and it's a good point for clarification

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added LibraryOfCongress#19 to clarify that intention - left it as a "SHOULD list all tag files" unless you agree it would be MUST.

Expand Down Expand Up @@ -582,8 +578,7 @@ Bag-Group-Identifier: university_foo
Bag-Count: 1 of 15
Internal-Sender-Identifier: /storage/images/foo
Internal-Sender-Description: Uncompressed greyscale TIFFs created
from microfilm and are...
</artwork>
from microfilm and are... </artwork>
</figure>
</section>

Expand Down Expand Up @@ -786,8 +781,7 @@ myfirstbag/
|
| 27613-h/images/q172.txt
| (... OCR text ... )
....
</artwork>
.... </artwork>
</figure>
</section>
<section title="Example bag using fetch.txt">
Expand Down Expand Up @@ -816,8 +810,7 @@ highsmith-tahoe/
|
| bag-info.txt
| (Internal-Sender-Description: Download link found at )
| ( https://www.loc.gov/resource/highsm.23364/ )
</artwork>
| ( https://www.loc.gov/resource/highsm.23364/ )</artwork>
</figure>
</section>
</section>
Expand Down Expand Up @@ -958,8 +951,7 @@ N 4e LATIN CAPITAL LETTER N
ú c3ba LATIN SMALL LETTER U WITH ACUTE
ñ c3b1 LATIN SMALL LETTER N WITH TILDE
e 65 LATIN SMALL LETTER E
z 7a LATIN SMALL LETTER Z
]]></artwork>
z 7a LATIN SMALL LETTER Z ]]></artwork>
</figure>
<t>
Unicode normalization is relevant to BagIt implementors because different
Expand Down Expand Up @@ -1032,8 +1024,7 @@ z 7a LATIN SMALL LETTER Z

</preamble>
<artwork>
&lt; &gt; : " / | ? *
</artwork>
&lt; &gt; : " / | ? * </artwork>
</figure>
<figure>
<preamble>
Expand All @@ -1042,8 +1033,7 @@ z 7a LATIN SMALL LETTER Z
<artwork>
CON, PRN, AUX, NUL
COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9
LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9
</artwork>
LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 </artwork>
</figure>
<t>
See <xref target="MSFNAM"/> for more information and possible alternatives.
Expand Down Expand Up @@ -1099,8 +1089,7 @@ definitions use the core rules (e.g. DIGIT, HEXDIG, etc) as defined in
bagit-txt = "BagIt-Version: " 1*DIGIT "." 1*DIGIT ending
"Tag-File-Character-Encoding: " encoding ending
encoding = 1*CHAR
ending = CR / LF / CRLF
]]></artwork>
ending = CR / LF / CRLF ]]></artwork>
</figure>
</section>
<!-- /bag declaration -->
Expand All @@ -1119,8 +1108,7 @@ unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims = "!" / "$" / "&" / DQUOTE / "'" / "(" / ")" /
"*" / "+" / "," / ";" / "=" / "/"
pct-encoded = "%0D" / "%0d" / "%0A" / "%0a" / "%25"
ending = CR / LF / CRLF
]]></artwork>
ending = CR / LF / CRLF ]]></artwork>
</figure>
</section>
<!-- /payload manifest -->
Expand All @@ -1136,8 +1124,7 @@ continuation = WSP 1*non-reserved
non-reserved = VCHAR / WSP
; any valid character for the specific encoding
; except those that match "ending"
ending = CR / LF / CRLF
]]></artwork>
ending = CR / LF / CRLF ]]></artwork>
</figure>
</section>
<!-- /bag-info.txt -->
Expand All @@ -1151,8 +1138,7 @@ url = <absolute-URI, see [RFC3986], Section 4.3>
length = 1*DIGIT / "-"
filepath = ("data/"
1*( unreserved / pct-encoded / sub-delims ))
ending = CR / LF / CRLF
]]></artwork>
ending = CR / LF / CRLF ]]></artwork>
</figure>
</section> <!-- Fetch File -->

Expand Down