diff --git a/CIP-0122/README.md b/CIP-0122/README.md
new file mode 100644
index 0000000000..64a886ec7b
--- /dev/null
+++ b/CIP-0122/README.md
@@ -0,0 +1,1537 @@
+---
+CIP: 122
+Title: Logical operations over BuiltinByteString
+Category: Plutus
+Status: Proposed
+Authors: 
+    - Koz Ross <koz@mlabs.city>
+Implementors: 
+    - Koz Ross <koz@mlabs.city>
+Discussions:
+    - https://github.com/cardano-foundation/CIPs/pull/806
+Created: 2024-05-03
+License: Apache-2.0
+---
+
+## Abstract
+
+We describe the semantics of a set of logical operations for Plutus
+`BuiltinByteString`s. Specifically, we provide descriptions for:
+
+- Bitwise logical AND, OR, XOR and complement;
+- Reading a bit value at a given index;
+- Setting bits value at given indices; and
+- Replicating a byte a given number of times.
+
+As part of this, we also describe the bit ordering within a `BuiltinByteString`,
+and provide some laws these operations should obey.
+
+## Motivation: why is this CIP necessary?
+
+Bitwise operations, both over fixed-width and variable-width blocks of bits,
+have a range of uses, including data structures (especially
+[succinct][succinct-data-structures] ones) and cryptography. Currently,
+operations on individual bits in Plutus Core are difficult, or outright
+impossible, while also keeping within the tight constraints required onchain.
+While it is possible to some degree to work with individual _bytes_ over
+`BuiltinByteString`s, this isn't sufficient, or efficient, when bit
+maniputations are required.
+
+To demonstrate where bitwise operations would allow onchain possibilities that
+are currently either impractical or impossible, we give the following use cases.
+
+### Case 1: integer set
+
+An _integer set_ (also known as a bit set, bitmap, or bitvector) is a
+[succinct][succinct-data-structures] data structure for representing a set of
+numbers in a pre-defined range $[0, n)$ for some $n \in \mathbb{N}$. The
+structure supports the following operations:
+
+* Construction given a fixed number of elements, as well as the bound $n$.
+* Construction of the empty set (contains no elements) and the universe
+  (contains all elements).
+* Set union, intersection, complement and difference (symmetric and asymmetric).
+* Membership testing for a specific element.
+* Inserting or removing elements.
+
+These structures have a range of uses. In addition to being used as sets of
+bounded natural numbers, an integer set could also represent an array of Boolean
+values. These have [a range of applications][bitvector-apps], mostly as
+'backends' for other, more complex structures. Furthermore, by using some index 
+arithmetic, integer sets can also be used to represent 
+[binary matrices][binary-matrix] (in any number of
+dimensions), which have an even wider range of uses:
+
+* Representations of graphs in [adjacency-matrix][adjacency-matrix] form
+* [Checking the rules for a game of Go][go-binary-matrix]
+* [FSM representation][finite-state-machine-4vl]
+* Representation of an arbitrary binary relation between finite sets
+
+The succinctness of the integer set (and the other succinct data structures it
+enables) is particularly valuable on-chain, due to the limited transaction size
+and memory available.
+
+Typically, such a structure would be represented as a packed array of bytes
+(similar to the Haskell `ByteString`). Essentially, given a bound $n$, the
+packed array has a length in bytes large enough to contain at least $n$ bits,
+with a bit at position $i$ corresponding to the value $i \in \mathbb{N}$. This 
+representation ensures the succinctness of the structure (at most 7 bits of 
+overhead are required if $n = 8k + 1$ for some $k \in \mathbb{N}$), and
+also allows all the above operations to be implemented efficiently:
+
+* Construction given a fixed number of elements and the bound $n$ involves
+  allocating the packed array, then modifying some bits to be set.
+* Construction of the empty set is a packed array where every byte is `0x00`,
+  while the universe is a packed array where every byte is `0xFF`.
+* Set union is bitwise OR over both arguments.
+* Set intersection is bitwise AND over both arguments.
+* Set complement is bitwise complement over the entire packed array.
+* Symmetric set difference is bitwise XOR over both arguments; asymmetric set
+  difference can be defined using a combination of bitwise complement and
+  bitwise OR.
+* Membership testing is checking whether a bit is set.
+* Inserting an element is setting the corresponding bit.
+* Removing an element is clearing the corresponding bit.
+
+Given that this is a packed representation, these operations can be implemented
+very efficiently by relying on the cache-friendly properties of packed array
+traversals, as well as making use of optimized routines available in many
+languages. Thus, this structure can be used to efficiently represent sets of
+numbers in any bounded range (as ranges not starting from $0$ can be represented
+by storing an offset), while also being minimal in space usage.
+
+Currently, such a structure cannot be easily implemented in Plutus Core while
+preserving the properties described above. The two options using existing
+primitives are either to use `[BuiltinInteger]`, or to mimic the above
+operations over `BuiltinByteString`. The first of these is not space _or_
+time-efficient: each `BuiltinInteger` takes up multiple machine words of space,
+and the list overheads introduced are linear in the number of items stored,
+destroying succinctness; membership testing, insertion and removal require
+either maintaining an ordered list or forcing linear scans for at least some
+operations, which are inefficient over lists; and 'bulk' operations like union,
+intersection and complement become very difficult and time-consuming. The second
+is not much better: while we preserve succinctness, there is no easy way to
+access individual bits, only bytes, which would require a division-remainder
+loop for each such operation, with all the overheads this imposes; intersection,
+union and symmetric difference would have to be simulated byte-by-byte,
+requiring large lookup tables or complex conditional logic; and construction
+would require immense amounts of copying and tricky byte construction logic.
+While it is not outright impossible to make such a structure using current
+primitives, it would be so impractical that it could never see real use.
+
+Furthermore, for sparse (or dense) integer sets (that is, where either most
+elements in the range are absent or present respectively), a range of
+[compression techniques][bitmap-index-compression] have been developed. All of
+these rely on bitwise operations to achieve their goals, and can potentially
+yield significant space savings in many cases. Given the limitations onchain
+that we have to work within, having such techniques available to implementers
+would be a huge potential advantage.
+
+### Case 2: hashing
+
+[Hashing][hashing], that is, computing a fixed-length 'fingerprint' or 'digest'
+of a variable-length input (typically viewed as binary) is a common task
+required in a range of applications. Most notably, hashing is a key tool in
+cryptographic protocols and applications, either in its own right, or as part of
+a larger task. The value of such functionality is such that Plutus Core already
+contains primitives for certain hash functions, specifically two variants of
+[SHA256][sha256] and [BLAKE2b][blake2b]. At the same time, hash functions
+choices are often determined by protocol or use case, and providing individual
+primitives for every possible hash function is not a scalable choice. It is much
+preferrable to give necessary tools to implement such functionality to users of
+Plutus (Core), allowing them to use whichever hash function(s) their
+applications require.
+
+As an example, we consider the [Argon2][argon2] family of hash functions. In
+order to implement any variant of this family requires the following operations:
+
+1. Conversion of numbers to bytes
+2. Bytestring concatenation
+3. BLAKE2b hashing
+4. Floor division
+5. Indexing bytes in a bytestring
+6. Logical XOR
+
+Operations 1 to 5 are already provided by Plutus Core (with 1 being included [in
+CIP-121][conversion-cip]); however, without logical XOR, no function in the
+Argon2 family could be implemented. While in theory, it could be simulated with
+what operations already exist, much as with Case 1, this would be impractical at
+best, and outright impossible at worst, due to the severe limits imposed
+on-chain. This is particularly the case here, as all Argon2 variants call
+logical XOR in a loop, whose step count is defined by _multiple_ user-specified
+(or protocol-specified) parameters.
+
+We observe that this requirement for logical XOR is not unique to the Argon2
+family of hash functions. Indeed, logical XOR is widely used for [a variety of
+cryptographic applications][xor-crypto], as it is a low-cost mixing
+function that happens to be self-inverting, as well as preserving randomness
+(that is, a random bit XORed with a non-random bit will give a random bit).  
+
+## Specification
+
+We describe the proposed operations in several stages. First, we specify a
+scheme for indexing individual bits (rather than whole bytes) in a
+`BuiltinByteString`. We then specify the semantics of each operation, as well as
+giving costing expectations and some examples. Lastly, we provide some laws that 
+any implementation of these operations is expected to obey.
+
+### Bit indexing scheme
+
+We begin by observing that a `BuiltinByteString` is a packed array of bytes
+(that is, `BuiltinInteger`s in the range $[0, 255]$) according to the API
+provided by existing Plutus Core primitives. In particular, we have the ability
+to access individual bytes by index as a primitive operation. Thus, we can view
+a `BuiltinByteString` as an indexed collection of bytes; for any
+`BuiltinByteString` $b$ of length $n$, and any $i \in 0, 1, \ldots, n - 1$, we
+define $b\\{i\\}$ as the byte at index $i$ in $b$, as defined by the
+`builtinIndexByteString` primitive. In essence, for any `BuiltinByteString` of
+length `n`, we have _byte_ indexes as follows:
+
+```
+| Index | 0  | 1  | ... | n - 1    |
+|-------|----|----| ... |----------|
+| Byte  | w0 | w1 | ... | w(n - 1) |
+```
+
+To view a `BuiltinByteString` as an indexed collection of _bits_, we must first
+consider the bit ordering within a byte. Suppose $i \in 0, 1, \ldots, 7$ is an
+index into a byte $w$. We say that the bit at $i$ in $w$ is _set_ when
+
+$$
+\left \lfloor \frac{w}{2^{i}} \right \rfloor \mod 2 \equiv 1
+$$
+
+Otherwise, the bit at $i$ in $w$ is _clear_. We define $w[i]$ to be $1$ when 
+the bit at $i$ in $w$ is set, and $0$ otherwise; this is the _value_ at index
+$i$ in $w$.
+
+For example, consider the byte represented by the `BuiltinInteger` 42. By the
+above scheme, we have the following:
+
+| Bit index | Set or clear? |
+|-----------|---------------|
+| $0$       | Clear         |
+| $1$       | Set           |
+| $2$       | Clear         |
+| $3$       | Set           |
+| $4$       | Clear         |
+| $5$       | Set           |
+| $6$       | Clear         |
+| $7$       | Clear         |
+
+Put another way, we can view $w[i] = 1$ to mean that the $(i + 1)$ th least significant
+digit in $w$'s binary representation is $1$, and likewise, $w[i] = 0$ would mean
+that the $i$th least significant digit in $w$'s binary representation is $0$.
+Continuing with the above example, $42$ is represented in binary as `00101010`;
+we can see that the second-least-significant, fourth-least-significant, and
+sixth-least-significant digits are `1`, and all the others are zero. This
+description mirrors the way bytes are represented on machine architectures.
+
+We now extend the above scheme to `BuiltinByteString`s. Let $b$ be a
+`BuiltinByteString` whose length is $n$, and let $i \in 0, 1, \ldots, 8 \cdot n - 1$. 
+For any $j \in 0, 1, \ldots, n - 1$, let $j^{\prime} = n - j - 1$. We say that the bit 
+at $i$ in $b$ is set if
+
+$$
+b\left\\{\left(\left\lfloor \frac{i}{8} \right\rfloor\right)^{\prime}\right\\}[i\mod 8] = 1
+$$
+
+We define the bit at $i$ in $b$ being clear analogously. Similarly to bits in a
+byte, we define $b[i]$ to be $1$ when the bit at $i$ in $b$ is set, and $0$
+otherwise; similarly to bytes, we term this the _value_ at index $i$ in $b$.
+
+As an example, consider the `BuiltinByteString` `[42, 57, 133]`: that is, the
+`BuiltinByteString` $b$ such that $b\\{0\\} = 42$, $b\\{1\\} = 57$ and $b\\{2\\}
+= 133$. We observe that the range of 'valid' bit indexes $i$ into $b$ is in 
+$[0, 3 \cdot 8 - 1 = 23]$. Consider $i = 4$; by the definition above, this
+corresponds to the _byte_ index 2, as $\left\lfloor\frac{4}{8}\right\rfloor =
+0$, and $3 - 0 - 1 = 2$ (as $b$ has length $3$). Within the byte $133$, this
+means we have $\left\lfloor\frac{133}{2^4}\right\rfloor \mod 2 \equiv 0$. Thus,
+$b[4] = 0$. Consider instead the index $i = 19$; by the definition above, this
+corresponds to the _byte_ index 0, as $\left\lfloor\frac{19}{8}\right\rfloor =
+2$, and $3 - 2 - 1 = 0$. Within the byte $42$, this means we have
+$\left\lfloor\frac{42}{2^3}\right\rfloor\mod 2 \equiv 1$. Thus, $b[19] = 1$. 
+
+Put another way, our _byte_ indexes run 'the opposite way' to our _bit_ indexes.
+Thus, for any `BuiltinByteString` of length $n$, we have _bit_ indexes relative
+_byte_ indexes as follows:
+
+```
+| Byte index | 0                              | 1  | ... | n - 1                         |
+|------------|--------------------------------|----| ... |-------------------------------|
+| Byte       | w0                             | w1 | ... | w(n - 1)                      |
+|------------|--------------------------------|----| ... |-------------------------------|
+| Bit index  | 8n - 1 | 8n - 2 | ... | 8n - 8 |   ...    | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
+```
+
+### Operation semantics
+
+We describe precisely the operations we intend to implement, and their
+semantics. These operations will have the following signatures:
+
+* `bitwiseLogicalAnd :: BuiltinBool -> BuiltinByteString -> BuiltinByteString ->
+  BuiltinByteString`
+* `bitwiseLogicalOr :: BuiltinBool -> BuiltinByteString -> BuiltinByteString ->
+  BuiltinByteString`
+* `bitwiseLogicalXor :: BuiltinBool -> BuiltinByteString -> BuiltinByteString ->
+  BuiltinByteString`
+* `bitwiseLogicalComplement :: BuiltinByteString -> BuiltinByteString`
+* `readBit :: BuiltinByteString -> BuiltinInteger -> BuiltinBool`
+* `writeBits :: BuiltinByteString -> [(BuiltinInteger, BuiltinBool)] ->
+  BuiltinByteString`
+* `replicateByteString :: BuiltinInteger -> BuiltinInteger -> BuiltinByteString`
+
+We assume the following costing, for both memory and execution time:
+
+| Operation | Cost |
+|-----------|------|
+| `bitwiseLogicalAnd` | Linear in longest `BuiltinByteString` argument |
+| `bitwiseLogicalOr` | Linear in longest `BuiltinByteString` argument |
+| `bitwiseLogicalXor` | Linear in longest `BuiltinByteString` argument |
+| `bitwiseLogicalComplement` | Linear in `BuiltinByteString` argument |
+| `readBit` | Constant |
+| `writeBits` | Additively linear in both arguments |
+| `replicateByteString` | Linear in the _value_ of the first argument |
+
+#### Padding versus truncation semantics
+
+For the binary logical operations (that is, `bitwiseLogicalAnd`,
+`bitwiseLogicalOr` and `bitwiseLogicalXor`), the we have two choices of
+semantics when handling `BuiltinByteString` arguments of different lengths. We
+can either produce a result whose length is the _minimum_ of the two arguments
+(which we call _truncation semantics_), or produce a result whose length is the
+_maximum_ of the two arguments (which we call _padding semantics_). As these can
+both be useful depending on context, we allow both, controlled by a
+`BuiltinBool` flag, on all the operations listed above. 
+
+In cases where we have arguments of different lengths, in order to produce a
+result of the appropriate lengths, one of the arguments needs to be either
+padded or truncated. Let `short` and `long` refer to the `BuiltinByteString`
+argument of shorter length, and of longer length, respectively. The following
+table describes what happens to the arguments before the operation:
+
+| Semantics | `short` | `long` |
+|-----------|---------|--------|
+| Padding   | Pad at high _byte_ indexes | Unchanged |
+| Truncation | Unchanged | Truncate high _byte_ indexes |
+
+We pad with different bytes depending on operation: for `bitwiseLogicalAnd`, we
+pad with `0xFF`, while for `bitwiseLogicalOr` and `bitwiseLogicalXor` we pad
+with `0x00` instead. We refer to arguments so changed as 
+_semantics-modified_ arguments.
+
+For example, consider the `BuiltinByteString`s `x = [0x00, 0xF0, 0xFF]` and `y =
+[0xFF, 0xF0]`. The following table describes what the semantics-modified
+versions of these arguments would become for each operation and each semantics:
+
+| Operation | Semantics | `x` | `y` |
+|-----------|-----------|-----|-----|
+| `bitwiseLogicalAnd` | Padding | `[0x00, 0xF0, 0xFF]` | `[0xFF, 0xF0, 0xFF]` |
+| `bitwiseLogicalAnd` | Truncation | `[0x00, 0xF0]` | `[0xFF, 0xF0]` |
+| `bitwiseLogicalOr` | Padding | `[0x00, 0xF0, 0xFF]` | `[0xFF, 0xF0, 0x00]` |
+| `bitwiseLogicalor` | Truncation | `[0x00, 0xF0]` | `[0xFF, 0xF0]` |
+| `bitwiseLogicalXor` | Padding | `[0x00, 0xF0, 0xFF]` | `[0xFF, 0xF0, 0x00]` |
+| `bitwiseLogicalXor` | Truncation | `[0x00, 0xF0]` | `[0xFF, 0xF0]` |
+
+Based on the above, we observe that under padding semantics, the result of any
+of the listed operations would have a byte length of 3, while under truncation
+semantics, the result would have a byte length of 2 instead.
+
+#### `bitwiseLogicalAnd`
+
+`bitwiseLogicalAnd` takes three arguments; we name and describe them below.
+
+1. Whether padding semantics should be used. If this argument is `False`,
+   truncation semantics are used instead. This is the _padding semantics
+   argument_, and has type `BuiltinBool`.
+2. The first input `BuiltinByteString`. This is the _first data argument_.
+3. The second input `BuiltinByteString`. This is the _second data argument_.
+
+Let $b_1, b_2$ refer to the semantics-modified first data argument and
+semantics-modified second data argument respectively, and let $n$ be either of 
+their lengths in bytes; see the 
+[section on padding versus truncation semantics](#padding-versus-truncation-semantics) 
+for the exact specification of this. Let the result of `bitwiseLogicalAnd`, given 
+$b_1, b_2$ and some padding semantics argument, be $b_r$, also of length $n$ 
+in bytes. We use $b_1\\{i\\}$ to refer to the byte at index $i$ in $b_1$ (and
+analogously for $b_2$, $b_r#); see the [section on the bit indexing
+scheme](#bit-indexing-scheme) for the exact specification of this.
+
+For all $i \in 0, 1, \ldots, n - 1$, we have 
+$b_r\\{i\\} = b_0\\{i\\} \text{ }\\& \text{ } b_1\\{i\\}$, where $\\&$ refers to a 
+[bitwise AND][bitwise-and].
+
+Some examples of the intended behaviour of `bitwiseLogicalAnd` follow. For
+brevity, we write `BuiltinByteString` literals as lists of hexadecimal values.
+
+```
+-- truncation semantics
+bitwiseLogicalAnd False [] [0xFF] => []
+
+bitwiseLogicalAnd False [0xFF] [] => []
+
+bitwiseLogicalAnd False [0xFF] [0x00] => [0x00]
+
+bitwiseLogicalAnd False [0x00] [0xFF] => [0x00]
+
+bitwiseLogicalAnd False [0x4F, 0x00] [0xF4] => [0x44]
+
+-- padding semantics
+bitwiseLogicalAnd True [] [0xFF] => [0xFF]
+
+bitwiseLogicalAnd True [0xFF] [] => [0xFF]
+
+bitwiseLogicalAnd False [0xFF] [0x00] => [0x00]
+
+bitwiseLogicalAnd False [0x00] [0xFF] => [0x00]
+
+bitwiseLogicalAnd False [0x4F, 0x00] [0xF4] => [0x44, 0x00]
+```
+
+#### `bitwiseLogicalOr`
+
+`bitwiseLogicalOr` takes three arguments; we name and describe them below.
+
+1. Whether padding semantics should be used. If this argument is `False`,
+   truncation semantics are used instead. This is the _padding semantics
+   argument_, and has type `BuiltinBool`.
+2. The first input `BuiltinByteString`. This is the _first data argument_.
+3. The second input `BuiltinByteString`. This is the _second data argument_.
+
+Let $b_1, b_2$ refer to the semantics-modified first data argument and
+semantics-modified second data argument respectively, and let $n$ be either of 
+their lengths in bytes; see the 
+[section on padding versus truncation semantics](#padding-versus-truncation-semantics) 
+for the exact specification of this. Let the result of `bitwiseLogicalOr`, given 
+$b_1, b_2$ and some padding semantics argument, be $b_r$, also of length $n$ 
+in bytes. We use $b_1\\{i\\}$ to refer to the byte at index $i$ in $b_1$ (and
+analogously for $b_2$, $b_r#); see the [section on the bit indexing
+scheme](#bit-indexing-scheme) for the exact specification of this.
+
+For all $i \in 0, 1, \ldots, n - 1$, we have 
+$b_r\\{i\\} = b_0\\{i\\} \text{ } \| \text{ } b_1\\{i\\}$, where $\|$ refers to 
+a [bitwise OR][bitwise-or].
+
+```
+-- truncation semantics
+bitwiseLogicalOr False [] [0xFF] => []
+
+bitwiseLogicalOr False [0xFF] [] => []
+
+bitwiseLogicalOr False [0xFF] [0x00] => [0xFF]
+
+bitwiseLogicalOr False [0x00] [0xFF] => [0xFF]
+
+bitwiseLogicalOr False [0x4F, 0x00] [0xF4] => [0xFF]
+
+-- padding semantics
+bitwiseLogicalOr True [] [0xFF] => [0xFF]
+
+bitwiseLogicalOr True [0xFF] [] => [0xFF]
+
+bitwiseLogicalOr False [0xFF] [0x00] => [0xFF]
+
+bitwiseLogicalOr False [0x00] [0xFF] => [0xFF]
+
+bitwiseLogicalOr False [0x4F, 0x00] [0xF4] => [0xFF, 0x00]
+```
+
+#### `bitwiseLogicalXor`
+
+`bitwiseLogicalXor` takes three arguments; we name and describe them below.
+
+1. Whether padding semantics should be used. If this argument is `False`,
+   truncation semantics are used instead. This is the _padding semantics
+   argument_, and has type `BuiltinBool`.
+2. The first input `BuiltinByteString`. This is the _first data argument_.
+3. The second input `BuiltinByteString`. This is the _second data argument_.
+
+Let $b_1, b_2$ refer to the semantics-modified first data argument and
+semantics-modified second data argument respectively, and let $n$ be either of 
+their lengths in bytes; see the 
+[section on padding versus truncation semantics](#padding-versus-truncation-semantics) 
+for the exact specification of this. Let the result of `bitwiseLogicalXor`, given 
+$b_1, b_2$ and some padding semantics argument, be $b_r$, also of length $n$ 
+in bytes. We use $b_1\\{i\\}$ to refer to the byte at index $i$ in $b_1$ (and
+analogously for $b_2$, $b_r#); see the [section on the bit indexing
+scheme](#bit-indexing-scheme) for the exact specification of this.
+
+For all $i \in 0, 1, \ldots, n - 1$, we have 
+$b_r\\{i\\} = b_0\\{i\\} \text{ } \wedge \text{ } b_1\\{i\\}$, where $\wedge$ refers to 
+a [bitwise XOR][bitwise-xor].
+
+Some examples of the intended behaviour of `bitwiseLogicalXor` follow. For
+brevity, we write `BuiltinByteString` literals as lists of hexadecimal values.
+
+```
+-- truncation semantics
+bitwiseLogicalXor False [] [0xFF] => []
+
+bitwiseLogicalXor False [0xFF] [] => []
+
+bitwiseLogicalXor False [0xFF] [0x00] => [0xFF]
+
+bitwiseLogicalXor False [0x00] [0xFF] => [0xFF]
+
+bitwiseLogicalXor False [0x4F, 0x00] [0xF4] => [0xBB]
+
+-- padding semantics
+bitwiseLogicalOr True [] [0xFF] => [0xFF]
+
+bitwiseLogicalOr True [0xFF] [] => [0xFF]
+
+bitwiseLogicalOr False [0xFF] [0x00] => [0xFF]
+
+bitwiseLogicalOr False [0x00] [0xFF] => [0xFF]
+
+bitwiseLogicalOr False [0x4F, 0x00] [0xF4] => [0xBB, 0x00]
+```
+
+#### `bitwiseLogicalComplement`
+
+`bitwiseLogicalComplement` takes a single argument, of type `BuiltinByteString`;
+let $b$ refer to that argument, and $n$ its length in bytes. Let $b_r$ be
+the result of `bitwiseLogicalComplement`; its length in bytes is also $n$. We
+use $b[i]$ to refer to the value at index $i$ of $b$ (and analogously for $b_r$); 
+see the [section on the bit indexing scheme](#bit-indexing-scheme) for the exact
+specification of this.
+
+For all $i \in 0, 1, \ldots , 8 \cdot n - 1$, we have
+
+$$
+b_r[i] = \begin{cases}
+        0 & \text{if } b[i] = 1\\
+        1 & \text{otherwise}\\
+        \end{cases}
+$$
+
+Some examples of the intended behaviour of `bitwiseLogicalComplement` follow. For
+brevity, we write `BuiltinByteString` literals as lists of hexadecimal values.
+
+```
+bitwiseLogicalComplement [] => []
+
+bitwiseLogicalComplement [0x0F] => [0xF0]
+
+bitwiseLogicalComplement [0x4F, 0xF4] => [0xB0, 0x0B]
+```
+
+#### `readBit`
+
+`readBit` takes two arguments; we name and describe them below.
+
+1. The `BuiltinByteString` in which the bit we want to read can be found. This
+   is the _data argument_.
+2. A bit index into the data argument, of type `BuiltinInteger`. This is the
+   _index argument_.
+
+Let $b$ refer to the data argument, of length $n$ in bytes, and let $i$ refer to
+the index argument. We use $b[i]$ to refer to the value at index $i$t of $b$; see 
+the [section on the bit indexing scheme](#bit-indexing-scheme) for the exact 
+specification of this.
+
+If $i < 0$ or $i \geq 8 \cdot n$, then `readBit`
+fails. In this case, the resulting error message must specify _at least_ the
+following information:
+
+* That `readBit` failed due to an out-of-bounds index argument; and
+* What `BuiltinInteger` was passed as an index argument.
+
+Otherwise, if $b[i] = 0$, `readBit` returns `False`, and if $b[i] = 1$,
+`readBit` returns `True`.
+
+Some examples of the intended behaviour of `readBit` follow. For
+brevity, we write `BuiltinByteString` literals as lists of hexadecimal values.
+
+```
+-- Indexing an empty BuiltinByteString fails
+readBit [] 0 => error
+
+readBit [] 345 => error
+
+-- Negative indexes fail
+readBit [] (-1) => error
+
+readBit [0xFF] (-1) => error
+
+-- Indexing reads 'from the end'
+readBit [0xF4] 0 => False
+
+readBit [0xF4] 1 => False 
+
+readBit [0xF4] 2 => True 
+
+readBit [0xF4] 3 => False
+
+readBit [0xF4] 4 => True
+
+readBit [0xF4] 5 => True
+
+readBit [0xF4] 6 => True
+
+readBit [0xF4] 7 => True
+
+-- Out-of-bounds indexes fail
+readBit [0xF4] 8 => error
+
+readBit [0xFF, 0xF4] 16 => error
+
+-- Larger indexes read backwards into the bytes from the end
+readBit [0xF4, 0xFF] 10 => False 
+```
+
+#### `writeBits`
+
+`writeBits` takes two arguments: we name and describe them below.
+
+1. The `BuiltinByteString` in which we want to change some bits. This is the
+   _data argument_.
+2. A list of index-value pairs, indicating which positions in the data argument
+   should be changed to which value. This is the _change list argument_. Each
+   index has type `BuiltinInteger`, while each value has type `BuiltinBool`.
+
+Let $b$ refer to the data argument of length $n$ in bytes. We define `writeBits`
+recursively over the structure of the change list argument. Throughout, we use
+$b_r$ to refer to the result of `writeBits`, whose length is also $n$. We use
+$b[i]$ to refer to the value at index $i$ of $b$ (and analogously, $b_r$); see
+the [section on the bit indexing scheme](#bit-indexing-scheme) for the exact
+specification of this.
+
+If the change list argument is empty, we return the data argument unchanged.
+Otherwise, let $(i, v)$ be the head of the change list argument, and $\ell$ its
+tail. If $i < 0$ or $i \geq 8 \cdot n$, then `writeBits` fails. In this case,
+the resulting error message must specify at _least_ the following information:
+
+* That `writeBits` failed due to an out-of-bounds index argument; and
+* What `BuiltinInteger` was passed as $i$.
+
+Otherwise, for all $j \in 0, 1, \ldots 8 \cdot n - 1$, we have
+
+$$
+b_r[j] = \begin{cases}
+         0 & \text{if } j = i \text{ and } v = \texttt{False}\\
+         1 & \text{if } j = i \text{ and } v = \texttt{True}\\
+         b[j] & \text{otherwise}\\
+         \end{cases}
+$$
+
+Then, if we did not fail as described above, we repeat the `writeBits`
+operation, but with $b_r$ as the data argument and $\ell$ as the change list
+argument.
+
+Some examples of the intended behaviour of `writeBits` follow. For
+brevity, we write `BuiltinByteString` literals as lists of hexadecimal values.
+
+```
+-- Writing an empty BuiltinByteString fails
+writeBits [] [(0, False)] => error
+
+-- Irrespective of index
+writeBits [] [(15, False)] => error
+
+-- And value
+writeBits [] [(0, True)] => error
+
+-- And multiplicity
+writeBits [] [(0, False), (1, False)] => error
+
+-- Negative indexes fail
+writeBits [0xFF] [((-1), False)] => error
+
+-- Even when mixed with valid ones
+writeBits [0xFF] [(0, False), ((-1), True)] => error
+
+-- In any position
+writeBits [0xFF] [((-1), True), (0, False)] => error
+
+-- Out-of-bounds indexes fail
+writeBits [0xFF] [(8, False)] => error
+
+-- Even when mixed with valid ones
+writeBits [0xFF] [(1, False), (8, False)] => error
+
+-- In any position
+writeBits [0xFF] [(8, False), (1, False)] => error
+
+-- Bits are written 'from the end'
+writeBits [0xFF] [(0, False)] => [0xFE]
+
+writeBits [0xFF] [(1, False)] => [0xFD]
+
+writeBits [0xFF] [(2, False)] => [0xFB]
+
+writeBits [0xFF] [(3, False)] => [0xF7]
+
+writeBits [0xFF] [(4, False)] => [0xEF]
+
+writeBits [0xFF] [(5, False)] => [0xDF]
+
+writeBits [0xFF] [(6, False)] => [0xBF]
+
+writeBits [0xFF] [(7, False)] => [0x7F]
+
+-- True value sets the bit
+writeBits [0x00] [(5, True)] => [0x20]
+
+-- False value clears the bit
+writeBits [0xFF] [(5, False)] => [0xDF]
+
+-- Larger indexes write backwards into the bytes from the end
+writeBits [0xF4, 0xFF] [(10, False)] => [0xF0, 0xFF]
+
+-- Multiple items in a change list apply cumulatively
+writeBits [0xF4, 0xFF] [(10, False), (1, False)] => [0xF0, 0xFD]
+
+writeBits (writeBits [0xF4, 0xFF] [(10, False)]) [(1, False)] => [0xF0, 0xFD]
+
+-- Order within a change list is unimportant among unique indexes
+writeBits [0xF4, 0xFF] [(1, False), (10, False)] => [0xF0, 0xFD]
+
+-- But _is_ important for identical indexes
+writeBits [0x00, 0xFF] [(10, True), (10, False)] => [0x00, 0xFF]
+
+writeBits [0x00, 0xFF] [(10, False), (10, True)] => [0x04, 0xFF]
+
+-- Setting an already set bit does nothing
+writeBits [0xFF] [(0, True)] => [0xFF]
+
+-- Clearing an already clear bit does nothing
+writeBits [0x00] [(0, False)] => [0x00]
+```
+
+#### `replicateByteString`
+
+`replicateByteString` takes two arguments; we name and describe them below.
+
+1. The desired result length, of type `BuiltinInteger`. This is the _length
+   argument_.
+2. The byte to place at each position in the result, represented as a
+   `BuiltinInteger` (corresponding to the unsigned integer this byte encodes).
+   This is the _byte argument_.
+
+Let $n$ be the length argument, and $w$ the byte argument. If $n < 0$, then
+`replicateByteString` fails. In this case, the resulting error message must specify
+_at least_ the following information:
+
+* That `replicateByteString` failed due to a negative length argument; and
+* What `BuiltinInteger` was passed as the length argument.
+
+If $n \geq 0$, and $w < 0$ or $w > 255$, then `replicateByteString` fails. In this
+case, the resulting error message must specify _at least_ the following
+information:
+
+* That `replicateByteString` failed due to the byte argument not being a valid
+  byte; and
+* What `BuiltinInteger` was passed as the byte argument.
+
+Otherwise, let $b$ be the result of `replicateByteString`, and let $b\\{i\\}$ be the
+byte at position $i$ of $b$, as per [the section describing the bit indexing
+scheme](#bit-indexing-scheme). We have:
+
+* The length (in bytes) of $b$ is $n$; and
+* For all $i \in 0, 1, \ldots, n - 1$, $b\\{i\\} = w$.
+
+Some examples of the intended behaviour of `replicateByteString` follow. For
+brevity, we write `BuiltinByteString` literals as lists of hexadecimal values.
+
+```
+-- Replicating a negative number of times fails
+replicateByteString (-1) 0 => error
+
+-- Irrespective of byte argument
+replicateByteString (-1) 3 => error
+
+-- Out-of-bounds byte arguments fail
+replicateByteString 1 (-1) => error
+
+replicateByteString 1 256 => error
+
+-- Irrespective of length argument
+replicateByteString 4 (-1) => error
+
+replicateByteString 4 256 => error
+
+-- Length of result matches length argument, and all bytes are the same
+replicateByteString 0 0xFF => []
+
+replicateByteString 4 0xFF => [0xFF, 0xFF, 0xFF, 0xFF]
+```
+
+### Laws
+
+#### Binary operations
+
+We describe laws for all three operations that work over two
+`BuiltinByteStrings`, that is, `bitwiseLogicalAnd`, `bitwiseLogicalOr` and
+`bitwiseLogicalXor`, together, as many of them are similar (and related). We
+describe padding semantics and truncation semantics laws, as they are slightly
+different.
+
+All three operations above, under both padding and truncation semantics, are
+[commutative semigroups][special-semigroups]. Thus, we have:
+
+```haskell
+bitwiseLogicalAnd s x y = bitwiseLogicalAnd s y x
+
+bitwiseLogicalAnd s x (bitwiseLogicalAnd s y z) = bitwiseLogicalAnd s
+(bitwiseLogicalAnd s x y) z
+
+-- and the same for bitwiseLogicalOr and bitwiseLogicalXor
+```
+
+Note that the semantics (designated as `s` above) must be consistent in order
+for these laws to hold. Furthermore, under padding semantics, all the above
+operations are [commutative monoids][commutative-monoid]:
+
+```haskell
+bitwiseLogicalAnd True x "" = bitwiseLogicalAnd True "" x = x
+
+-- and the same for bitwiseLogicalOr and bitwiseLogicalXor
+```
+
+Under truncation semantics, `""` (that is, the empty `BuiltinByteString`) acts
+instead as an [absorbing element][absorbing-element]:
+
+```haskell
+bitwiseLogicalAnd False x "" = bitwiseLogicalAnd False "" x = ""
+
+-- and the same for bitwiseLogicalOr and bitwiseLogicalXor
+```
+
+`bitwiseLogicalAnd` and `bitwiseLogicalOr` are also [semilattices][semilattice],
+due to their idempotence:
+
+```haskell
+bitwiseLogicalAnd s x x = x
+
+-- and the same for bitwiseLogicalOr
+```
+
+`bitwiseLogicalXor` is instead involute:
+
+```haskell
+bitwiseLogicalXor s x (bitwiseLogicalXor s x x) = bitwiseLogicalXor s
+(bitwiseLogicalXor s x x) x = x
+```
+
+Additionally, under padding semantics, `bitwiseLogicalAnd` and
+`bitwiseLogicalOr` are [self-distributive][distributive]:
+
+```haskell
+bitwiseLogicalAnd True x (bitwiseLogicalAnd True y z) = bitwiseLogicalAnd True
+(bitwiseLogicalAnd True x y) (bitwiseLogicalAnd True x z)
+
+bitwiseLogicalAnd True (bitwiseLogicalAnd True x y) z = bitwiseLogicalAnd True
+(bitwiseLogicalAnd True x z) (bitwiseLogicalAnd True y z)
+
+-- and the same for bitwiseLogicalOr
+```
+
+Under truncation semantics, `bitwiseLogicalAnd` is only left-distributive over
+itself, `bitwiseLogicalOr` and `bitwiseLogicalXor`:
+
+```haskell
+bitwiseLogicalAnd False x (bitwiseLogicalAnd False y z) = bitwiseLogicalAnd
+False (bitwiseLogicalAnd False x y) (bitwiseLogicalAnd False x z)
+
+bitwiseLogicalAnd False x (bitwiseLogicalOr False y z) = bitwiseLogicalOr False
+(bitwiseLogicalAnd False x y) (bitwiseLogicalAnd False x z)
+
+bitwiseLogicalAnd False x (bitwiseLogicalXor False y z) = bitwiseLogicalXor
+False (bitwiseLogicalAnd False x y) (bitwiseLogicalAnd False x z)
+```
+
+`bitwiseLogicalOr` under truncation semantics is left-distributive over itself
+and `bitwiseLogicalAnd`:
+
+```haskell
+bitwiseLogicalOr False x (bitwiseLogicalOr False y z) = bitwiseLogicalOr False
+(bitwiseLogicalOr False x y) (bitwiseLogicalOr False x z)
+
+bitwiseLogicalOr False x (bitwiseLogicalAnd False y z) = bitwiseLogicalAnd False
+(bitwiseLogicalOr False x y) (bitwiseLogicalOr False x z)
+```
+
+If the first and second data arguments to these operations have the same length,
+these operations satisfy several additional laws. We describe these briefly
+below, with the added note that, in this case, padding and truncation semantics
+coincide:
+
+* `bitwiseLogicalAnd` and `bitwiseLogicalOr` form a [bounded lattice][lattice]
+* `bitwiseLogicalAnd` is [distributive][distributive] over itself, `bitwiseLogicalOr` and
+  `bitwiseLogicalXor`
+* `bitwiseLogicalOr` is [distributive][distributive] over itself and `bitwiseLogicalAnd`
+
+We do not specify these laws here, as they do not hold in general. At the same
+time, we expect that any implementation of these operations will be subject to
+these laws.
+
+#### `bitwiseLogicalComplement`
+
+The main law of `bitwiseLogicalComplement` is involution:
+
+```haskell
+bitwiseLogicalComplement (bitwiseLogicalComplement x) = x
+```
+
+In combination with `bitwiseLogicalAnd` and `bitwiseLogicalOr`,
+`bitwiseLogicalComplement` gives rise to the famous [De Morgan laws][de-morgan], irrespective of semantics:
+
+```haskell
+bitwiseLogicalComplement (bitwiseLogicalAnd s x y) = bitwiseLogicalOr s
+(bitwiseLogicalComplement x) (bitwiseLogicalComplement y)
+
+bitwiseLogicalComplement (bitwiseLogicalOr s x y) = bitwiseLogicalAnd s
+(bitwiseLogicalComplement x) (bitwiseLogicalComplement y)
+```
+
+For `bitwiseLogicalXor`, we instead have (again, irrespective of semantics):
+
+```haskell
+bitwiseLogicalXor s x (bitwiseLogicalComplement x) = x
+```
+
+#### Bit reading and modification
+
+Throughout, we assume any index arguments to be 'in-bounds'; that is, all the
+index arguments used in the statements of any law are such that the operation
+they are applied to wouldn't produce an error.
+
+The first law of `writeBits` is similar to the [set-twice law of
+lenses][lens-laws]:
+
+```haskell
+writeBits bs [(i, b1), (i, b2)] = writeBits bs [(i, b2)]
+```
+
+Together with `readBit`, we obtain the remaining two analogues to the lens
+laws:
+
+```haskell
+-- writing to an index, then reading from that index, gets you what you wrote
+readBit (writeBits bs [(i, b)]) i = b
+
+-- if you read from an index, then write that value to that same index, nothing
+-- happens
+writeBits bs [(i, readBit bs i)] = bs
+```
+
+Furthermore, given a fixed data argument, `writeBits` acts as a [monoid
+homomorphism][monoid-homomorphism] lists under concatenation to functions:
+
+```haskell
+writeBits bs [] = bs
+
+writeBits bs (is <> js) = writeBits (writeBits bs is) js
+```
+
+#### `replicateByteString`
+
+Given a fixed byte argument, `replicateByteString` acts as a [monoid
+homomorphism][monoid-homomorphism] from natural numbers under addition to
+`BuiltinByteString`s under concatenation: 
+
+```haskell
+replicateByteString 0 w = ""
+
+replicateByteString (n + m) w = replicateByteString n w <> replicateByteString m w
+```
+
+Additionally, for any 'in-bounds' index (that is, any index for which
+`builtinIndexByteString` won't error) `i`, we have
+
+```haskell
+builtinIndexByteString (replicateByteString n w) i = w
+```
+
+Lastly, we have
+
+```haskell
+builtinSizeOfByteString (replicateByteString n w) = n
+```
+
+## Rationale: how does this CIP achieve its goals?
+
+The operations, and semantics, described in this CIP provide a set of
+well-defined bitwise logical operations, as well as bitwise access and
+modification, to allow cases similar to Case 1 to be performed efficiently and
+conveniently. Furthermore, the semantics we describe would be reasonably
+familiar to users of other programming languages (including Haskell) which have
+provisions for bitwise logical operations of this kind, as well as some way of
+extending these operations to operate on packed byte vectors. At the same time,
+there are several choices we have made that are somewhat unusual, or could
+potentially have been implemented differently based on existing work: most
+notably, our choice of bit indexing scheme, the padding-versus-truncation
+semantics, and the multiplicitous definition of bit modification. Among existing
+work, a particularly important example is [CIP-58][cip-58], which makes
+provisions for operations similar to the ones described here, and from which we
+differ in several important ways. We clarify the reasoning behind our choices,
+and how they differ from existing work, below.
+
+Aside from the issues we list below, we don't consider other operations
+controversial. Indeed, `bitwiseLogicalComplement` has a direct parallel to the
+implementation in [CIP-58][cip-58], and `replicateByteString` is a direct wrapper
+around the `replicate` function in `ByteString`. Thus, we do not discuss them
+further here.
+
+### Relationship to CIP-58 and CIP-121
+
+Our work relates to both [CIP-58][cip-58] and [CIP-121][cip-121]. Essentially,
+our goal with both this CIP and CIP-121 is to both break CIP-58 into more
+manageable (and reviewable) parts, and also address some of the design choices
+in CIP-58 that were not as good (or as clear) as they could have been. In this
+regard, this CIP is a direct continuation of CIP-121; CIP-121 dealt with
+conversions between `BuiltinByteString` and `BuiltinInteger`, while this CIP
+handles bit indexing more generally, as well as 'parallel' logical operations
+that operate on all the bits of a `BuiltinByteString` in bulk. 
+
+We describe how our work in this CIP relates to (and in some cases, supercedes)
+CIP-58, as well as how it follows on from CIP-121, in more detail below.
+
+### Bit indexing scheme
+
+The bit indexing scheme we describe here is designed around two
+considerations. Firstly, we want operations on these bits, as well as those
+results, to be as consistent and as predictable as possible: any individual
+familiar with such operations on variable-length bitvectors from another
+language shouldn't be surprised by the semantics. Secondly, we want to
+anticipate future bitwise operation extensions, such as shifts and rotations,
+and have the indexing scheme support efficient implementations (and predictable
+semantics) for these. 
+
+While prior art for bit access (and modification) exists
+in almost any programming language, these are typically over types of fixed
+width (usually bytes, machine words, or something similar); for variable-width
+types, these typically are either not implemented at all, or if they are
+implemented, this is done in an external library, with varying support for
+certain operations. An example of the first is Haskell's `ByteString`, which has
+no way to even access, much less modify, individual bits; an example of the
+second is the [CRoaring][croaring] library for C, which supports all the
+operations we describe in this CIP, along with multiple others. In the second
+case, the _exact_ arrangement of bits inside the representation is not something
+users are exposed to directly: instead, the bitvector type is opaque, and the
+library only guarantees consistency of API. In our case, this is not a viable
+choice, as we require bit access _and_ byte access to both work on
+`BuiltinByteString`, and thus, some consistency of representation is required.
+
+The scheme for indexing bits within a byte that we describe in [the relevant
+section](#bit-indexing-scheme) is the same as the one used by the `Data.Bits`
+API in Haskell for `Word8` bit indexing, and mirrors the decisions of most
+languages that provide such an API at all, as well as the conventional
+definition of such operations as `(w >> i) & 1` for access, `w | (1 << i)` for
+setting, and `w & ~(1 << i)` for clearing. We could choose to 'flip' this
+indexing, by using a similar operation for 'index flipping' as we currently use
+for bytes: essentially, instead of 
+
+$$
+\left \lfloor \frac{w}{2^{i}} \right \rfloor \mod 2 \equiv 1
+$$
+
+we would instead use
+
+$$
+\left \lfloor \frac{w}{2^{8 - i - 1}} \right \rfloor \mod 2 \equiv 1
+$$
+
+to designate bit $i$ as set (and analogously for clear). Together with the
+ability to choose _not_ to flip the _byte_ index, we get four possibilities,
+which have [been described previously][too-many-ways-1]. For clarity, we name,
+and describe, them below. Throughout, we use `n` as the length of a given
+`BuiltinByteString` in bytes.
+
+The first possibility is that we 'flip' neither bit, nor byte, indexes. We call
+this the _no-flip variant_:
+
+```
+| Byte index | 0                             | 1                 | ... | n - 1                          |
+|------------|-------------------------------|-------------------| ... |--------------------------------|
+| Byte       | w0                            | w1                | ... | w(n - 1)                       |
+|------------|-------------------------------|-------------------| ... |--------------------------------|
+| Bit index  | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 15 | 14 | ... | 8 | ... | 8n - 1 | 8n - 2 | ... | 8n - 8 |
+```
+
+The second possibility is that we 'flip' _both_ bit and byte indexes. We call
+this the _both-flip variant_:
+
+```
+| Byte index | 0                              | ... | n - 2            | n - 1                         |
+|------------|--------------------------------| ... |------------------|-------------------------------|
+| Byte       | w0                             | ... | w (n - 2)        | w(n - 1)                      |
+|------------|--------------------------------| ... |------------------|-------------------------------|
+| Bit index  | 8n - 8 | 8n - 7 | ... | 8n - 1 | ... | 8 | 9 | ... | 15 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 
+```
+
+The third possibility is that we 'flip' _bit_ indexes, but not _byte_ indexes.
+We call this the _bit-flip variant_:
+
+```
+| Byte index | 0                             | 1            | ... | n - 1                          |
+|------------|-------------------------------|--------------| ... |--------------------------------|
+| Byte       | w0                            | w1           | ... | w(n - 1)                       |
+|------------|-------------------------------|--------------| ... |--------------------------------|
+| Bit index  | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | ... | 15 | ... | 8n - 8 | 8n - 7 | ... | 8n - 1 |
+```
+
+The fourth possibility is the one we describe in the [bit indexing scheme
+section](#bit-indexing-scheme), which is also the scheme chosen by CIP-58. We
+repeat it below for clarity:
+
+```
+| Byte index | 0                              | 1  | ... | n - 1                         |
+|------------|--------------------------------|----| ... |-------------------------------|
+| Byte       | w0                             | w1 | ... | w(n - 1)                      |
+|------------|--------------------------------|----| ... |-------------------------------|
+| Bit index  | 8n - 1 | 8n - 2 | ... | 8n - 8 |   ...    | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
+```
+
+On the face of it, these schemes appear equivalent: they are all consistent, and
+all have formal descriptions, and quite similar ones at that. However, we
+believe that only the one we chose is the correct one. To explain this, we
+introduce two notions that we consider to be both intuitive and important,
+then specify why our choice of indexing scheme fits those notions better than
+any other.
+
+The first notion is _index locality_. Intuitively, this states that if
+two indexes are 'close' (in that their absolute difference is small), the values
+at those indexes should be 'close' (in that their positioning in memory should
+be separated less). We believe this notion to be reasonable, as this is an
+expectation from array indexing (and indeed, `BuiltinByteString` indexing), as
+well as the reason why packed array data is efficient on modern memory
+hierarchies. Extending this notion to bits, we can observe that the both-flip
+and no-flip variants of the bit indexing scheme do not preserve index locality:
+the separation between a bit at index $0$ and index $1$ is _significantly_
+different to the separation between a bit at index $7$ and index $8$ in both
+representations, despite their absolute difference being identical. Thus, we
+believe that these two variants are not viable, as they are not only confusing
+from the point of view of behaviour, they would also make implementation of
+future operations (such as shifts or rotations) significantly harder to both do,
+and also reason about. Thus, only the bit-flip variant, as well as our choice,
+remain contenders.
+
+The second notion is _most-significant-first conversion agreement_. This notion
+refers to the [CIP-121 concept of the same name][cip-121-big-endian], and
+ensures that (at least for the most-significant-first arrangement), the
+following workflow doesn't produce unexpected results:
+
+1. Convert a `BuiltinInteger` to `BuiltinByteString` using
+   `builtinIntegerToByteString` with the most-significant-first endianness
+   argument.
+2. Manipulate the bits of the result of step 1 using the operations specified 
+   here.
+3. Convert the result of step 2 back to a `BuiltinInteger` using
+   `builtinByteStringToInteger` with the most-significant-first endianness
+   argument.
+
+This workflow is directly relevant to Case 2. The Argon2 family of hashes use
+certain inputs (which happen to be numbers) both as numbers (meaning, for
+arithmetic operatons) and also as blocks of binary (specifically for XOR). This
+is not unique to Argon2, or even hashing, as a range of operations (especially
+in cryptographic applications) use similar approaches, whether for performance,
+semantics or both. In such cases, users of our primitives (both logical and
+conversion) must be confident that their changes 'translate' in the way they
+expect between these two 'views' of the data. 
+
+The choice of most-significant-first as the arrangement that we must agree with
+seems somewhat arbitrary at a glance, for two reasons: firstly, it's not clear
+why we must pick a single arrangement to be consistent with; secondly, the
+reasoning for the choice of most-significant-first over most-significant-last 
+as the arrangement to agree with isn't immediately apparent. To see why this is
+the only choice that we consider reasonable, we first observe that, according 
+to the definition of the bit indexing scheme given [in
+the corresponding section](#bit-indexing-scheme), as well as the corresponding
+definition for the bit-flip variant, we view a `BuiltinByteString` of length $n$
+as a binary natural number with exactly $8n$ digits, and the value at index $i$
+corresponds to the digits whose place value is either $2^i$ (for the bit-flip
+variant), or $2^{8n - i - 1}$ (for our chosen method). Put another way, under
+the specification for the bit-flip variant, the least significant binary digit
+is first, whereas in our chosen specification, the least significant binary
+digit is last. CIP-121's conversion primitives mirror this reasoning: the
+most-significant-first arrangement corresponds to our chosen method, while the
+most-significant-last arrangement corresponds to the bit-flip variant instead.
+The difference is the digit value: for us, the digit value is (effectively) 2,
+while for CIP-121's conversion primitives, it is 256 instead.
+
+We also observe that, when we index a `BuiltinByteString`'s _bytes_, we get back
+a `BuiltinInteger`, whic has a numerical value as a natural number in the range
+$[0, 255]$. Putting these two observations together, we consider it sensible
+that, given a non-empty `BuiltinByteString`, if we were to get the values at bit
+indexes $0$ through $7$, then sum their corresponding place values (treating
+clear bits as $0$ and set bits as the appropriate place value), we should get
+the same result as indexing whichever byte those bits came from.
+
+Consider the `BuiltinByteString` whose only byte is $42$, whose representation 
+is as follows:
+
+```
+| Byte index | 0        |
+|------------|----------|
+| Byte       | 00101010 |
+```
+
+We note that, if we index this `BuiltinByteString` at byte position $0$, we get
+back the answer $42$. Furthermore, if we use `builtinByteStringToInteger` from
+CIP-121 with such a `BuiltinByteString`, we get the result $42$ as well,
+regardless of the endianness argument we choose.
+
+Under the bit-flip variant, the bit indexes of this `BuiltinByteString` would be
+as follows:
+
+```
+| Byte index | 0                             |
+|------------|-------------------------------|
+| Byte       | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
+|------------|-------------------------------|
+| Bit index  | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+```
+
+However, we immediately see a problem: under this indexing scheme, the $2^2 = 4$
+place value is $1$, which would suggest that in the binary representation of
+$42$, the corresponding digit is also $1$. However, this is not the case. Under
+our scheme of choice however, we get the correct answer:
+
+```
+| Byte index | 0                             |
+|------------|-------------------------------|
+| Byte       | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
+|------------|-------------------------------|
+| Bit index  | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
+```
+
+Here, the $4$ place value is correctly $0$. This demonstrates that of the two
+indexing scheme possibilities that preserve index locality, only one can be
+consistent with _any_ choice of byte arrangement, whether most-significant-first
+or most-significant-last: the one we chose. This implies that we cannot be
+consistent with both arrangements while also preserving index locality.
+
+Let us now consider a larger example `BuiltinByteString`:
+
+```
+| Byte index | 0        | 1        |
+|------------|----------|----------|
+| Byte       | 00101010 | 11011011 |
+```
+
+This would produce two different results when converted with
+`builtinByteStringToInteger`, depending on the choice of endianness argument:
+
+* For the most-significant-first arrangement, the result is $42 * 256 + 223 =
+  10975$.
+* For the most-significant-last arrangement, the result is $223 * 256 + 42 =
+  57130$.
+
+These have the following 'breakdowns' in base-2:
+
+* $10975 = 8096 + 2048 + 512 + 256 + 32 + 16 + 8 + 4 + 2 + 1 = 2^13 + 2^11 + 2^9 + 2^8 + 2^5 + 2^4 + 2^3 + 2^2 + 2^1 + 2^0$
+* $57130 = 32768 + 16386 + 4096 + 2048 + 1024 + 512 + 256 + 32 + 8 + 2 = 2^15 + 2^14 + 2^12 + 2^11 + 2^10 + 2^9 + 2^8 + 2^5 + 2^3 + 2^1$
+
+Under the bit-flip variant, the bit indexes of this `BuiltinByteString` would be
+as follows:
+
+```
+| Byte index | 0                             | 1                                   |
+|------------|-------------------------------|-------------------------------------|
+| Byte       | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0  | 1  | 1  | 0  | 1  | 1  |
+|------------|-------------------------------|-------------------------------------|
+| Bit index  | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
+```
+
+We immediately see a problem, as in this representation, it suggests that the
+$2^1 = 2$ place value has zero digit value. This is true of _neither_ $10975$
+nor $57130$'s base-2 forms, which have the $2$ place value with a $1$ digit
+value. This suggests that the bit-flip variant cannot agree with _either_ choice
+of arrangement in general.
+
+However, if we view the bit indexes using our chosen scheme:
+
+```
+| Byte index | 0                                   | 1                             |
+|------------|-------------------------------------|-------------------------------|
+| Byte       | 0  | 0  | 1  | 0  | 1  | 0  | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 |
+|------------|-------------------------------------|-------------------------------|
+| Bit index  | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
+```
+
+the $2$ place value is correctly shown as having a digit value of 1. 
+
+Combining these observations, we note that, assuming we value index locality,
+choosing our scheme gives us consistency with the most-significant-first
+arrangement, as well as consistency with byte indexing digit values, but
+choosing the bit-flip variant gives us _neither_. As we need both index locality
+_and_ consistency with at least one arrangement, our choice is the correct one.
+The fact that we also get byte indexing digit values being consistent is another
+reason for our choice.
+
+### Padding versus truncation
+
+For the operations defined in this CIP taking two `BuiltinByteString` arguments
+(that is, `bitwiseLogicalAnd`, `bitwiseLogicalOr`, and `bitwiseLogicalXor`),
+when the two arguments have identical lengths, the semantics are natural,
+mirroring the corresponding operations on the 
+[Boolean algebra][boolean-algebra-2] $\textbf{2}^{8n}$, where $n$ is the length 
+of either argument in bytes. When the arguments do _not_ have matching lengths,
+however, the situation becomes more complex, as there are several ways in which
+we could define these operations. The most natural possibilities are as follows;
+we repeat some of the definitions used [in the corresponding
+section](#padding-versus-truncation-semantics).
+
+* Extend the shorter argument with the identity element (all-1s for
+  `bitwiseLogicalAnd`, all-0s otherwise) to match the length of the longer argument,
+  then perform the operation as if on matching-length arguments. We call this
+  _padding semantics_.
+* Ignore the bytes of the longer argument whose indexes would not be valid for
+  the shorter argument, then perform the operation as if on matching-length
+  arguments. We call this _truncation semantics_.
+* Fail with an error whenever argument lengths don't match. We call this
+  _match semantics_.
+
+Furthermore, for both padding and truncation semantics, we can choose to pad (or
+truncate) _low_ index bytes or _high_ index bytes. To illustrate the difference,
+consider the two `BuiltinByteString`s (written as arrays of bytes for
+simplicity) `[0xFF, 0x0F, 0x00]` and `[0x8F, 0x99]`. Under padding semantics,
+padding low index bytes would give us `[0x00, 0x8F, 0x99]` (or `[0xFF, 0x8F,
+0x99]` depending on operation), while padding high index bytes would give us
+`[0x8F, 0x99, 0x00]` (or `[0x8F, 0x99, 0xFF]` depending on operation). Under
+truncation semantics, truncating low index bytes would give us `[0x0F, 0x00]`,
+while truncating high index bytes would give us `[0xFF, 0x0F]`.
+
+It is not a priori clear which of these we should choose: they are subject to
+different laws (as evidenced by the [corresponding
+section](#laws-and-examples)), none of which are strict supersets of each other
+(at least not for _all_ inputs possible). While [CIP-58][cip-58] chose match
+semantics, we believe this was not the correct decision: we use Case 1 to
+justify the benefit of having other semantics described above available.
+
+Consider the following operation: given a bound $k$, a 'direction' (larger or
+smaller), and an integer set, remove all elements indicates by the direction and
+$k$ (that is, either smaller than $k$ or larger than $k$, as indicated by the
+direction). This could be done using a `bitwiseLogicalAnd` and a mask. However,
+under match semantics, this mask would have to have a length equal to the
+integer set representation; under padding semantics, the mask would potentially
+only need $\Theta(k)$ length, depending on direction. This is noteworthy, as
+padding the mask would require an additional copy operation, only to produce a
+value that would be discarded immediately.
+
+Consider instead the following operation: given two integer sets with different
+(upper) bounds, take their intersection, producing an integer set whose size is
+the minimum of the two. This can once again be done using `bitwiseLogicalAnd`,
+but under match semantics (or padding semantics for that matter), we would first
+have to slice the longer argument, while under truncation semantics, we wouldn't
+need to.
+
+Match semantics can be useful for Case 1 as well. Consider the case that a 
+representation of an integer set is supplied as an
+input datum (in its `Data` encoding). In order to deserialize it, we need to
+verify at least whether it has the right length in bytes to represent an integer
+set with a given bound. Under padding or truncation semantics, we would have to
+check this at deserialization time; under exact match semantics, provided we
+were sure that at least one argument is of a known size, we could simply perform
+the necessary operations and let the match semantics error if given something
+inappropriate.
+
+It is also worth noting that truncation semantics are well-established in the
+Haskell ecosystem. Viewed another way, all of the operations under discussion in
+this sections are specialized versions of the `zipWith` operation; Haskell
+libraries provide this type of operation for a range of linear collections,
+including lists, `Vector`s, and mostly notably, `ByteString`s. In all of these
+cases, truncation semantics are what is implemented; it would be surprising to
+developers coming from Haskell to find that they have to do additional work to
+replicate them in Plutus. While we don't anticipate direct use of Plutus Core
+primitives by developers (although this is not an unheard-of case), we should
+enable library authors to build familiar APIs _on top of_ Plutus Core
+primitives, which suggests truncation semantics should be available, at least as
+an option.
+
+All the above suggests that no _single_ choice of semantics will satisfy all
+reasonable needs, if only from the point of view of efficiency. This suggests,
+much as for [CIP-121 primitives][conversion-cip] and endianness issues, that
+the primitive should allow a choice in what semantics get used for any given
+call. Ideally, we would allow a choice of any of the three options described
+above (along with a choice of low or high index padding or truncation); 
+however, this is awkward to do in Plutus Core. While the choice between
+_two_ options is straightforward (pass a `BuiltinBool`), the choice between
+more than this number would require something like a `BuiltinInteger` argument
+with 'designated values' ('0 means match', '1 means low-index padding', etc).
+This is not ideal, as they involve additional checks, argument redundancy, or
+both. In light of this, we made the following decisions:
+
+1. We would choose only two of the three semantics, and have this choice
+   controlled for any given call be controlled by a `BuiltinBool` flag; and
+2. For padding or truncation semantics, we would _always_ use either low or high
+   index padding (or truncation).
+
+This leads naturally to two questions: which of the three semantics above we can
+afford to exclude, and whether low or high index padding should be chosen. We
+believe that the correct choices are to exclude match semantics, and to use
+high index padding and truncation, for several reasons. 
+
+Firstly, we can simulate
+match semantics with either padding or truncation semantics, together with a
+length check. While we could also simulate padding semantics via match semantics
+similarly, the amount of effort (both developer and computational) required is
+significantly more in that case: a length check is a constant-time operation,
+while manually padding is linear at best (and even then, it requires operations
+only this CIP provides, as it would be quadratic otherwise), and on top of that,
+manual padding is much fiddlier and easier to get wrong. 
+
+Secondly, truncation semantics are common enough in Haskell that we believe
+excluding them as an option is both surprising and wrong. Any developer familiar
+with Haskell has interacted with various `zipWith` operations, and having our
+primitives behave differently to this at minimum creates friction for
+implementers of higher-level abstractions atop the primitives in this CIP. While
+Haskellers are not exclusive users of Plutus primitives (directly or not), there
+are definitely enough of them that _not_ having truncation semantics available
+would create a lot of unnecessary friction.
+
+Thirdly, outside of error checking, match semantics give few benefits,
+performance or otherwise. The examples above demonstrate cases where padding and
+truncation semantics lead to better performance, less fiddly implementations, or
+both: finding such a case for match semantics outside of error checking is
+difficult at best.
+
+This combination of reasoning leads us to consider padding and truncation as the
+two semantics we should retain, and this guided our implementation choices
+accordingly. With regard to padding (or truncating) low or high indexes, given
+that we pad (or truncate) whole bytes by necessity, it makes the corresponding
+operations (effectively) operate over bytes, or rather, they view
+`BuiltinByteString`s as linear collections of bytes, rather than bits. When
+viewed this way, the `zipWith` analogy with Haskell suggests that truncating
+high is the correct choice: truncating low would be quite surprising to a
+Haskeller familiar with how `zipWith`-style operations behave. Furthermore, as
+having padding low and truncating high would be confusing (and arguably quite
+strange), padding high seems like the correct choice. Thus, we decided to both
+pad and truncate high in light of this. 
+
+### Bit setting
+
+`writeBits` in our description takes a change list argument, allowing
+changing multiple bits at once. This is an added complexity, and an argument can
+be made that something similar to the following operation would be sufficient:
+
+```haskell
+writeBit :: BuiltinByteString -> BuiltinInteger -> BuiltinBool ->
+BuiltinByteString
+```
+
+Essentially, `writeBit bs i v` would be equivalent to `writeBits bs
+[(i, v)]` as currently defined. This was the choice made by [CIP-58][cip-58],
+with the consideration of simplicity in mind. 
+
+At the same time, due to the immutability semantics of Plutus Core, each time
+`writeBit` would be called, we would have to copy its `BuiltinByteString` 
+argument. Thus, a sequence of $k$ `setBit` calls in a fold over a
+`BuiltinByteString` of length $n$ would require $\Theta(nk)$ time and
+$\Theta(nk)$ space. Meanwhile, if we instead used `writeBits`, the time
+drops to $\Theta(n + k)$ and the space to $\Theta(n)$, which is a non-trivial
+improvement. While we cannot avoid the worst-case copying behaviour of
+`setBit` (if we have a critical path of read-write dependencies of length
+$k$, for example), and 'list packing' carries some cost, we have
+[benchmarks][benchmarks-bits] that show not only that this 'packing cost' is
+essentially zero, but that for `BuiltinByteString`s of 30 bytes or fewer,
+copying completely overwhelms the work required to modify the bits specified in
+the change list argument. This alone is good evidence for having `writeBits` instead;
+indeed, there is prior art for doing this [in the `vector` library][vector], for
+the exact reasons we give here.
+
+The argument could also be made whether this design should be extended to other
+primitive operations in this CIP which both take `BuiltinByteString` arguments
+and also produce `BuiltinByteString` results. We believe that this is not as
+justified as in the `writeBits` case, for several reasons. Firstly, for
+`bitwiseLogicalComplement`, it's not clear what benefit this would have at
+all: the only possible signature such an operation would have is
+`[BuiltinByteString] -> [BuiltinByteString]`, which in effect would be a
+specialized form of mapping. While an argument could be made for a _general_
+form of mapping as a Plutus Core primitive, it wouldn't be reasonable for an
+operation like this to be considered for such.
+
+Secondly, the performance benefits of such an operation aren't nearly as 
+significant in theory, and likely wouldn't be in practice either. Consider 
+this hypothetical operation (with fold semantics):
+
+```haskell
+bitwiseLogicalXors :: BuiltinBool -> [BuiltinByteString] -> BuiltinByteString
+```
+
+Simulating this operation as a fold using `bitwiseLogicalXor`, in the worst
+case, irrespective of padding or truncation semantics, requires $\Theta(nk)$
+time and space, where $n$ is the size of each `BuiltinByteString` in the
+argument list, and $k$ is the length of the argument list itself. Using
+`bitwiseLogicalXors` instead would reduce the space required to $\Theta(n)$, 
+but would not affect the time complexity at all. 
+
+Lastly, it is questionable whether 'bulk' operations like `bitwiseLogicalXors`
+above would see as much use as `writeBits`. In the context of Case 1,
+`bitwiseLogicalXors` corresponds to taking the symmetric difference of multiple
+integer sets; it seems unlikely that the number of sets we'd want to do this
+with would frequently be higher than 2. However, in the same context,
+`writeBits` corresponds to constructing an integer set given a list of
+members (or, for that matter, _non_-members): this is an operation that is both
+required by the case description, and also much more likely to be used often.
+
+On the basis of the above, we believe that choosing to implement
+`writeBits` as a 'bulk' operation, but to leave others as 'singular' is the
+right choice.
+
+## Path to Active
+
+### Acceptance Criteria
+
+We consider the following criteria to be essential for acceptance:
+
+* A proof-of-concept implementation of the operations specified in this
+  document, outside of the Plutus source tree. The implementation must be in
+  GHC Haskell, without relying on the FFI.
+* The proof-of-concept implementation must have tests, demonstrating that it 
+  behaves as the specification requires.
+* The proof-of-concept implementation must demonstrate that it will 
+  successfully build, and pass its tests, using all GHC versions currently 
+  usable to build Plutus (8.10, 9.2 and 9.6 at the time of writing), across all 
+  [Tier 1][tier-1-ghc] platforms.
+
+Ideally, the implementation should also demonstrate its performance 
+characteristics by well-designed benchmarks.
+
+### Implementation Plan
+
+MLabs has begun the [implementation of the proof-of-concept][mlabs-impl] as 
+required in the acceptance criteria. Upon completion, we will send a pull 
+request to Plutus with the implementation of the primitives for Plutus 
+Core, mirroring the proof-of-concept.
+
+## Copyright
+
+This CIP is licensed under [Apache-2.0](http://www.apache.org/licenses/LICENSE-2.0).
+
+[mlabs-impl]: https://github.com/mlabs-haskell/plutus-integer-bytestring
+[tier-1-ghc]: https://gitlab.haskell.org/ghc/ghc/-/wikis/platforms#tier-1-platforms
+[special-semigroups]: https://en.wikipedia.org/wiki/Special_classes_of_semigroups
+[commutative-monoid]: https://en.wikipedia.org/wiki/Monoid#Commutative_monoid
+[absorbing-element]: https://en.wikipedia.org/wiki/Zero_element#Absorbing_elements
+[semilattice]: https://en.wikipedia.org/wiki/Semilattice
+[distributive]: https://en.wikipedia.org/wiki/Distributive_property
+[lattice]: https://en.wikipedia.org/wiki/Lattice_(order)
+[de-morgan]: https://en.wikipedia.org/wiki/De_Morgan%27s_laws
+[lens-laws]: https://oleg.fi/gists/posts/2017-04-18-glassery.html#laws:lens
+[monoid-homomorphism]: https://en.wikipedia.org/wiki/Monoid#Monoid_homomorphisms
+[succinct-data-structures]: https://en.wikipedia.org/wiki/Succinct_data_structure
+[adjacency-matrix]: https://en.wikipedia.org/wiki/Adjacency_matrix
+[binary-matrix]: https://en.wikipedia.org/wiki/Logical_matrix
+[go-binary-matrix]: https://senseis.xmp.net/?BinMatrix
+[finite-state-machine-4vl]: https://en.wikipedia.org/wiki/Four-valued_logic#Matrix_machine
+[bitvector-apps]: https://en.wikipedia.org/wiki/Bit_array#Applications
+[bitmap-index-compression]: https://en.wikipedia.org/wiki/Bitmap_index#Compression
+[cip-58]: https://github.com/cardano-foundation/CIPs/tree/master/CIP-0058
+[croaring]: https://github.com/RoaringBitmap/CRoaring
+[too-many-ways-1]: https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far-too-many-ways-part-1
+[conversion-cip]: https://github.com/mlabs-haskell/CIPs/blob/koz/to-from-bytestring/CIP-0121/README.md
+[benchmarks-bits]: https://github.com/mlabs-haskell/plutus-integer-bytestring/blob/main/bench/naive/Main.hs#L74-L83
+[vector]: https://hackage.haskell.org/package/vector-0.13.1.0/docs/Data-Vector.html#v:-47--47-
+[boolean-algebra-2]: https://en.wikipedia.org/wiki/Two-element_Boolean_algebra
+[hashing]: https://en.wikipedia.org/wiki/Hash_function
+[sha256]: https://en.wikipedia.org/wiki/Secure_Hash_Algorithms
+[blake2b]: https://en.wikipedia.org/wiki/BLAKE_(hash_function)
+[argon2]: https://en.wikipedia.org/wiki/Argon2
+[xor-crypto]: https://en.wikipedia.org/wiki/Exclusive_or#Bitwise_operation
+[cip-121-big-endian]: https://github.com/mlabs-haskell/CIPs/blob/koz/to-from-bytestring/CIP-0121/README.md#representation
+[bitwise-and]: https://en.wikipedia.org/wiki/Bitwise_operation#AND
+[bitwise-or]: https://en.wikipedia.org/wiki/Bitwise_operation#OR
+[bitwise-xor]: https://en.wikipedia.org/wiki/Bitwise_operation#XOR