Skip to content

Commit

Permalink
Merge pull request #207 from marshallpierce/mp/api-rework
Browse files Browse the repository at this point in the history
API rework
  • Loading branch information
marshallpierce authored Jan 6, 2023
2 parents 10b73f6 + 726f784 commit 8350376
Show file tree
Hide file tree
Showing 32 changed files with 1,294 additions and 938 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "base64"
version = "0.20.0"
version = "0.21.0-rc.1"
authors = ["Alice Maz <alice@alicemaz.com>", "Marshall Pierce <marshall@mpierce.org>"]
description = "encodes and decodes base64 as bytes or utf8"
repository = "https://github.com/marshallpierce/rust-base64"
Expand Down
14 changes: 0 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,6 @@ e.g. `decode_engine_slice` decodes into an existing `&mut [u8]` and is pretty fa
whereas `decode_engine` allocates a new `Vec<u8>` and returns it, which might be more convenient in some cases, but is
slower (although still fast enough for almost any purpose) at 2.1 GiB/s.

## Example

```rust
use base64::{encode, decode};

fn main() {
let a = b"hello world";
let b = "aGVsbG8gd29ybGQ=";

assert_eq!(encode(a), b);
assert_eq!(a, &decode(b).unwrap()[..]);
}
```

See the [docs](https://docs.rs/base64) for all the details.

## FAQ
Expand Down
129 changes: 112 additions & 17 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,106 @@
# 0.21.0

(not yet released)


## Migration

### Functions

| < 0.20 function | 0.21 equivalent |
|-------------------------|-------------------------------------------------------------------------------------|
| `encode()` | `engine::general_purpose::STANDARD.encode()` or `prelude::BASE64_STANDARD.encode()` |
| `encode_config()` | `engine.encode()` |
| `encode_config_buf()` | `engine.encode_string()` |
| `encode_config_slice()` | `engine.encode_slice()` |
| `decode()` | `engine::general_purpose::STANDARD.decode()` or `prelude::BASE64_STANDARD.decode()` |
| `decode_config()` | `engine.decode()` |
| `decode_config_buf()` | `engine.decode_vec()` |
| `decode_config_slice()` | `engine.decode_slice()` |

The short-lived 0.20 functions were the 0.13 functions with `config` replaced with `engine`.

### Padding

If applicable, use the preset engines `engine::STANDARD`, `engine::STANDARD_NO_PAD`, `engine::URL_SAFE`,
or `engine::URL_SAFE_NO_PAD`.
The `NO_PAD` ones require that padding is absent when decoding, and the others require that
canonical padding is present .

If you need the < 0.20 behavior that did not care about padding, or want to recreate < 0.20.0's predefined `Config`s
precisely, see the following table.

| 0.13.1 Config | 0.20.0+ alphabet | `encode_padding` | `decode_padding_mode` |
|-----------------|------------------|------------------|-----------------------|
| STANDARD | STANDARD | true | Indifferent |
| STANDARD_NO_PAD | STANDARD | false | Indifferent |
| URL_SAFE | URL_SAFE | true | Indifferent |
| URL_SAFE_NO_PAD | URL_SAFE | false | Indifferent |

# 0.21.0-rc.1

- Restore the ability to decode into a slice of precisely the correct length with `Engine.decode_slice_unchecked`.
- Add `Engine` as a `pub use` in `prelude`.

# 0.21.0-beta.2

## Breaking changes

- Re-exports of preconfigured engines in `engine` are removed in favor of `base64::prelude::...` that are better suited to those who wish to `use` the entire path to a name.

# 0.21.0-beta.1

## Breaking changes

- `FastPortable` was only meant to be an interim name, and shouldn't have shipped in 0.20. It is now `GeneralPurpose` to
make its intended usage more clear.
- `GeneralPurpose` and its config are now `pub use`'d in the `engine` module for convenience.
- Change a few `from()` functions to be `new()`. `from()` causes confusing compiler errors because of confusion
with `From::from`, and is a little misleading because some of those invocations are not very cheap as one would
usually expect from a `from` call.
- `encode*` and `decode*` top level functions are now methods on `Engine`.
- `DEFAULT_ENGINE` was replaced by `engine::general_purpose::STANDARD`
- Predefined engine consts `engine::general_purpose::{STANDARD, STANDARD_NO_PAD, URL_SAFE, URL_SAFE_NO_PAD}`
- These are `pub use`d into `engine` as well
- The `*_slice` decode/encode functions now return an error instead of panicking when the output slice is too small
- As part of this, there isn't now a public way to decode into a slice _exactly_ the size needed for inputs that
aren't multiples of 4 tokens. If adding up to 2 bytes to always be a multiple of 3 bytes for the decode buffer is
a problem, file an issue.

## Other changes

- `decoded_len_estimate()` is provided to make it easy to size decode buffers correctly.

# 0.20.0

### Breaking changes
## Breaking changes

- Update MSRV to 1.57.0
- Decoding can now either ignore padding, require correct padding, or require no padding. The default is to require correct padding.
- The `NO_PAD` config now requires that padding be absent when decoding.
- Decoding can now either ignore padding, require correct padding, or require no padding. The default is to require
correct padding.
- The `NO_PAD` config now requires that padding be absent when decoding.

## 0.20.0-alpha.1

### Breaking changes
- Extended the `Config` concept into the `Engine` abstraction, allowing the user to pick different encoding / decoding implementations.
- What was formerly the only algorithm is now the `FastPortable` engine, so named because it's portable (works on any CPU) and relatively fast.
- This opens the door to a portable constant-time implementation ([#153](https://github.com/marshallpierce/rust-base64/pull/153), presumably `ConstantTimePortable`?) for security-sensitive applications that need side-channel resistance, and CPU-specific SIMD implementations for more speed.
- Standard base64 per the RFC is available via `DEFAULT_ENGINE`. To use different alphabets or other settings (padding, etc), create your own engine instance.
- `CharacterSet` is now `Alphabet` (per the RFC), and allows creating custom alphabets. The corresponding tables that were previously code-generated are now built dynamically.
- Since there are already multiple breaking changes, various functions are renamed to be more consistent and discoverable.

- Extended the `Config` concept into the `Engine` abstraction, allowing the user to pick different encoding / decoding
implementations.
- What was formerly the only algorithm is now the `FastPortable` engine, so named because it's portable (works on
any CPU) and relatively fast.
- This opens the door to a portable constant-time
implementation ([#153](https://github.com/marshallpierce/rust-base64/pull/153),
presumably `ConstantTimePortable`?) for security-sensitive applications that need side-channel resistance, and
CPU-specific SIMD implementations for more speed.
- Standard base64 per the RFC is available via `DEFAULT_ENGINE`. To use different alphabets or other settings (
padding, etc), create your own engine instance.
- `CharacterSet` is now `Alphabet` (per the RFC), and allows creating custom alphabets. The corresponding tables that
were previously code-generated are now built dynamically.
- Since there are already multiple breaking changes, various functions are renamed to be more consistent and
discoverable.
- MSRV is now 1.47.0 to allow various things to use `const fn`.
- `DecoderReader` now owns its inner reader, and can expose it via `into_inner()`. For symmetry, `EncoderWriter` can do the same with its writer.
- `DecoderReader` now owns its inner reader, and can expose it via `into_inner()`. For symmetry, `EncoderWriter` can do
the same with its writer.
- `encoded_len` is now public so you can size encode buffers precisely.

# 0.13.1
Expand All @@ -28,8 +112,11 @@
- Config methods are const
- Added `EncoderStringWriter` to allow encoding directly to a String
- `EncoderWriter` now owns its delegate writer rather than keeping a reference to it (though refs still work)
- As a consequence, it is now possible to extract the delegate writer from an `EncoderWriter` via `finish()`, which returns `Result<W>` instead of `Result<()>`. If you were calling `finish()` explicitly, you will now need to use `let _ = foo.finish()` instead of just `foo.finish()` to avoid a warning about the unused value.
- When decoding input that has both an invalid length and an invalid symbol as the last byte, `InvalidByte` will be emitted instead of `InvalidLength` to make the problem more obvious.
- As a consequence, it is now possible to extract the delegate writer from an `EncoderWriter` via `finish()`, which
returns `Result<W>` instead of `Result<()>`. If you were calling `finish()` explicitly, you will now need to
use `let _ = foo.finish()` instead of just `foo.finish()` to avoid a warning about the unused value.
- When decoding input that has both an invalid length and an invalid symbol as the last byte, `InvalidByte` will be
emitted instead of `InvalidLength` to make the problem more obvious.

# 0.12.2

Expand All @@ -47,23 +134,31 @@
- A minor performance improvement in encoding

# 0.11.0

- Minimum rust version 1.34.0
- `no_std` is now supported via the two new features `alloc` and `std`.

# 0.10.1

- Minimum rust version 1.27.2
- Fix bug in streaming encoding ([#90](https://github.com/marshallpierce/rust-base64/pull/90)): if the underlying writer didn't write all the bytes given to it, the remaining bytes would not be retried later. See the docs on `EncoderWriter::write`.
- Fix bug in streaming encoding ([#90](https://github.com/marshallpierce/rust-base64/pull/90)): if the underlying writer
didn't write all the bytes given to it, the remaining bytes would not be retried later. See the docs
on `EncoderWriter::write`.
- Make it configurable whether or not to return an error when decoding detects excess trailing bits.

# 0.10.0

- Remove line wrapping. Line wrapping was never a great conceptual fit in this library, and other features (streaming encoding, etc) either couldn't support it or could support only special cases of it with a great increase in complexity. Line wrapping has been pulled out into a [line-wrap](https://crates.io/crates/line-wrap) crate, so it's still available if you need it.
- `Base64Display` creation no longer uses a `Result` because it can't fail, which means its helper methods for common
configs that `unwrap()` for you are no longer needed
- Remove line wrapping. Line wrapping was never a great conceptual fit in this library, and other features (streaming
encoding, etc) either couldn't support it or could support only special cases of it with a great increase in
complexity. Line wrapping has been pulled out into a [line-wrap](https://crates.io/crates/line-wrap) crate, so it's
still available if you need it.
- `Base64Display` creation no longer uses a `Result` because it can't fail, which means its helper methods for
common
configs that `unwrap()` for you are no longer needed
- Add a streaming encoder `Write` impl to transparently base64 as you write.
- Remove the remaining `unsafe` code.
- Remove whitespace stripping to simplify `no_std` support. No out of the box configs use it, and it's trivial to do yourself if needed: `filter(|b| !b" \n\t\r\x0b\x0c".contains(b)`.
- Remove whitespace stripping to simplify `no_std` support. No out of the box configs use it, and it's trivial to do
yourself if needed: `filter(|b| !b" \n\t\r\x0b\x0c".contains(b)`.
- Detect invalid trailing symbols when decoding and return an error rather than silently ignoring them.

# 0.9.3
Expand Down
40 changes: 18 additions & 22 deletions benches/benchmarks.rs
Original file line number Diff line number Diff line change
@@ -1,36 +1,34 @@
#[macro_use]
extern crate criterion;

use base64::display;
use base64::{
decode, decode_engine_slice, decode_engine_vec, encode, encode_engine_slice,
encode_engine_string, write,
display,
engine::{general_purpose::STANDARD, Engine},
write,
};

use base64::engine::DEFAULT_ENGINE;
use criterion::{black_box, Bencher, BenchmarkId, Criterion, Throughput};
use rand::{Rng, SeedableRng};
use std::io::{self, Read, Write};

fn do_decode_bench(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

b.iter(|| {
let orig = decode(&encoded);
let orig = STANDARD.decode(&encoded);
black_box(&orig);
});
}

fn do_decode_bench_reuse_buf(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

let mut buf = Vec::new();
b.iter(|| {
decode_engine_vec(&encoded, &mut buf, &DEFAULT_ENGINE).unwrap();
STANDARD.decode_vec(&encoded, &mut buf).unwrap();
black_box(&buf);
buf.clear();
});
Expand All @@ -39,28 +37,28 @@ fn do_decode_bench_reuse_buf(b: &mut Bencher, &size: &usize) {
fn do_decode_bench_slice(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

let mut buf = Vec::new();
buf.resize(size, 0);
b.iter(|| {
decode_engine_slice(&encoded, &mut buf, &DEFAULT_ENGINE).unwrap();
STANDARD.decode_slice(&encoded, &mut buf).unwrap();
black_box(&buf);
});
}

fn do_decode_bench_stream(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

let mut buf = Vec::new();
buf.resize(size, 0);
buf.truncate(0);

b.iter(|| {
let mut cursor = io::Cursor::new(&encoded[..]);
let mut decoder = base64::read::DecoderReader::from(&mut cursor, &DEFAULT_ENGINE);
let mut decoder = base64::read::DecoderReader::new(&mut cursor, &STANDARD);
decoder.read_to_end(&mut buf).unwrap();
buf.clear();
black_box(&buf);
Expand All @@ -71,7 +69,7 @@ fn do_encode_bench(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size);
fill(&mut v);
b.iter(|| {
let e = encode(&v);
let e = STANDARD.encode(&v);
black_box(&e);
});
}
Expand All @@ -80,7 +78,7 @@ fn do_encode_bench_display(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size);
fill(&mut v);
b.iter(|| {
let e = format!("{}", display::Base64Display::from(&v, &DEFAULT_ENGINE));
let e = format!("{}", display::Base64Display::new(&v, &STANDARD));
black_box(&e);
});
}
Expand All @@ -90,7 +88,7 @@ fn do_encode_bench_reuse_buf(b: &mut Bencher, &size: &usize) {
fill(&mut v);
let mut buf = String::new();
b.iter(|| {
encode_engine_string(&v, &mut buf, &DEFAULT_ENGINE);
STANDARD.encode_string(&v, &mut buf);
buf.clear();
});
}
Expand All @@ -101,9 +99,7 @@ fn do_encode_bench_slice(b: &mut Bencher, &size: &usize) {
let mut buf = Vec::new();
// conservative estimate of encoded size
buf.resize(v.len() * 2, 0);
b.iter(|| {
encode_engine_slice(&v, &mut buf, &DEFAULT_ENGINE);
});
b.iter(|| STANDARD.encode_slice(&v, &mut buf).unwrap());
}

fn do_encode_bench_stream(b: &mut Bencher, &size: &usize) {
Expand All @@ -114,7 +110,7 @@ fn do_encode_bench_stream(b: &mut Bencher, &size: &usize) {
buf.reserve(size * 2);
b.iter(|| {
buf.clear();
let mut stream_enc = write::EncoderWriter::from(&mut buf, &DEFAULT_ENGINE);
let mut stream_enc = write::EncoderWriter::new(&mut buf, &STANDARD);
stream_enc.write_all(&v).unwrap();
stream_enc.flush().unwrap();
});
Expand All @@ -125,7 +121,7 @@ fn do_encode_bench_string_stream(b: &mut Bencher, &size: &usize) {
fill(&mut v);

b.iter(|| {
let mut stream_enc = write::EncoderStringWriter::from(&DEFAULT_ENGINE);
let mut stream_enc = write::EncoderStringWriter::new(&STANDARD);
stream_enc.write_all(&v).unwrap();
stream_enc.flush().unwrap();
let _ = stream_enc.into_inner();
Expand All @@ -139,7 +135,7 @@ fn do_encode_bench_string_reuse_buf_stream(b: &mut Bencher, &size: &usize) {
let mut buf = String::new();
b.iter(|| {
buf.clear();
let mut stream_enc = write::EncoderStringWriter::from_consumer(&mut buf, &DEFAULT_ENGINE);
let mut stream_enc = write::EncoderStringWriter::from_consumer(&mut buf, &STANDARD);
stream_enc.write_all(&v).unwrap();
stream_enc.flush().unwrap();
let _ = stream_enc.into_inner();
Expand Down
8 changes: 4 additions & 4 deletions examples/base64.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,21 +61,21 @@ fn main() {
};

let alphabet = opt.alphabet.unwrap_or_default();
let engine = engine::fast_portable::FastPortable::from(
let engine = engine::GeneralPurpose::new(
&match alphabet {
Alphabet::Standard => alphabet::STANDARD,
Alphabet::UrlSafe => alphabet::URL_SAFE,
},
engine::fast_portable::PAD,
engine::general_purpose::PAD,
);

let stdout = io::stdout();
let mut stdout = stdout.lock();
let r = if opt.decode {
let mut decoder = read::DecoderReader::from(&mut input, &engine);
let mut decoder = read::DecoderReader::new(&mut input, &engine);
io::copy(&mut decoder, &mut stdout)
} else {
let mut encoder = write::EncoderWriter::from(&mut stdout, &engine);
let mut encoder = write::EncoderWriter::new(&mut stdout, &engine);
io::copy(&mut input, &mut encoder)
};
if let Err(e) = r {
Expand Down
2 changes: 1 addition & 1 deletion fuzz/fuzzers/decode_random.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,5 @@ fuzz_target!(|data: &[u8]| {

// The data probably isn't valid base64 input, but as long as it returns an error instead
// of crashing, that's correct behavior.
let _ = decode_engine(data, &engine);
let _ = engine.decode(data);
});
6 changes: 3 additions & 3 deletions fuzz/fuzzers/roundtrip.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
#[macro_use] extern crate libfuzzer_sys;
extern crate base64;

use base64::engine::DEFAULT_ENGINE;
use base64::{Engine as _, engine::general_purpose::STANDARD};

fuzz_target!(|data: &[u8]| {
let encoded = base64::encode_engine(data, &DEFAULT_ENGINE);
let decoded = base64::decode_engine(&encoded, &DEFAULT_ENGINE).unwrap();
let encoded = STANDARD.encode(data);
let decoded = STANDARD.decode(&encoded).unwrap();
assert_eq!(data, decoded.as_slice());
});
Loading

0 comments on commit 8350376

Please sign in to comment.