Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"fusion" AES-GCM engine #310

Merged
merged 64 commits into from
Jun 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
32f6c7b
it works
kazuho May 5, 2020
fa13ede
it works
kazuho May 6, 2020
58f04f4
unaligned access
kazuho May 6, 2020
50b3568
clang-format
kazuho May 6, 2020
ac9f2d0
remove dead code
kazuho May 6, 2020
7936cdd
constantify
kazuho May 6, 2020
2842536
do ~ 16384 bytes in thes test code too
kazuho May 7, 2020
0a1dc47
use loop to optimize for size
kazuho May 7, 2020
083f531
unroll hot loops
kazuho May 7, 2020
2ef1c0f
clang-format
kazuho May 7, 2020
cd0b7f0
Merge pull request #308 from h2o/kazuho/fusion-O2
kazuho May 7, 2020
274a572
precompute the entire ghash table
kazuho May 7, 2020
9a1143c
remove unused function
kazuho May 7, 2020
f198c1b
let the user specify the maximum size
kazuho May 7, 2020
f5f0f64
add benchmark
kazuho May 7, 2020
1e586c0
Merge branch 'kazuho/fusion-once' into kazuho/fusion
kazuho May 7, 2020
8363d78
comments
kazuho May 7, 2020
e46529c
add aesecb api
kazuho May 8, 2020
8289564
tests!
kazuho May 8, 2020
303153d
abondon unnecessary AES calculation
kazuho May 8, 2020
fb5bc58
add test case
kazuho May 8, 2020
bb320d8
fix off-by-one block
kazuho May 8, 2020
a1a81e6
wip
kazuho May 8, 2020
91c3b18
bail out as soon as learning that only GHASH calculation is necessary
kazuho May 9, 2020
bdabc76
parameterize the benchmark
kazuho May 9, 2020
8b4dfee
decryption
kazuho May 10, 2020
a891e31
add option to benchmark decryption speed
kazuho May 10, 2020
f94669f
add fusionbench to xcodeproj
kazuho May 10, 2020
ae95e4c
be explicit about the origin
kazuho May 10, 2020
9f2fb30
CTR mode
kazuho May 11, 2020
94feca2
expose fusion to the picotls API
kazuho May 11, 2020
4879386
unaligned access
kazuho May 12, 2020
faedb81
remove unnecessary assert
kazuho May 12, 2020
66a95e5
apply XOR
kazuho May 12, 2020
e68d6a3
handle non-zero vectors
kazuho May 12, 2020
977cf3d
follow the API change
kazuho May 12, 2020
1cf91f6
delay supplementary operation until the dependent region of the AES-G…
kazuho May 13, 2020
02ca0f0
we can make it a contractual obligation that IV can be loaded as 16-b…
kazuho May 13, 2020
079b1d0
use 128-bit load when the entire data is on the same page
kazuho May 13, 2020
56c572a
add API for initializing AEAD directly
kazuho May 13, 2020
ba2b960
let AEAD impls retain static_iv themselves using the formats they prefer
kazuho May 13, 2020
076982f
oops, argument to slli is in bytes
kazuho May 14, 2020
6d1eaab
set `-mavx2` as well
kazuho May 14, 2020
9c230ef
create dependency
kazuho May 14, 2020
3ee790b
check CPU features
kazuho May 14, 2020
3604f8b
old versions of GCC (e.g. 5.4) cannot detect support for aes,pclmul
kazuho May 14, 2020
efce043
__get_cpuid_count is also unavailable on older versions of GCC
kazuho May 14, 2020
31ebd7d
new / free are the terms that we use
kazuho May 14, 2020
4c19f50
AES256
kazuho May 14, 2020
77f1b8b
organize tests
kazuho May 15, 2020
b531bae
run AEAD test vectors using minicrypto
kazuho May 15, 2020
6b84978
expose picotls identifiers for fusion-aes256, add test
kazuho May 15, 2020
93dbbda
lessen the output (for travis)
kazuho May 18, 2020
7fd7c84
auto-expand
kazuho May 18, 2020
f950d65
remove obsolete FIXME
kazuho May 18, 2020
ea21c50
reduce redundancy
kazuho May 18, 2020
d8dc699
run GHASH of AAD and first AES permutation in parallel
kazuho May 19, 2020
122dd00
add test for loadn
kazuho May 17, 2020
eeff164
use pshufb when avoiding cross-page-boundary load
herumi May 19, 2020
db930f1
use pshufb when avoiding cross-page load
kazuho May 19, 2020
c1cae38
Merge branch 'master' into kazuho/fusion
kazuho Jun 14, 2020
ae2aeda
at the internal API-level, preserve the capability of setting IV
kazuho Jun 14, 2020
5e8d4e3
t/fusion.c not used by picotls-core
kazuho Jun 14, 2020
2ab530c
move fusionbench.c out from picotls; it's now available at https://gi…
kazuho Jun 14, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 19 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ ENDIF ()

ADD_LIBRARY(picotls-core ${CORE_FILES})
TARGET_LINK_LIBRARIES(picotls-core ${CORE_EXTRA_LIBS})

ADD_LIBRARY(picotls-minicrypto
${MINICRYPTO_LIBRARY_FILES}
lib/cifra.c
Expand All @@ -90,7 +91,6 @@ ADD_EXECUTABLE(test-minicrypto.t
lib/cifra/aes128.c
lib/cifra/aes256.c
lib/cifra/random.c)

SET(TEST_EXES test-minicrypto.t)

FIND_PACKAGE(OpenSSL)
Expand Down Expand Up @@ -131,6 +131,24 @@ ELSE ()
MESSAGE(WARNING "Disabling OpenSSL support (requires 1.0.1 or newer)")
ENDIF ()

IF ((CMAKE_SIZEOF_VOID_P EQUAL 8) AND
(CMAKE_SYSTEM_PROCESSOR STREQUAL "x86_64") OR
(CMAKE_SYSTEM_PROCESSOR STREQUAL "amd64") OR
(CMAKE_SYSTEM_PROCESSOR STREQUAL "AMD64"))
MESSAGE(STATUS " Enabling fusion support")
ADD_LIBRARY(picotls-fusion lib/fusion.c)
SET_TARGET_PROPERTIES(picotls-fusion PROPERTIES COMPILE_FLAGS "-mavx2 -maes -mpclmul")
TARGET_LINK_LIBRARIES(picotls-fusion picotls-core)
ADD_EXECUTABLE(test-fusion.t
deps/picotest/picotest.c
lib/picotls.c
t/fusion.c)
TARGET_LINK_LIBRARIES(test-fusion.t picotls-minicrypto)
SET_TARGET_PROPERTIES(test-fusion.t PROPERTIES COMPILE_FLAGS "-mavx2 -maes -mpclmul")
ADD_DEPENDENCIES(test-fusion.t generate-picotls-probes)
SET(TEST_EXES ${TEST_EXES} test-fusion.t)
ENDIF ()

ADD_CUSTOM_TARGET(check env BINARY_DIR=${CMAKE_CURRENT_BINARY_DIR} prove --exec '' -v ${CMAKE_CURRENT_BINARY_DIR}/*.t t/*.t WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} DEPENDS ${TEST_EXES} cli)

IF (CMAKE_SYSTEM_NAME STREQUAL "Linux")
Expand Down
4 changes: 4 additions & 0 deletions cmake/dtrace-utils.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,14 @@ FUNCTION (DEFINE_DTRACE_DEPENDENCIES d_file prefix)
OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.h
COMMAND dtrace -o ${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.h -s ${d_file} -h
DEPENDS ${d_file})
ADD_CUSTOM_TARGET(generate-${prefix}-probes DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.h)
SET_SOURCE_FILES_PROPERTIES(${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.h PROPERTIES GENERATED TRUE)
IF (DTRACE_USES_OBJFILE)
ADD_CUSTOM_COMMAND(
OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.o
COMMAND dtrace -o ${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.o -s ${d_file} -G
DEPENDS ${d_file})
ADD_DEPENDENCIES(generate-${prefix}-probes ${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.o)
SET_SOURCE_FILES_PROPERTIES(${CMAKE_CURRENT_BINARY_DIR}/${prefix}-probes.o PROPERTIES GENERATED TRUE)
ENDIF ()
ENDFUNCTION ()
73 changes: 58 additions & 15 deletions include/picotls.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ extern "C" {

#include <assert.h>
#include <inttypes.h>
#include <string.h>
#include <sys/types.h>

#if __GNUC__ >= 3
Expand Down Expand Up @@ -303,19 +304,26 @@ typedef const struct st_ptls_cipher_algorithm_t {
int (*setup_crypto)(ptls_cipher_context_t *ctx, int is_enc, const void *key);
} ptls_cipher_algorithm_t;

typedef struct st_ptls_aead_supplementary_encryption_t {
ptls_cipher_context_t *ctx;
const void *input;
uint8_t output[16];
} ptls_aead_supplementary_encryption_t;

/**
* AEAD context. AEAD implementations are allowed to stuff data at the end of the struct. The size of the memory allocated for the
* struct is governed by ptls_aead_algorithm_t::context_size.
*/
typedef struct st_ptls_aead_context_t {
const struct st_ptls_aead_algorithm_t *algo;
uint8_t static_iv[PTLS_MAX_IV_SIZE];
/* field above this line must not be altered by the crypto binding */
void (*dispose_crypto)(struct st_ptls_aead_context_t *ctx);
void (*do_encrypt_init)(struct st_ptls_aead_context_t *ctx, const void *iv, const void *aad, size_t aadlen);
void (*do_encrypt_init)(struct st_ptls_aead_context_t *ctx, uint64_t seq, const void *aad, size_t aadlen);
size_t (*do_encrypt_update)(struct st_ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen);
size_t (*do_encrypt_final)(struct st_ptls_aead_context_t *ctx, void *output);
size_t (*do_decrypt)(struct st_ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, const void *iv,
void (*do_encrypt)(struct st_ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen, ptls_aead_supplementary_encryption_t *supp);
size_t (*do_decrypt)(struct st_ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen);
} ptls_aead_context_t;

Expand Down Expand Up @@ -355,7 +363,7 @@ typedef const struct st_ptls_aead_algorithm_t {
/**
* callback that sets up the crypto
*/
int (*setup_crypto)(ptls_aead_context_t *ctx, int is_enc, const void *key);
int (*setup_crypto)(ptls_aead_context_t *ctx, int is_enc, const void *key, const void *iv);
} ptls_aead_algorithm_t;

/**
Expand Down Expand Up @@ -1192,15 +1200,24 @@ static void ptls_cipher_encrypt(ptls_cipher_context_t *ctx, void *output, const
*/
ptls_aead_context_t *ptls_aead_new(ptls_aead_algorithm_t *aead, ptls_hash_algorithm_t *hash, int is_enc, const void *secret,
const char *label_prefix);
/**
* instantiates an AEAD cipher given key and iv
* @param aead
* @param is_enc 1 if creating a context for encryption, 0 if creating a context for decryption
* @return pointer to an AEAD context if successful, otherwise NULL
*/
ptls_aead_context_t *ptls_aead_new_direct(ptls_aead_algorithm_t *aead, int is_enc, const void *key, const void *iv);
/**
* destroys an AEAD cipher context
*/
void ptls_aead_free(ptls_aead_context_t *ctx);
/**
*
*/
size_t ptls_aead_encrypt(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq, const void *aad,
size_t aadlen);
static size_t ptls_aead_encrypt(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen);
static void ptls_aead_encrypt_s(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen, ptls_aead_supplementary_encryption_t *supp);
/**
* initializes the internal state of the encryptor
*/
Expand Down Expand Up @@ -1251,7 +1268,12 @@ int ptls_server_handle_message(ptls_t *tls, ptls_buffer_t *sendbuf, size_t epoch
/**
* internal
*/
void ptls_aead__build_iv(ptls_aead_context_t *ctx, uint8_t *iv, uint64_t seq);
void ptls_aead__build_iv(ptls_aead_algorithm_t *algo, uint8_t *iv, const uint8_t *static_iv, uint64_t seq);
/**
*
*/
static void ptls_aead__do_encrypt(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen, ptls_aead_supplementary_encryption_t *supp);
/**
* internal
*/
Expand Down Expand Up @@ -1374,12 +1396,22 @@ inline void ptls_cipher_encrypt(ptls_cipher_context_t *ctx, void *output, const
ctx->do_transform(ctx, output, input, len);
}

inline void ptls_aead_encrypt_init(ptls_aead_context_t *ctx, uint64_t seq, const void *aad, size_t aadlen)
inline size_t ptls_aead_encrypt(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen)
{
uint8_t iv[PTLS_MAX_IV_SIZE];
ctx->do_encrypt(ctx, output, input, inlen, seq, aad, aadlen, NULL);
return inlen + ctx->algo->tag_size;
}

inline void ptls_aead_encrypt_s(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen, ptls_aead_supplementary_encryption_t *supp)
{
ctx->do_encrypt(ctx, output, input, inlen, seq, aad, aadlen, supp);
}

ptls_aead__build_iv(ctx, iv, seq);
ctx->do_encrypt_init(ctx, iv, aad, aadlen);
inline void ptls_aead_encrypt_init(ptls_aead_context_t *ctx, uint64_t seq, const void *aad, size_t aadlen)
{
ctx->do_encrypt_init(ctx, seq, aad, aadlen);
}

inline size_t ptls_aead_encrypt_update(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen)
Expand All @@ -1392,13 +1424,24 @@ inline size_t ptls_aead_encrypt_final(ptls_aead_context_t *ctx, void *output)
return ctx->do_encrypt_final(ctx, output);
}

inline void ptls_aead__do_encrypt(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen, ptls_aead_supplementary_encryption_t *supp)
{
ctx->do_encrypt_init(ctx, seq, aad, aadlen);
ctx->do_encrypt_update(ctx, output, input, inlen);
ctx->do_encrypt_final(ctx, (uint8_t *)output + inlen);

if (supp != NULL) {
ptls_cipher_init(supp->ctx, supp->input);
memset(supp->output, 0, sizeof(supp->output));
ptls_cipher_encrypt(supp->ctx, supp->output, supp->output, sizeof(supp->output));
}
}

inline size_t ptls_aead_decrypt(ptls_aead_context_t *ctx, void *output, const void *input, size_t inlen, uint64_t seq,
const void *aad, size_t aadlen)
{
uint8_t iv[PTLS_MAX_IV_SIZE];

ptls_aead__build_iv(ctx, iv, seq);
return ctx->do_decrypt(ctx, output, input, inlen, iv, aad, aadlen);
return ctx->do_decrypt(ctx, output, input, inlen, seq, aad, aadlen);
}

#define ptls_define_hash(name, ctx_type, init_func, update_func, final_func) \
Expand Down
99 changes: 99 additions & 0 deletions include/picotls/fusion.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
/*
* Copyright (c) 2020 Fastly, Kazuho Oku
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to
* deal in the Software without restriction, including without limitation the
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
* sell copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#ifndef picotls_fusion_h
#define picotls_fusion_h

#ifdef __cplusplus
extern "C" {
#endif

#include <stddef.h>
#include <emmintrin.h>
#include "../picotls.h"

#define PTLS_FUSION_AES128_ROUNDS 10
#define PTLS_FUSION_AES256_ROUNDS 14

typedef struct ptls_fusion_aesecb_context {
__m128i keys[PTLS_FUSION_AES256_ROUNDS + 1];
unsigned rounds;
} ptls_fusion_aesecb_context_t;

typedef struct ptls_fusion_aesgcm_context ptls_fusion_aesgcm_context_t;

void ptls_fusion_aesecb_init(ptls_fusion_aesecb_context_t *ctx, int is_enc, const void *key, size_t key_size);
void ptls_fusion_aesecb_dispose(ptls_fusion_aesecb_context_t *ctx);
void ptls_fusion_aesecb_encrypt(ptls_fusion_aesecb_context_t *ctx, void *dst, const void *src);

/**
* Creates an AES-GCM context.
* @param key the AES key (128 bits)
* @param capacity maximum size of AEAD record (i.e. AAD + encrypted payload)
*/
ptls_fusion_aesgcm_context_t *ptls_fusion_aesgcm_new(const void *key, size_t key_size, size_t capacity);
/**
* Updates the capacity.
*/
ptls_fusion_aesgcm_context_t *ptls_fusion_aesgcm_set_capacity(ptls_fusion_aesgcm_context_t *ctx, size_t capacity);
/**
* Destroys an AES-GCM context.
*/
void ptls_fusion_aesgcm_free(ptls_fusion_aesgcm_context_t *ctx);
/**
* Encrypts an AEAD block, and in parallel, optionally encrypts one block using AES-ECB.
* @param ctx context
* @param output output buffer
* @param input payload to be encrypted
* @param inlen size of the payload to be encrypted
* @param counter
* @param aad AAD
* @param aadlen size of AAD
* @param supp (optional) supplementary encryption context
*/
void ptls_fusion_aesgcm_encrypt(ptls_fusion_aesgcm_context_t *ctx, void *output, const void *input, size_t inlen, __m128i ctr,
const void *aad, size_t aadlen, ptls_aead_supplementary_encryption_t *supp);
/**
* Decrypts an AEAD block, an in parallel, optionally encrypts one block using AES-ECB. Returns if decryption was successful.
* @param iv initialization vector of 12 bytes
* @param output output buffer
* @param input payload to be decrypted
* @param inlen size of the payload to be decrypted
* @param aad AAD
* @param aadlen size of AAD
* @param tag the AEAD tag being received from peer
*/
int ptls_fusion_aesgcm_decrypt(ptls_fusion_aesgcm_context_t *ctx, void *output, const void *input, size_t inlen, __m128i ctr,
const void *aad, size_t aadlen, const void *tag);

extern ptls_cipher_algorithm_t ptls_fusion_aes128ctr, ptls_fusion_aes256ctr;
extern ptls_aead_algorithm_t ptls_fusion_aes128gcm, ptls_fusion_aes256gcm;

/**
* Returns a boolean indicating if fusion can be used.
*/
int ptls_fusion_is_supported_by_cpu(void);

#ifdef __cplusplus
}
#endif

#endif
Loading