This repository contains a Proof of Concept for a key-value store built on Fully Homomorphic Encryption. Here, FHE enables the client to query for a single item of data without the server knowing which item was queried or even if anything was returned.
Part of the requirements for this project was that the server must be able to accept queries that are 32-bits in size. I decided to start small and begin with 4-bit queries. When extending the database to 32 bits, I represent 32-bit data as a sequence of 4-bit groups.
Many operations were unsupported in concrete-numpy
that would have been useful for building the final database. To work around these, I was able to encode certain operations on two variables into lookup tables on a single input.
on
Run python four_bit_database.py
to see it in action and have a look inside the four_bit_database.py
file to see how it was implemented.
The idea for creating a fully homomorphic database revolves around representing operations in such a way that they can be represented as some function on transformations of the entries. For the homomorphic database. In this example, I choose the addition operator and created a transformation that returns zero when the entry contains the queried data and returns the queried data otherwise. This way an operation on the database
The server will represent the database as a sequence of key-value pairs. In python, we can do this by using a list.
class HomomorphicDatabase:
def __init__(self, *args):
self.base = []
Insertion is quite easy and can be implemented as appending key-value pairs to the list.
def insert(self, key, value):
self.base.append((key, value))
For the retrieval operation,
Here, we can break down
The data-fetching part takes two inputs equal
and value
. If equal
is one, then we should return value
. If equal is zero then we should return zero.
The equality function can be written by comparing the bits in each place of two numbers. If we can perform an equality check on 1-bit integers, then we can check equality on integers of any size. For now, let's assume that each number fits into 4 bits and later extend this to 32 bits.
For a 4 bit integer, we can begin by checking the most significant bit. We can clear the other bits by right-shifting the number by 3. After performing this operation on both integers, we can compare these bits. We can subtract these bits from the original numbers to achieve and repeat the process, but this time with a shift of 2, then 1, then 0. After completing this process, we're left with 4 bits which we need to verify are all 1. This is done using a table look-ups on 2-bit integers.
Represent an AND
gate using a table lookup
Input | Output |
---|---|
00 | 0 |
01 | 0 |
10 | 0 |
11 | 1 |
Using this we now haveAND(x, y) = Table[x << 1 + y]
.
For checking 1-bit equality we can do the same thing.
Input | Output |
---|---|
00 | 1 |
01 | 0 |
10 | 0 |
11 | 1 |
And EQUAL(x, y) = Table[x << 1 + y]
.
Putting this together, we get
import concrete.numpy as cnp
AND = cnp.LookupTable([0, 0, 0, 1])
EQUAL = cnp.LookupTable([1, 0, 0, 1])
one_bit_equality(left, right):
return EQUAL[left << 1 + right]
def bit_and(left, right):
return AND[left << 1 + right]
However the and operator can be represented as a function of the sum of the two bits. That is, there is a table such that AND(X, Y) = Table[x + y]
. Look at the following table
xy | x + y | AND(X, Y) |
---|---|---|
00 | 00 | 0 |
01 | 01 | 0 |
10 | 01 | 0 |
11 | 10 | 1 |
THe corresponding table satisfies the requirements
Input | Output |
---|---|
00 | 0 |
01 | 0 |
10 | 1 |
11 | * |
In fact, we can do this for any commutative operator on two bits as the sum uniquely represents the two-bit combination.
Our code now looks like this.
import concrete.numpy as cnp
AND = cnp.LookupTable([0, 0, 1])
EQUAL = cnp.LookupTable([1, 0, 1])
one_bit_equality(left, right):
return EQUAL[left + right]
def bit_and(left, right):
return AND[left + right]
Using the algorithm from earlier, the code for checking 4-bit equality is as follows.
def four_bit_equality(left, right):
x = 1
for i in range(1, 5):
k = 4 - i
lk = left >> k
rk = right >> k
x = AND[one_bit_equality(lk, rk) + x]
left -= lk << k
right -= rk << k
return x
To reduce the number of table lookups, we can instead define one table to check that all the bits are 1. This implementation looks like this
ALL_ONE = cnp.LookupTable([0 for _ in range(2**4 - 1)] + [1])
def four_bit_equality(left, right):
x = 0
for i in range(1, 5):
k = 4 - i
lk = left >> k
rk = right >> k
x += one_bit_equality(lk, rk) << k
left -= lk << k
right -= rk << k
return ALL_ONE[x]
The data fetching component of the function takes two bits. We can take the same approach as earlier and successively operate on each bit of the integer. What we do here, is essentially a bitwise and between a sequence of 1s or a sequences of 0s against the original data. In code, this becomes the following.
def fetch_data(control, data):
result = 0
for i in range(1, 5):
k = 4 - i
bit = data >> k
result += AND[control + bit] << k
data -= bit << k
return result
Getting the data from an entry with an encrypted key and value given an arbitrary query is now simple! The code looks like this
def read_entry(key, value, query):
equal = four_bit_equality(key, query)
return fetch_data(equal, value)
Using all of these, the code for retrieving an item from the database is as follows
def get(self, query):
result = 0
for (entry_key, entry_value) in self.base:
result += read_entry(entry_key, entry_value, query)
return result
Replacing values can be thought of as applying an update operation to every single entry. Where the entry's value remains the same if the keys is different to the query and becomes the new value if the keys are the same. We can implement this function using the functions defined earlier! This is the same as fetching data from the entry in the case that the key is not equal to the query and fetching data from the updated value if the key is the equal to query. At most one of these is non-zero so we can simply add these values.
The code for the FHE circuit is as follows
def replace(old_key, old_value, new_key, new_value):
equal = four_bit_equality(old_key, new_key)
return fetch_data(equal, new_value) + fetch_data(1 - equal, old_value)
The code for the database operation looks like this
def update(self, query, value):
for index, (entry_key, entry_value) in enumerate(self.base):
new_value = replace(entry_key, query, value)
self.base[index] = (entry_key, new_value)
The deletion function requires a bit of thinking to come up with. Zero-ing out a row with FHE will not allow us to reclaim the space occupied by the item that the client wishes to delete. We can construct an algorithm to perform entry deletion from the database in such a way that allows us to free up space with each operation while still ensuring that the server knows nothing about which item was deleted. The process is as follows.
Take the first entry in the database, remove it from the database and store it in a variable. Now, for each remaining item in the database, replace the entry with the entry selected at the start if the key of this entry matches the key specified in the client's deletion query. If the client wanted to delete the first entry, then the match is never successful and the entry is never re-inserted to the database so it will successfuly be deleted. If the client wants to delete any other entry then it will have been replaced with the first entry after matching the client's query and never again re-inserted into the database so it will have been deleted successfully.
The code implementation for the required FHE circuits looks like this
def delete_circuit(first_key, first_value, key, value, query):
equal = four_bit_equality(key, query)
new_key = fetch_data(equal, first_key) + fetch_data(1 - equal, key)
new_value = fetch_data(equal, first_value) + fetch_data(1 - equal, value)
return (new_key, new_value)
The delete database deletion operation will then be as follows
def delete(self, query):
first_key, first_value = self.base.pop()
for index, (first_key, first_value) in enumerate(self.base):
new_key, new_value = delete_circuit(first_key, first_value, entry_key, entry_value, query)
self.base[index] = (new_key, new_value)
The database can be extended so that it can support 32 bit values. To do this, represent as 32-bit values as a sequence of 8 4-bit values. When checking equality of two 32-bit values, check the equality of each pair of 4-bit values and ensure they are all equal using the AND operation. When performing any conditional statements on database entry values, perform them on all of the 4-bit data that compose the entry's value and proceed as normal. The definition for the eight-bit equality circuit can now be written as follows
def all8(z1, z2, z3, z4, z5, z6, z7, z8):
z = AND[z1 + z2]
z = AND[z + z3]
z = AND[z + z4]
z = AND[z + z5]
z = AND[z + z6]
z = AND[z + z7]
z = AND[z + z8]
return z
def eight_bit_equality(
left1, left2, left3, left4, left5, left6, left7, left8,
right1, right2, right3, right4, right5, right6, right7, right8,
):
z1 = EQUAL[left1 + right1]
z2 = EQUAL[left2 + right2]
z3 = EQUAL[left3 + right3]
z4 = EQUAL[left4 + right4]
z5 = EQUAL[left5 + right5]
z6 = EQUAL[left6 + right6]
z7 = EQUAL[left7 + right7]
z8 = EQUAL[left8 + right8]
return all8(z1, z2, z3, z4, z5, z6, z7, z8)
We can remove the sequential AND
lookups by defining a new table, AND8
AND8 = cnp.LookupTable([0] * 7 + [1])
def all8(z1, z2, z3, z4, z5, z6, z7, z8):
return AND8[z1 + z2 + z3 + z4 + z5 + z6 + z7 + z8]
The fetch_data
operation, essentially allows us to construct if statements in FHE. We calculate the result from both branches of computation and then use the function to perform selection on the data, simulating substituting in as an if statement. In my original code, this function is called partial_multipy
as it is equivalent to multiplying data by either one or zero. fetch_data
can be renamed as fetch_if
and the fetch_if(condition, value1) + fetch_if(1 - condition, value2)
idiom can be used to construct a new macro if_then_else(condition, value1, value2)
. This could also be used to construct if statements at the compiler level and also bounded control flow statements (e.g. a while loop with a maximum number of iterations).