README.md



Partitioned Filters
This repository contains our implementations of four approximate filters: the Bloom filter, the Cuckoo filter, theMorton filter, and the Xor filter. We used the code in our paper A four-dimensional Analysis of Partitioned Filters.
In addition to our optimized filter implementations, the repository also contains the code of state-of-the-art competitors we compare to and extensive test cases. We generate the benchmarks using python scripts and included our results on an Intel i9-9900x (Skylake-X) with 64 GiB memory.

Using the Code

Prerequisites

C++20 (we used GCC 10.2)
CMake version 3.10 & make
Linux (we used Ubuntu 20.10 with kernel version 5.8)
Python3 + virtualenv (python is only needed for generating the benchmarks and the plots)


Tests
make test executes all available test (~400 test cases that test the correctness of our implementation (no false-negatives) and different aspects like multi-threading, partitioning, vectorization, etc)

Benchmarks
make benchmark generates the benchmark and the user can select the benchmark to execute. The results are written to csv-files in the benchmark/paper folder. All measurements are repeated five times to get stable results.
Executing all benchmarks takes roughly 1 week and requires 64 GiB memory. Some of the benchmarks do measure only the false-positive rate and the failed builds and, thus, should be executed with all available threads.
The csv includes the following fields:


Field
Unit
Description


name
Text
Configuration: <BenchmarkName>_<k>/  <Fixture>/  <s> / <n_threads> / <n_partitions> / <elements_build> / <elements_lookup> / <shared_elements / _ / _


real_time
milliseconds
execution time per iteration


DTLB-misses
float
data translation lookaside buffer misses per iteration


ITLB-misses
float
instruction translation lookaside buffer misses per iteration


L1D-misses
float
level 1 data cache misses per iteration


L1I-misses
float
level 1 instruction cache misses per iteration


LLC-misses
float
last-level (L3) cache misses per iteration


branch-misses
float
branch misses per iteration


cycles
float
cycles per iteration


instruction
float
executed instructions per iteration


FPR
float
false-positive rate of the filter (only available when lookup is benchmarked)


failures
integer
number of failed builds


retries
integer
number of retries needed to build the filter


bits
float
number of bits per key allocated to the filter


size
integer
size of the filter in bytes


Repository Structure

Dependencies

Related Work
We included the following state-the-art-filter implementations: