README.md

# Partitioned Filters

This repository contains our implementations of four approximate filters: the Bloom filter, the Cuckoo filter, theMorton filter, and the Xor filter. We used the code in our paper [A four-dimensional Analysis of Partitioned Filters](https://www.db.in.tum.de).

In addition to our optimized filter implementations, the repository also contains the code of state-of-the-art competitors we compare to and extensive test cases. We generate the benchmarks using python scripts and included our results on an Intel i9-9900x (Skylake-X) with 64 GiB memory.

## Using the Code

### Prerequisites

- C++20 (we used GCC 10.2)
- CMake version 3.10 & make
- Linux (we used Ubuntu 20.10 with kernel version 5.8)
- Python3 + virtualenv (python is only needed for generating the benchmarks and the plots)

### Tests

```make test``` executes all available test (~400 test cases that test the correctness of our implementation (no false-negatives) and different aspects like multi-threading, partitioning, vectorization, etc)

### Benchmarks

```make benchmark``` generates the benchmark and the user can select the benchmark to execute. The results are written to csv-files in the ```benchmark/paper``` folder. All measurements are repeated five times to get stable results.

Executing all benchmarks takes roughly 1 week and requires 64 GiB memory. Some of the benchmarks do measure only the false-positive rate and the failed builds and, thus, should be executed with all available threads.

The csv includes the following fields:
| Field         | Unit         | Description                                                                                                                                                               |
| ------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| name          | Text         | Configuration: `<BenchmarkName>_<k>`/  `<Fixture>`/  `<s>` / `<n_threads>` / `<n_partitions>` / `<elements_build>` / `<elements_lookup>` / `<shared_elements` / `_` / `_` |
| real_time     | milliseconds | execution time per iteration                                                                                                                                              |
| DTLB-misses   | float        | data translation lookaside buffer misses per iteration                                                                                                                    |
| ITLB-misses   | float        | instruction translation lookaside buffer misses per iteration                                                                                                             |
| L1D-misses    | float        | level 1 data cache misses per iteration                                                                                                                                   |
| L1I-misses    | float        | level 1 instruction cache misses per iteration                                                                                                                            |
| LLC-misses    | float        | last-level (L3) cache misses per iteration                                                                                                                                |
| branch-misses | float        | branch misses per iteration                                                                                                                                               |
| cycles        | float        | cycles per iteration                                                                                                                                                      |
| instruction   | float        | executed instructions per iteration                                                                                                                                       |
| FPR           | float        | false-positive rate of the filter (only available when lookup is benchmarked)                                                                                             |
| failures      | integer      | number of failed builds                                                                                                                                                   |
| retries       | integer      | number of retries needed to build the filter                                                                                                                              |
| bits          | float        | number of bits per key allocated to the filter                                                                                                                            |
| size          | integer      | size of the filter in bytes                                                                                                                                               |

### Repository Structure

### Dependencies

## Related Work

We included the following state-the-art-filter implementations: