Boost C++ 库

……在世界上享有盛誉、设计精湛的 C++ 库项目之一。 — Herb Sutter 和 Andrei Alexandrescu, C++ Coding Standards

Boost.Bloom - Boost C++ 函数库

介绍

Boost.Bloom 提供了类模板 boost::bloom::filter，它可以配置为实现经典的 Bloom 过滤器以及文献中讨论的变体，如块过滤器、多块过滤器等。

#include <boost/bloom.hpp>
#include <cassert>
#include <iostream>
#include <string>

int main()
{
  // Bloom filter of strings with 5 bits set per insertion
  using filter = boost::bloom::filter<std::string, 5>;

  // create filter with a capacity of 1'000'000 bits
  filter f(1'000'000);

  // insert elements (they can't be erased, Bloom filters are insert-only)
  f.insert("hello");
  f.insert("Boost");

  // elements inserted are always correctly checked as such
  assert(f.may_contain("hello") == true);

  // elements not inserted may incorrectly be identified as such with a
  // false positive rate (FPR) which is a function of the array capacity,
  // the number of bits set per element and generally how the boost::bloom::filter
  // was specified
  if(f.may_contain("bye")) { // likely false
    std::cout << "false positive\n";
  }
  else {
    std::cout << "everything worked as expected\n";
  }
}

支持的不同过滤器变体在编译时作为 boost::bloom::filter 实例化定义的一部分指定。Boost.Bloom 的实现重点在于性能；SIMD 技术（如 AVX2、Neon 和 SSE2）可以用来加速操作。

入门

有关如何安装整个 Boost 项目或仅安装 Boost.Bloom 及其依赖项的信息，请参阅网站部分。

Boost.Bloom 是一个仅头文件的库，因此不需要额外的构建阶段。需要 C++11 或更高版本。该库已成功在 GCC 4.8、Clang 3.9 和 Visual Studio 2015（及更高版本）上运行。您可以通过编译上面显示的示例程序来验证您的环境是否设置正确。

如果您不熟悉 Bloom 过滤器，请参阅入门教程；否则，您可以直接跳转到教程。

Bloom 过滤器（以发明者 Burton Howard Bloom 的名字命名）是一种概率数据结构，其中插入的元素可以 100% 准确地查找，而查找未插入的元素可能会以一定的概率失败，称为过滤器的误报率或 FPR。这里的权衡是，Bloom 过滤器比传统的非概率容器占用的空间小得多（通常每个元素约 8-20 位），同时保持可接受的低 FPR。过滤器的容量（以位为单位）越大，最终的 FPR 就越低。

通常，Bloom 过滤器对于防止/减轻对大型数据集的查询非常有用，当精确检索成本高昂且/或无法在主内存中完成时。

示例：加速数据库的失败请求

Bloom 过滤器和类似数据结构的一个主要应用是防止成本高昂的磁盘/网络访问，当这些访问无法检索到特定信息时。例如，假设我们正在开发一个访问时间为 10 毫秒的数据库前端，并且我们知道 50% 的请求将不成功（记录不存在）。插入一个查找时间为 200 纳秒、FPR 为 0.5% 的 Bloom 过滤器将把系统的平均响应时间从 10 毫秒降低到

(10 + 0.0002) × 50.25% + 0.0002 × 49.75% ≅ 5.03 毫秒，

也就是说，我们获得了 1.99 倍的整体加速。如果数据库包含 10 亿条记录，一个每个元素占用 8 位内存的过滤器将占用 0.93 GB，这是完全可以实现的。

图 1. 使用 Bloom 过滤器改善数据库负访问时间。

已在 Web 缓存、字典压缩、网络路由和基因组学等领域描述了其应用。 Broder 和 Mitzenmacher 提供了相当广泛的用例回顾，重点关注网络。

实现

经典 Bloom 过滤器的实现包括一个由 m 位组成的数组，最初设置为零。插入元素 x 归结为伪随机选择 k 个位置（借助 k 个独立的哈希函数）并将它们设置为一。

图 2. 经典 Bloom 过滤器中插入 k = 6 个不同哈希函数。插入 x 归结为将位置 10、14、43、58、1 和 39 的位设置为一，如 h₁(x)、…、h₆(x) 所示。

为了检查元素 y 是否在过滤器中，我们执行相同的过程，并检查选定的位是否全部设置为一。在示例图中，有两个未设置的位，这明确表明 y 未插入过滤器。

图 3. 经典 Bloom 过滤器中的查找。值 y 不在过滤器中，因为位置 20 和 61 的位未设置为一。

当被检查的位由于其他不相关的插入而恰好全部设置为一时，会发生误报。随着我们将更多元素添加到过滤器中，发生误报的概率会增加，而对于给定数量 n 的已插入元素，容量更大（更大的位数组）的过滤器将具有更低的 FPR。每个操作设置的位数 k 也会影响 FPR，尽管方式更复杂：当数组稀疏填充时，较高的 k 值会改善（降低）FPR，因为我们更有可能命中一个未设置的位；但是，如果 k 非常高，随着新元素的插入，数组中设置为一的位数会越来越多，最终会达到一个点，我们会输给一个具有较低 k 值因此设置位数比例较低的过滤器。对于给定的 n 和 m 值，最佳 k 为\(\lfloor k_{\text{opt}}\rfloor\)或\(\lceil k_{\text{opt}}\rceil\)，其中

\(k_{\text{opt}}=\displaystyle\frac{m\cdot\ln2}{n},\)

，对于最低 FPR 接近\(1/2^{k_{\text{opt}}} \approx 0.6185^{m/n}\)。有关更多详细信息，请参阅有关 FPR 估计的附录。

图 4. 两个具有 m = 10⁵ 位的过滤器的 FPR 与已插入元素数量的关系。k = 6（红色）对于较小的 n 值比 k = 2（蓝色）具有更好的（较低的）FPR，但随着 n 的增长最终会退化得更多。虚线显示了通过为每个 n 值选择最佳 k 值可达到的最低 FPR。

经典过滤器的变体

块过滤器

Bloom 过滤器上的操作涉及访问内存中的 k 个不同位置，对于大型数组而言，这会导致 k 次 CPU 缓存未命中，并影响操作的性能。一种称为块过滤器的经典方法的变体，通过将所有位设置/检查集中在一个伪随机选择的 b 位的小块中来最小化缓存未命中。如果块足够小，它将适合 CPU 缓存行，从而大大减少缓存未命中的数量。

图 5. 块过滤器。根据 h₀(x) 选择一个 b 位的块，所有后续的位设置都限制在该块中。

缺点是，与具有相同 n、m 和 k 值的经典过滤器相比，最终的 FPR 更差。直观地说，块过滤器降低了数组中位的分布均匀性，这最终会损害其概率性能。

图 6. 相同 k = 4、m = 10⁵ 位的经典过滤器和块过滤器的 FPR（对数尺度）与已插入元素数量的关系。

这个想法的进一步变体是让操作选择 k 个块，每个块设置 k' 位。这同样会比具有每个操作 k·k' 位的经典过滤器具有更差的 FPR，但比普通的 k·k' 块过滤器有所改进。

图 7. 具有多重插入的块过滤器。选择 k = 2 个块，并在每个块中设置 k' = 3 位。

多块过滤器

多块过滤器进一步发展了块过滤器的思想，通过在连续的 b 位块序列上进行位设置/检查，使得每个块占用一位。这仍然保持了良好的缓存局部性，但与块过滤器相比，FPR 得到了改善，因为设置为一的位在数组中分布得更开。

图 8. 多块过滤器。根据 h₀(x) 选择一个 k' = 4 个连续块的范围，并在每个块中设置一位。

多块过滤器也可以与多重插入结合使用。通常，对于每个操作相同的位数以及相等的 n 和 m 值，经典 Bloom 过滤器将具有更好的（较低的）FPR，其次是多块过滤器，然后是块过滤器。执行速度大致按相反的顺序进行。在考虑具有多重插入的块/多块过滤器时，可用配置的数量会迅速增长，您需要进行一些实验来找到（FPR、容量、速度）权衡空间中的首选点。

教程

boost::bloom::filter 可以被视为一个位数组，该数组被划分为子数组，这些子数组在插入时根据伪随机选择（基于哈希函数）：每个子数组被传递给一个子过滤器，该子过滤器根据某种关联策略标记其中的几个位。

请注意，尽管 boost::bloom::filter 模仿了容器的接口并提供了 insert 等操作，但它实际上不是一个容器：例如，插入不涉及元素在数据结构中的实际存储，而仅仅是根据元素的哈希值在内部数组中设置一些位。

过滤器定义

template<
  typename T, std::size_t K,
  typename Subfilter = block<unsigned char, 1>, std::size_t Stride = 0,
  typename Hash = boost::hash<T>,
  typename Allocator = std::allocator<unsigned char>
>
class filter;

T：插入的元素的类型。
K：每次插入标记的子数组的数量。
Subfilter：使用的子过滤器的类型。
Stride：连续子数组的起始位置之间的字节距离。
Hash：T 的哈希函数。
Allocator：unsigned char 的分配器。

`子过滤器`

子过滤器定义了在位数组的选定子数组内设置或检查位的局部策略。它确定每次操作修改的位数、它们在内存中的排列方式以及内存的访问方式。以下子过滤器是提供的

子过滤器描述优点缺点

子过滤器	描述	优点	缺点
`block<Block, K'>`	在 `Block` 类型的子数组中设置 `K'` 位	访问时间非常快	FPR 越差（越高），`Block` 越小
`multiblock<Block, K'>`	在 `Block[K']` 子数组的每个元素中设置一位	对于相同的 `Block` 类型，FPR 比 `block<Block, K'>` 更好（更低）	如果访问子数组时跨越缓存行边界，性能可能会下降
`fast_multiblock32<K'>`	在统计上等同于 `multiblock<uint32_t, K'>`，但在编译时启用 SSE2、AVX2 或 Neon 时使用更快的基于 SIMD 的算法	当 SSE2/AVX2/Neon 可用时，始终优先于 `multiblock<uint32_t, K'>`	对于相同的 `K'`，FPR 比 `fast_multiblock64<K'>` 更差（更高）
`fast_multiblock64<K'>`	在统计上等同于 `multiblock<uint64_t, K'>`，但在编译时启用 AVX2 时使用更快的基于 SIMD 的算法	当 AVX2 可用时，始终优先于 `multiblock<uint64_t, K'>`	对于相同的 `K'`，速度比 `fast_multiblock32<K'>` 慢

block<Block, K'>

在 Block 类型的子数组中设置 K' 位

访问时间非常快

FPR 越差（越高），Block 越小

multiblock<Block, K'>

在 Block[K'] 子数组的每个元素中设置一位

对于相同的 Block 类型，FPR 比 block<Block, K'> 更好（更低）

如果访问子数组时跨越缓存行边界，性能可能会下降

fast_multiblock32<K'>

在统计上等同于 multiblock<uint32_t, K'>，但在编译时启用 SSE2、AVX2 或 Neon 时使用更快的基于 SIMD 的算法

当 SSE2/AVX2/Neon 可用时，始终优先于 multiblock<uint32_t, K'>

对于相同的 K'，FPR 比 fast_multiblock64<K'> 更差（更高）

fast_multiblock64<K'>

在统计上等同于 multiblock<uint64_t, K'>，但在编译时启用 AVX2 时使用更快的基于 SIMD 的算法

当 AVX2 可用时，始终优先于 multiblock<uint64_t, K'>

对于相同的 K'，速度比 fast_multiblock32<K'> 慢

上表中的 Block 可以是无符号整数类型（例如 unsigned char、uint32_t、uint64_t），或 2^N 个无符号整数的数组（例如 uint64_t[8]）。通常，Block 越宽，最终的 FPR 越好（越低），但缓存局部性会变差，性能可能会受到影响。

请注意，boost::bloom::filter<T, K, subfilter<…, K'>> 的总设置/检查位数是 K * K'。默认配置 boost::bloom::filter<T, K> = boost::bloom::filter<T, K, block<unsigned char, 1>>，对应于经典 Bloom 过滤器，在具有相同每操作位数的所有过滤器中具有最佳（最低）FPR，但也是最慢的：为每个设置/检查的位访问一个新的子数组。请参阅基准测试部分以了解 FPR 和性能之间的不同权衡。

一旦选择了子过滤器，就可以根据预期的插入元素数量调整参数 K' 到其最佳值（最小 FPR），如专用部分所述；否则，通常低值的 K' 比高值的 K' 更快且更优选，只要最终的 FPR 在可接受的水平。

`步长`

正如我们所见，Subfilter 定义了 boost::bloom::filter 使用的子数组（在 block<Block, K'> 的情况下为 Block，在 multiblock<Block, K'> 的情况下为 Block[K']）：然后访问底层位数组的连续部分并将其视为这些子数组。Stride 参数控制连续子数组起始位置之间的字节距离。

当使用默认值 0 时，步长会自动设置为子数组的大小，因此它们之间没有重叠。如果 Stride 设置的值小于该大小，则连续的子数组会相互叠加：对于较小的值 Stride，重叠级别更大，当 Stride 为 1 字节时发生最大重叠。

图 9. 两个不同的 Stride 配置：（a）非重叠子数组，（b）重叠子数组。
每个子数组都与相同颜色的步长相关联。

与非重叠情况相比，重叠情况会改善（降低）最终的 FPR，权衡是子数组可能未在内存中对齐，这可能会对性能产生负面影响。

`哈希`

与其他需要每个操作多个哈希函数的 Bloom 过滤器实现不同，boost::bloom::filter 只使用一个。默认情况下，使用 Boost.ContainerHash。如果您需要为自己的类型扩展 boost::hash，请查阅该库的专用部分。

当提供的哈希函数质量足够高时，它会被直接使用；否则，会对哈希值应用位混合后处理，以提高其统计属性，从而使最终的 FPR 接近理论极限。通过 boost::hash_is_avalanching trait 来确定哈希函数是否为高质量（更准确地说，是否具有所谓的雪崩属性）。

容量

过滤器的内部数组大小在构造时指定

using filter = boost::bloom::filter<std::string, 8>;
filter f(1'000'000); // array of 1'000'000 bits
std::cout << f.capacity(); // >= 1'000'000

请注意，boost::bloom::filter 的默认构造函数指定了零容量，这通常没什么用——分配的数组为空。

而不是直接指定数组的容量，我们可以让库根据我们计划插入的元素数量和所需的 FPR 来计算它

// we'll insert 100'000 elements and want a FPR ~ 1%
filter f(100'000, 0.01);

// this is equivalent
filter f2(filter::capacity_for(100'000, 0.01));

当指定的 FPR 非常小时要小心，因为由此产生的容量可能太大而无法放入内存

// resulting capacity ~ 1.4E12, out of memory std::bad_alloc is thrown
filter f3(100'000, 1E-50);

一旦过滤器构造完成，其数组就是固定的（例如，它不会随着元素的插入而动态增长）。改变它的唯一方法是通过赋值/交换另一个过滤器，或者使用 reset

f.reset(2'000'000); // change to 2'000'000 bits and clears the filter
f.reset(100'000, 0.005); // equivalent to reset(filter::capacity_for(100'000, 0.005));
f.reset(); // null array (capacity == 0)

插入和查找

插入方式与传统容器非常相似

f.insert("hello");
f.insert(data.begin(), data.end());

当然，在这种情况下，“插入”并不涉及元素在过滤器中的任何实际存储，而是根据这些元素的哈希值在内部数组中设置位。查找如下所示

bool b1 = f.may_contain("hello"); // b1 is true since we actually inserted "hello"
bool b2 = f.may_contain("bye"); // b2 is most likely false

顾名思义，may_contain 即使在元素之前未插入时也可能返回 true，也就是说，它可能产生误报——这是概率数据结构的本质。fpr_for 提供误报率的估计

// we have inserted 100 elements so far, what's our FPR?
std::cout<< filter::fpr_for(100, f.capacity());

请注意，在示例中我们外部提供了数字 100：boost::bloom::filter 不跟踪已插入元素的数量——换句话说，它没有 size 操作。

一旦插入，就没有办法从过滤器中删除特定元素。我们只能完全清空过滤器

f.clear(); // sets all the bits in the array to zero

批量操作

通常，以下代码

f.insert(data.begin(), data.end());

比...更快

for(const auto& x: data) f.insert(x);

这是因为前者以 bulk_insert_size 的块大小处理范围，并使用一些内部的简化技术来减少执行时间。类似地，may_contain 也可以批量执行，如下所示

f.may_contain(
  input.begin(), input.end(), // range of elements to do lookup on
  [](value_type& x, bool b) { // called for each of the elements with their lookup result
    if(b) std::cout << x << "likely in the filter";
	else  std::cout << x << "not in the filter";
  });

批量 may_contain 以 bulk_may_contain_size 个元素的块大小处理范围。

批量模式可以将性能提高 2 倍或更多，但这非常依赖于过滤器配置、使用的编译器和环境，在某些情况下会导致净性能损失。通常，对于较大的数组大小，加速会更高。有关更多信息，请参阅专用基准测试部分和相关仓库。

过滤器组合

可以通过对它们的数组进行逻辑 OR 操作来组合 boost::bloom::filter

filter f2 = ...;
...
f |= f2; // f and f2 must have exactly the same capacity

结果等同于一个“包含”f 和 f2 元素集合并集的过滤器。另一方面，AND 组合会产生一个包含元素交集的过滤器

filter f3 = ...;
...
f &= f3; // f and f3 must have exactly the same capacity

对于 AND 组合，请注意，最终的 FPR 通常会比从头开始通过仅插入公共元素构建过滤器时更差（更高）——在这种情况下，不要相信 fpr_for。

直接访问数组

可以通过 array 成员函数直接访问位数组的内容，该函数可用于过滤器序列化

filter f1 = ...;
...

// save filter
std::ofstream out("filter.bin", std::ios::binary);
std::size_t c1=f1.capacity();
out.write(reinterpret_cast<const char*>(&c1), sizeof(c1)); // save capacity (bits)
boost::span<const unsigned char> s1 = f1.array();
out.write(reinterpret_cast<const char*>(s1.data()), s1.size()); // save array
out.close();

// load filter
filter f2;
std::ifstream in("filter.bin", std::ios::binary);
std::size_t c2;
in.read(reinterpret_cast<char*>(&c2), sizeof(c2));
f2.reset(c2); // restore capacity
boost::span<unsigned char> s2 = f2.array();
in.read(reinterpret_cast<char*>(s2.data()), s2.size()); // load array
in.close();

请注意，array() 是一个 unsigned char 的 span，而容量是以位为单位测量的，因此 array.size() 是 capacity() / CHAR_BIT。如果您在与保存过滤器相同的计算机之外的计算机上加载序列化过滤器，请考虑两端的 CPU 体系结构必须具有相同的字节序才能使重构工作正常进行。

调试

Visual Studio Natvis

将 boost_bloom.natvis 可视化工具添加到您的项目中，以方便用户检查 boost::bloom::filter。

图 10. 使用 boost_bloom.natvis 查看 boost::bloom::filter。

GDB pretty-printer

boost::bloom::filter 附带一个专用的 pretty-printer，用于在 GDB 调试时进行可视化检查

(gdb) print f
$1 = boost::bloom::filter with {capacity = 2000, data = 0x6da840, size = 250} = {[0] = 0 '\000',
  [1] = 0 '\000', [2] = 0 '\000', [3] = 0 '\000', [4] = 0 '\000', [5] = 1 '\001'...}

(gdb) # boost::bloom::filter does not have an operator[]. The following expression
(gdb) # is used in place of print f.array()[30]
(gdb) print f[30]
$2 = 128 '\200'

请记住在 GDB 中启用 pretty-printing（通常是一次性设置）

(gdb) set print pretty on

如果您的编译二进制格式是 ELF 并且宏 BOOST_ALL_NO_EMBEDDED_GDB_SCRIPTS 未定义，则 pretty-printer 会自动嵌入到程序中；使用此命令（或默认通过将其添加到您的 .gdbinit 配置文件）可以在 GDB 会话中为特定 GDB 会话启用嵌入式 pretty-printer。

(gdb) add-auto-load-safe-path <path-to-executable>

作为使用嵌入式 pretty-printer 的替代方法，您可以显式加载 boost_bloom_printers.py 脚本

(gdb) source <path-to-boost>/libs/bloom/extra/boost_bloom_printers.py

选择过滤器配置

Boost.Bloom 提供了大量的编译时和运行时配置选项，因此选择起来可能很困难。如果您正在为特定的 FPR 或容量进行目标设置，并希望选择最合适的过滤器类型，那么下面的图表可能会有所帮助。

图 11. 不同过滤器类型的 FPR 与 c 的关系。

该图绘制了几个 boost::bloom::filter 的 FPR 与 c（容量 / 已插入元素数量）的关系，其中 K 已设置为其最佳值（最小 FPR），如下表所示。

	c = 容量 / 已插入元素数量
	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24
`filter<T,1,block<uint32_t,K>>`	3	3	3	4	4	5	5	5	5	5	5	5	6	6	7	7	7	7	7	7	7
`filter<T,1,block<uint32_t,K>,1>`	2	3	4	4	4	4	5	5	5	6	6	6	6	6	6	6	7	7	7	7	7
`filter<T,1,block<uint64_t,K>>`	2	3	4	4	5	5	5	5	5	6	6	6	6	6	7	7	7	7	7	7	7
`filter<T,1,block<uint64_t,K>,1>`	2	3	4	4	4	5	6	6	6	7	7	7	7	7	8	8	8	8	8	9	9
`filter<T,1,multiblock<uint32_t,K>>`	3	3	4	5	6	6	8	8	8	8	9	9	9	10	13	13	15	15	15	16	16
`filter<T,1,block<uint64_t[8],K>>`	4	4	4	5	5	6	7	7	7	8	8	9	9	10	10	11	12	12	12	12	12
`filter<T,1,multiblock<uint32_t,K>,1>`	3	3	4	5	6	6	7	7	8	8	9	9	10	10	12	12	14	14	14	14	15
`filter<T,1,block<uint64_t[8],K>,1>`	3	3	4	5	6	6	7	7	7	8	8	8	10	11	11	12	12	12	12	12	13
`filter<T,1,multiblock<uint64_t,K>>`	4	4	5	5	6	6	6	7	8	8	10	10	12	13	14	15	15	15	15	16	17
`filter<T,1,multiblock<uint64_t,K>,1>`	3	3	4	5	5	6	6	7	9	10	10	11	11	12	12	13	13	13	15	16	16
`filter<T,K>`	3	4	4	5	5	6	6	8	8	9	10	11	12	13	13	13	14	16	16	16	17

让我们以一个例子来看如何使用。假设我们计划插入 1000 万个元素，并希望将 FPR 保持在 10^-4。图表为我们提供了五个不同的 c 值（数组容量除以元素数量，在本例中为 1000 万）

filter<T, K> → 每元素 c ≅ 19 位
filter<T, 1, multiblock<uint64_t, K>, 1> → 每元素 c ≅ 20 位
filter<T, 1, multiblock<uint64_t, K>> → 每元素 c ≅ 21 位
filter<T, 1, block<uint64_t[8], K>, 1> → 每元素 c ≅ 21 位
filter<T, 1, multiblock<uint32_t, K>, 1> → 每元素 c ≅ 21.5 位
filter<T, 1, block<uint64_t[8], K>> → 每元素 c ≅ 22 位
filter<T, 1, multiblock<uint32_t, K>> → 每元素 c ≅ 23 位

这些选项在使用的空间和性能方面有不同的权衡。如果我们选择 filter<T, 1, multiblock<uint32_t, K>, 1> 作为折衷（或者更好的是 filter<T, 1, fast_multiblock32<K>, 1>），那么最后一步就是查阅表中 c = 21 或 22 的 K 值，我们就得到了最终的配置

using my_filter=filter<std::string, 1, fast_multiblock32<14>, 1>;

最终的过滤器可以通过以下任一方式构建

// 1) calculate the capacity from the value of c we got from the chart
my_filter f((std::size_t)(10'000'000 * 21.5));

// 2) let the library calculate the capacity from n and target fpr
// expect some deviation from the capacity in 1)
my_filter f(10'000'000, 1E-4);

// 3) equivalent to 2)
my_filter f(my_filter::capacity_for(10'000'000, 1E-4));

基准测试

（更多结果请参见专用仓库。）

这些表格显示了 boost::bloom::filter<int, ...> 的若干配置在插入 1000 万个元素时的每操作的误报率和执行时间（纳秒）。

ins.：插入。
succ. lkp.：成功查找（元素在过滤器中）。
uns. lkp.：不成功查找（元素不在过滤器中，尽管查找可能返回 true）。
mixed lkp.：混合查找（10% 成功，90% 不成功）。

过滤器使用容量 c * N（位）构造，因此 c 是每个元素使用的位数。对于 c 和给定过滤器配置的每种组合，我们都选择了最佳的 K 值（产生最小 FPR 的值）。使用标准的发布模式设置；AVX2 在 Visual Studio 生成（/arch:AVX2）和 GCC/Clang 生成（-march=native）中表示，这会导致 fast_multiblock32 和 fast_multiblock64 使用其 AVX2 变体。

GCC 14, x64

	`filter<int,K>`						`filter<int,1,block<uint64_t,K>>`						`filter<int,1,block<uint64_t,K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	2.1519	15.40	17.18	21.01	22.53	4	3.3467	5.04	5.67	5.65	5.63	5	3.0383	5.31	5.97	5.98	6.00
12	9	0.3180	52.20	57.42	28.83	33.49	5	1.0300	11.45	12.50	12.56	12.47	6	0.8268	12.07	12.59	12.59	12.54
16	11	0.0469	85.91	95.79	34.35	43.78	6	0.4034	16.94	18.74	18.73	18.72	7	0.2883	19.20	19.38	19.37	19.38
20	14	0.0065	122.99	136.63	40.00	54.53	7	0.1887	22.76	22.40	22.40	22.38	8	0.1194	23.06	25.67	25.67	25.69
	`filter<int,1,multiblock<uint64_t,K>>`						`filter<int,1,multiblock<uint64_t,K>,1>`						`filter<int,1,fast_multiblock32<K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4510	6.20	6.93	6.91	6.90	5	2.3157	6.98	8.73	8.72	8.75	5	2.7361	4.26	4.05	4.06	4.09
12	8	0.4207	13.54	17.01	17.02	17.03	8	0.3724	17.59	22.15	22.21	22.10	8	0.5415	8.09	8.41	8.42	8.46
16	11	0.0764	29.76	31.32	31.29	31.29	11	0.0642	35.49	35.18	35.16	35.16	11	0.1179	19.46	21.14	15.17	16.59
20	13	0.0150	37.30	39.43	39.43	39.45	14	0.0122	39.98	52.28	51.49	52.16	13	0.0275	21.92	24.21	17.10	18.71
	`filter<int,1,fast_multiblock32<K>,1>`						`filter<int,1,fast_multiblock64<K>>`						`filter<int,1,fast_multiblock64<K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4788	4.34	4.01	4.00	4.02	5	2.4546	6.96	7.39	7.48	7.41	5	2.3234	5.87	5.85	5.85	5.86
12	8	0.4394	8.88	8.75	8.77	8.74	8	0.4210	8.69	9.66	9.72	9.69	8	0.3754	11.26	12.49	12.48	12.47
16	11	0.0865	18.94	20.93	14.90	16.43	11	0.0781	25.21	26.82	21.99	23.32	11	0.0642	23.97	27.41	22.21	23.47
20	13	0.0178	21.62	24.21	16.72	18.43	13	0.0160	30.99	37.48	25.27	26.90	14	0.0110	30.44	37.40	25.38	26.93
	`filter<int,1,block<uint64_t[8],K>>`						`filter<int,1,block<uint64_t[8],K>,1>`						`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.3292	8.27	9.21	16.05	17.80	6	2.2986	12.49	10.04	20.31	21.25	7	2.3389	15.10	15.21	15.26	15.29
12	7	0.4140	16.00	17.87	18.36	21.24	7	0.3845	23.85	22.82	22.12	24.50	10	0.3468	29.46	31.95	31.99	31.98
16	9	0.0852	28.48	29.03	22.33	27.03	10	0.0714	35.97	34.60	26.73	32.18	11	0.0493	44.35	50.15	50.14	50.21
20	12	0.0196	42.78	43.70	25.31	32.06	12	0.0152	52.29	52.03	29.21	34.86	15	0.0076	69.93	74.25	74.10	74.15

Clang 18, x64

	`filter<int,K>`						`filter<int,1,block<uint64_t,K>>`						`filter<int,1,block<uint64_t,K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	2.1519	15.00	15.48	20.32	21.79	4	3.3467	4.73	4.73	4.70	4.69	5	3.0383	5.64	6.25	6.23	6.24
12	9	0.3180	47.79	49.35	26.32	31.11	5	1.0300	11.49	11.45	11.46	11.47	6	0.8268	11.70	12.80	12.81	12.82
16	11	0.0469	86.52	87.04	31.84	40.27	6	0.4034	18.14	18.24	18.23	18.25	7	0.2883	17.03	18.96	18.96	18.99
20	14	0.0065	119.92	119.60	36.84	52.66	7	0.1887	22.16	22.00	22.00	22.02	8	0.1194	14.02	15.27	15.27	15.25
	`filter<int,1,multiblock<uint64_t,K>>`						`filter<int,1,multiblock<uint64_t,K>,1>`						`filter<int,1,fast_multiblock32<K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4510	5.27	6.18	6.18	6.20	5	2.3157	4.50	4.98	5.00	5.01	5	2.7361	4.14	4.05	4.06	4.08
12	8	0.4207	8.73	10.49	10.48	10.46	8	0.3724	9.55	12.13	12.13	12.08	8	0.5415	7.94	7.63	7.64	7.62
16	11	0.0764	18.48	22.46	22.46	22.46	11	0.0642	17.81	22.51	22.48	22.50	11	0.1179	15.25	16.61	12.59	13.95
20	13	0.0150	23.02	29.67	29.71	29.71	14	0.0122	24.11	30.94	30.96	30.98	13	0.0275	16.95	18.48	13.96	15.71
	`filter<int,1,fast_multiblock32<K>,1>`						`filter<int,1,fast_multiblock64<K>>`						`filter<int,1,fast_multiblock64<K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4788	3.68	3.45	3.42	3.39	5	2.4546	5.19	5.56	5.57	5.59	5	2.3234	5.22	5.39	5.37	5.38
12	8	0.4394	7.90	7.93	7.99	7.92	8	0.4210	9.32	9.97	9.97	9.99	8	0.3754	10.96	12.42	12.43	12.43
16	11	0.0865	14.44	16.40	12.03	13.44	11	0.0781	19.44	21.68	17.18	19.06	11	0.0642	18.27	21.52	17.03	18.84
20	13	0.0178	15.88	18.32	13.35	15.19	13	0.0160	24.18	29.36	18.90	21.11	14	0.0110	24.52	26.84	18.96	20.93
	`filter<int,1,block<uint64_t[8],K>>`						`filter<int,1,block<uint64_t[8],K>,1>`						`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.3292	7.46	7.85	15.05	17.13	6	2.2986	12.93	10.03	19.77	20.99	7	2.3389	13.53	13.34	13.37	13.32
12	7	0.4140	18.93	17.17	18.34	21.21	7	0.3845	23.02	19.22	19.54	22.44	10	0.3468	32.32	34.01	34.01	33.94
16	9	0.0852	29.47	27.26	21.68	25.51	10	0.0714	36.97	33.58	24.96	29.48	11	0.0493	44.86	49.30	49.36	49.36
20	12	0.0196	44.02	36.49	25.43	32.45	12	0.0152	53.64	41.22	26.37	32.60	15	0.0076	73.24	70.58	70.14	70.39

Clang 15, ARM64

	`filter<int,K>`						`filter<int,1,block<uint64_t,K>>`						`filter<int,1,block<uint64_t,K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	2.1519	7.84	6.33	13.07	13.65	4	3.3467	2.12	2.07	2.07	2.08	5	3.0383	2.18	2.10	2.09	2.09
12	9	0.3180	13.12	11.56	16.21	17.94	5	1.0300	3.75	3.82	3.57	3.92	6	0.8268	3.48	3.24	3.19	3.21
16	11	0.0469	31.85	25.68	18.28	22.44	6	0.4034	7.19	6.65	6.41	6.58	7	0.2883	6.78	6.16	6.13	6.01
20	14	0.0065	53.59	39.04	20.54	27.09	7	0.1887	9.40	8.05	8.11	7.94	8	0.1194	7.73	6.32	6.83	6.65
	`filter<int,1,multiblock<uint64_t,K>>`						`filter<int,1,multiblock<uint64_t,K>,1>`						`filter<int,1,fast_multiblock32<K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4510	2.75	2.63	2.64	2.60	5	2.3157	2.77	2.68	2.69	2.67	5	2.7361	2.44	2.62	2.66	2.64
12	8	0.4207	4.11	4.24	4.55	4.35	8	0.3724	4.34	4.66	4.54	4.60	8	0.5415	2.68	3.31	3.33	3.33
16	11	0.0764	10.25	9.42	9.30	10.01	11	0.0642	11.16	9.82	10.07	9.91	11	0.1179	8.25	8.07	5.90	6.98
20	13	0.0150	14.14	12.81	12.90	12.86	14	0.0122	15.87	13.62	13.61	13.90	13	0.0275	10.31	11.20	6.69	7.76
	`filter<int,1,fast_multiblock32<K>,1>`						`filter<int,1,fast_multiblock64<K>>`						`filter<int,1,fast_multiblock64<K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4788	2.40	2.57	2.62	2.60	5	2.4510	2.79	2.65	2.67	2.64	5	2.3157	2.76	2.67	2.70	2.69
12	8	0.4394	3.27	3.14	3.11	3.36	8	0.4207	4.08	4.64	4.36	4.39	8	0.3724	4.36	4.78	4.93	4.94
16	11	0.0865	7.87	8.12	6.03	6.68	11	0.0764	9.78	9.01	9.01	9.58	11	0.0642	11.18	9.98	10.05	10.00
20	13	0.0178	10.28	10.90	6.43	7.91	13	0.0150	15.65	12.84	13.19	12.87	14	0.0122	15.83	13.80	13.05	12.52
	`filter<int,1,block<uint64_t[8],K>>`						`filter<int,1,block<uint64_t[8],K>,1>`						`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.3292	4.71	4.40	11.49	12.49	6	2.2986	9.07	5.03	13.73	14.25	7	2.3389	9.10	7.06	7.05	7.05
12	7	0.4140	9.22	8.50	12.67	15.09	7	0.3845	13.27	8.10	13.37	15.47	10	0.3468	14.73	12.49	12.21	12.53
16	9	0.0852	16.66	13.68	14.34	16.94	10	0.0714	24.53	16.40	16.90	20.26	11	0.0493	27.25	23.91	23.51	24.00
20	12	0.0196	22.15	16.19	15.03	18.30	12	0.0152	27.21	19.06	16.37	20.16	15	0.0076	45.60	35.50	35.49	35.50

VS 2022, x64

	`filter<int,K>`						`filter<int,1,block<uint64_t,K>>`						`filter<int,1,block<uint64_t,K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	2.1519	15.74	39.75	24.15	27.06	4	3.3467	4.66	4.87	4.84	4.83	5	3.0383	4.93	4.83	4.61	4.57
12	9	0.3180	39.82	39.21	20.53	23.85	5	1.0300	8.55	8.28	8.44	8.35	6	0.8268	9.77	9.51	9.46	9.42
16	11	0.0469	71.07	83.07	26.88	34.78	6	0.4034	16.76	14.54	14.64	14.63	7	0.2883	17.56	16.92	16.90	16.95
20	14	0.0065	103.76	118.04	31.19	42.73	7	0.1887	19.82	18.42	18.45	18.55	8	0.1194	20.96	22.75	22.79	22.75
	`filter<int,1,multiblock<uint64_t,K>>`						`filter<int,1,multiblock<uint64_t,K>,1>`						`filter<int,1,fast_multiblock32<K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4510	7.06	4.81	4.60	4.67	5	2.3157	7.85	5.28	5.03	4.98	5	2.7361	4.06	3.57	3.57	3.55
12	8	0.4207	12.78	11.44	11.42	11.32	8	0.3724	19.79	13.42	13.63	13.60	8	0.5415	6.09	6.06	5.23	6.43
16	11	0.0764	27.65	24.32	24.34	24.46	11	0.0642	31.18	29.82	29.78	29.84	11	0.1179	14.42	15.78	11.22	12.37
20	13	0.0150	36.26	35.31	35.21	35.25	14	0.0122	39.58	37.78	37.75	37.78	13	0.0275	16.33	17.85	12.38	13.61
	`filter<int,1,fast_multiblock32<K>,1>`						`filter<int,1,fast_multiblock64<K>>`						`filter<int,1,fast_multiblock64<K>,1>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.4788	3.30	7.65	7.66	7.65	5	2.4546	4.24	3.70	3.47	3.45	5	2.3234	4.62	3.77	3.62	3.55
12	8	0.4394	8.19	8.95	7.71	8.57	8	0.4210	8.66	8.26	7.42	8.71	8	0.3754	10.37	9.63	8.42	9.19
16	11	0.0865	14.86	15.64	11.14	12.09	11	0.0781	23.99	19.63	16.58	17.51	11	0.0642	21.43	19.33	16.35	17.11
20	13	0.0178	17.75	17.90	12.37	13.80	13	0.0160	28.54	26.97	18.98	20.10	14	0.0110	28.58	26.89	19.02	19.88
	`filter<int,1,block<uint64_t[8],K>>`						`filter<int,1,block<uint64_t[8],K>,1>`						`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	FPR [%]	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	2.3292	8.16	7.72	12.16	13.28	6	2.2986	10.74	9.89	14.82	15.44	7	2.3389	11.56	9.94	9.93	9.72
12	7	0.4140	14.41	14.90	15.39	17.45	7	0.3845	22.28	20.72	18.53	20.75	10	0.3468	21.52	21.28	21.28	21.30
16	9	0.0852	27.96	28.29	20.87	24.11	10	0.0714	34.15	33.63	20.57	25.46	11	0.0493	42.62	40.23	40.30	40.28
20	12	0.0196	36.03	35.12	23.05	28.27	12	0.0152	42.51	41.13	21.94	28.03	15	0.0076	68.84	64.56	64.67	64.67

批量操作

（更多结果请参见专用仓库。）

下表显示了之前描述的插入和查找操作以批量模式执行时的相对性能。大于 1.0 的值表示批量模式更快（并非总是如此）。

GCC 14, x64

	`filter<int,K>`					`filter<int,1,block<uint64_t,K>>`					`filter<int,1,block<uint64_t,K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	1.00	0.78	2.11	1.43	4	1.20	1.34	1.21	1.32	5	1.09	1.34	1.25	1.22
12	9	1.86	1.54	2.27	1.38	5	2.07	2.26	2.10	2.02	6	1.89	1.91	1.90	1.88
16	11	2.22	2.08	2.45	1.46	6	2.92	2.97	2.86	2.82	7	2.69	2.65	2.66	2.66
20	14	2.19	2.24	2.57	1.43	7	3.50	3.24	3.15	3.33	8	2.71	3.06	3.29	3.16
	`filter<int,1,multiblock<uint64_t,K>>`					`filter<int,1,multiblock<uint64_t,K>,1>`					`filter<int,1,fast_multiblock32<K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.32	1.32	1.33	1.34	5	1.27	1.54	1.54	1.55	5	1.28	1.31	1.32	1.38
12	8	1.98	2.13	2.35	2.13	8	2.28	2.65	2.66	2.60	8	2.09	2.09	2.25	2.16
16	11	2.15	2.55	2.52	2.48	11	1.67	2.47	2.48	2.48	11	2.70	3.01	3.02	3.06
20	13	2.07	2.29	2.29	2.28	14	2.24	2.89	2.74	2.80	13	2.44	2.66	2.58	2.64
	`filter<int,1,fast_multiblock32<K>,1>`					`filter<int,1,fast_multiblock64<K>>`					`filter<int,1,fast_multiblock64<K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.12	1.22	1.18	1.17	5	1.28	1.27	1.26	1.34	5	1.24	1.27	1.29	1.29
12	8	2.01	2.10	2.10	2.10	8	2.18	2.25	2.01	2.12	8	2.19	2.41	2.44	2.41
16	11	2.62	2.95	2.96	2.95	11	1.96	2.15	2.15	2.15	11	1.82	2.12	2.12	2.11
20	13	2.35	2.61	2.53	2.60	13	1.77	1.89	1.89	1.89	14	1.69	1.87	1.87	1.87
	`filter<int,1,block<uint64_t[8],K>>`					`filter<int,1,block<uint64_t[8],K>,1>`					`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.25	1.37	1.26	1.35	6	0.99	1.38	1.38	1.52	7	1.21	1.24	1.25	1.25
12	7	1.84	1.81	1.80	1.80	7	1.61	2.35	2.41	2.43	10	1.50	1.65	1.65	1.65
16	9	2.92	2.56	2.56	2.56	10	2.40	3.44	3.43	3.42	11	1.39	1.65	1.64	1.64
20	12	3.50	3.27	3.27	3.27	12	3.21	4.03	4.03	4.03	15	1.21	1.42	1.42	1.41

Clang 18, x64

	`filter<int,K>`					`filter<int,1,block<uint64_t,K>>`					`filter<int,1,block<uint64_t,K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	1.06	0.88	2.58	1.71	4	1.50	1.38	1.39	1.39	5	1.23	1.32	1.36	1.32
12	9	1.90	1.63	2.28	1.46	5	2.46	2.32	2.32	2.29	6	2.11	2.35	2.35	2.35
16	11	2.25	2.01	2.39	1.43	6	3.60	3.40	3.44	3.40	7	2.75	3.13	3.13	3.13
20	14	2.15	2.18	2.50	1.38	7	3.95	3.80	3.81	3.80	8	1.78	2.00	1.99	2.00
	`filter<int,1,multiblock<uint64_t,K>>`					`filter<int,1,multiblock<uint64_t,K>,1>`					`filter<int,1,fast_multiblock32<K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.20	1.32	1.34	1.34	5	1.25	1.54	1.54	1.54	5	1.22	1.32	1.31	1.29
12	8	2.64	2.59	2.61	2.60	8	2.09	2.59	2.65	2.64	8	2.46	2.58	2.59	2.58
16	11	1.66	1.96	1.93	1.96	11	1.55	1.87	1.85	1.84	11	2.36	2.68	2.68	2.68
20	13	1.49	1.88	1.87	1.89	14	1.48	1.86	1.88	1.91	13	1.78	2.09	2.09	2.09
	`filter<int,1,fast_multiblock32<K>,1>`					`filter<int,1,fast_multiblock64<K>>`					`filter<int,1,fast_multiblock64<K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.29	1.31	1.36	1.31	5	1.34	1.48	1.48	1.47	5	1.30	1.41	1.45	1.40
12	8	1.96	2.03	2.05	2.02	8	2.48	2.45	2.42	2.45	8	2.34	2.67	2.68	2.67
16	11	2.26	2.65	2.66	2.65	11	1.66	1.87	1.88	1.85	11	1.58	1.81	1.81	1.80
20	13	1.74	2.04	2.04	2.03	13	1.46	1.89	1.89	1.89	14	1.45	1.79	1.79	1.78
	`filter<int,1,block<uint64_t[8],K>>`					`filter<int,1,block<uint64_t[8],K>,1>`					`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.70	1.66	1.69	1.68	6	1.02	1.54	1.54	1.54	7	1.09	1.18	1.17	1.18
12	7	2.16	2.14	2.13	2.14	7	1.59	2.53	2.53	2.53	10	1.44	1.58	1.58	1.57
16	9	3.09	2.95	2.96	2.95	10	2.29	3.82	3.83	3.82	11	1.36	1.59	1.58	1.60
20	12	3.69	3.44	3.44	3.41	12	3.28	4.16	4.16	4.16	15	1.24	1.32	1.32	1.32

Clang 15, ARM64

	`filter<int,K>`					`filter<int,1,block<uint64_t,K>>`					`filter<int,1,block<uint64_t,K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	0.89	0.68	3.16	2.13	4	1.28	1.32	1.29	1.31	5	0.98	1.18	1.17	1.17
12	9	1.03	0.87	3.08	2.03	5	1.97	2.05	2.03	2.12	6	1.42	1.72	1.66	1.62
16	11	1.39	1.25	2.44	1.41	6	2.75	3.11	2.94	3.07	7	1.92	2.05	2.05	1.94
20	14	1.42	1.16	2.18	1.04	7	2.93	3.10	2.97	3.34	8	1.76	1.87	1.67	1.92
	`filter<int,1,multiblock<uint64_t,K>>`					`filter<int,1,multiblock<uint64_t,K>,1>`					`filter<int,1,fast_multiblock32<K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.09	1.28	1.27	1.28	5	1.04	1.29	1.28	1.31	5	1.13	1.47	1.47	1.46
12	8	2.04	2.33	1.99	2.24	8	1.49	1.98	1.97	2.10	8	1.72	2.33	2.23	2.13
16	11	2.01	2.25	2.31	2.33	11	1.75	2.15	2.18	2.14	11	2.02	2.27	2.46	2.71
20	13	2.19	2.32	2.49	2.22	14	2.03	2.52	2.42	2.42	13	1.95	2.70	2.66	2.71
	`filter<int,1,fast_multiblock32<K>,1>`					`filter<int,1,fast_multiblock64<K>>`					`filter<int,1,fast_multiblock64<K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.12	1.47	1.43	1.44	5	1.10	1.29	1.26	1.28	5	1.05	1.33	1.31	1.31
12	8	1.50	1.67	1.62	1.71	8	2.11	2.18	2.24	2.23	8	1.48	2.10	2.19	1.99
16	11	2.24	2.37	2.72	2.76	11	2.11	2.16	2.17	2.17	11	2.00	2.18	2.17	2.14
20	13	2.02	2.65	2.67	2.71	13	2.16	2.32	2.33	2.34	14	2.16	2.40	2.41	2.40
	`filter<int,1,block<uint64_t[8],K>>`					`filter<int,1,block<uint64_t[8],K>,1>`					`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.33	1.76	1.75	1.78	6	1.49	1.57	1.56	1.58	7	1.16	1.31	1.30	1.30
12	7	1.78	2.55	2.55	2.55	7	1.97	2.30	2.29	2.29	10	1.29	1.45	1.45	1.46
16	9	2.66	3.70	3.69	3.70	10	2.62	3.49	3.47	3.65	11	1.55	1.55	1.55	1.55
20	12	3.10	4.08	4.08	4.14	12	3.25	4.40	4.40	4.46	15	1.72	1.60	1.60	1.60

VS 2022, x64

	`filter<int,K>`					`filter<int,1,block<uint64_t,K>>`					`filter<int,1,block<uint64_t,K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	6	0.92	0.69	2.11	1.33	4	1.50	3.79	3.77	3.77	5	3.32	1.29	1.29	1.28
12	9	1.35	0.80	2.14	1.15	5	1.91	1.66	1.66	1.70	6	2.16	0.94	0.97	0.95
16	11	2.20	3.09	3.57	2.13	6	4.32	3.52	3.49	3.51	7	3.01	3.92	3.93	3.93
20	14	1.86	2.60	2.64	1.45	7	4.84	4.20	4.19	4.18	8	3.05	3.92	3.90	3.91
	`filter<int,1,multiblock<uint64_t,K>>`					`filter<int,1,multiblock<uint64_t,K>,1>`					`filter<int,1,fast_multiblock32<K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.57	1.15	1.16	1.16	5	1.23	1.43	1.43	1.43	5	1.24	0.88	0.87	0.88
12	8	1.93	4.55	4.53	4.53	8	1.88	2.85	2.79	2.85	8	2.39	2.03	2.02	2.01
16	11	3.90	4.65	4.68	4.76	11	2.35	1.54	1.55	1.50	11	1.25	0.83	0.83	0.83
20	13	3.80	3.23	3.26	3.02	14	2.27	2.45	2.44	2.32	13	1.26	1.11	1.10	1.11
	`filter<int,1,fast_multiblock32<K>,1>`					`filter<int,1,fast_multiblock64<K>>`					`filter<int,1,fast_multiblock64<K>,1>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.20	3.26	3.26	3.25	5	0.52	0.97	0.97	0.97	5	1.14	3.61	3.61	3.62
12	8	1.50	2.22	2.21	2.23	8	2.17	4.37	4.35	4.37	8	1.40	4.08	4.03	4.06
16	11	1.36	1.30	1.30	1.30	11	2.69	0.81	0.78	0.82	11	2.83	1.69	1.59	1.48
20	13	1.35	1.20	1.21	1.20	13	1.54	1.58	1.86	1.75	14	2.44	1.45	1.53	1.58
	`filter<int,1,block<uint64_t[8],K>>`					`filter<int,1,block<uint64_t[8],K>,1>`					`filter<int,1,multiblock<uint64_t[8],K>>`
c	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.	K	ins.	succ. lkp.	uns. lkp.	mixed lkp.
8	5	1.42	1.37	1.37	1.37	6	1.09	0.32	0.33	0.32	7	1.58	0.43	0.43	0.42
12	7	3.14	1.55	1.57	1.57	7	1.67	0.66	0.66	0.66	10	1.11	1.33	1.33	1.33
16	9	3.66	2.76	2.74	2.75	10	1.64	1.26	1.26	1.25	11	1.26	0.99	0.99	0.99
20	12	3.33	3.25	3.24	3.25	12	1.83	1.68	1.68	1.68	15	1.23	1.55	1.56	1.55

参考

`<boost/bloom.hpp>`

包含此参考中列出的所有其他头文件的便捷头文件。

`<boost/bloom/filter.hpp>`

定义 boost::bloom::filter 和相关函数。

namespace boost{
namespace bloom{

template<
  typename T, std::size_t K,
  typename Subfilter = block<unsigned char, 1>, std::size_t Stride = 0,
  typename Hash = boost::hash<T>,
  typename Allocator = std::allocator<unsigned char>
>
class filter;

template<
  typename T, std::size_t K, typename SF, std::size_t S, typename H, typename A
>
bool operator==(
  const filter<T, K, SF, S, H, A>& x, const filter<T, K, SF, S, H, A>& y);

template<
  typename T, std::size_t K, typename SF, std::size_t S, typename H, typename A
>
bool operator!=(
  const filter<T, K, SF, S, H, A>& x, const filter<T, K, SF, S, H, A>& y);

template<
  typename T, std::size_t K, typename SF, std::size_t S, typename H, typename A
>
void swap(filter<T, K, SF, S, H, A>& x, filter<T, K, SF, S, H, A>& y)
  noexcept(noexcept(x.swap(y)));

} // namespace bloom
} // namespace boost

类模板 `filter`

boost::bloom::filter — 一种数据结构，它支持元素插入和概率性查找，其中一个元素可以被确定为以高置信度在过滤器中，否则则不能确定。查找错误地将不存在的元素分类为存在的概率称为过滤器的误报率（FPR）。

boost::bloom::filter 维护一个由 m 位组成的内部数组，其中 m 是过滤器的容量。与传统容器不同，插入元素 x 不会将 x 的副本存储在过滤器中，而是导致数组中固定数量的位设置为一，这些位的确切位置是通过 x 的哈希值伪随机生成的。查找 y 只是检查与 y 关联的所有位是否都已设置为一。

对于给定的过滤器，FPR 随着新元素的插入而增加。
对于给定数量的已插入元素，容量更高的过滤器具有更低的 FPR。

按照惯例，当其容量为零或内部数组中的所有位都设置为零时，我们称过滤器为空。

提要

// #include <boost/bloom/filter.hpp>

namespace boost{
namespace bloom{

template<
  typename T, std::size_t K,
  typename Subfilter = block<unsigned char, 1>, std::size_t Stride = 0,
  typename Hash = boost::hash<T>,
  typename Allocator = std::allocator<unsigned char>
>
class filter
{
public:
  // types and constants
  using value_type                    = T;
  static constexpr std::size_t k      = K;
  using subfilter                     = Subfilter;
  static constexpr std::size_t stride = see below;
  using hasher                        = Hash;
  using allocator_type                = Allocator;
  using size_type                     = std::size_t;
  using difference_type               = std::ptrdiff_t;
  using reference                     = value_type&;
  using const_reference               = const value_type&;
  using pointer                       = value_type*;
  using const_pointer                 = const value_type*;
  static constexpr std::size_t
    bulk_insert_size                  = implementation-defined;
  static constexpr std::size_t
    bulk_may_contain_size             = implementation-defined;

  // construct/copy/destroy
  filter();
  explicit filter(
    size_type m, const hasher& h = hasher(),
    const allocator_type& al = allocator_type());
  filter(
    size_type n, double fpr, const hasher& h = hasher(),
    const allocator_type& al = allocator_type());
  template<typename InputIterator>
    filter(
      InputIterator first, InputIterator last,
      size_type m, const hasher& h = hasher(),
      const allocator_type& al = allocator_type());
  template<typename InputIterator>
    filter(
      InputIterator first, InputIterator last,
      size_type n, double fpr, const hasher& h = hasher(),
      const allocator_type& al = allocator_type());
  filter(const filter& x);
  filter(filter&& x);
  template<typename InputIterator>
    filter(
      InputIterator first, InputIterator last,
      size_type m, const allocator_type& al);
  template<typename InputIterator>
    filter(
      InputIterator first, InputIterator last,
      size_type n, double fpr, const allocator_type& al);
  explicit filter(const allocator_type& al);
  filter(const filter& x, const allocator_type& al);
  filter(filter&& x, const allocator_type& al);
  filter(
    std::initializer_list<value_type> il,
    size_type m, const hasher& h = hasher(),
    const allocator_type& al = allocator_type());
  filter(
    std::initializer_list<value_type> il,
    size_type n, double fpr, const hasher& h = hasher(),
    const allocator_type& al = allocator_type());
  filter(size_type m, const allocator_type& al);
  filter(size_type n, double fpr, const allocator_type& al);
  filter(
    std::initializer_list<value_type> il,
    size_type m, const allocator_type& al);
  filter(
    std::initializer_list<value_type> il,
    size_type n, double fpr, const allocator_type& al);
  ~filter();
  filter& operator=(const filter& x);
  filter& operator=(filter&& x)
    noexcept(
	  std::allocator_traits<Allocator>::is_always_equal::value ||
      std::allocator_traits<Allocator>::propagate_on_container_move_assignment::value);
  filter& operator=(std::initializer_list<value_type> il);
  allocator_type get_allocator() const noexcept;

  // capacity
  size_type capacity() const noexcept;
  static size_type capacity_for(size_type n, double fpr);
  static double fpr_for(size_type n,size_type m)

  // data access
  boost::span<unsigned char>       array() noexcept;
  boost::span<const unsigned char> array() const noexcept;

  // modifiers
  void insert(const value_type& x);
  template<typename U>
    void insert(const U& x);
  template<typename InputIterator>
    void insert(InputIterator first, InputIterator last);
  void insert(std::initializer_list<value_type> il);

  void swap(filter& x)
    noexcept(std::allocator_traits<Allocator>::is_always_equal::value ||
             std::allocator_traits<Allocator>::propagate_on_container_swap::value);
  void clear() noexcept;
  void reset(size_type m = 0);
  void reset(size_type n, double fpr);

  filter& operator&=(const filter& x);
  filter& operator|=(const filter& x);

  // observers
  hasher hash_function() const;

  // lookup
  bool may_contain(const value_type& x) const;
  template<typename U>
    bool may_contain(const U& x) const;
  template<typename ForwardIterator, typename F>
    void may_contain(ForwardIterator first, ForwardIterator last, F f) const;
};

} // namespace bloom
} // namespace boost

描述

模板参数

T

插入到过滤器的元素的 cv-unqualified 对象类型。

K

每次插入或查找时调用相关子过滤器的次数。K 必须大于零。

子过滤器

一种子过滤器类型，提供用于在过滤器内部数组中设置/检查位的确切算法。子过滤器在对 K 个伪随机选择的数组部分（子数组）执行的每次操作上调用 K 次，子数组的宽度为 used-value-size<Subfilter>。

步长

连续子数组的起始位置之间的字节距离。如果 Stride 指定为零，则实际距离会自动选择为 used-value-size<Subfilter>（非重叠子数组）。否则，Stride 必须不大于 used-value-size<Subfilter>。

哈希

T 上的 Hash 类型。

Allocator

值为 unsigned char 的 Allocator。

内部数组的分配和释放通过提供的分配器的内部副本进行。如果 stride 是 a = alignof(Subfilter::value_type) 的倍数，则数组将按字节对齐到 max(64, a)。

如果 boost::hash_is_avalanching<Hash>::value 为 true 且 sizeof(std::size_t) >= 8，则哈希函数按原样使用；否则，会添加一个位混合后处理阶段，以提高哈希质量，但会增加额外的计算成本。

异常安全保证

除非另有明确说明，所有以非 const 引用接受 boost::bloom::filter 的非 const 成员函数和关联函数都提供基本异常保证，而所有以 const 引用接受 boost::bloom::filter 的 const 成员函数和关联函数都提供强异常保证。

除非另有明确说明，否则任何操作都不会抛出异常，除非该异常是由过滤器的 Hash 或 Allocator 对象（如果有）抛出的。

类型和常量

static constexpr std::size_t stride;

如果 Stride 参数被指定为非零值，则等于 Stride。否则，等于 used-value-size<subfilter>。

static constexpr std::size_t bulk_insert_size;

在批量插入操作中内部使用的块大小。

static constexpr std::size_t bulk_may_contain_size;

在批量 may_contain 操作中内部使用的块大小。

构造函数

默认构造函数

filter();

使用 hasher() 作为哈希函数，并使用 allocator_type() 作为分配器，构造一个空过滤器。

前置条件	`hasher` 和 `allocator_type` 必须是 DefaultConstructible。
后置条件	`capacity() == 0`.

容量构造函数

explicit filter(
  size_type m, const hasher& h = hasher(),
  const allocator_type& al = allocator_type());
filter(
  size_type n, double fpr, const hasher& h = hasher(),
  const allocator_type& al = allocator_type());

使用 h 和 al 的副本作为哈希函数和分配器，分别构造一个空过滤器。

前置条件	`fpr` 的值在 0.0 到 1.0 之间。
后置条件	`capacity() == 0`（如果 `m == 0`），否则 `capacity() >= m`（第一个重载）。 `capacity() == capacity_for(n, fpr)`（第二个重载）。

迭代器范围构造函数

template<typename InputIterator>
  filter(
    InputIterator first, InputIterator last,
    size_type m, const hasher& h = hasher(),
    const allocator_type& al = allocator_type());
template<typename InputIterator>
  filter(
    InputIterator first, InputIterator last,
    size_type n, double fpr, const hasher& h = hasher(),
    const allocator_type& al = allocator_type());

使用 h 和 al 的副本分别作为哈希函数和分配器来构造过滤器，并调用 insert(first, last)。

前置条件	`InputIterator` 是一个 LegacyInputIterator，其解引用的值为可插入到过滤器中的值。 `[first, last)` 是一个有效范围。 `fpr` 的值在 0.0 到 1.0 之间。
后置条件	`capacity() == 0`（如果 `m == 0`），否则 `capacity() >= m`（第一个重载）。 `capacity() == capacity_for(n, fpr)`（第二个重载）。对于 `[first, last)` 中的所有值 `x`，均满足 `may_contain(x)`。

复制构造函数

filter(const filter& x);

使用 x 的内部数组、x.hash_function() 和 std::allocator_traits<Allocator>::select_on_container_copy_construction(x.get_allocator()) 的副本构造过滤器。

后置条件	`*this == x`.

移动构造函数

filter(filter&& x);

构造一个过滤器，将 x 的内部数组转移到 *this，并使用从 x 的哈希函数和分配器分别移动构造而来的哈希函数和分配器。

后置条件	`x.capacity() == 0`.

带分配器的迭代器范围构造函数

template<typename InputIterator>
  filter(
    InputIterator first, InputIterator last,
    size_type m, const allocator_type& al);
template<typename InputIterator>
  filter(
    InputIterator first, InputIterator last,
    size_type n, double fpr, const allocator_type& al);

等同于 filter(first, last, m, hasher(), al)（第一个重载）或 filter(first, last, n, fpr, hasher(), al)（第二个重载）。

分配器构造函数

explicit filter(const allocator_type& al);

使用 hasher() 作为哈希函数，并使用 al 的副本作为分配器，构造一个空过滤器。

前置条件	`hasher` 必须是 DefaultConstructible。
后置条件	`capacity() == 0`.

带分配器的复制构造函数

filter(const filter& x, const allocator_type& al);

使用 x 的内部数组、x.hash_function() 和 al 的副本构造过滤器。

后置条件	`*this == x`.

带分配器的移动构造函数

filter(filter&& x, const allocator_type& al);

如果 al == x.get_allocator()，则将 x 的内部数组转移到 *this，否则使用该数组的副本构造过滤器。新过滤器的哈希函数从 x 的哈希函数移动构造而来，分配器是 al 的副本。

后置条件	`x.capacity() == 0`.

初始化列表构造函数

filter(
  std::initializer_list<value_type> il,
  size_type m, const hasher& h = hasher(),
  const allocator_type& al = allocator_type());
filter(
  std::initializer_list<value_type> il,
  size_type n, double fpr, const hasher& h = hasher(),
  const allocator_type& al = allocator_type());

等同于 filter(il.begin(), il.end(), m, h, al)（第一个重载）或 filter(il.begin(), il.end(), n, fpr, h, al)（第二个重载）。

带分配器的容量构造函数

filter(size_type m, const allocator_type& al);
filter(size_type n, double fpr, const allocator_type& al);

等同于 filter(m, hasher(), al)（第一个重载）或 filter(n, fpr, hasher(), al)（第二个重载）。

带分配器的初始化列表构造函数

filter(
  std::initializer_list<value_type> il,
  size_type m, const allocator_type& al);
filter(
  std::initializer_list<value_type> il,
  size_type n, double fpr, const allocator_type& al);

等同于 filter(il, m, hasher(), al)（第一个重载）或 filter(il, n, fpr, hasher(), al)（第二个重载）。

析构函数

~filter();

释放内部数组，并销毁内部哈希函数和分配器。

赋值

复制赋值

filter& operator=(const filter& x);

令 pocca 为 std::allocator_traits<Allocator>::propagate_on_container_copy_assignment::value。如果 pocca 为 true，则用 x.get_allocator() 的副本替换内部分配器 al。如果 capacity() != x.capacity() 或 pocca && al != x.get_allocator()，则用具有 x.capacity() 容量的新数组替换内部数组。复制 x 的内部数组的值。用 x.hash_function() 的副本替换内部哈希函数。

前置条件	如果 `pocca` 为 `true`，则 `Allocator` 是不抛出异常的 CopyAssignable。 `hasher` 是不抛出异常的 Swappable。
后置条件	`*this == x`.
返回	`*this`.
异常安全性	强异常保证。

移动赋值

filter& operator=(filter&& x)
  noexcept(
    std::allocator_traits<Allocator>::is_always_equal::value ||
    std::allocator_traits<Allocator>::propagate_on_container_move_assignment::value);

令 pocma 为 std::allocator_traits<Allocator>::propagate_on_container_move_assignment::value。如果 pocma 为 true，则用 x.get_allocator() 的副本替换内部分配器。如果 get_allocator() == x.get_allocator()，则将 x 的内部数组转移到 *this；否则，用具有 x.capacity() 容量的新数组替换内部数组，并复制 x 的内部数组的值。用 x.hash_function() 的副本替换内部哈希函数。

前置条件	如果 `pocma` 为 `true`，则 `Allocator` 是不抛出异常的 CopyAssignable。 `hasher` 是不抛出异常的 Swappable。
后置条件	`x.capacity() == 0`.
返回	`*this`.
异常安全性	如指示般不抛出异常，否则具有强异常保证。

初始化列表赋值

filter& operator=(std::initializer_list<value_type> il);

清空过滤器，并插入 il 中的值。

*this.

容量

size_type capacity() const noexcept;

后置条件	`capacity()` 是 `CHAR_BIT` 的倍数。
返回	内部数组的大小（以位为单位）。

容量估算

static size_type capacity_for(size_type n, double fpr);

前置条件	`fpr` 的值在 0.0 到 1.0 之间。
后置条件	`filter(capacity_for(n, fpr)).capacity() == capacity_for(n, fpr)`. `capacity_for(n, 1.0) == 0`.
返回	当插入了 `n` 个不同的元素时，为了达到 `fpr` 的误报率而需要的过滤器的容量估算。

FPR 估算

static double fpr_for(size_type n, size_type m);

后置条件	`fpr_for(n, m)` 在 0.0 和 1.0 之间。 `fpr_for(n, 0) == 1.0`. `fpr_for(0, m) == 0.0`（如果 `m != 0`）。
返回	当 `n` 个不同的元素被插入到容量为 `m` 的过滤器中时，预期的误报率估算。

数据访问

Array

boost::span<unsigned char>       array() noexcept;
boost::span<const unsigned char> array() const noexcept;

后置条件	`array().size() == capacity() / CHAR_BIT`.
返回	内部数组的跨度。

修改器

插入

void insert(const value_type& x);
template<typename U> void insert(const U& x);

如果 capacity() != 0，则从哈希值 hash_function()(x) 中确定性地选择 k * subfilter::k（不一定不同）个内部数组的位，并将它们设置为一。

后置条件	`may_contain(x)`.
异常安全性	强异常保证。
注意	第二个重载仅在 `hasher::is_transparent` 是一个有效的成员类型时才参与重载解析。

迭代器范围插入

template<typename InputIterator>
  void insert(InputIterator first, InputIterator last);

等同于 while(first != last) insert(*first++)。

如果 InputIterator 是 LegacyForwardIterator（C++11 到 C++17）或满足 std::forward_iterator（C++20 及更高版本），则范围 [first, last) 将以 bulk_insert_size 的块大小进行处理，利用内部优化技术来提高相对于逐个元素插入的性能。

前置条件	`InputIterator` 是一个 LegacyInputIterator，其解引用的值为可插入到过滤器中的值。 `[first, last)` 是一个有效范围。

初始化列表插入

void insert(std::initializer_list<value_type> il);

等同于 insert(il.begin(), il.end())。

Swap

void swap(filter& x)
  noexcept(std::allocator_traits<Allocator>::is_always_equal::value ||
           std::allocator_traits<Allocator>::propagate_on_container_swap::value);

令 pocs 为 std::allocator_traits<Allocator>::propagate_on_container_swap::value。交换内部数组和哈希函数与 x 的。如果 pocs 为 true，则交换内部分配器与 x 的。

前置条件	`pocs \|\| get_allocator() == x.get_allocator()`. 如果 `pocs` 为 `true`，则 `Allocator` 是不抛出异常的 Swappable。 `hasher` 是不抛出异常的 Swappable。
异常安全性	不抛出异常。

清空

void clear() noexcept;

将内部数组中的所有位设置为零。

重置

void reset(size_type m = 0);
void reset(size_type n, double fpr);

第一个重载：如果从 m 计算出的预期容量不等于 capacity()，则替换内部数组，并清空过滤器。
第二个重载：等同于 reset(capacity_for(n, fpr))。

前置条件	`fpr` 的值在 0.0 到 1.0 之间。
后置条件	通常情况下，`capacity() >= m`。如果 `m == 0` 或 `m == capacity()` 或 `m == capacity_for(n, fpr)`（对于某些 `n` 和 `fpr`），则 `capacity() == m`。
异常安全性	如果 `m == 0` 或 `capacity_for(n, fpr) == 0`，则不抛出异常，否则具有强异常保证。

按位与合并

filter& operator&=(const filter& x);

如果 capacity() != x.capacity()，则抛出 std::invalid_argument 异常；否则，将内部数组中每个位的更改为该位与 x 中相应位的逻辑 AND 操作的结果。

前置条件	`x` 和 `y` 的 `Hash` 对象等价。
返回	`*this`;
异常安全性	强异常保证。

按位或合并

filter& operator|=(const filter& x);

如果 capacity() != x.capacity()，则抛出 std::invalid_argument 异常；否则，将内部数组中每个位的更改为该位与 x 中相应位的逻辑 OR 操作的结果。

前置条件	`x` 和 `y` 的 `Hash` 对象等价。
返回	`*this`;
异常安全性	强异常保证。

观察器

get_allocator

allocator_type get_allocator() const noexcept;

内部分配器的副本。

hash_function

hasher hash_function() const;

返回	内部哈希函数的副本。

查找

may_contain

bool may_contain(const value_type& x) const;
template<typename U> bool may_contain(const U& x) const;

返回	当且仅当所有由假设的 `insert(x)` 操作选择的位都设置为一，则返回 `true`。
注意	第二个重载仅在 `hasher::is_transparent` 是一个有效的成员类型时才参与重载解析。

批量 may_contain

template<typename ForwardIterator, typename F>
  void may_contain(ForwardIterator first, ForwardIterator last, F f) const;

等同于 for( ; first != last; ++first) f(*first, may_contain(*first))。

范围 [first, last) 以 bulk_may_contain_size 的块大小进行处理，利用内部优化技术来提高相对于逐个元素查找的性能。

前置条件	`ForwardIterator` 是 LegacyForwardIterator（C++11 到 C++17）或满足 `std::forward_iterator`（C++20 及更高版本）。 `ForwardIterator` 解引用得到的值可被逐个元素的 `may_contain` 接受。 `[first, last)` 是一个有效范围。

比较

operator==

template<
  typename T, std::size_t K, typename S, std::size_t B, typename H, typename A
>
bool operator==(
  const filter<T, K, S, B, H, A>& x, const filter<T, K, S, B, H, A>& y);

前置条件	`x` 和 `y` 的 `Hash` 对象等价。
返回	当且仅当 `x.capacity() == y.capacity()` 且 `x` 和 `y` 的内部数组按位相同，则返回 `true`。

operator!=

template<
  typename T, std::size_t K, typename S, std::size_t B, typename H, typename A
>
bool operator!=(
  const filter<T, K, S, B, H, A>& x, const filter<T, K, S, B, H, A>& y);

前置条件	`x` 和 `y` 的 `Hash` 对象等价。
返回	`!(x == y)`.

Swap

template<
  typename T, std::size_t K, typename S, std::size_t B, typename H, typename A
>
void swap(filter<T, K, S, B, H, A>& x, filter<T, K, S, B, H, A>& y)
  noexcept(noexcept(x.swap(y)));

等同于 x.swap(y)。

子过滤器

子过滤器 为 boost::bloom::filter 实现特定的位设置（插入）和位检查（查找）算法。子过滤器在过滤器内部数组的称为子数组的部分上操作。这些子数组的确切宽度在静态上取决于子过滤器类型。

子过滤器的完整接口未公开，因此用户无法提供自己的子过滤器，只能使用库原生提供的。以下是公开可用的接口。

Subfilter::k

结果	一个编译时 `std::size_t` 值，表示每次操作设置/检查的（不一定不同的）位的数量。

typename Subfilter::value_type

结果	一个 cv-unqualified 的、TriviallyCopyable 类型，子过滤器将投影的子数组分配给它。

Subfilter::used_value_size

结果	一个编译时 `std::size_t` 值，表示用于位设置/检查的 `Subfilter::value_type` 的有效部分的大小（假定从内存的最低地址开始）。
后置条件	大于零且不大于 `sizeof(Subfilter::value_type)`。
注意	可选。

used-value-size

template<typename Subfilter>
constexpr std::size_t used-value-size; // exposition only

used-value-size<Subfilter> 是 Subfilter::used_value_size（如果此嵌套常量存在），否则为 sizeof(Subfilter::value_type)。该值为给定子过滤器操作的子数组的有效大小（以字节为单位）。

`<boost/bloom/block.hpp>`

namespace boost{
namespace bloom{

template<typename Block, std::size_t K>
struct block;

} // namespace bloom
} // namespace boost

类模板 `block`

boost::bloom::block — 一个基于整数类型的子过滤器。

提要

// #include <boost/bloom/block.hpp>

namespace boost{
namespace bloom{

template<typename Block, std::size_t K>
struct block
{
  static constexpr std::size_t k = K;
  using value_type               = Block;

  // the rest of the interface is not public

} // namespace bloom
} // namespace boost

描述

模板参数

Block

一个无符号整数类型或一个大小为 2^N 的无符号整数类型数组。

K

每次操作设置/检查的位数。必须大于零。

`<boost/bloom/multiblock.hpp>`

namespace boost{
namespace bloom{

template<typename Block, std::size_t K>
struct multiblock;

} // namespace bloom
} // namespace boost

类模板 `multiblock`

boost::bloom::multiblock — 一个基于整数类型数组的子过滤器。

提要

// #include <boost/bloom/multiblock.hpp>

namespace boost{
namespace bloom{

template<typename Block, std::size_t K>
struct multiblock
{
  static constexpr std::size_t k = K;
  using value_type               = Block[k];

  // the rest of the interface is not public

} // namespace bloom
} // namespace boost

描述

模板参数

Block

一个无符号整数类型或一个大小为 2^N 的无符号整数类型数组。

K

每次操作设置/检查的位数。必须大于零。

设置/检查的 K 位分别位于 Block[K] 数组的不同元素中。

`<boost/bloom/fast_multiblock32.hpp>`

namespace boost{
namespace bloom{

template<std::size_t K>
struct fast_multiblock32;

} // namespace bloom
} // namespace boost

类模板 `fast_multiblock32`

boost::bloom::fast_multiblock32 — multiblock<std::uint32_t, K> 的更快替代方案。

提要

// #include <boost/bloom/fast_multiblock32.hpp>

namespace boost{
namespace bloom{

template<std::size_t K>
struct fast_multiblock32
{
  static constexpr std::size_t k               = K;
  using value_type                             = implementation-defined;

  // might not be present
  static constexpr std::size_t used_value_size = implementation-defined;

  // the rest of the interface is not public

} // namespace bloom
} // namespace boost

描述

模板参数

`K`	每次操作设置/检查的位数。必须大于零。

fast_multiblock32<K> 在统计上等同于 multiblock<std::uint32_t, K>，但在编译时可用时利用选定的 SIMD 技术来提高性能。当前支持：AVX2、小端 Neon、SSE2。非 SIMD 情况会回退到常规的 multiblock。

used-value-size<fast_multiblock32<K>> 为 4 * K。

`<boost/bloom/fast_multiblock64.hpp>`

namespace boost{
namespace bloom{

template<std::size_t K>
struct fast_multiblock64;

} // namespace bloom
} // namespace boost

类模板 `fast_multiblock64`

boost::bloom::fast_multiblock64 — multiblock<std::uint64_t, K> 的更快替代方案。

提要

// #include <boost/bloom/fast_multiblock64.hpp>

namespace boost{
namespace bloom{

template<std::size_t K>
struct fast_multiblock64
{
  static constexpr std::size_t k               = K;
  using value_type                             = implementation-defined;

  // might not be present
  static constexpr std::size_t used_value_size = implementation-defined;

  // the rest of the interface is not public

} // namespace bloom
} // namespace boost

描述

模板参数

`K`	每次操作设置/检查的位数。必须大于零。

fast_multiblock64<K> 在统计上等同于 multiblock<std::uint64_t, K>，但在编译时可用时利用选定的 SIMD 技术来提高性能。当前支持：AVX2。非 SIMD 情况会回退到常规的 multiblock。

used-value-size<fast_multiblock64<K>> 为 8 * K。

未来工作

Boost.Bloom 的评论者和用户提出的许多功能被考虑纳入库的未来版本。

`try_insert`

为了避免插入已存在的元素，我们现在必须这样做

if(!f.may_contain(x)) f.insert(x);

这两次调用可以组合成一次潜在更快的单次操作。

bool res = f.try_insert(x); // returns true if x was not present

已插入元素的数量估计

对于经典的 Bloom 过滤器，可以从数组中设置为一的位数 B 的数量估算实际插入的元素数量 n 为\(B\)为

\(n\approx-\displaystyle\frac{m}{k}\ln\left(1-\displaystyle\frac{B}{m}\right),\)

这可以用于实现成员函数 estimated_size。在撰写本文时，我们不知道如何将该公式扩展到块过滤器和多块过滤器的场景。任何关于此问题的帮助都将不胜感激。

运行时指定 k

当前，每次操作设置的位数 k 是在编译时配置的。可以提供 boost::bloom::filter 的一个变体（或扩展），其中 k 的值在运行时指定，其代价是性能会比静态情况差（初步实验显示执行时间增加了大约 10-20%）。

替代过滤器

我们可以考虑添加其他数据结构，如 uckoo 和 xor 过滤器，它们更节省空间且可能更快。

附录 A: FPR 估计

对于经典的 Bloom 过滤器，在一些简化假设下，理论误报率由下式给出：

\(\text{FPR}(n,m,k)=\left(1 - \left(1 - \displaystyle\frac{1}{m}\right)^{kn}\right)^k \approx \left(1 - e^{-kn/m}\right)^k\)对于较大的\(m\),

其中\(n\)是插入过滤器的元素数量，\(m\)其容量（以位为单位），以及\(k\)每次插入设置的位数（参见此公式的推导）。对于固定的反向负载因子\(c=m/n\)，表达式在

\(k_{\text{opt}}=c\cdot\ln2\)

处达到最小值\(1/2^{k_{\text{opt}}} \approx 0.6185^{c}\)。最优的\(k\)，它必须是一个整数，是\(\lfloor k_{\text{opt}}\rfloor\)或\(\lceil k_{\text{opt}}\rceil\).

在 boost::bloom::filter<T, K, block<Block, K'>> 形式的过滤器的情况下，我们可以从 Putze 等人的方法推导出（近似但非常精确）的公式

\(\text{FPR}_{\text{block}}(n,m,b,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}(i,nbk/m) \cdot \text{FPR}(i,b,k')\right)^{k},\)

其中

\(\text{Pois}(i,\lambda)=\displaystyle\frac{\lambda^i e^{-\lambda}}{i!}\)

是均值为\(\lambda\)和\(b\)是 Block 的大小（以位为单位）。如果我们使用 multiblock<Block,K'>，我们有

\(\text{FPR}_\text{multiblock}(n,m,b,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}(i,nbkk'/m) \cdot \text{FPR}(i,b,1)^{k'}\right)^{k}.\)

正如我们在之前所评论的，通常情况下

\(\text{FPR}_\text{block}(n,m,b,k,k') \geq \text{FPR}_\text{multiblock}(n,m,b,k,k') \geq \text{FPR}(n,m,kk'),\)

也就是说，对于每次插入设置相同的位数，块过滤器和多块过滤器的 FPR 比经典过滤器差，但它们会更快。我们有特殊情况

\(\text{FPR}_{\text{block}}(n,m,b,k,1)=\text{FPR}_{\text{multiblock}}(n,m,b,k,1)=\text{FPR}(n,m,k),\)

这仅仅是因为观察到使用 {block|multiblock}<Block, 1> 的行为与经典 Bloom 过滤器完全相同。

我们不知道对于块过滤器和多块过滤器，当 Stride 不是其“自然”大小 used-value-size<Subfilter>（即子数组重叠时）的 FPR，是否存在封闭的简单公式。我们可以使用以下近似（\(s\)= Stride（以位为单位）)

\(\text{FPR}_{\text{block}}(n,m,b,s,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}\left(i,\frac{n(2b-s)k}{m}\right) \cdot \text{FPR}(i,2b-s,k')\right)^{k},\)
\(\text{FPR}_\text{multiblock}(n,m,b,s,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}\left(i,\frac{n(2bk'-s)k}{m}\right) \cdot \text{FPR}\left(i,\frac{2bk'-s}{k'},1\right)^{k'}\right)^{k},\)

其中\(b\)带有\(2b-s\)(或\(bk'\)带有\(2bk'-s\)（对于多块过滤器）的替换，是因为哈希位置的窗口由于重叠而扩展。请注意，当\(s\)取其默认值（块为 \(b\)，多块为 \(bk'\)）时，这些公式会简化为非重叠情况。这些近似对于\(k'\)的低值是可接受的，但随着\(k'\)的增长，它们会低估实际的 FPR。一般来说，对于典型的过滤器配置，重叠的使用可以将 FPR 提高（降低）0.6 到 0.9 的因子。

\(\text{FPR}_{\text{block}}(n,m,b,s,k,k')\)and\(\text{FPR}_\text{multiblock}(n,m,b,s,k,k')\)是 boost::filter::fpr_for 的实现所使用的公式。

附录 B: 实现说明

哈希混合

这是我们用于提高哈希函数统计属性的位混合后处理，当它不具有雪崩效应时：

\(m\leftarrow\text{mul}(h,C)\),
\(h'\leftarrow\text{high}(m)\text{ xor }\text{low}(m)\),

其中\(\text{mul}\)表示两个 64 位因子的 128 位乘法，\(\text{high}(m)\)and\(\text{low}(m)\)分别是\(m\)的高低 64 位字，\(C=\lfloor 2^{64}/\varphi \rfloor\)and\(\varphi\)是黄金分割比。

32 位模式

在内部，即使在使用 32 位模式时，我们也总是使用 64 位哈希值，此时用户提供的哈希函数会产生 32 位输出。为了将 32 位哈希值扩展到 64 位，我们使用与上面描述的相同的混合过程。

免除多个哈希函数

直接实现具有\(k\)每次操作的位数\(k\)需要不同且独立的哈希函数\(h_i(x)\)，这会带来显著的性能损失，特别是当对象哈希开销很大（例如字符串）时。 Kirsch 和 Mitzenmacher 展示了如何将此要求放宽到只需要两个不同的哈希函数\(h_1(x)\)and\(h_2(x)\)，它们线性组合为

\(g_i(x)=h_1(x)+ih_2(x).\)

在没有正式证明的情况下，我们甚至将此要求放宽到只有一个初始哈希值\(h_0=h_0(x)\)，其中新值\(h_i\)是从\(h_{i-1}\)通过非常廉价的混合方案计算得出的。在下文中，\(k\), \(k'\)是在 boost::bloom::filter<T, K, {block|multiblock}<Block, K'>> 形式的过滤器中的同名值，\(b\)是 sizeof(Block) * CHAR_BIT，而\(r\)是过滤器中的子数组数量。

子数组位置

为了从\(p\)in\([0,r)\)中生成一个位置（即一个数字\(h_{i-1}\)），而不是直接但昂贵的过程\(p\leftarrow h_{i-1}\bmod r\)，我们采用了 Lemire 的 fastrange 技术

\(m\leftarrow\text{mul}(h_{i-1},r),\)
\(p\leftarrow\lfloor m/2^{64} \rfloor=\text{high}(m).\)

为了使\(p\)与哈希值的进一步使用解耦，我们生成\(h_{i}\)template< class PtrSequence > void transfer( iterator before, typename PtrSequence::iterator object, PtrSequence& from );\(h_{i-1}\)和

\(h_i\leftarrow c \cdot h_{i-1} \bmod 2^{64}=\text{low}(c \cdot h_{i-1}),\)

带有\(c=\text{0xf1357aea2e62a9c5}\)（64 位模式），\(c=\text{0xe817fb2d}\)（32 位模式），来自 Steele 和 Vigna。变换\(h_{i-1} \rightarrow h_i\)是一个简单的线性同余生成器在\(2^{64}\)上。为了使该 MCG 产生长周期，\(h_0\)必须是奇数，因此实现会调整\(h_0\)到\(h_0'= (h_0\text{ or }1)\)，这使得\(h_i\)的最低有效位不适合伪随机化（它总是为一）。

位选择

在子过滤器内部，我们必须生成\(k'\)值\(h_i\)个在范围\([0,b)\)（\(k'\)位的。我们通过依次从\(\log_2b\)位中提取\(h_i\)来做到这一点，而不使用包含其最低有效位（如我们所讨论的，该位总是为一）的部分。如果我们用完了位数（当\(k'> 63/\log_2b\)时发生），我们通过已经描述过的混合过程生成一个新的哈希值\(h_{i+1}\)template< class PtrSequence > void transfer( iterator before, typename PtrSequence::iterator object, PtrSequence& from );\(h_{i}\)。

SIMD 算法

`fast_multiblock32`

使用 AVX2 时，我们通过创建 32 位值的 __m256i 来一次选择最多 8 个位\((x_0,x_1,...,x_7)\)其中每个\(x_i\)由哈希值的不同 5 位部分构造，并由此计算 __m256i\((2^{x_0},2^{x_1},...,2^{x_7})\)使用 _mm256_sllv_epi32。如果需要更多位，我们生成一个新的哈希值，如前面所述，然后重复。

对于小端 Neon，算法类似，但计算以两个 uint32x4_t 并行进行，因为 Neon 没有 256 位寄存器。

在 SSE2 的情况下，我们没有 _mm256_sllv_epi32 的 128 位等价物，因此我们使用以下略有趣的技巧：一个形式为

\(((x_0+127)\cdot 2^{23},(x_1+127)\cdot 2^{23},(x_2+127)\cdot 2^{23},(x_3+127)\cdot 2^{23}),\)

其中每个\(x_i\)存在于\([0,32)\)的 __m128i 可以被 reinterpret_cast 为（即具有相同的二进制表示）__m128（浮点数寄存器）

\((2^{x_0},2^{x_1},2^{x_2},2^{x_3}),\)

，从中可以通过 _mm_cvttps_epi32 获得我们想要的移位 1 的 __m128i。

`fast_multiblock64`

我们只为 AVX2 提供 SIMD 实现，它依赖于两个并行的 __m256i 来生成最多 8 个 64 位移位 1 的值。对于 Neon 和 SSE2，通过 4 个 128 位寄存器进行的模拟比非 SIMD multiblock<uint64_t, K> 慢。

发布说明

Boost 1.90

增加了批量插入和查找模式以提高性能。
使 block、fast_multiblock32 和 fast_multiblock64 的查找实现无分支化，从而带来一些性能提升，特别是对于混合的成功/不成功的查询。

Boost 1.89

初始发布。

致谢

Peter Dimov 和 Christian Mazakas 在开发阶段审查了代码和文档的许多部分。Sam Darwin 提供了 CI 设置和文档构建支持。Braden Ganetsky 贡献了 boost::bloom::filter 的 GDB pretty-printer。

Boost 接受审查于 2025 年 5 月 13 日至 22 日进行。非常感谢 Arnaud Becheler 的专业管理。以下人员参与了审查：Dmitry Arkhipov、David Bien、Claudio DeSouza、Peter Dimov、Vinnie Falco、Alexander Grund、Seth Heeren、Andrzej Krzemieński、Ivan Matek、Christian Mazakas、Rubén Pérez、Kostas Savvidis、Peter Turcan、Tomer Vromen。非常感谢他们所有人提供的非常有用的反馈。

Boost.Bloom 于 2025 年 1 月至 6 月在 Cáceres 和 Oropesa 设计和编写。

版权和许可

关于本文档

根据 Boost Software License, Version 1.0 分发。