C++ 技巧

计算 $\mathrm{highbit}$

unsigned int highbit(unsigned int x)
{
    x |= x >> 1;
    x |= x >> 2;
    x |= x >> 4;
    x |= x >> 8;
    x |= x >> 16;
    return x - (x >> 1);
}

$2$ 到 $7$ 行是将数字 $x$ 变成 $\mathrm{highbit}(x) \times 2 - 1$ 的格式。
例子： $00110110 \Rightarrow 00111111$ 。
再通过 $x - \frac{x}{2}$ 得到 $\mathrm{highbit}$ 。

`bitset` 快速输出数字二进制

void print_binary(unsigned int x)
{
    std::cout << std::bitset<32>(x);
}

`emplace` 系列

使用 emplace 通常会比 push 之类的函数快一点，前者省去了一个拷贝过程。

lambda 函数递归

类似函数式编程的 yc

auto f = [](auto &f, int i) -> int {
    if (i <= 1) return 1;
    else return f(f, i - 1) + f(f, i - 2);
}

迭代器技巧

取末尾元素

非常 pythonic

std::vector a{0, 1, 2, 3, 4};
a[a.size() - 1] == 4;       // true
a.rbegin()[0] == 4;         // true
a.end()[-1] == 4;           // true
std::end(a)[-1] == 4;       // true

逆序排序

std::sort(a.begin(), a.end()); // 顺序
std::sort(a.rbegin(), a.rend()); // 逆序

std::ranges

C++20 带来的十分强大的功能，算是对迭代器之类东西的补充。

cppreference.com: 范围库 (C++20)

对于参数含 [begin, end) 的迭代器对的函数，通常都有它的 ranges 版本：

std::sort(a.begin(), a.end());
std::ranges::sort(a);

std::unique(a.begin(), a.end());
std::ranges::unique(a);

std::lower_bound(a.begin(), a.end(), x);
std::ranges::lower_bound(a, x);

就相当于每处 .begin()、.end() 都可以节省 8 个非空白字符。

值得注意的是，min，max 也可以这样：

*std::max_element(a.begin(), a.end());
std::ranges::max(a);

其他没用过，其他管道、过滤器啥的好像挺高级。

更多算法

std::fill

cppreference.com: std::fill

与 memset 相比，优点在于可以任意设置填充的值，std::fill(a.begin(), a.end(), 114514)，或 std::ranges::fill(a, 114514)，更加人性化。

性能上，不开 O2 与手写 for 循环填充相当，开了 O2 都差不多。

#include <algorithm>
#include <cstdint>
#include <cstdio>
#include <cstring>
#include <ctime>

uint8_t a[1 << 20];

int main()
{
    auto t1 = clock();
    for (int i = 0; i < (1 << 10); i++) {
        std::memset(a, 0, sizeof(a));
    }
    auto t2 = clock();
    printf("memset, filled with %u:  %lf\n", a[0], (double)(t2 - t1) / CLOCKS_PER_SEC);
    for (int i = 0; i < (1 << 10); i++) {
        std::fill(a, a + (1 << 20), 0);
    }
    auto t3 = clock();
    printf("std::fill, filled with %u: %lf\n", a[0], (double)(t3 - t2) / CLOCKS_PER_SEC);
    for (int i = 0; i < (1 << 10); i++) {
        for (int j = 0; j < (1 << 20); j++) {
            a[j] = 0;
        }
    }
    auto t4 = clock();
    printf("for, filled with %u: %lf\n", a[0], (double)(t4 - t3) / CLOCKS_PER_SEC);
}

在我的电脑上的结果：

无优化（-O0）：

memset, filled with 0:  0.022000
std::fill, filled with 0: 2.298235
for, filled with 0: 2.282090

优化（-O2)：

memset, filled with 0:  0.023301
std::fill, filled with 0: 0.026282
for, filled with 0: 0.022732

堆

cppreference.com: std::make_heap

这里的堆不是 std::priority_queue，而是利用 std::make_heap、std::push_heap、 std::pop_heap 等函数实现的堆。

个人认为优点在于建堆是 $O(n)$ 的，而建 std::priority_queue 只能是 $O(n \log n)$ 的。

update: std::priority_queue 可以 $O(n)$ 建堆，使用其构造函数。这个东西我不知道有什么优势。

注意用法：

std::vector<int> a{9, 9, 8, 2, 4, 4, 3, 5, 3};
std::make_heap(a.begin(), a.end()); // O(n) 建堆，等同于 std::ranges::make_heap(a);

a.emplace_back(6);
std::push_heap(a.begin(), a.end()); // 将 a.back() 加到堆中，等同于 std::ranges::push_heap(a);

std::pop_heap(a.begin(), a.end()); // 将 a.front() 与堆尾交换，并维护堆，等同于 std::ranges::pop_heap(a);
a.pop_back();

std::iota

cppreference.com: std::iota

从某个值开始，依次自增填充数组。

std::vector<int> a(10);
std::iota(a.begin(), a.end(), 0); // a = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

注意：std::ranges::iota 是 C++23 才加入的~~漏网之鱼~~。

std::accumulate

cppreference.com: std::accumulate

给定初始值，求给定范围的和。注意：返回类型由初值决定。

std::vector<int> a{3, 1, 4, 1, 5}
int sum = std::accumulate(a.begin(), a.end(), 0); // 14
int proc = std::accumulate(a.begin(), a.end(), 1, std::multiplies<int>()); // 60

bit

注意：不是 bits。

cppreference.com: 位操纵

C++20 的新东西，用于二进制位的操作。通常来说比手写快（编程效率和执行效率）。

以下的 x 均为 unsigned。

x != 0 && x == lowbit(x);
std::has_single_bit(x);

highbit(x + 1); // 就是开头那个 highbit
std::bit_ceil(x);

x <= 1 ? 1 : highbit(x - 1) * 2;
std::bit_ceil(x);

x == 0 ? 1 : highbit(x);
std::bit_floor(x);

x == 0 ? 0 : (int)std::log2(x) + 1;
std::bit_width(x);

__builtin_clz(x); // count leading zero
std::countl_zero(x); // count left zero

__buildin_ctz(x); // count trailing zero
std::countr_zero(x); // count right zero

__builtin_clz(~x);
std::countl_one(x);

__builtin_ctz(~x);
std::countr_one(x);

__builtin_popount(x);
std::popcount(x);

比较

三路比较

C++20 引入的新东西，主要用于简写多个比较运算符。

cppreference.com: 三路比较

例子：

template<int P> struct modint 
{
    int val;
    modint(int x) {
        val = (x % P + P) % P;
    }
    // ...
    auto operator<=>(const modint &x) const {
        return val <=> x.val;
    }
};

`std::greater<>` 等

C++14 之后，所有的这种比较类都有了这种特化，当你不在模板参数中填写内容，或者填写 void 的时候，可用于任意两个类型的比较。大概长这样：

template< class T, class U >

constexpr auto operator()( T&& lhs, U&& rhs ) const
    -> decltype(std::forward<T>(lhs) > std::forward<U>(rhs));

这意味着你在大多是需要传这种比较类的地方可以这样写了：

std::priority_queue<std::pair<int, int>, std::vector<int, int>, std::greater<>>
std::sort(a.begin(), a.end(), std::greater<>());

透明比较器

见用于关联式容器的透明比较器技巧。

计算 highbit\mathrm{highbit}highbit

bitset 快速输出数字二进制

emplace 系列