Skip to content

Conversation

@royjacobson
Copy link

std::unordered_map::emplace always allocates an std::pair<Key, Value> on the heap, even if the key is already in the map. This patch replaces the call to std::unordered_map::emplace with a call to std::unordered_map::find and an additional call to std::unordered_map::emplace in the case that the object is not registered yet. In practice, the object should be in the versions map every time except the first time so the additional lookup is not significant.

cereal::detail::Versions::find was considerably slowed by those allocations in some cases. For example, the serialization time for the attached benchmark was improved by 37% after this change.

#include <benchmark/benchmark.h>
#include "cereal/cereal.hpp"
#include "cereal/archives/binary.hpp"
#include "cereal/types/vector.hpp"

struct A {
  template <class Archive>
  void serialize(Archive& ar, int) { ar &x; }
  int x;
};

static void BM_serialize_with_version(benchmark::State& state) {
  std::vector<A> a(10000);
  for (auto _ : state) {
    std::stringstream ss;
    cereal::BinaryOutputArchive oarchive(ss);
    oarchive(a);
  }
}

BENCHMARK(BM_serialize_with_version);
BENCHMARK_MAIN();

Before:

--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_serialize_with_version    2023361 ns      2021515 ns          348

After:

--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_serialize_with_version    1279180 ns      1276762 ns          549

@AzothAmmo AzothAmmo added this to the v1.3.3 milestone Apr 24, 2023
@AzothAmmo
Copy link
Contributor

Will be merged after I sort out CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants