Граф коммитов

10 Коммитов

Автор SHA1 Сообщение Дата
Michael Schellenberger Costa b3504262fe
P0619R4 Removing C++17-Deprecated Features (#380)
Fixes #28 and fixes #478.
2020-02-03 02:55:53 -08:00
Billy O'Neal 94d39ed515
Avoid double strlen for string operator+ and implement P1165R1 (#467)
Resolves GH-53.
Resolves GH-456.

Co-authored by: @barcharcraz
Co-authored by: @ArtemSarmini 

This change adds a bespoke constructor to `basic_string` to handle string concat use cases, removing any EH states we previously emitted in our operator+s, avoiding double strlen in our operator+s,

The EH states problem comes from our old pattern:

```
S operator+(a, b) {
    S result;
    result.reserve(a.size() +b.size()); // throws
    result += a; // throws
    result += b; // throws
    return result;
}
```

Here, the compiler does not know that the append operation can't throw, because it doesn't understand `basic_string` and doesn't know the `reserve` has made that always safe. As a result, the compiler emitted EH handing code to call `result`'s destructor after each of the reserve and `operator+=` calls.

Using a bespoke concatenating constructor avoids these problems because there is only one throwing operation (in IDL0 mode). As expected, this results in a small performance win in all concats due to avoiding needing to set up EH stuff, and a large performance win for the `const char*` concats due to the avoided second `strlen`:

Performance:

```
#include <benchmark/benchmark.h>
#include <stdint.h>
#include <string>

constexpr size_t big = 2 << 12;
constexpr size_t multiplier = 64;

static void string_concat_string(benchmark::State &state) {
    std::string x(static_cast<size_t>(state.range(0)), 'a');
    std::string y(static_cast<size_t>(state.range(1)), 'b');
    for (auto _ : state) {
        (void)_;
        benchmark::DoNotOptimize(x + y);
    }
}

BENCHMARK(string_concat_string)->RangeMultiplier(multiplier)->Ranges({{2, big}, {2, big}});

static void string_concat_ntbs(benchmark::State &state) {
    std::string x(static_cast<size_t>(state.range(0)), 'a');
    std::string yBuf(static_cast<size_t>(state.range(1)), 'b');
    const char *const y = yBuf.c_str();
    for (auto _ : state) {
        (void)_;
        benchmark::DoNotOptimize(x + y);
    }
}

BENCHMARK(string_concat_ntbs)->RangeMultiplier(multiplier)->Ranges({{2, big}, {2, big}});

static void string_concat_char(benchmark::State &state) {
    std::string x(static_cast<size_t>(state.range(0)), 'a');
    for (auto _ : state) {
        (void)_;
        benchmark::DoNotOptimize(x + 'b');
    }
}

BENCHMARK(string_concat_char)->Range(2, big);

static void ntbs_concat_string(benchmark::State &state) {
    std::string xBuf(static_cast<size_t>(state.range(0)), 'a');
    const char *const x = xBuf.c_str();
    std::string y(static_cast<size_t>(state.range(1)), 'b');
    for (auto _ : state) {
        (void)_;
        benchmark::DoNotOptimize(x + y);
    }
}

BENCHMARK(ntbs_concat_string)->RangeMultiplier(multiplier)->Ranges({{2, big}, {2, big}});

static void char_concat_string(benchmark::State &state) {
    std::string x(static_cast<size_t>(state.range(0)), 'a');
    for (auto _ : state) {
        (void)_;
        benchmark::DoNotOptimize('b' + x);
    }
}

BENCHMARK(char_concat_string)->Range(2, big);

BENCHMARK_MAIN();

```

Times are in NS on a Ryzen Threadripper 3970X, improvements are `((Old/New)-1)*100`

|                                 | old x64 | new x64 | improvement | old x86 | new x86 | improvement |
| ------------------------------- | ------- | ------- | ----------- | ------- |-------- | ----------- |
| string_concat_string/2/2        | 12.8697 | 5.78125 |     122.61% | 13.9029 | 11.0696 |      25.60% |
| string_concat_string/64/2       |  62.779 | 61.3839 |       2.27% | 66.4394 | 61.6296 |       7.80% |
| string_concat_string/4096/2     | 125.558 | 124.512 |       0.84% | 124.477 | 117.606 |       5.84% |
| string_concat_string/8192/2     | 188.337 | 184.152 |       2.27% | 189.982 | 185.598 |       2.36% |
| string_concat_string/2/64       | 64.5229 | 64.1741 |       0.54% | 67.1338 | 61.4962 |       9.17% |
| string_concat_string/64/64      | 65.5692 | 59.9888 |       9.30% | 66.7742 | 60.4781 |      10.41% |
| string_concat_string/4096/64    | 122.768 | 122.768 |       0.00% | 126.774 | 116.327 |       8.98% |
| string_concat_string/8192/64    |  190.43 | 181.362 |       5.00% | 188.516 | 186.234 |       1.23% |
| string_concat_string/2/4096     | 125.558 | 119.978 |       4.65% | 120.444 | 111.524 |       8.00% |
| string_concat_string/64/4096    | 125.558 | 119.978 |       4.65% | 122.911 | 117.136 |       4.93% |
| string_concat_string/4096/4096  | 188.337 | 184.152 |       2.27% | 193.337 | 182.357 |       6.02% |
| string_concat_string/8192/4096  | 273.438 | 266.811 |       2.48% | 267.656 | 255.508 |       4.75% |
| string_concat_string/2/8192     | 205.078 | 194.964 |       5.19% | 175.025 | 170.181 |       2.85% |
| string_concat_string/64/8192    | 205.078 | 188.337 |       8.89% | 191.676 |  183.06 |       4.71% |
| string_concat_string/4096/8192  | 266.811 | 256.696 |       3.94% | 267.455 | 255.221 |       4.79% |
| string_concat_string/8192/8192  |  414.69 | 435.965 |      -4.88% | 412.784 |  403.01 |       2.43% |
| string_concat_ntbs/2/2          | 12.8348 |  5.9375 |     116.17% |   14.74 |  11.132 |      32.41% |
| string_concat_ntbs/64/2         | 71.1496 |  59.375 |      19.83% | 70.6934 | 60.9371 |      16.01% |
| string_concat_ntbs/4096/2       | 128.697 | 114.397 |      12.50% | 126.626 | 121.887 |       3.89% |
| string_concat_ntbs/8192/2       | 194.964 | 176.479 |      10.47% | 196.641 |  186.88 |       5.22% |
| string_concat_ntbs/2/64         | 100.446 |  74.986 |      33.95% | 109.082 | 83.3939 |      30.80% |
| string_concat_ntbs/64/64        | 106.027 | 78.4738 |      35.11% | 109.589 | 84.3635 |      29.90% |
| string_concat_ntbs/4096/64      | 164.969 | 138.114 |      19.44% | 165.417 | 142.116 |      16.40% |
| string_concat_ntbs/8192/64      | 224.958 | 200.195 |      12.37% | 228.769 | 200.347 |      14.19% |
| string_concat_ntbs/2/4096       | 2040.32 | 1074.22 |      89.94% | 2877.33 | 1362.74 |     111.14% |
| string_concat_ntbs/64/4096      | 1994.98 | 1074.22 |      85.71% | 2841.93 | 1481.62 |      91.81% |
| string_concat_ntbs/4096/4096    | 2050.78 | 1147.46 |      78.72% | 2907.78 | 1550.82 |      87.50% |
| string_concat_ntbs/8192/4096    | 2148.44 | 1227.68 |      75.00% | 2966.92 | 1583.78 |      87.33% |
| string_concat_ntbs/2/8192       | 3934.14 | 2099.61 |      87.37% | 5563.32 | 2736.56 |     103.30% |
| string_concat_ntbs/64/8192      | 3989.95 | 1994.98 |     100.00% | 5456.84 | 2823.53 |      93.26% |
| string_concat_ntbs/4096/8192    | 4049.24 | 2197.27 |      84.29% | 5674.02 | 2957.04 |      91.88% |
| string_concat_ntbs/8192/8192    | 4237.58 | 2249.58 |      88.37% | 5755.07 | 3095.65 |      85.91% |
| string_concat_char/2            | 12.8348 | 3.44936 |     272.09% | 11.1104 | 10.6976 |       3.86% |
| string_concat_char/8            | 8.99833 | 3.45285 |     160.61% | 11.1964 | 10.6928 |       4.71% |
| string_concat_char/64           | 65.5692 | 60.9375 |       7.60% | 65.7585 | 60.0182 |       9.56% |
| string_concat_char/512          | 72.5446 | 69.7545 |       4.00% |  83.952 | 79.5254 |       5.57% |
| string_concat_char/4096         | 125.558 | 119.978 |       4.65% | 123.475 | 117.103 |       5.44% |
| string_concat_char/8192         |  190.43 | 187.988 |       1.30% | 189.181 | 185.174 |       2.16% |
| ntbs_concat_string/2/2          | 13.4975 | 6.13839 |     119.89% | 14.8623 |   11.09 |      34.02% |
| ntbs_concat_string/64/2         |  104.98 | 79.5201 |      32.02% | 112.207 | 83.7111 |      34.04% |
| ntbs_concat_string/4096/2       | 2085.66 | 1098.63 |      89.84% | 2815.19 | 1456.08 |      93.34% |
| ntbs_concat_string/8192/2       | 3899.27 | 2099.61 |      85.71% | 5544.52 | 2765.16 |     100.51% |
| ntbs_concat_string/2/64         | 71.4983 |  62.779 |      13.89% | 72.6602 | 63.1953 |      14.98% |
| ntbs_concat_string/64/64        |  104.98 | 80.2176 |      30.87% | 111.073 | 81.8413 |      35.72% |
| ntbs_concat_string/4096/64      | 2085.66 | 1074.22 |      94.16% | 2789.73 |  1318.7 |     111.55% |
| ntbs_concat_string/8192/64      | 3989.95 | 2085.66 |      91.30% | 5486.85 | 2693.83 |     103.68% |
| ntbs_concat_string/2/4096       | 136.719 | 128.348 |       6.52% | 122.605 |  114.44 |       7.13% |
| ntbs_concat_string/64/4096      | 167.411 | 142.997 |      17.07% | 168.572 | 138.566 |      21.65% |
| ntbs_concat_string/4096/40      | 2099.61 | 1171.88 |      79.17% | 2923.85 | 1539.02 |      89.98% |
| ntbs_concat_string/8192/40      | 4098.07 | 2246.09 |      82.45% | 5669.34 | 3005.25 |      88.65% |
| ntbs_concat_string/2/8192       |   213.1 | 199.498 |       6.82% | 178.197 | 168.532 |       5.73% |
| ntbs_concat_string/64/8192      | 223.214 | 214.844 |       3.90% | 232.263 | 203.722 |      14.01% |
| ntbs_concat_string/4096/81      | 2148.44 | 1255.58 |      71.11% | 2980.78 | 1612.97 |      84.80% |
| ntbs_concat_string/8192/81      | 4237.58 | 2406.53 |      76.09% | 5775.55 | 3067.94 |      88.25% |
| char_concat_string/2            | 11.1607 | 3.60631 |     209.48% | 11.2101 | 10.7192 |       4.58% |
| char_concat_string/8            | 11.4746 | 3.52958 |     225.10% | 11.4595 |  10.709 |       7.01% |
| char_concat_string/64           | 65.5692 | 66.9643 |      -2.08% | 66.6272 | 60.8601 |       9.48% |
| char_concat_string/512          | 68.0106 | 73.2422 |      -7.14% | 91.1946 | 83.0791 |       9.77% |
| char_concat_string/4096         | 125.558 | 122.768 |       2.27% | 119.432 | 110.031 |       8.54% |
| char_concat_string/8192         | 199.498 | 199.498 |       0.00% | 171.895 | 169.173 |       1.61% |


Code size:
```
#include <string>

std::string strings(const std::string& a, const std::string& b) {
    return a + b;
}
std::string string_ntbs(const std::string& a, const char * b) {
    return a + b;
}
std::string string_char(const std::string& a, char b) {
    return a + b;
}
std::string ntbs_string(const char * a, const std::string& b) {
    return a + b;
}
std::string char_string(char a, const std::string& b) {
    return a + b;
}
```

Sizes are in bytes for the `.obj`, "Times Original" is New/Old, `cl /EHsc /W4 /WX /c /O2 .\code_size.cpp`:

| Bytes | Before | After  | Times Original |
| ----- | ------ | ------ | -------------- |
| x64   | 70,290 | 34,192 |          0.486 |
| x86   | 47,152 | 28,792 |          0.611 |
2020-01-31 16:45:39 -08:00
Billy O'Neal 3447e56030 Implement constexpr algorithms. (#425)
* Implement constexpr algorithms.

Resolves GH-6 ( P0202R3 ), resolves GH-38 ( P0879R0 ), and drive-by fixes GH-414.

Everywhere: Add constexpr, _CONSTEXPR20, and _CONSTEXPR20_ICE to things.

skipped_tests.txt: Turn on all tests previously blocked by missing constexpr algorithms (and exchange and swap). Mark those algorithms that cannot be turned on that we have outstanding PRs for with their associated PRs.
yvals_core.h: Turn on feature test macros.
xutility:
* Move the _Ptr_cat family down to copy, and fix associated SHOUTY comments to indicate that this is really an implementation detail of copy, not something the rest of the standard library intends to use directly. Removed and clarified some of the comments as requested by Casey Carter.
* Extract _Copy_n_core which implements copy_n using only the core language (rather than memcpy-as-an-intrinsic). Note that we cannot use __builtin_memcpy or similar to avoid the is_constant_evaluated check here; builtin_memcpy only works in constexpr contexts when the inputs are of type char.
numeric: Refactor as suggested by GH-414.

* Attempt alternate fix of GH-414 suggested by Stephan.

* Stephan product code PR comments:

* _Swap_ranges_unchecked => _CONSTEXPR20
* _Idl_dist_add => _NODISCARD (and remove comments)
* is_permutation => _NODISCARD
* Add yvals_core.h comments.

* Delete unused _Copy_n_core and TRANSITION, DevCom-889321 comment.

* Put the comments in the right place and remove phantom braces.
2020-01-22 17:57:27 -08:00
Daniel Marshall 48c7f31413 <utility> Deprecate std::rel_ops & resolve #403 (#402)
Co-authored-by: Casey Carter <cartec69@gmail.com>
Co-authored-by: Billy O'Neal <billy.oneal@gmail.com>
2020-01-08 19:16:40 -08:00
Casey Carter 1e8b8d4eef
[range.iter.ops], default_sentinel, and unreachable_sentinel (#329)
Implements iterator primitive operations `std::ranges::advance`, `std::ranges::next`, `std::ranges::prev`, and `std::ranges::distance`; as well as `std::default_sentinel` and `std::unreachable_sentinel`.

This change reworks the STL's iterator unwrapping machinery to enable unwrapping of C++20 move-only single-pass iterators (and `if constepxr`s all the things). Consequently, `_Iter_ref_t`, `_Iter_value_t`, and `_Iter_diff_t` resolve to `iter_reference_t`, `iter_value_t`, and `iter_difference_t` (respectively) in `__cpp_lib_concepts` (soon to be C++20) mode. This change necessitates some fixes to `unique_copy` and `_Fill_memset_is_safe` which both assume that `_Iter_value_t<T>` is well-formed for any iterator `T`. (`iter_value_t<T>` does not have that property: it is only well-formed when `readable<T>`.)

I notably haven't unified `default_sentinel_t` with `_Default_sentinel` out of an abundance of paranoia. Our `move_iterator` is comparable with `_Default_sentinel`, which is not the case for `std::default_sentinel`.

Drive-by:
* This change `if constexpr`-izes `unique_copy`.
2019-12-02 15:32:14 -08:00
Daniel Marshall 28ec9a3295 P1209R0 erase_if(), erase() (#236)
Resolves #55.

* Deprecate experimental::erase

* Implement P1209R0

* Update yvals_core.h

* Update deprecations and remove <experimental/xutility>

* move and reorder erase/erase_if

* moved _Erase and remove and friends to <xmemory>

* Consistently place erase_if() definitions.
2019-11-01 14:32:39 -07:00
Nathan Ward 379e61781a P0356R5 bind_front() (#158)
Resolves #13.
2019-10-22 17:15:35 -07:00
Stephan T. Lavavej 712b7971bd
Update comments to follow custom autolink syntax (#168)
* Use custom autolinks.

* Also update .clang-format.

* Use ArchivedOS.
2019-10-11 13:43:06 -07:00
Casey Carter c5e2e3f799 Ranges <range> machinery
* Implements a selection of support machinery in `std::ranges` and adds the `<ranges>` header. Primarily consists of the range access customization point objects:
  * `ranges::begin`
  * `ranges::end`
  * `ranges::cbegin`
  * `ranges::cend`
  * `ranges::rbegin`
  * `ranges::rend`
  * `ranges::crbegin`
  * `ranges::crend`
  * `ranges::size`
  * `ranges::empty`
  * `ranges::data`
  * `ranges::cdata`

  and range concepts:

  * `ranges::range`
  * `ranges::output_range`
  * `ranges::input_range`
  * `ranges::forward_range`
  * `ranges::bidirectional_range`
  * `ranges::random_access_range`
  * `ranges::contiguous_range`
  * `ranges::sized_range`
  * `ranges::view`
  * `ranges::common_range`

  and the associated type aliases:

  * `ranges::iterator_t`
  * `ranges::sentinel_t`
  * `ranges::range_value_t`
  * `ranges::range_reference_t`
  * `ranges::range_difference_t`
  * `ranges::range_rvalue_reference_t`

* Adds `<ranges>` - which is mostly empty since the support machinery is defined in `<xutility>` so as to be visible to `<algorithm>`

* Annotates [P0896R4](https://wg21.link/p0896r4) as partially implemented in the "`_HAS_CXX20` directly controls" section of `<yvals_core.h>`.

* Touches `<regex>`, `<set>`, and `<unordered_set>` to add partial specializations of `ranges::enable_view` for `match_results` and `(unordered_)?multi?set` as mandated by the WD to override the heuristic.

* Partially implements [P1474R1 "Helpful pointers for `ContiguousIterator`"](https://wg21.link/p1474r1):
  * Push `pointer_traits` from `<xmemory>` and `to_address` from `<memory>` up into `<xutility>`
  * Add `to_address` expression requirement to `contiguous_iterator` concept, and update `P0896R4_ranges_iterator_machinery` appropriately
  * Implement the changes to `ranges::data` (but not `view_interface` since it isn't yet implemented)

* Drive-by:
  * Simplify the definition of `pointer_traits::_Reftype` by eschewing `add_lvalue_reference_t`.
  * Strengthen `reverse_iterator`'s constructors and `make_reverse_iterator` so `ranges::rbegin` and `ranges::rend` can be `noexcept` in more cases
  * Since we're using `_Rng` as the template parameter name for models of `std::ranges::range`, rename a local variable `_Rng` to `_Generator` in `<random>`
2019-09-09 15:31:26 -07:00
Stephan T. Lavavej 219514876e Initial commit. 2019-09-04 15:57:56 -07:00