- Dec 10, 2021
-
-
Florian Fischer authored
-
Florian Fischer authored
Waitfree work stealing is configured with the meson option 'waitfree_work_stealing'. The retry logic is intentionally left in the Queues and not lifted to the scheduler to reuse the load of an unsuccessful CAS. Consider the following pseudo code examples: steal() -> bool: steal() -> res load load loop: if empty return EMPTY if empty return EMPTY cas cas return cas ? STOLEN : LOST_RACE if not WAITFREE and not cas: goto loop outer(): return cas ? STOLEN : LOST_RACE loop: res = steal() outer(): if not WAITFREE and res == LOST_RACE: steal() goto loop In the right example the value loaded by a possible unsuccessful CAS can not be reused. And a loop of unsuccessful CAS' will result in double loads. The retries are configurable through a template variable maxRetries. * maxRetries < 0: indefinitely retries * maxRetries >= 0: maxRetries
-
- Dec 09, 2021
-
-
Florian Fischer authored
-
Florian Schmaus authored
Multiple changes to improve IO stealing See merge request !288
-
- Dec 08, 2021
-
-
Florian Fischer authored
This has the benefit of adequat sized intermediate arrays reducing the needed stack size.
-
Florian Schmaus authored
[gitlab-ci] Cache subprojects/packagecache See merge request !287
-
Florian Fischer authored
-
Florian Fischer authored
-
Florian Fischer authored
This removes the rather expensive (reported by perf) initialization of the std::arrays.
-
Florian Fischer authored
To distinguish the outcomes of the waitfree reap attempt a new enum StealingResult::{Empty, LostRace, Stolen} is introduced.
-
Florian Schmaus authored
Fix includes (as reported by IWYU 0.17) and update CI container image See merge request !285
-
Florian Schmaus authored
-
- Dec 06, 2021
-
-
Florian Schmaus authored
[meson] set check_anywhere_queue_while_steal automatic See merge request !286
-
Florian Fischer authored
We introduced the check_anywhere_queue_while_steal configuration as an optimization to get the IO completions reaped by the completer faster into the normal WSQ. But now the emper has configurations where we don't use a completer thus making this optimization useless or rather harmful. By default automatically decide the value of check_anywhere_queue_while_stealing based on the value of io_completer_behavior.
-
Florian Schmaus authored
[io/Stats] do not record steal attempts See merge request !284
-
Florian Schmaus authored
-
Florian Schmaus authored
-
- Dec 05, 2021
-
-
Florian Fischer authored
Recording every steal attempt is rather excessive and we are not doing it for normal work. Flamegraphs have show io-stealing takes considerable more time than normal work stealing because of the recording of steal attempts, especially if we are using atomics to aggregate them.
-
- Dec 03, 2021
-
-
Florian Schmaus authored
reduce test load when logging See merge request !282
-
Florian Fischer authored
I suspect some test which scale whith the number of CPUs to timeout mostly on jenkins2. This patch reduces the load when logging is active and increases the load when logging is off. Therefore our test build with debugoptimized will do less and hopefully only timeout when they actually hung and the release test will do more.
-
Florian Schmaus authored
load CQ->tail only once during lockless stealing See merge request !281
-
- Dec 02, 2021
-
-
Florian Fischer authored
Currently we load the CQ->tail with acquire semantic to determine if we should steal from teh victim and load it again in the actual stealing logic which will also immediately abort if there are no CQEs to steal. Keep the optimization for the locked case.
-
Florian Schmaus authored
EchoClient: improve the help message See merge request !280
-
- Nov 29, 2021
-
-
Florian Fischer authored
-
- Nov 24, 2021
-
-
Florian Schmaus authored
add concurrent BPS test See merge request !279
-
Florian Schmaus authored
echoclient: add a state variable for debugging See merge request !278
-
- Nov 23, 2021
-
-
Florian Fischer authored
-
Florian Fischer authored
The test introduces multiple cycles of Semaphores and a Fiber for each semaphore blocking and signaling the next. Through work-stealing the fibers from a cycle should be spread across different workers and thus test concurrent use of BinaryPrivateSemaphores. Cycle of length 3: Sem A -> Sem B -> Sem C -> Sem A -> ... Algorithm: if isFirstInCycle signal next wait if not isFirstInCycle signal next
-
- Nov 15, 2021
-
-
Florian Schmaus authored
[PipeSleepStrategy] fix notifyFromAnywhere See merge request !277
-
Florian Fischer authored
Don't decrease the sleeper count in the CAS loop further than -count, which is the threshold we need to ensure that the notification will be observed. Decreasing it further than our threshold is not faulty it just results in unnecessary skipped sleeps. Don't call writeNotifications with a negative count. Which will be interpreted as an unsigned value and thus results in writing way to much hints to the pipe, jamming it. If the original value before a successfully CAS is already negative we called writeNotifications with this negative value. This is fixed by using max(toWakeup, 0).
-
- Nov 11, 2021
-
-
Florian Schmaus authored
make the victim count in work-stealing configurable See merge request !276
-
- Nov 10, 2021
-
-
Florian Fischer authored
Add two new mutual exclusive meson_options: * work_stealing_victim_count: Which sets an absolute number of victims * work_stealing_victim_denominator: Set victim count to #workers/denominator
-
Florian Schmaus authored
[tools] Update check-format (from Mazstab) See merge request !275
-
Florian Schmaus authored
Fixes for clang-tidy 13 See merge request !274
-
Florian Schmaus authored
Sync tools/check-format of EMPER and Mazstab by using the newer Mazstab version of the script.
-
Florian Schmaus authored
While we do not have yet LLVM 13 in the gitlab-ci, I use it on my systems. So fix the new warnings found with clang-tidy 13.
-
- Oct 29, 2021
-
-
Florian Schmaus authored
Random computation echoserver See merge request !272
-
- Oct 28, 2021
-
-
Florian Schmaus authored
Ci debian testing dev bump See merge request !273
-
Florian Schmaus authored
-
Florian Fischer authored
Now three variants of computation are available: * fixed (echoserver <port> <computation>: This will always consume computation us before sending the echo back to the client. * random range (echoserver <port> <computation> <computation-max>: This will consume a random computation uniformly selected from the interval [computation, computation-max] us. * random min-max (echoserver <port> <computation> <computation-max> <max-probability> This will either consume computation or computation-max us. The max computation is randomly chosen with the specified probability. All random values are generated with a thread_local mt19937 generator and uniformly distributed with uniform_{int,real}_distribution.
-