Commits · 0ff7ac2ebb56a63462187117a4908fa3ff57b8ff · Lehrstuhl für Informatik 4 (Systemsoftware) / manycore / emper

Dec 22, 2021

Merge branch 'runtime-scheduler-public' into 'master' · 0ff7ac2e
Florian Fischer authored 3 years ago
```
Make Scheduler within Runtime public API

See merge request !298
```
0ff7ac2e
Make Scheduler within Runtime public API · dd6eb16b
Florian Schmaus authored 3 years ago

dd6eb16b

Add emper::inRuntime() and emper::assertInRuntime() · befbdda7

Florian Schmaus authored 3 years ago

The Runtime class was never a good place for inRuntime(): due its
central role in EMPER's architecture, it causes easily include
cycles. Furthermore, the API was not ideal, once we would move Runtime
into emper::, it would become emper::Runtime::inRuntime().

We declare emper::Emper::inRuntime and emper::Emper::assertInRuntime()
as static struct functions to avoid multiple definitions, and then use
a constexpr function "alias" to have those functions available in the
emper:: namespace.

befbdda7

Dec 17, 2021
- Merge branch 'sqpoll' into 'master' · 95140afc
  Florian Schmaus authored 3 years ago
  
  [IO] overhaul SQPOLL support See merge request !210
  95140afc
- Merge branch 'ci-print-cpus' into 'master' · 5cba2deb
  Florian Schmaus authored 3 years ago
  
  [CI] print the number of available CPUs See merge request !290
  5cba2deb
- Merge branch 'fix-implicit-widening' into 'master' · db688431
  Florian Schmaus authored 3 years ago
  
  [qsort] fix implicit widening clang-tidy error See merge request !292
  db688431
- Merge branch 'add-computation-to-echo-test' into 'master' · 59c100d3
  Florian Schmaus authored 3 years ago
  
  [ConcurrentNetworkEchoTest] add optional computation per echo See merge request !293
  59c100d3
Dec 16, 2021
- [ConcurrentNetworkEchoTest] add optional computation per echo · 8fb6f0f5
  Florian Fischer authored 3 years ago
  
  This makes the test closer to our echoserver.
  8fb6f0f5
Dec 14, 2021

[CI] build sqpoll variants · c59fb81b
Florian Fischer authored 3 years ago

c59fb81b
[IO] support one sq poller thread per numa node · fdff953f
Florian Fischer authored 3 years ago

fdff953f

[IO] overhaul SQPOLL support · 50c965e4

Florian Fischer authored 3 years ago

Two meson options control the io_uring sqpoll feature:
* io_uring_sqpoll - enable sq polling
* io_uring_shared_poller - share the polling thread between all io_urings

Since 5.12 the IORING_SETUP_ATTACH_WQ only causes sharing of
poller threads not the work queues.
See: https://github.com/axboe/liburing/issues/349

When using SQPOLL the userspace has no good way to
know how many sqes the kernel has consumed therefore we
wait for available sqes using io_uring_sqring_wait if there
was no usable sqe.

Remove the GlobalIoContext::registerLock and register all worker
io_uring eventfd reads at the beginning of the completer function.
Also register all the worker io_uring eventfds since they never change
and it hopefully reduces overhead in the global io_uring.

50c965e4

[qsort] fix implicit widening clang-tidy error · e651bfcf
Florian Fischer authored 3 years ago

e651bfcf

Dec 13, 2021

Merge branch 'fix-locked-queue-impls' into 'master' · a7c62f15
Florian Fischer authored 3 years ago
```
fix {RwLocked,SharedMutex}UnboundedQueues

See merge request !291
```
a7c62f15

[CI] add fast checks for various emper variants · a740f980

Florian Fischer authored 3 years ago

A "fast check" consists of our smoke tests and the fast static analysis
this ensures that the emper variants even build successfully and
are not totally broken.

a740f980

[editorconfig] add yml and meson_options rules · 6125da7e
Florian Fischer authored 3 years ago

6125da7e

[RwLockUnboundedQueue] fix upgrading the lock · 5a127e80

Florian Fischer authored 3 years ago

Previous RwLockUnboundedQueue versions did not drop their rw lock
before trying to grab the wr lock which may result in a deadlock.
Also wrap the noisy POSIX functions names in cleaner helper functions.

5a127e80

Dec 10, 2021

Merge branch 'stealing-changes' into 'master' · 97034882
Florian Schmaus authored 3 years ago
```
Introduce waitfree work/io-stealing

See merge request !289
```
97034882
[CI] print the number of available CPUs · 2653df7e
Florian Fischer authored 3 years ago

2653df7e
[IoContext] introduce waitfree io stealing · f133174a
Florian Fischer authored 3 years ago
```
Waitfree IO-stealing can be enabled with the meson option
-Dio_waitfree_stealing.
```
f133174a
[meson] introduce dependencies to io configuration options · 5c7e3e9b
Florian Fischer authored 3 years ago

5c7e3e9b

Introduce waitfree workstealing · 1c538024

Florian Fischer authored 3 years ago

Waitfree work stealing is configured with the meson option
'waitfree_work_stealing'.

The retry logic is intentionally left in the Queues and not lifted to
the scheduler to reuse the load of an unsuccessful CAS.

Consider the following pseudo code examples:

steal() -> bool:                       steal() -> res
  load                                   load
loop:                                    if empty return EMPTY
  if empty return EMPTY                  cas
  cas                                    return cas ? STOLEN : LOST_RACE
  if not WAITFREE and not cas:
    goto loop                          outer():
  return cas ? STOLEN : LOST_RACE      loop:
                                         res = steal()
outer():                                 if not WAITFREE and res == LOST_RACE:
  steal()                                  goto loop

In the right example the value loaded by a possible unsuccessful CAS
can not be reused. And a loop of unsuccessful CAS' will result in
double loads.

The retries are configurable through a template variable maxRetries.
* maxRetries < 0: indefinitely retries
* maxRetries >= 0: maxRetries

1c538024

Dec 09, 2021
- [IoContext] use a static_assert instead of runtime assert · 49e4d36f
  Florian Fischer authored 3 years ago
  
  49e4d36f
- Merge branch 'stealing-changes' into 'master' · c7670d99
  Florian Schmaus authored 3 years ago
  
  Multiple changes to improve IO stealing See merge request !288
  c7670d99
Dec 08, 2021
- [IoContext] make number of cqes to reap as template parameter · a336d96f
  Florian Fischer authored 3 years ago
  
  This has the benefit of adequat sized intermediate arrays reducing the needed stack size.
  a336d96f
- Merge branch 'cache-meson-packagecache' into 'master' · 4f25e7f2
  Florian Schmaus authored 3 years ago
  
  [gitlab-ci] Cache subprojects/packagecache See merge request !287
  4f25e7f2
- [io/Stats] fix comment referring to non present members · 58aceeb4
  Florian Fischer authored 3 years ago
  
  58aceeb4
- [IoContext] add debug helper to track the requests in an io_uring · 799e5055
  Florian Fischer authored 3 years ago
  
  799e5055
- [IoContext] use intermediate c-array during reaping · 84cf13b8
  Florian Fischer authored 3 years ago
  
  This removes the rather expensive (reported by perf) initialization of the std::arrays.
  84cf13b8
- [IoContext] add waitfree reaping of a single CQE · 3d1f2608
  Florian Fischer authored 3 years ago
  
  To distinguish the outcomes of the waitfree reap attempt a new enum StealingResult::{Empty, LostRace, Stolen} is introduced.
  3d1f2608
- Merge branch 'iwyu-0.17' into 'master' · 7e460851
  Florian Schmaus authored 3 years ago
  
  Fix includes (as reported by IWYU 0.17) and update CI container image See merge request !285
  7e460851
- [gitlab-ci] Cache subprojects/packagecache · c3472767
  Florian Schmaus authored 3 years ago
  
  c3472767
Dec 06, 2021

Merge branch 'auto-check-aq-while-stealing' into 'master' · f5278a2a
Florian Schmaus authored 3 years ago
```
[meson] set check_anywhere_queue_while_steal automatic

See merge request !286
```
f5278a2a

[meson] set check_anywhere_queue_while_stealing automatic · 7da8e687

Florian Fischer authored 3 years ago

We introduced the check_anywhere_queue_while_steal configuration
as an optimization to get the IO completions reaped by the completer
faster into the normal WSQ.
But now the emper has configurations where we don't use a completer
thus making this optimization useless or rather harmful.

By default automatically decide the value of
check_anywhere_queue_while_stealing based on the value of
io_completer_behavior.

7da8e687

Merge branch 'do-not-record-io-steal-attempts' into 'master' · 3b2d6f8e
Florian Schmaus authored 3 years ago
```
[io/Stats] do not record steal attempts

See merge request !284
```
3b2d6f8e
[gitlab-ci] Bump container image to 1.17 · ba55e55a
Florian Schmaus authored 3 years ago

ba55e55a
Fix includes (as reported by IWYU 0.17) · 42f5a853
Florian Schmaus authored 3 years ago

42f5a853

Dec 05, 2021

[io/Stats] do not record steal attempts · 100f8532

Florian Fischer authored 3 years ago

Recording every steal attempt is rather excessive and we are not doing
it for normal work.
Flamegraphs have show io-stealing takes considerable more time
than normal work stealing because of the recording of steal attempts,
especially if we are using atomics to aggregate them.

100f8532

Dec 03, 2021

Merge branch 'adjust-test-load' into 'master' · 2548d32c
Florian Schmaus authored 3 years ago
```
reduce test load when logging

See merge request !282
```
2548d32c

reduce test load when logging · 905fb18b

Florian Fischer authored 3 years ago

I suspect some test which scale whith the number of CPUs to timeout
mostly on jenkins2.
This patch reduces the load when logging is active and increases the
load when logging is off.
Therefore our test build with debugoptimized will do less and hopefully
only timeout when they actually hung and the release test will do
more.

905fb18b

Merge branch 'optimize-lockless-stealing' into 'master' · d829d688
Florian Schmaus authored 3 years ago
```
load CQ->tail only once during lockless stealing

See merge request !281
```
d829d688