Commits · 7629dff448c05235f837dd28b258c4d4b2d31732 · Florian Fischer / emper

Mar 08, 2021

[EchoClient] set TCP_NODELAY and record total durations · 7629dff4

Florian Fischer authored 4 years ago

Nagle's algorithm which tries to prevent small TCP frames is harmful
for our throughput if we send small echos.

Arithmetic means are inaccurate if the sample has extreme outliers
therefore we record and report the total execution times.

7629dff4

Merge branch 'parallelize_io_startup' into 'master' · c8828c26
Florian Schmaus authored 4 years ago
```
[IO] parallelize IO startup

See merge request i4/manycore/emper!122
```
c8828c26
Merge branch 'remove-obsolete-batch-meson-option' into 'master' · 196e2ec0
Florian Schmaus authored 4 years ago
```
[meson] remove obsolete io_batch_anywhere_completions option

See merge request i4/manycore/emper!121
```
196e2ec0

[IO] parallelize IO startup · 92239853

Florian Fischer authored 4 years ago

GlobalIoContext::registerWorkerIo() now protects the GlobalIoContext's SQ
with a mutex and the globalCompleter waits till all worker's registered
their IoContext with the new Sempahore Runtime.ioReadySem.

92239853

[meson] remove obsolete io_batch_anywhere_completions option · 0627a4b9

Florian Fischer authored 4 years ago

Since 8f38dbed the globalCompleter
does always reap and schedule in batches through
IoContest::reapAndSchedule<CallerEnvironment::ANYWHERE> ->
Runtime::scheduleFromAnywhere(Input it begin, InputIt end) ->
AnywhereQueue::insert(Input it begin, InputIt end)

0627a4b9

Mar 03, 2021
- Merge branch 'emper-shutdown' into 'master' · bc6e4139
  Florian Schmaus authored 4 years ago
  
  EMPER shutdown See merge request i4/manycore/emper!111
  bc6e4139
- Add support for clean EMPER runtime shutdowns · 8078cabc
  Florian Schmaus authored 4 years ago
  
  8078cabc
Mar 02, 2021
- [Emper] Fix include in Emper.hpp · b6f1dbc6
  Florian Schmaus authored 4 years ago
  
  b6f1dbc6
- [RwLockUnboundedQueue] Add missing #include "Common.hpp" · 5e8b9c95
  Florian Schmaus authored 4 years ago
  
  5e8b9c95
- [Common] Add missing #include <sstream> · f00a4fe2
  Florian Schmaus authored 4 years ago
  
  f00a4fe2
- Merge branch 'fix_reap_completion_race' into 'master' · 8f38dbed
  Florian Schmaus authored 4 years ago
  
  Fix reap completion race See merge request i4/manycore/emper!117
  8f38dbed
Mar 01, 2021

[Runtime] add iterator-based scheduling and optimize Runtime::nextFiber · 941f22fa
Florian Fischer authored 4 years ago

941f22fa

[IO] fix the possible lost wakeup for the IoContext::cq_lock race · e6cc92f1

Florian Fischer authored 4 years ago

Our current naive try lock protecting a worker's IoContext's cq is racy.
This fact alone is no problem a try lock is by design racy in the sense
that two threads race who can take the lock.

The actual problem is:

While a worker is holding the lock additional completions could arrive
which the worker does not observe because it could be already finished
iterating the CQ.

In the case that the worker still holds the lock preventing the globalCompleter
from reaping the additional completions there exists a lost wakeup problem
possibly leading to a completely sleeping runtime with runnable completions
in a worker's IoContext.

To prevent this lost wakeup the cq_lock now counts the unsuccessful
lock attempts from the globalCompleter.

If a worker observes that the globalCompleter tried to reapCompletions
more than once we know that a lost wakeup could have occurred and we try to
reap again.
Observing one attempt is normal since we know the globalCompleter and the
worker owning the IoContext race for the cq_lock required to reap completions.

Additionally:

* Reduce the critical section in which the cq_lock is held by copying all
  seen cqes and completing the Futures after the lock was released.

* Don't immediately schedule blocked Fibers or Callbacks rather collect them
  an return them as batch. Maybe the caller knows better what to to with a
  batch of runnable Fibers

e6cc92f1

Merge branch 'remove_duplicate_overflow_queue_option' into 'master' · 5edc8e5e
Florian Schmaus authored 4 years ago
```
[meson] remove duplicate overflow_queue meson option

See merge request i4/manycore/emper!118
```
5edc8e5e
[meson] remove duplicate overflow_queue meson option · beca4f19
Florian Fischer authored 4 years ago

beca4f19
Merge branch 'terminate_io' into 'master' · 19bd1fbf
Florian Schmaus authored 4 years ago
```
Prepare the IO subsystem for emper's future

See merge request i4/manycore/emper!114
```
19bd1fbf

Prepare the IO subsystem for emper's future · 3f87157a

Florian Fischer authored 4 years ago

All code related to the gloablIo is moved from IoContext to the GlobalIoContext
child class.

Remember the runtime in each IoContext which created the IoContext.
This allows the IoContexts to be independent from a global runtime reference.
Because of this change a IoContext (global or worker) musst be retrieved
from the a runtime object using Runtime::getIo()

The only point in the IO subsystem where we depend on Runtime::getRuntime()
is when resubmitting a PartialCompletableFuture.
To be sure we use the correct Runtime object Runtime::getRuntime() should
return the runtime which started the thread (worker or globalCompleter).
This could be achieved by remembering the runtime in a thread_local variable
for example.

The global completer thread is now tied to a GlobalIoContext object
and can be terminated using the GlobalIoContext's eventfd
through GlobalIoContext::initiateTermination() and GlobalIoContext::waitUntilFinished()

3f87157a

Merge branch 'boost' into 'master' · 0e67670b
Florian Schmaus authored 4 years ago
```
Improve boost dependency and bump gitlab-ci docker image

See merge request !116
```
0e67670b
[gitlab-ci] Bump flowdalic/debian-testing-dev docker image to 1.5 · ff724a95
Florian Schmaus authored 4 years ago

ff724a95
[build] Make boost_thread_dep an optional dependency · 869cef2d
Florian Schmaus authored 4 years ago

869cef2d
Merge branch 'non-virtual-dtor' into 'master' · b28a254f
Florian Schmaus authored 4 years ago
```
[build] Re-enable non-virtual-dtor warning

See merge request !115
```
b28a254f
[gitlab-ci] Factor iwyu & clang-tidy out of the static-analysis task · 02ecb722
Florian Schmaus authored 4 years ago
```
This should improve the CI response time, as we now (potentially)
perform iwyu and clang-tidy in parallel.
```
02ecb722
[build] Re-enable non-virtual-dtor warning · 4bd885bb
Florian Schmaus authored 4 years ago

4bd885bb
[Makefile] Add 'tidy' target · b18bf590
Florian Schmaus authored 4 years ago

b18bf590

Feb 26, 2021

Merge branch 'unbounded_queue_implementation' into 'master' · a1bb4f7a
Florian Schmaus authored 4 years ago
```
Make LockedUnboundedQueue implementation configurable

See merge request i4/manycore/emper!113
```
a1bb4f7a

Make LockedUnboundedQueue implementation configurable · 9b949e49

Florian Fischer authored 4 years ago

Available implementations configurations through the meson option
'locked_unbounded_queue_implementation' are:

mutex - our current LockedUnboundedQueue implementation using std::mutex

rwlock - An implementation with pthread_rwlock. The implementations tries
         to upgrade its rdlock and drops and acquires a wrlock on failure

shared_mutex - An implementation using std::shared_mutex.
         dequeue() acquires a shared lock at first, drops it and
         acquires a unique lock

boost_shared_mutex - An implementation using boost::shared_mutex.
         dequeue() acquires an upgradable lock and upgrade it
         to a unique lock if necessary

9b949e49

[meson] better propagate dependencies · 2d0b5f6b

Florian Fischer authored 4 years ago

The emper header LockedUnboundedQueue.hpp could depend on different libraries
according to the implementation.

To link those dependencies with everything including LockedUnboundedQueue.hpp
we propagate all emper_dependencies through emper_dep.

And using emper_dep as a dependency seems anyway better than essentially
writing down emper_dep manually each time.

emper_dep essentially is:
(link_with:emper, include_directories: emper_all_include)

2d0b5f6b

Merge branch 'batch_schedule_from_anywhere' into 'master' · b22579c0
Florian Schmaus authored 4 years ago
```
add a batch optimization for the global completer

See merge request !110
```
b22579c0
Merge branch 'flow' into 'master' · 6fd6fc4d
Florian Schmaus authored 4 years ago
```
Minor improvements

See merge request !112
```
6fd6fc4d

add a batch optimization for the global completer · 17776ba2

Florian Fischer authored 4 years ago

This change introduces new scheduleFromAnywhere methods which take
a range of Fibers to schedule.

Blockable gets a new method returning the fiber used to start
the unblocked context, which is used by Future/PartialCompletableFuture
to provide a way of completion and returning the continuation Fiber
to the caller so they may schedule the continuation how they want.

If the meson option io_batch_anywhere_completions is set the global
completer will collect all callback and continuation fibers before
scheduling them at once when it is done reaping the completions.
The idea is that taking the AnywhereQueue write lock and calling onNewWork
must only be done once.

TODO: investigate if onNewWork should be extended by an amountOfWork
argument which determines how many worker can be awoken and have work to
do. This should be trivially since our WorkerWakeupSemaphore implementations
already support notify_many(), which may be implemented in terms of
notify_all though.

17776ba2

[IoContext] Add code comment · 4d742238
Florian Schmaus authored 4 years ago

4d742238
Add compile_commands.json symlink · 620d09b3
Florian Schmaus authored 4 years ago

620d09b3

Feb 25, 2021

Merge branch 'env_options' into 'master' · 58d376aa
Florian Schmaus authored 4 years ago
```
[Runtime] add env options to define workerCount and pinningOffset

See merge request !109
```
58d376aa

[Runtime] add env options to define workerCount and pinningOffset · 365f11db

Florian Fischer authored 4 years ago

Introduced environment variables:
* EMPER_WORKER_COUNT: specifies how many worker the Runtime default constructor
                      starts.
* EMPER_PINNING_OFFSET: specifies the cpu id where the workers should be pinned
                        round robin.

Both variables allow multiple emper processes to run on the same system
by splitting the available cores.

EMPER_WORKER_COUNT=1 build/apps/echoserver &&
EMPER_WORKER_COUNT=1 EMPER_PINNING_OFFSET=1 build/apps/echoclient

Starts a single threaded echoserver on core 0 and a single threaded
echo client on core 1.

365f11db

Merge branch 'fixes' into 'master' · 2712c004
Florian Schmaus authored 4 years ago
```
Minor fixes

See merge request !108
```
2712c004
[IoContext] Remove superfluous template delcarations · 7729e9ea
Florian Schmaus authored 4 years ago

7729e9ea
[gitlab-ci] Fix default BUILDTYPE s/debugoptmized/debugoptimized/ · cdba8613
Florian Schmaus authored 4 years ago

cdba8613

Feb 24, 2021
- Merge branch 'futex_semaphore' into 'master' · 8ba4b32e
  Florian Schmaus authored 4 years ago
  
  add three semaphore implementations and make wakeupSem configurable See merge request !107
  8ba4b32e
Feb 23, 2021

[WorkerWakeupSemaphore] add three possible implementations · 3cde3e16

Florian Fischer authored 4 years ago

LockedSemaphore is the already existening Semaphore using
a mutex and a condition variable.
PosixMutex is a thin wrapper around a POSIX semaphore.
SpuriousFutexSemaphore is a atomic/futex based implementation
prune to spurious wakeups which is fine for the worker wakeup usecase.

3cde3e16

Merge branch 'fix-skipWakeupThreshold' into 'master' · f9f50145
Florian Schmaus authored 4 years ago
```
[Runtime] Fix skipWakeupThreshold value

See merge request !106
```
f9f50145