emper merge requestshttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests2021-08-20T12:10:12Zhttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/239[build] Emit a message if liburing subproject is used2021-08-20T12:10:12ZFlorian Schmaus[build] Emit a message if liburing subproject is usedhttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/190[IO] add shutdown(3) support2021-08-19T12:37:21ZFlorian Fischer[IO] add shutdown(3) supportio_uring has shutdown support since Linux 5.11.io_uring has shutdown support since Linux 5.11.https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/238Consume liburing as meson wrap if native is not recent enough2021-08-19T11:46:42ZFlorian FischerConsume liburing as meson wrap if native is not recent enoughThis was initially a part of !210 but I notices the old !190 would benefit from it as well.
So I decided the problems of !210 should not hold back those meson/liburing chages.This was initially a part of !210 but I notices the old !190 would benefit from it as well.
So I decided the problems of !210 should not hold back those meson/liburing chages.https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/237[Future] Also log 'res' in LOGD of setCompletion(int32_t res)2021-08-19T11:40:32ZFlorian Schmaus[Future] Also log 'res' in LOGD of setCompletion(int32_t res)It can't hurt to provide more information in log messages, especially
'res' is a good candidate in this case.It can't hurt to provide more information in log messages, especially
'res' is a good candidate in this case.https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/236[GlobalIoContext] Add CompleterSchedParam option2021-08-19T11:40:23ZFlorian Schmaus[GlobalIoContext] Add CompleterSchedParam optionThis adds an option to make the scheduling parameters of the completer
thread configurable via a meson option.This adds an option to make the scheduling parameters of the completer
thread configurable via a meson option.https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/235[IoContext] Add missing error handling in submitPreparedSqesAndWait()2021-08-19T08:56:12ZFlorian Schmaus[IoContext] Add missing error handling in submitPreparedSqesAndWait()Within the
do { reapAndScheduleCompletions() } while (io_uring_submit() == -EBUSY)
loop, the return value of io_uring_submit could be a negative value other
than EBUSY. In that case, we did not DIE.
Looking at the SubmitActor, where w...Within the
do { reapAndScheduleCompletions() } while (io_uring_submit() == -EBUSY)
loop, the return value of io_uring_submit could be a negative value other
than EBUSY. In that case, we did not DIE.
Looking at the SubmitActor, where we have a very similar loop, the
error handling is correct. This changes the error handling in
IoContext to match the one of SubmitActor, even though it has a little
bit more overhead.https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/233[IoContext] Remove duplicate CQE_BATCH_COUNT delcaration and definition2021-08-19T08:55:59ZFlorian Schmaus[IoContext] Remove duplicate CQE_BATCH_COUNT delcaration and definitionhttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/230[IO] Implement configurable "simple architecture"2021-08-18T10:04:37ZFlorian Fischer[IO] Implement configurable "simple architecture"Introduce a new meson option io_single_uring which causes EMPER
to only use the GlobalIoContexts for all IO.
To submit SQEs to the io_uring SQ SubmitActor is used.
Futures can be in a new state where they are submitted to the SubmitAct...Introduce a new meson option io_single_uring which causes EMPER
to only use the GlobalIoContexts for all IO.
To submit SQEs to the io_uring SQ SubmitActor is used.
Futures can be in a new state where they are submitted to the SubmitActor
but not to the io_uring yet.
In this state isSubmitted && !isPrepared th Future must not be destroyed
to ensure this we yield when forgetting a Future until it is prepared
and thus it is safe to destroy it.
This commit contains no optimizations (no batching, no try non blocking
syscall first, ...)https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/232Move all definitions from emper.hpp in compilation unit2021-08-18T09:53:51ZFlorian SchmausMove all definitions from emper.hpp in compilation unithttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/231Move emper::sleep() implemention from header in compilation unit2021-08-18T09:10:39ZFlorian SchmausMove emper::sleep() implemention from header in compilation unithttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/229[AbstractWorkStealingScheduler] Get rid of "rand() % workerCount"2021-08-11T14:30:56ZFlorian Schmaus[AbstractWorkStealingScheduler] Get rid of "rand() % workerCount"The "rand() % workerCount" constructed used in the work-stealing
scheduler is flawed. It has a bias toward lower worker IDs due the
modulo operation. This is something I always wanted to get rid of, but
never found the time to do it. Unt...The "rand() % workerCount" constructed used in the work-stealing
scheduler is flawed. It has a bias toward lower worker IDs due the
modulo operation. This is something I always wanted to get rid of, but
never found the time to do it. Until know.
Get rid of it and replace it with
std::uniform_int_distribution<workerid_t> (as field the Worker
instance).
The changes in AbstractWorkStealingScheduler.cpp look more than they
are actually. I had to introduce a new scope since to the goto
instruction would otherwise skip the initialization of currentWorker.
The main changes in AbstractWorkStealingScheduler are
- use currentWorker->nextRandomWorkerId() (instead of the flawed construct)
- currentWorker->getWorkerId() (instead of Runtime::getWorkerId())https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/224[stats] blocked contexts2021-08-10T10:42:18ZFlorian Schmaus[stats] blocked contextshttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/228Support distributing multiple echoclient over the network2021-08-09T16:15:11ZFlorian FischerSupport distributing multiple echoclient over the networkWe archive synchronization through an external coordinator process awaiting n
connections and a message on each connection.
After receiving all messages it sends "Go" back on each connection and terminates.
The echo clients synchronize ...We archive synchronization through an external coordinator process awaiting n
connections and a message on each connection.
After receiving all messages it sends "Go" back on each connection and terminates.
The echo clients synchronize using the coordinator after establishing all TCP
connections.https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/221[meson] allow building EMPER on systems whithout <filesystem>2021-08-02T19:38:53ZFlorian Fischer[meson] allow building EMPER on systems whithout <filesystem>Check if `std::filesystem::recursive_directory_iterator` and `std::filesystem::path`
are available before using those in EMPER code.
Allow `meson.build` files in `emper/` subdirectories to add configuration options
by consuming the `con...Check if `std::filesystem::recursive_directory_iterator` and `std::filesystem::path`
are available before using those in EMPER code.
Allow `meson.build` files in `emper/` subdirectories to add configuration options
by consuming the `conf_data` object after all subdirectories were visited.
Introduce a quasi naming standard for cpp feature flags in meson code:
`cpp_has_<namespace>_<feature>`
Examples:
`cpp_has_fs_path`https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/207Add meson option for "check anywhere queue while stealing"2021-08-02T16:41:52ZFlorian SchmausAdd meson option for "check anywhere queue while stealing"https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/226[io.hpp] add blocking functions using timeouts2021-08-02T14:52:13ZFlorian Fischer[io.hpp] add blocking functions using timeoutshttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/223[stats] Add max-queue-length stats to AbstractWorkStealingScheduler2021-08-02T13:01:40ZFlorian Schmaus[stats] Add max-queue-length stats to AbstractWorkStealingSchedulerhttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/225[echoclient] print a short description of parameters2021-07-29T11:59:25ZFlorian Fischer[echoclient] print a short description of parametershttps://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/214Implement sleep strategy using the IO subsystem2021-07-27T11:49:05ZFlorian FischerImplement sleep strategy using the IO subsystemimplement a pipe based sleep strategy using the IO subsystem
# Design goals
* Wakeup either on external newWork notifications or on local IO completions
-> Sleep strategy is sound without the IO completer
* Do as less as possible in ...implement a pipe based sleep strategy using the IO subsystem
# Design goals
* Wakeup either on external newWork notifications or on local IO completions
-> Sleep strategy is sound without the IO completer
* Do as less as possible in a system saturated with work
* Pass a hint where to find new work to suspended workers
# Algorithm
```
Data:
Global:
hint pipe
sleepers count
Per worker:
dispatch hint buffer
in flight flag
Sleep:
if we have no sleep request in flight
Atomic increment sleep count
Remember that we are sleeping
Prepare read cqe from the hint pipe to dispatch hint buffer
Prevent the completer from reaping completions on this worker's IoContext
Wait until IO completions occurred
NotifyEmper(n):
if observed sleepers <= 0
return
// Determine how many we are responsible to wake
do
toWakeup = min(observed sleepers, n)
while (!CAS(sleepers, toWakeup))
write toWakeup hints to the hint pipe
NotifyAnywhere(n):
// Ensure all n notifications take effect
while (!CAS(sleepers, observed sleepers - n))
if observed sleeping <= -n
return
toWakeup = min(observed sleeping, n)
write toWakeup hints to the hint pipe
onNewWorkCompletion:
reset in flight flag
allow completer to reap completions on this IoContext
```
# Notes
* We must decrement the sleepers count on the notifier side to
prevent multiple notifiers to observe all the same amount of sleepers,
trying to wake up the same sleepers by writing to the pipe and jamming it up
with unconsumed hints and thus blocking in the notify write resulting
in a deadlock.
* The CAS loops on the notifier side are needed because decrementing
and incrementing the excess is racy: Two notifier can observe the
sum of both their excess decrement and increment to much resulting in a
broken counter.
* Add the dispatch hint code in `AbstractWorkStealingScheduler::nextFiber`.
This allows workers to check the dispatch hint after there
where no local work to execute.
This is a trade-off where we trade slower wakeup - a just awoken worker
will check for local work - against a faster dispatch hot path when
we have work to do in our local WSQ.
* The completer tread must not reap completions on the IoContexts of
sleeping workers because this introduces a race for cqes and a possible
lost wakeup if the completer consumes the completions before the worker
is actually waiting for them.
* When notifying sleeping workers from anywhere we must ensure that all
notifications take effect. This is needed for example when terminating
the runtime to prevent sleep attempt from worker thread which are
about to sleep but have not incremented the sleeper count yet.
We achieve this by always decrementing the sleeper count by the notification
count.
Thanks to Florian Schmaus <flow@cs.fau.de> for spotting bugs and suggesting
improvements.https://gitlab.cs.fau.de/i4/manycore/emper/-/merge_requests/219Add docker tooling2021-07-27T09:02:22ZFlorian FischerAdd docker toolingUsage run "docker.sh <your command>" to execute <your command> in the
docker image extracted from .gitlab-ci.yml in the emper root directory
Example: `docker.sh make test`Usage run "docker.sh <your command>" to execute <your command> in the
docker image extracted from .gitlab-ci.yml in the emper root directory
Example: `docker.sh make test`