- Apr 24, 2022
-
-
Florian Fischer authored
-
- Apr 10, 2022
-
-
Florian Fischer authored
When EMPER is build with -Dio_synchronous each Future will be completed synchronously when calling Future::wait().
-
- Mar 24, 2022
-
-
Florian Schmaus authored
-
- Feb 28, 2022
-
-
Florian Schmaus authored
This further split up the stats machinery into smaller parts.
-
Florian Schmaus authored
The idea of the new stats machinery is that 'stats' becomes an option that enables the basic stats gathering infrastructure in EMPER. At some point, it should become a non-user option, i.e., it should be remove from meson_options.txt. Then comes a layer of fine-grained stats control switches, which default to 'auto'. Third, a new option called 'stats_all' is added, which, if enabled, activates all fine-graind stats knobs that are set to 'auto'.
-
- Feb 27, 2022
-
-
Florian Schmaus authored
-
- Feb 24, 2022
-
-
Florian Schmaus authored
-
- Feb 18, 2022
-
-
Florian Fischer authored
Since Linux 5.15 io_uring can limit the number of iow threads created using IORING_REGISTER_IOWQ_MAX_WORKERS. Bump liburing wrap to version 2.1 to use io_uring_register_iowq_max_workers. Expose this via the meson variable io_unbounded_iow_max and the environment variable EMPER_IO_UNBOUNDED_IOW_MAX. See for an detailed explanation: https://blog.cloudflare.com/missing-manuals-io_uring-worker-pool
-
Florian Schmaus authored
This required to break an include cycle between Fibril and LockedQueue.
-
- Feb 16, 2022
-
-
Florian Schmaus authored
-
- Feb 15, 2022
-
-
Florian Schmaus authored
-
- Feb 11, 2022
-
-
Florian Schmaus authored
-
Florian Fischer authored
-
- Feb 10, 2022
-
-
Florian Schmaus authored
-
Florian Schmaus authored
Also keep the context size at 64 KiB (there was a comment that errorneously indicated that the context size is 4 MiB).
-
- Feb 09, 2022
-
-
Florian Schmaus authored
-
- Feb 07, 2022
-
-
Florian Schmaus authored
Thanks to Nicolas Pfeiffer for writing the initial prototypical implementation of continuation stealing and the cactus stack mechanism, on which this is based. Co-authored-by:
Nicolas Pfeiffer <pfeiffer@cs.fau.de>
-
- Jan 23, 2022
-
-
Florian Fischer authored
I think wakeup hints should never be ignored but having the option seams usefull to observe their benefits/cost.
-
- Jan 21, 2022
-
-
Florian Fischer authored
-
Florian Fischer authored
The SpuriousFutex2Semaphore is able to notify a specific worker by using two futexes two wait on. One working like a normal semaphore used for global non specific notifications via notify() and notify_many(). And a second one per worker which is based on a SleeperState. To notify a specific worker we change its SleeperState to Notified and call FUTEX_WAKE if needed.
-
- Jan 14, 2022
-
-
Florian Schmaus authored
-
- Dec 14, 2021
-
-
Florian Fischer authored
-
Florian Fischer authored
Two meson options control the io_uring sqpoll feature: * io_uring_sqpoll - enable sq polling * io_uring_shared_poller - share the polling thread between all io_urings Since 5.12 the IORING_SETUP_ATTACH_WQ only causes sharing of poller threads not the work queues. See: https://github.com/axboe/liburing/issues/349 When using SQPOLL the userspace has no good way to know how many sqes the kernel has consumed therefore we wait for available sqes using io_uring_sqring_wait if there was no usable sqe. Remove the GlobalIoContext::registerLock and register all worker io_uring eventfd reads at the beginning of the completer function. Also register all the worker io_uring eventfds since they never change and it hopefully reduces overhead in the global io_uring.
-
- Dec 10, 2021
-
-
Florian Fischer authored
-
Florian Fischer authored
Waitfree work stealing is configured with the meson option 'waitfree_work_stealing'. The retry logic is intentionally left in the Queues and not lifted to the scheduler to reuse the load of an unsuccessful CAS. Consider the following pseudo code examples: steal() -> bool: steal() -> res load load loop: if empty return EMPTY if empty return EMPTY cas cas return cas ? STOLEN : LOST_RACE if not WAITFREE and not cas: goto loop outer(): return cas ? STOLEN : LOST_RACE loop: res = steal() outer(): if not WAITFREE and res == LOST_RACE: steal() goto loop In the right example the value loaded by a possible unsuccessful CAS can not be reused. And a loop of unsuccessful CAS' will result in double loads. The retries are configurable through a template variable maxRetries. * maxRetries < 0: indefinitely retries * maxRetries >= 0: maxRetries
-
- Dec 06, 2021
-
-
Florian Fischer authored
We introduced the check_anywhere_queue_while_steal configuration as an optimization to get the IO completions reaped by the completer faster into the normal WSQ. But now the emper has configurations where we don't use a completer thus making this optimization useless or rather harmful. By default automatically decide the value of check_anywhere_queue_while_stealing based on the value of io_completer_behavior.
-
- Nov 10, 2021
-
-
Florian Fischer authored
Add two new mutual exclusive meson_options: * work_stealing_victim_count: Which sets an absolute number of victims * work_stealing_victim_denominator: Set victim count to #workers/denominator
-
- Oct 13, 2021
-
-
Florian Fischer authored
The lockless algorithm can now be configured by setting -Dio_lockless_cq=true and the used memory ordering by setting -Dio_lockless_memory_order={weak,strong}. io_lockless_memory_order=weak: read with acquire write with release io_lockless_memory_order=strong: read with seq_cst write with seq_cst
-
- Oct 11, 2021
-
-
Florian Fischer authored
TODO: think about stats and possible ring buffer pointers overflow and ABA.
-
Florian Fischer authored
IO stealing is analog to work-stealing and means that worker thread without work will try to steal IO completions (CQEs) from other worker's IoContexts. The work stealing algorithm is modified to check a victims CQ after findig their work queue empty. This approach in combination with future additions (global notifications on IO completions, and lock free CQE consumption) are a realistic candidate to replace the completer thread without loosing its benefits. To allow IO stealing the CQ must be synchronized which is already the case with the IoContext::cq_lock. Currently stealing workers always try to pop a single CQE (this could be configurable). Steal attempts are recorded in the IoContext's Stats object and successfully stolen IO continuations in the AbstractWorkStealingWorkerStats. I moved the code transforming CQEs into continuation Fibers from reapCompletions into a seperate function to make the rather complicated function more readable and thus easier to understand. Remove the default CallerEnvironment template arguments to make the code more explicit and prevent easy errors (not propagating the caller environment or forgetting the function takes a caller environment). io::Stats now need to use atomics because multiple thread may increment them in parallel from EMPER and the OWNER. And since using std::atomic<T*> in std::map is not easily possible we use the compiler __atomic_* builtins. Add, adjust and fix some comments.
-
- Sep 27, 2021
-
-
Florian Fischer authored
std::localtime takes a global lock and is therefore not scalable and inapplicable for analyzing timing sensible bugs. Introduce a new option to add UTC timestamps. This allows on my system to double the CPU load while using mmapped logging. Also increase the LogBuffer size from 1MB to 1GB because I had some crashes where a renewed buffer was still used.
-
- Sep 22, 2021
-
-
Florian Fischer authored
-
- Sep 20, 2021
-
-
Florian Fischer authored
Add new 'throttle' wakeup strategy inspired by the algorithm used by zap, go and tokio. This tries to prevent a possible thundering herd problem and reduce contention on the scheduler by only waking a single worker at a time. It further ensures that the next worker is only notified if the previous successfully found work.
-
- Aug 19, 2021
-
-
Florian Schmaus authored
This adds an option to make the scheduling parameters of the completer thread configurable via a meson option.
-
- Aug 18, 2021
-
-
Florian Fischer authored
-
Florian Fischer authored
Introduce a new meson option io_single_uring which causes EMPER to only use the GlobalIoContexts for all IO. To submit SQEs to the io_uring SQ SubmitActor is used. Futures can be in a new state where they are submitted to the SubmitActor but not to the io_uring yet. In this state isSubmitted && !isPrepared th Future must not be destroyed to ensure this we yield when forgetting a Future until it is prepared and thus it is safe to destroy it. This commit contains no optimizations (no batching, no try non blocking syscall first, ...) Refacter GlobalIoContext.cpp: * rename globalCompleter to completer * make the completer loop non-static
-
- Aug 02, 2021
-
-
Florian Schmaus authored
-
- Jul 14, 2021
-
-
Florian Fischer authored
Design goals ============ * Wakeup either on external newWork notifications or on local IO completions -> Sleep strategy is sound without the IO completer * Do as less as possible in a system saturated with work * Pass a hint where to find new work to suspended workers Algorithm ========= Data: Global: hint pipe sleepers count Per worker: dispatch hint buffer in flight flag Sleep: if we have no sleep request in flight Atomic increment sleep count Remember that we are sleeping Prepare read cqe from the hint pipe to dispatch hint buffer Prevent the completer from reaping completions on this worker's IoContext Wait until IO completions occurred NotifyEmper(n): if observed sleepers <= 0 return // Determine how many we are responsible to wake do toWakeup = min(observed sleepers, n) while (!CAS(sleepers, toWakeup)) write toWakeup hints to the hint pipe NotifyAnywhere(n): // Ensure all n notifications take effect while (!CAS(sleepers, observed sleepers - n)) if observed sleeping <= -n return toWakeup = min(observed sleeping, n) write toWakeup hints to the hint pipe onNewWorkCompletion: reset in flight flag allow completer to reap completions on this IoContext Notes ===== * We must decrement the sleepers count on the notifier side to prevent multiple notifiers to observe all the same amount of sleepers, trying to wake up the same sleepers by writing to the pipe and jamming it up with unconsumed hints and thus blocking in the notify write resulting in a deadlock. * The CAS loops on the notifier side are needed because decrementing and incrementing the excess is racy: Two notifier can observe the sum of both their excess decrement and increment to much resulting in a broken counter. * Add the dispatch hint code in AbstractWorkStealingScheduler::nextFiber. This allows workers to check the dispatch hint after there where no local work to execute. This is a trade-off where we trade slower wakeup - a just awoken worker will check for local work - against a faster dispatch hot path when we have work to do in our local WSQ. * The completer tread must not reap completions on the IoContexts of sleeping workers because this introduces a race for cqes and a possible lost wakeup if the completer consumes the completions before the worker is actually waiting for them. * When notifying sleeping workers from anywhere we must ensure that all notifications take effect. This is needed for example when terminating the runtime to prevent sleep attempt from worker thread which are about to sleep but have not incremented the sleeper count yet. We achieve this by always decrementing the sleeper count by the notification count. Thanks to Florian Schmaus <flow@cs.fau.de> for spotting bugs and suggesting improvements.
-
- May 05, 2021
-
-
Florian Schmaus authored
-
- Mar 23, 2021
-
-
Florian Fischer authored
Available behaviors: * none - the completer thread is not started * schedule (default) - the completer thread will reap and schedule available completions from worker IoContexts * wakeup - the completer thread will wakeup all workers if it observes completions in a worker IoContext. The Fiber produced by the completion will be scheduled when the worker in which's IoContext the cqe lies reaps its completions.
-