Skip to content
Snippets Groups Projects
  1. Feb 07, 2022
  2. Jan 23, 2022
  3. Jan 14, 2022
  4. Jan 11, 2022
    • Florian Fischer's avatar
      [meson] add boost as dependency · e976a496
      Florian Fischer authored
      I setup a new development environment and emper did not compile because
      emper::io::Stats use the circular_buffer provided by boost.
      Boost was not installed and our build-system failed to detect it.
      
      This change adds the header-only boost dependency to emper.
      https://mesonbuild.com/Dependencies.html#boost
      The header-only dependency is enough to build emper default configuration.
      
      When linking against boost is required we use the 'modules' karg.
      e976a496
  5. Dec 14, 2021
    • Florian Fischer's avatar
      fdff953f
    • Florian Fischer's avatar
      [IO] overhaul SQPOLL support · 50c965e4
      Florian Fischer authored
      Two meson options control the io_uring sqpoll feature:
      * io_uring_sqpoll - enable sq polling
      * io_uring_shared_poller - share the polling thread between all io_urings
      
      Since 5.12 the IORING_SETUP_ATTACH_WQ only causes sharing of
      poller threads not the work queues.
      See: https://github.com/axboe/liburing/issues/349
      
      When using SQPOLL the userspace has no good way to
      know how many sqes the kernel has consumed therefore we
      wait for available sqes using io_uring_sqring_wait if there
      was no usable sqe.
      
      Remove the GlobalIoContext::registerLock and register all worker
      io_uring eventfd reads at the beginning of the completer function.
      Also register all the worker io_uring eventfds since they never change
      and it hopefully reduces overhead in the global io_uring.
      50c965e4
  6. Dec 10, 2021
    • Florian Fischer's avatar
    • Florian Fischer's avatar
      Introduce waitfree workstealing · 1c538024
      Florian Fischer authored
      Waitfree work stealing is configured with the meson option
      'waitfree_work_stealing'.
      
      The retry logic is intentionally left in the Queues and not lifted to
      the scheduler to reuse the load of an unsuccessful CAS.
      
      Consider the following pseudo code examples:
      
      steal() -> bool:                       steal() -> res
        load                                   load
      loop:                                    if empty return EMPTY
        if empty return EMPTY                  cas
        cas                                    return cas ? STOLEN : LOST_RACE
        if not WAITFREE and not cas:
          goto loop                          outer():
        return cas ? STOLEN : LOST_RACE      loop:
                                               res = steal()
      outer():                                 if not WAITFREE and res == LOST_RACE:
        steal()                                  goto loop
      
      In the right example the value loaded by a possible unsuccessful CAS
      can not be reused. And a loop of unsuccessful CAS' will result in
      double loads.
      
      The retries are configurable through a template variable maxRetries.
      * maxRetries < 0: indefinitely retries
      * maxRetries >= 0: maxRetries
      1c538024
  7. Dec 06, 2021
    • Florian Fischer's avatar
      [meson] set check_anywhere_queue_while_stealing automatic · 7da8e687
      Florian Fischer authored
      We introduced the check_anywhere_queue_while_steal configuration
      as an optimization to get the IO completions reaped by the completer
      faster into the normal WSQ.
      But now the emper has configurations where we don't use a completer
      thus making this optimization useless or rather harmful.
      
      By default automatically decide the value of
      check_anywhere_queue_while_stealing based on the value of
      io_completer_behavior.
      7da8e687
  8. Nov 10, 2021
  9. Oct 13, 2021
    • Florian Fischer's avatar
      [meson] introduce lockless memory order and rename lockless option · 67b0c77a
      Florian Fischer authored
      The lockless algorithm can now be configured by setting -Dio_lockless_cq=true
      and the used memory ordering by setting -Dio_lockless_memory_order={weak,strong}.
      
      io_lockless_memory_order=weak:
          read with acquire
          write with release
      
      io_lockless_memory_order=strong:
          read with seq_cst
          write with seq_cst
      67b0c77a
  10. Oct 11, 2021
    • Florian Fischer's avatar
      [IoContext] implement lockless CQ reaping · d9d350d9
      Florian Fischer authored
      TODO: think about stats and possible ring buffer pointers overflow and ABA.
      d9d350d9
    • Florian Fischer's avatar
      implement IO stealing · 0abc29ad
      Florian Fischer authored
      IO stealing is analog to work-stealing and means that worker thread
      without work will try to steal IO completions (CQEs) from other worker's
      IoContexts. The work stealing algorithm is modified to check a victims
      CQ after findig their work queue empty.
      
      This approach in combination with future additions (global notifications
      on IO completions, and lock free CQE consumption) are a realistic candidate
      to replace the completer thread without loosing its benefits.
      
      To allow IO stealing the CQ must be synchronized which is already the
      case with the IoContext::cq_lock.
      Currently stealing workers always try to pop a single CQE (this could
      be configurable).
      Steal attempts are recorded in the IoContext's Stats object and
      successfully stolen IO continuations in the AbstractWorkStealingWorkerStats.
      
      I moved the code transforming CQEs into continuation Fibers from
      reapCompletions into a seperate function to make the rather complicated
      function more readable and thus easier to understand.
      
      Remove the default CallerEnvironment template arguments to make
      the code more explicit and prevent easy errors (not propagating
      the caller environment or forgetting the function takes a caller environment).
      
      io::Stats now need to use atomics because multiple thread may increment
      them in parallel from EMPER and the OWNER.
      And since using std::atomic<T*> in std::map is not easily possible we
      use the compiler __atomic_* builtins.
      
      Add, adjust and fix some comments.
      0abc29ad
  11. Sep 22, 2021
  12. Aug 19, 2021
  13. Aug 18, 2021
    • Florian Fischer's avatar
    • Florian Fischer's avatar
      [IO] Implement configurable "simple architecture" · 06b5bf0f
      Florian Fischer authored
      Introduce a new meson option io_single_uring which causes EMPER
      to only use the GlobalIoContexts for all IO.
      
      To submit SQEs to the io_uring SQ SubmitActor is used.
      
      Futures can be in a new state where they are submitted to the SubmitActor
      but not to the io_uring yet.
      In this state isSubmitted && !isPrepared th Future must not be destroyed
      to ensure this we yield when forgetting a Future until it is prepared
      and thus it is safe to destroy it.
      
      This commit contains no optimizations (no batching, no try non blocking
      syscall first, ...)
      
      Refacter GlobalIoContext.cpp:
      
      * rename globalCompleter to completer
      * make the completer loop non-static
      06b5bf0f
  14. Aug 02, 2021
  15. Jul 14, 2021
    • Florian Fischer's avatar
      implement a pipe based sleep strategy using the IO subsystem · 4ec30fd4
      Florian Fischer authored
      Design goals
      ============
      
      * Wakeup either on external newWork notifications or on local IO completions
        -> Sleep strategy is sound without the IO completer
      * Do as less as possible in a system saturated with work
      * Pass a hint where to find new work to suspended workers
      
      Algorithm
      =========
      
      Data:
      	Global:
      		hint pipe
      		sleepers count
      	Per worker:
      		dispatch hint buffer
      		in flight flag
      
      Sleep:
      	if we have no sleep request in flight
      		Atomic increment sleep count
      		Remember that we are sleeping
      		Prepare read cqe from the hint pipe to dispatch hint buffer
      	Prevent the completer from reaping completions on this worker's IoContext
      	Wait until IO completions occurred
      
      NotifyEmper(n):
      	if observed sleepers <= 0
      		return
      
      	// Determine how many we are responsible to wake
      	do
      		toWakeup = min(observed sleepers, n)
      	while (!CAS(sleepers, toWakeup))
      
      	write toWakeup hints to the hint pipe
      
      NotifyAnywhere(n):
      	// Ensure all n notifications take effect
      	while (!CAS(sleepers, observed sleepers - n))
      		if observed sleeping <= -n
      			return
      
      	toWakeup = min(observed sleeping, n)
      	write toWakeup hints to the hint pipe
      
      onNewWorkCompletion:
      	reset in flight flag
      	allow completer to reap completions on this IoContext
      
      Notes
      =====
      
      * We must decrement the sleepers count on the notifier side to
        prevent multiple notifiers to observe all the same amount of sleepers,
        trying to wake up the same sleepers by writing to the pipe and jamming it up
        with unconsumed hints and thus blocking in the notify write resulting
        in a deadlock.
      * The CAS loops on the notifier side are needed because decrementing
        and incrementing the excess is racy: Two notifier can observe the
        sum of both their excess decrement and increment to much resulting in a
        broken counter.
      * Add the dispatch hint code in AbstractWorkStealingScheduler::nextFiber.
        This allows workers to check the dispatch hint after there
        where no local work to execute.
        This is a trade-off where we trade slower wakeup - a just awoken worker
        will check for local work - against a faster dispatch hot path when
        we have work to do in our local WSQ.
      * The completer tread must not reap completions on the IoContexts of
        sleeping workers because this introduces a race for cqes and a possible
        lost wakeup if the completer consumes the completions before the worker
        is actually waiting for them.
      * When notifying sleeping workers from anywhere we must ensure that all
        notifications take effect. This is needed for example when terminating
        the runtime to prevent sleep attempt from worker thread which are
        about to sleep but have not incremented the sleeper count yet.
        We achieve this by always decrementing the sleeper count by the notification
        count.
      
      Thanks to Florian Schmaus <flow@cs.fau.de> for spotting bugs and suggesting
      improvements.
      4ec30fd4
  16. May 05, 2021
  17. Mar 23, 2021
    • Florian Fischer's avatar
      [IO] make the behavior of the completer thread configurable · 5ea44519
      Florian Fischer authored
      Available behaviors:
        * none - the completer thread is not started
      
        * schedule (default) - the completer thread will reap and schedule available
                               completions from worker IoContexts
      
        * wakeup - the completer thread will wakeup all workers if it observes completions
                   in a worker IoContext. The Fiber produced by the completion will
                   be scheduled when the worker in which's IoContext the cqe lies
                   reaps its completions.
      5ea44519
  18. Mar 12, 2021
  19. Mar 09, 2021
    • Florian Fischer's avatar
      [IO] make the lock implementation protecting a IoContext's cq configurable  · a619ba3e
      Florian Fischer authored
      This change introduces a new synchronization primitive "PseudoCountingTryLock"
      which takes an actual lock as template and provides a CountingTryLock interface.
      By using a PseudoCountingTryLock we don't have to change any synchronization
      code in IoContext::reapCompletion.
      
      Since all PseudoCountingTryLock code is defined in a header the compiler
      should see our constant return values and hopefully optimize away any check
      depending on those constant return values.
      
      Options:
      * spin_lock - naive CAS spin lock
      * mutex - std::mutex
      * counting_try_lock (default) - our own lightweight special
                                      purpose synchronization primitive
      a619ba3e
    • Florian Schmaus's avatar
      [meson] Fix 'iwyu' target for meson >= 0.57 · 08224cd2
      Florian Schmaus authored
      The run_target() function requires an absolute path in meson >= 0.57.
      08224cd2
  20. Mar 08, 2021
  21. Mar 01, 2021
  22. Feb 26, 2021
    • Florian Fischer's avatar
      Make LockedUnboundedQueue implementation configurable · 9b949e49
      Florian Fischer authored
      Available implementations configurations through the meson option
      'locked_unbounded_queue_implementation' are:
      
      mutex - our current LockedUnboundedQueue implementation using std::mutex
      
      rwlock - An implementation with pthread_rwlock. The implementations tries
               to upgrade its rdlock and drops and acquires a wrlock on failure
      
      shared_mutex - An implementation using std::shared_mutex.
               dequeue() acquires a shared lock at first, drops it and
               acquires a unique lock
      
      boost_shared_mutex - An implementation using boost::shared_mutex.
               dequeue() acquires an upgradable lock and upgrade it
               to a unique lock if necessary
      9b949e49
    • Florian Fischer's avatar
      add a batch optimization for the global completer · 17776ba2
      Florian Fischer authored
      This change introduces new scheduleFromAnywhere methods which take
      a range of Fibers to schedule.
      
      Blockable gets a new method returning the fiber used to start
      the unblocked context, which is used by Future/PartialCompletableFuture
      to provide a way of completion and returning the continuation Fiber
      to the caller so they may schedule the continuation how they want.
      
      If the meson option io_batch_anywhere_completions is set the global
      completer will collect all callback and continuation fibers before
      scheduling them at once when it is done reaping the completions.
      The idea is that taking the AnywhereQueue write lock and calling onNewWork
      must only be done once.
      
      TODO: investigate if onNewWork should be extended by an amountOfWork
      argument which determines how many worker can be awoken and have work to
      do. This should be trivially since our WorkerWakeupSemaphore implementations
      already support notify_many(), which may be implemented in terms of
      notify_all though.
      17776ba2
  23. Feb 23, 2021
    • Florian Fischer's avatar
      [WorkerWakeupSemaphore] add three possible implementations · 3cde3e16
      Florian Fischer authored
      LockedSemaphore is the already existening Semaphore using
      a mutex and a condition variable.
      PosixMutex is a thin wrapper around a POSIX semaphore.
      SpuriousFutexSemaphore is a atomic/futex based implementation
      prune to spurious wakeups which is fine for the worker wakeup usecase.
      3cde3e16
  24. Feb 22, 2021
  25. Feb 10, 2021
  26. Jan 26, 2021
    • Florian Fischer's avatar
      [IO] introduce emper::io a IO subsystem using io_uring · 460c2f05
      Florian Fischer authored
      Empers IO design is based on a proactor pattern where each worker
      can issue IO requests through its exclusive IoContext object which wraps an
      io_uring instance.
      
      IO completions are reaped at 4 places:
      1. After a submit to collect inline completions
      2. Before dispatching a new Fiber
      3. When no new IO can be submitted because the completion queue is full
      4. And by a global completer thread which gets notified about completions
         on worker IoContexts through registered eventfds
      
      All IO requests are modeled as Future objects which can be either
      instantiated and submitted manually, retrieved by POSIX-like non-blocking
      or implicitly used by posix-like blocking functions.
      
      User facing API is exported in the following headers:
      * emper/io.hpp (POSIX-like)
      * emper.h (POSIX-like)
      * emper/io/Future.hpp
      
      Catching short write/reads/sends and resubmitting the request without
      unblocking the Fiber is supported.
      
      Using AlarmFuture objects Fibers have a emper-native way to sleep for
      a given time.
      
      IO request timeouts with TimeoutWrapper class.
      Request Cancellation is supported with Future::cancel() or the
      CancelWrapper() Future class.
      
      A proactor design demands that buffers are committed to the kernel
      as long as the request is active. To guaranty memory safety Futures
      get canceled in their Destructor which will only return after the committed
      memory is free to use.
      
      Linking Futures to chains is supported using the Future::SetDependency()
      method. Future are submitted when their last Future gets submitted.
      A linked Request will start if the previous has finished.
      Error or partial completions will cancel the not started tail of a chain.
      
      TODO: Handle possible situations where the CQ of the global completer is full
      and no more sqe can be submitted to the SQ.
      460c2f05
    • Florian Fischer's avatar
      [Blockable] add global set of all blocked contexts for debugging · a745c865
      Florian Fischer authored
      This feature must be activated using the blocked_context_set meson option.
      a745c865
  27. Jan 22, 2021
  28. Jan 13, 2021
  29. Jan 11, 2021
  30. Jan 05, 2021
Loading