Skip to content
Snippets Groups Projects
  1. Sep 05, 2022
  2. Apr 24, 2022
  3. Apr 10, 2022
  4. Mar 24, 2022
  5. Mar 15, 2022
    • Florian Fischer's avatar
      depend on liburing 2.2 · 171ae9d4
      Florian Fischer authored
      Liburing 2.2 and Linux 5.18 support IORING_REGISTER_RING_FDS, preventing
      the fget(ring_fd) overhead for each io_uring_enter call, as well as
      IORING_OP_MSG_RING, greatly simplifying the IO-based sleep strategy code.
      171ae9d4
  6. Feb 28, 2022
    • Florian Schmaus's avatar
      Add stats_blocked_context(_count) · 2d43c259
      Florian Schmaus authored
      This further split up the stats machinery into smaller parts.
      2d43c259
    • Florian Schmaus's avatar
      Add worker sleep stats, rework stats machinery · 46302a0f
      Florian Schmaus authored
      The idea of the new stats machinery is that 'stats' becomes an option
      that enables the basic stats gathering infrastructure in EMPER. At
      some point, it should become a non-user option, i.e., it should be
      remove from meson_options.txt. Then comes a layer of fine-grained
      stats control switches, which default to 'auto'. Third, a new option
      called 'stats_all' is added, which, if enabled, activates all
      fine-graind stats knobs that are set to 'auto'.
      46302a0f
  7. Feb 24, 2022
  8. Feb 18, 2022
  9. Feb 16, 2022
  10. Feb 15, 2022
  11. Feb 11, 2022
  12. Feb 10, 2022
  13. Feb 09, 2022
  14. Feb 07, 2022
  15. Jan 23, 2022
  16. Jan 14, 2022
  17. Jan 11, 2022
    • Florian Fischer's avatar
      [meson] add boost as dependency · e976a496
      Florian Fischer authored
      I setup a new development environment and emper did not compile because
      emper::io::Stats use the circular_buffer provided by boost.
      Boost was not installed and our build-system failed to detect it.
      
      This change adds the header-only boost dependency to emper.
      https://mesonbuild.com/Dependencies.html#boost
      The header-only dependency is enough to build emper default configuration.
      
      When linking against boost is required we use the 'modules' karg.
      e976a496
  18. Dec 14, 2021
    • Florian Fischer's avatar
      fdff953f
    • Florian Fischer's avatar
      [IO] overhaul SQPOLL support · 50c965e4
      Florian Fischer authored
      Two meson options control the io_uring sqpoll feature:
      * io_uring_sqpoll - enable sq polling
      * io_uring_shared_poller - share the polling thread between all io_urings
      
      Since 5.12 the IORING_SETUP_ATTACH_WQ only causes sharing of
      poller threads not the work queues.
      See: https://github.com/axboe/liburing/issues/349
      
      When using SQPOLL the userspace has no good way to
      know how many sqes the kernel has consumed therefore we
      wait for available sqes using io_uring_sqring_wait if there
      was no usable sqe.
      
      Remove the GlobalIoContext::registerLock and register all worker
      io_uring eventfd reads at the beginning of the completer function.
      Also register all the worker io_uring eventfds since they never change
      and it hopefully reduces overhead in the global io_uring.
      50c965e4
  19. Dec 10, 2021
    • Florian Fischer's avatar
    • Florian Fischer's avatar
      Introduce waitfree workstealing · 1c538024
      Florian Fischer authored
      Waitfree work stealing is configured with the meson option
      'waitfree_work_stealing'.
      
      The retry logic is intentionally left in the Queues and not lifted to
      the scheduler to reuse the load of an unsuccessful CAS.
      
      Consider the following pseudo code examples:
      
      steal() -> bool:                       steal() -> res
        load                                   load
      loop:                                    if empty return EMPTY
        if empty return EMPTY                  cas
        cas                                    return cas ? STOLEN : LOST_RACE
        if not WAITFREE and not cas:
          goto loop                          outer():
        return cas ? STOLEN : LOST_RACE      loop:
                                               res = steal()
      outer():                                 if not WAITFREE and res == LOST_RACE:
        steal()                                  goto loop
      
      In the right example the value loaded by a possible unsuccessful CAS
      can not be reused. And a loop of unsuccessful CAS' will result in
      double loads.
      
      The retries are configurable through a template variable maxRetries.
      * maxRetries < 0: indefinitely retries
      * maxRetries >= 0: maxRetries
      1c538024
  20. Dec 06, 2021
    • Florian Fischer's avatar
      [meson] set check_anywhere_queue_while_stealing automatic · 7da8e687
      Florian Fischer authored
      We introduced the check_anywhere_queue_while_steal configuration
      as an optimization to get the IO completions reaped by the completer
      faster into the normal WSQ.
      But now the emper has configurations where we don't use a completer
      thus making this optimization useless or rather harmful.
      
      By default automatically decide the value of
      check_anywhere_queue_while_stealing based on the value of
      io_completer_behavior.
      7da8e687
  21. Nov 10, 2021
  22. Oct 13, 2021
    • Florian Fischer's avatar
      [meson] introduce lockless memory order and rename lockless option · 67b0c77a
      Florian Fischer authored
      The lockless algorithm can now be configured by setting -Dio_lockless_cq=true
      and the used memory ordering by setting -Dio_lockless_memory_order={weak,strong}.
      
      io_lockless_memory_order=weak:
          read with acquire
          write with release
      
      io_lockless_memory_order=strong:
          read with seq_cst
          write with seq_cst
      67b0c77a
  23. Oct 11, 2021
    • Florian Fischer's avatar
      [IoContext] implement lockless CQ reaping · d9d350d9
      Florian Fischer authored
      TODO: think about stats and possible ring buffer pointers overflow and ABA.
      d9d350d9
    • Florian Fischer's avatar
      implement IO stealing · 0abc29ad
      Florian Fischer authored
      IO stealing is analog to work-stealing and means that worker thread
      without work will try to steal IO completions (CQEs) from other worker's
      IoContexts. The work stealing algorithm is modified to check a victims
      CQ after findig their work queue empty.
      
      This approach in combination with future additions (global notifications
      on IO completions, and lock free CQE consumption) are a realistic candidate
      to replace the completer thread without loosing its benefits.
      
      To allow IO stealing the CQ must be synchronized which is already the
      case with the IoContext::cq_lock.
      Currently stealing workers always try to pop a single CQE (this could
      be configurable).
      Steal attempts are recorded in the IoContext's Stats object and
      successfully stolen IO continuations in the AbstractWorkStealingWorkerStats.
      
      I moved the code transforming CQEs into continuation Fibers from
      reapCompletions into a seperate function to make the rather complicated
      function more readable and thus easier to understand.
      
      Remove the default CallerEnvironment template arguments to make
      the code more explicit and prevent easy errors (not propagating
      the caller environment or forgetting the function takes a caller environment).
      
      io::Stats now need to use atomics because multiple thread may increment
      them in parallel from EMPER and the OWNER.
      And since using std::atomic<T*> in std::map is not easily possible we
      use the compiler __atomic_* builtins.
      
      Add, adjust and fix some comments.
      0abc29ad
  24. Sep 22, 2021
  25. Aug 19, 2021
  26. Aug 18, 2021
    • Florian Fischer's avatar
    • Florian Fischer's avatar
      [IO] Implement configurable "simple architecture" · 06b5bf0f
      Florian Fischer authored
      Introduce a new meson option io_single_uring which causes EMPER
      to only use the GlobalIoContexts for all IO.
      
      To submit SQEs to the io_uring SQ SubmitActor is used.
      
      Futures can be in a new state where they are submitted to the SubmitActor
      but not to the io_uring yet.
      In this state isSubmitted && !isPrepared th Future must not be destroyed
      to ensure this we yield when forgetting a Future until it is prepared
      and thus it is safe to destroy it.
      
      This commit contains no optimizations (no batching, no try non blocking
      syscall first, ...)
      
      Refacter GlobalIoContext.cpp:
      
      * rename globalCompleter to completer
      * make the completer loop non-static
      06b5bf0f
  27. Aug 02, 2021
Loading