Skip to content
Snippets Groups Projects
  1. Jan 26, 2021
    • Florian Schmaus's avatar
      [Scheduler] s/dequeFiberFromAnywhereQueue/dequeueFiberFromAnywhereQueue' · e51a130e
      Florian Schmaus authored
      The operation is called 'dequeue', 'deque' is a double-ended queue.
      e51a130e
    • Florian Schmaus's avatar
      Merge branch 'worker_exclusive_uring' into 'master' · f3d39a5c
      Florian Schmaus authored
      Worker exclusive uring
      
      See merge request i4/manycore/emper!54
      f3d39a5c
    • Florian Fischer's avatar
      [IO] add an echo client implementation suitable for > C10k experiments · 07782b4f
      Florian Fischer authored
      The echo client established X connections and start the
      echo phase after all sockets are connected.
      
      Each client Fiber measures the time from sending the message until
      receiving the echo.
      07782b4f
    • Florian Fischer's avatar
      [IO] add a simple echo server app · e34f1f5d
      Florian Fischer authored
      e34f1f5d
    • Florian Fischer's avatar
      [IO] introduce emper::io a IO subsystem using io_uring · 460c2f05
      Florian Fischer authored
      Empers IO design is based on a proactor pattern where each worker
      can issue IO requests through its exclusive IoContext object which wraps an
      io_uring instance.
      
      IO completions are reaped at 4 places:
      1. After a submit to collect inline completions
      2. Before dispatching a new Fiber
      3. When no new IO can be submitted because the completion queue is full
      4. And by a global completer thread which gets notified about completions
         on worker IoContexts through registered eventfds
      
      All IO requests are modeled as Future objects which can be either
      instantiated and submitted manually, retrieved by POSIX-like non-blocking
      or implicitly used by posix-like blocking functions.
      
      User facing API is exported in the following headers:
      * emper/io.hpp (POSIX-like)
      * emper.h (POSIX-like)
      * emper/io/Future.hpp
      
      Catching short write/reads/sends and resubmitting the request without
      unblocking the Fiber is supported.
      
      Using AlarmFuture objects Fibers have a emper-native way to sleep for
      a given time.
      
      IO request timeouts with TimeoutWrapper class.
      Request Cancellation is supported with Future::cancel() or the
      CancelWrapper() Future class.
      
      A proactor design demands that buffers are committed to the kernel
      as long as the request is active. To guaranty memory safety Futures
      get canceled in their Destructor which will only return after the committed
      memory is free to use.
      
      Linking Futures to chains is supported using the Future::SetDependency()
      method. Future are submitted when their last Future gets submitted.
      A linked Request will start if the previous has finished.
      Error or partial completions will cancel the not started tail of a chain.
      
      TODO: Handle possible situations where the CQ of the global completer is full
      and no more sqe can be submitted to the SQ.
      460c2f05
    • Florian Fischer's avatar
      [Blockable] add global set of all blocked contexts for debugging · a745c865
      Florian Fischer authored
      This feature must be activated using the blocked_context_set meson option.
      a745c865
  2. Jan 25, 2021
  3. Jan 22, 2021
  4. Jan 14, 2021
  5. Jan 13, 2021
  6. Jan 12, 2021
  7. Jan 11, 2021
  8. Jan 06, 2021
  9. Jan 05, 2021
  10. Jan 04, 2021
  11. Dec 19, 2020
    • Florian Fischer's avatar
      [Runtime] notify only one sleeping worker on new work · c2686509
      Florian Fischer authored
      This prevents a trampling herd effect on the AnywhereQueue and the workerSleep mutex.
      
      On workloads where not all workers are busy it greatly reduces used
      CPU time because not all workers wake up just to sleep again.
      
      For hight intensity workloads when al workers are busy handling there own work
      this change should not have any impact because then sleeping workers are rare.
      
      This claim is backed by the experiments I did on faui49big02 (40 cores / 80 hw threads).
      I measured the time and resources used by our tests/Echoserver handling
      incremental amount of connections (one connection per client process,
      each connection issued 100000 echos)
      
                 con    100k echos time[ns]    user-time[s]    sys-time[s]     cpu-used
      notify_all   1            49008685297          217.86         626.26        1650%
      notify_one   1            31304750273            9.40           8.33          53%
      
      notify_all  10            76487793595          665.45        1295.19        2484%
      notify_one  10            35674140605          188.77          68.26         656%
      
      ...
      
      notify_all  40           102469333659         4255.30         363.86        4399%
      notify_one  40           105289161995         4167.43         322.69        4169%
      
      notify_all  80            76883202092         3418.44         409.64        4762%
      notify_one  80            68856748614         2878.56         397.66        4548%
      
      Although I would not absolutely trust the numbers because there are from only one
      run and quit a bit of randomness is inherent to emper because of the work stealing scheduling.
      Nonetheless they show three interesting points:
      1. CPU usage for low intensity workloads is drastically reduced.
      2. Impact of notify_one get smaller the more intense the workload gets.
      3. Somehow emper performs significantly worse for 40 than for 80 connections
      
      Command used to generate results:
      for i in 1 10 20 30 40 80; do /usr/bin/time -v build-release/tests/EchoServer 2>> notify_all.txt & sleep 2; tests/echo_client.py -p ${i} -c 1 -i 100000 >> notify_all.txt && echo "quit" | nc localhost 12345; done
      
      Full results can be found here:
      notify_all: https://termbin.com/6zba
      notify_one: https://termbin.com/3bsi
      c2686509
  12. Dec 18, 2020
  13. Dec 17, 2020
    • Florian Schmaus's avatar
      Merge branch 'ubmpscq_handle_spurious_wakeup' into 'master' · a84a8752
      Florian Schmaus authored
      handle UnboundedBlockingMpscQueue spurious wake-ups
      
      Closes #4
      
      See merge request i4/manycore/emper!57
      a84a8752
    • Florian Schmaus's avatar
      Merge branch 'test-laws' into 'master' · 59bcc092
      Florian Schmaus authored
      Add meson option for scheduling strategy and according CI jobs
      
      See merge request !53
      59bcc092
    • Florian Fischer's avatar
      handle UnboundedBlockingMpscQueue spurious wake-ups · 82cf159a
      Florian Fischer authored
      A spurious wake-up can be produced by the new UnblockOnMainActorTest which
      triggers the assert(!mpscQueue.empty()) in UnboundedBlockingMpscQueue::get.
      
      Those spurious wake-ups are possible because the push and wake-up pair in
      UnboundedBlockingMpscQueue::put are not atomic.
      The following sequence diagram demonstrates a spurious wake-up:
      
         T1          T2            Q
         .           .            { }
        put(e)       .            { }
       push 54-57    .            {e}
         .         get()          {e}
         .        consume e       { }
         .           .            { }
         .         get()          { }
         .         block          { }
       unblock       .            { }
         .           .            { }
         .         wakeup         { }
         .           .            { }
                     X
             assert(!queue.Empty())
      
      To deal with spurious wake-ups we recheck the wake-up condition (a non empty queue)
      and block again if we find it empty.
      We assume spurious wake-ups are rare because it was difficult to reproduce them
      even with a dedicated Test (the new UnblockOnMainActorTest) therefore we declare
      the empty queue branch as unlikely.
      
      Fixes #4.
      82cf159a
    • Florian Fischer's avatar
      [Runtime] don't allocate threads array twice · e7dee47b
      Florian Fischer authored
      the threads array is initialized using the Runtime::Runtime initializer list
      and afterwards again in the constructor.
      e7dee47b
Loading