Skip to content
Snippets Groups Projects
  1. May 04, 2021
    • Florian Fischer's avatar
    • Florian Fischer's avatar
      [IO] use our affinity capabilities in the io subsystem · 388bd882
      Florian Fischer authored
      Set the affinity for Fibers created from the completer thread
      to the workerId where the completion originated from.
      
      First, we split up the Callback type in a user facing and an internal one.
      The internal one has a new affinity member which provides the memory for
      the new Fibers affinity buffer. This is memory safe because the Callback
      object is always heap allocated and freed when the CallbackFiber terminates.
      
      Second, each Future has an affinity member as well which will be passed
      to the BinaryPrivateSemaphore when calling signalFromAnywhere to
      again hint the originating worker.
      
      Both affinity values are set during the preparation of the Future in
      IoContext::prepareFutureChain because then we are sure in which worker's
      IoContext the completion will be generated.
      388bd882
    • Florian Fischer's avatar
      [laws] also schedule Fibers to their priorityQueues from anywhere · 5b1c238e
      Florian Fischer authored
      First of all I think it makes sense for laws to also look at affinities of Fibers
      scheduled to the AnywhereQueue.
      But the actual reason is that IO completions scheduled from the completer
      could be dispatched from the priorityQueue before they are found in the
      expensive AnywhereQueue.
      
      But this comes with the cost of a vcall for scheduleFromAnywhere.
      5b1c238e
    • Florian Fischer's avatar
      [Runtime] switch batch schedule interface from STL to C array style · a328256c
      Florian Fischer authored
      Using the STL iterator based approach has caused more harm than it
      has advantages.
      
      First of all it is unneeded we know how the Fibers we want to schedule are stored
      and don't need a algorithmic generic abstraction.
      
      We keep Fibers pointers in continues memory for memory locality benefits
      therefore we can simply use the C dynamic array style of passing the
      Fibers to the scheduler.
      
      This removes the template from the batched schedule methods and allows us
      to use virtual functions to specialize the batched scheduleFromAnywhere
      method.
      a328256c
    • Florian Schmaus's avatar
      Merge branch 'laws-set-affinity' into 'master' · 60c3afb7
      Florian Schmaus authored
      [LAWS] Set fiber affinity *before* dispatching it, not after
      
      See merge request !189
      60c3afb7
    • Florian Schmaus's avatar
  2. May 03, 2021
  3. May 01, 2021
  4. Apr 30, 2021
  5. Apr 27, 2021
  6. Apr 21, 2021
    • Florian Fischer's avatar
      [IO] provide memory to store continuation Fiber* by the caller · fe367535
      Florian Fischer authored
      Reason for this change is that we suspect the dynamic memory allocation
      in reapCompletions for the return vector to be rather costly.
      
      We know the maximum amount of memory needed to store continuation Fiber*
      at compile-time because the size of the CQs is fixed.
      This allows us to pass a buffer big enough to store all possible continuation
      Fiber* to reapCompletions from the caller preventing the need for
      the dynamic memory allocation for the returned vector.
      The caller has to ensure that the memory is free of data races.
      Currently this is ensured by allocating the buffer on the stack.
      fe367535
  7. Apr 20, 2021
  8. Apr 19, 2021
  9. Apr 18, 2021
    • Florian Schmaus's avatar
      Merge branch 'worker_wakeup_strategy' into 'master' · b5efa992
      Florian Schmaus authored
      introduce new CRTP based worker sleep algorithm abstraction
      
      See merge request !172
      b5efa992
    • Florian Fischer's avatar
      introduce new CRTP based worker sleep algorithm abstraction · 03bdd4c3
      Florian Fischer authored
      Introduce AbstractWorkerSleepStrategy a CRTP interface class for a worker
      sleep algorithm.
      A concrete WorkerSleepStrategy must implement the following functions:
      
      	template <CallerEnvironment callerEnvironment>
      	void notifyMany(unsigned count);
      
      	template <CallerEnvironment callerEnvironment>
      	void notifyOne();
      
      	template <CallerEnvironment callerEnvironment>
      	void notifyAll();
      
      	void notifySpecific(workerid_t workerId);
      
      	void sleep();
      
      The runtime uses this interface to notify the workers about new work
      as well as ensuring that all workers get notified on termination.
      
      All sempahore based worker sleep algorithm code was moved from the Runtime
      into SemaphoreWorkerSleepStrategy which takes the used Semaphore as a
      template parameter.
      
      This interface should be an zero-cost abstraction.
      
      Encapsulating the worker sleep algorithm behind an interface allows us
      to easier experiment with different approaches not based on semaphores
      ("Wait in the io_uring", "Empty flag per WSQ").
      
      Implement a generic notifySpecific algorithm for SemaphoreWorkerSleepStrategy.
      This algorithm comes with runtime overhead and is only used when it is
      used by the runtime and the semaphore implementation does not provide a
      own implementation.
      03bdd4c3
  10. Apr 17, 2021
    • Florian Schmaus's avatar
      Merge branch 'echo_client_histo' into 'master' · 354cdbce
      Florian Schmaus authored
      [EchoClient] support latency histograms
      
      See merge request !174
      354cdbce
    • Florian Fischer's avatar
      [EchoClient] support latency histograms · a71a6127
      Florian Fischer authored
      Histograms can only be collected when using a fixed amount of iterations.
      
      When the '--histogram <file>' argument is passed each Client
      collects 4 time stamps (each 8 byte):
      
      1. Before requesting the send operation
      2. After requesting the send operation
      3. After getting unblocked and dispatched because the send operation finished
      4. After getting unblocked and dispatched because the recv operation finished
      
      Taking the timestamps is enabled using a template and thus does not introduce
      any runtime cost if they are not used except binary size.
      
      Before termination three latencies are calculated and written to the histogram
      file as csv data for each client and each echo.
      
      1. total_latency := (T4 - T1)
      2. after_send_latency := (T4 - T2)
      3. after_send_dispatch_latency := (T4 - T3)
      a71a6127
  11. Apr 16, 2021
Loading