Skip to content
Snippets Groups Projects
  1. Aug 11, 2021
    • Florian Schmaus's avatar
      Merge branch 'random-worker-id' into 'master' · 96d27755
      Florian Schmaus authored
      [AbstractWorkStealingScheduler] Get rid of "rand() % workerCount"
      
      See merge request !229
      96d27755
    • Florian Schmaus's avatar
      [AbstractWorkStealingScheduler] Get rid of "rand() % workerCount" · bf8cf516
      Florian Schmaus authored
      The "rand() % workerCount" constructed used in the work-stealing
      scheduler is flawed. It has a bias toward lower worker IDs due the
      modulo operation. This is something I always wanted to get rid of, but
      never found the time to do it. Until know.
      
      Get rid of it and replace it with
      std::uniform_int_distribution<workerid_t> (as field the Worker
      instance).
      
      The main changes in AbstractWorkStealingScheduler are
      - use currentWorker->nextRandomWorkerId() (instead of the flawed construct)
      - currentWorker->getWorkerId() (instead of Runtime::getWorkerId())
      bf8cf516
  2. Aug 10, 2021
  3. Aug 09, 2021
  4. Aug 08, 2021
  5. Aug 02, 2021
  6. Jul 29, 2021
  7. Jul 28, 2021
  8. Jul 27, 2021
  9. Jul 26, 2021
  10. Jul 23, 2021
  11. Jul 21, 2021
  12. Jul 15, 2021
  13. Jul 14, 2021
    • Florian Fischer's avatar
      implement a pipe based sleep strategy using the IO subsystem · 4ec30fd4
      Florian Fischer authored
      Design goals
      ============
      
      * Wakeup either on external newWork notifications or on local IO completions
        -> Sleep strategy is sound without the IO completer
      * Do as less as possible in a system saturated with work
      * Pass a hint where to find new work to suspended workers
      
      Algorithm
      =========
      
      Data:
      	Global:
      		hint pipe
      		sleepers count
      	Per worker:
      		dispatch hint buffer
      		in flight flag
      
      Sleep:
      	if we have no sleep request in flight
      		Atomic increment sleep count
      		Remember that we are sleeping
      		Prepare read cqe from the hint pipe to dispatch hint buffer
      	Prevent the completer from reaping completions on this worker's IoContext
      	Wait until IO completions occurred
      
      NotifyEmper(n):
      	if observed sleepers <= 0
      		return
      
      	// Determine how many we are responsible to wake
      	do
      		toWakeup = min(observed sleepers, n)
      	while (!CAS(sleepers, toWakeup))
      
      	write toWakeup hints to the hint pipe
      
      NotifyAnywhere(n):
      	// Ensure all n notifications take effect
      	while (!CAS(sleepers, observed sleepers - n))
      		if observed sleeping <= -n
      			return
      
      	toWakeup = min(observed sleeping, n)
      	write toWakeup hints to the hint pipe
      
      onNewWorkCompletion:
      	reset in flight flag
      	allow completer to reap completions on this IoContext
      
      Notes
      =====
      
      * We must decrement the sleepers count on the notifier side to
        prevent multiple notifiers to observe all the same amount of sleepers,
        trying to wake up the same sleepers by writing to the pipe and jamming it up
        with unconsumed hints and thus blocking in the notify write resulting
        in a deadlock.
      * The CAS loops on the notifier side are needed because decrementing
        and incrementing the excess is racy: Two notifier can observe the
        sum of both their excess decrement and increment to much resulting in a
        broken counter.
      * Add the dispatch hint code in AbstractWorkStealingScheduler::nextFiber.
        This allows workers to check the dispatch hint after there
        where no local work to execute.
        This is a trade-off where we trade slower wakeup - a just awoken worker
        will check for local work - against a faster dispatch hot path when
        we have work to do in our local WSQ.
      * The completer tread must not reap completions on the IoContexts of
        sleeping workers because this introduces a race for cqes and a possible
        lost wakeup if the completer consumes the completions before the worker
        is actually waiting for them.
      * When notifying sleeping workers from anywhere we must ensure that all
        notifications take effect. This is needed for example when terminating
        the runtime to prevent sleep attempt from worker thread which are
        about to sleep but have not incremented the sleeper count yet.
        We achieve this by always decrementing the sleeper count by the notification
        count.
      
      Thanks to Florian Schmaus <flow@cs.fau.de> for spotting bugs and suggesting
      improvements.
      4ec30fd4
Loading