meson_options.txt · 41d31c71074aad18734f83b54afd65ec241c8871 · Florian Fischer / emper

Forked from Lehrstuhl für Informatik 4 (Systemsoftware) / manycore / emper

Source project has a limited visibility.

3 years ago

implement a pipe based sleep strategy using the IO subsystem · 4ec30fd4

Florian Fischer authored 3 years ago

Design goals
============

* Wakeup either on external newWork notifications or on local IO completions
  -> Sleep strategy is sound without the IO completer
* Do as less as possible in a system saturated with work
* Pass a hint where to find new work to suspended workers

Algorithm
=========

Data:
	Global:
		hint pipe
		sleepers count
	Per worker:
		dispatch hint buffer
		in flight flag

Sleep:
	if we have no sleep request in flight
		Atomic increment sleep count
		Remember that we are sleeping
		Prepare read cqe from the hint pipe to dispatch hint buffer
	Prevent the completer from reaping completions on this worker's IoContext
	Wait until IO completions occurred

NotifyEmper(n):
	if observed sleepers <= 0
		return

	// Determine how many we are responsible to wake
	do
		toWakeup = min(observed sleepers, n)
	while (!CAS(sleepers, toWakeup))

	write toWakeup hints to the hint pipe

NotifyAnywhere(n):
	// Ensure all n notifications take effect
	while (!CAS(sleepers, observed sleepers - n))
		if observed sleeping <= -n
			return

	toWakeup = min(observed sleeping, n)
	write toWakeup hints to the hint pipe

onNewWorkCompletion:
	reset in flight flag
	allow completer to reap completions on this IoContext

Notes
=====

* We must decrement the sleepers count on the notifier side to
  prevent multiple notifiers to observe all the same amount of sleepers,
  trying to wake up the same sleepers by writing to the pipe and jamming it up
  with unconsumed hints and thus blocking in the notify write resulting
  in a deadlock.
* The CAS loops on the notifier side are needed because decrementing
  and incrementing the excess is racy: Two notifier can observe the
  sum of both their excess decrement and increment to much resulting in a
  broken counter.
* Add the dispatch hint code in AbstractWorkStealingScheduler::nextFiber.
  This allows workers to check the dispatch hint after there
  where no local work to execute.
  This is a trade-off where we trade slower wakeup - a just awoken worker
  will check for local work - against a faster dispatch hot path when
  we have work to do in our local WSQ.
* The completer tread must not reap completions on the IoContexts of
  sleeping workers because this introduces a race for cqes and a possible
  lost wakeup if the completer consumes the completions before the worker
  is actually waiting for them.
* When notifying sleeping workers from anywhere we must ensure that all
  notifications take effect. This is needed for example when terminating
  the runtime to prevent sleep attempt from worker thread which are
  about to sleep but have not incremented the sleeper count yet.
  We achieve this by always decrementing the sleeper count by the notification
  count.

Thanks to Florian Schmaus <flow@cs.fau.de> for spotting bugs and suggesting
improvements.

4ec30fd4

History

implement a pipe based sleep strategy using the IO subsystem

Florian Fischer authored 3 years ago

Design goals
============

* Wakeup either on external newWork notifications or on local IO completions
  -> Sleep strategy is sound without the IO completer
* Do as less as possible in a system saturated with work
* Pass a hint where to find new work to suspended workers

Algorithm
=========

Data:
	Global:
		hint pipe
		sleepers count
	Per worker:
		dispatch hint buffer
		in flight flag

Sleep:
	if we have no sleep request in flight
		Atomic increment sleep count
		Remember that we are sleeping
		Prepare read cqe from the hint pipe to dispatch hint buffer
	Prevent the completer from reaping completions on this worker's IoContext
	Wait until IO completions occurred

NotifyEmper(n):
	if observed sleepers <= 0
		return

	// Determine how many we are responsible to wake
	do
		toWakeup = min(observed sleepers, n)
	while (!CAS(sleepers, toWakeup))

	write toWakeup hints to the hint pipe

NotifyAnywhere(n):
	// Ensure all n notifications take effect
	while (!CAS(sleepers, observed sleepers - n))
		if observed sleeping <= -n
			return

	toWakeup = min(observed sleeping, n)
	write toWakeup hints to the hint pipe

onNewWorkCompletion:
	reset in flight flag
	allow completer to reap completions on this IoContext

Notes
=====

* We must decrement the sleepers count on the notifier side to
  prevent multiple notifiers to observe all the same amount of sleepers,
  trying to wake up the same sleepers by writing to the pipe and jamming it up
  with unconsumed hints and thus blocking in the notify write resulting
  in a deadlock.
* The CAS loops on the notifier side are needed because decrementing
  and incrementing the excess is racy: Two notifier can observe the
  sum of both their excess decrement and increment to much resulting in a
  broken counter.
* Add the dispatch hint code in AbstractWorkStealingScheduler::nextFiber.
  This allows workers to check the dispatch hint after there
  where no local work to execute.
  This is a trade-off where we trade slower wakeup - a just awoken worker
  will check for local work - against a faster dispatch hot path when
  we have work to do in our local WSQ.
* The completer tread must not reap completions on the IoContexts of
  sleeping workers because this introduces a race for cqes and a possible
  lost wakeup if the completer consumes the completions before the worker
  is actually waiting for them.
* When notifying sleeping workers from anywhere we must ensure that all
  notifications take effect. This is needed for example when terminating
  the runtime to prevent sleep attempt from worker thread which are
  about to sleep but have not incremented the sleeper count yet.
  We achieve this by always decrementing the sleeper count by the notification
  count.

Thanks to Florian Schmaus <flow@cs.fau.de> for spotting bugs and suggesting
improvements.