Commit dcffd713 authored by Johannes Knoedtel's avatar Johannes Knoedtel
Browse files

add all the code for version 1.0.0

parents
doc/
.yardoc/
Boost Software License - Version 1.0 - August 17th, 2003
Permission is hereby granted, free of charge, to any person or organization
obtaining a copy of the software and accompanying documentation covered by
this license (the "Software") to use, reproduce, display, distribute,
execute, and transmit the Software, and to prepare derivative works of the
Software, and to permit third-parties to whom the Software is furnished to
do so, all subject to the following:
The copyright notices in the Software and this entire statement, including
the above license grant, this restriction and the following disclaimer,
must be included in all copies of the Software, in whole or in part, and
all derivative works of the Software, unless such copies or derivative
works are solely in the form of machine-executable object code generated by
a source language processor.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
FODOR
=====
FODOR is a programm intended to be used for optimizing compiler flags. It offers:
- Extensible set of algorithms
- Offloading of the evaluations to different hosts via SSH and Slurm
- CSV reports for the behavior of the algorithm
- All steps are recorded and can be analysed later
- Well documented and modular design allows easy extention in almost all aspects
require 'rake'
require 'yard'
require 'md2man/rakefile'
YARD::Rake::YardocTask.new do |t|
t.files = ['lib/**/*.rb', '-', 'docs/*.md']
t.options = ['--no-private', '--title', 'FODOR']
t.stats_options = ['--list-undoc']
end
#!/usr/bin/env ruby
$0 = 'fodor'
require 'rubygems'
require 'fodor'
# Adoption
If you want to use FODOR to optimize the flags of your own programm you will need a custom {Evaluator}.
There are currently three ways to create such an Evaluator: Creating a factory method in {EvaluatorFactory} and makeing it accessible via the {OptimizeTask#read_evaluator} method. As a variation of that you can simply subclass {Evaluator}. Or better: using an {EvaluatorTemplate}.
## Evaluator Template
If your evaluation step is simple and matches the following template you can create an evaluator by simply writing an xml file:
1. Remotely compile the programm.
2. Upload it to a storage host.
3. Download it on the evalution host.
4. Evaluate it.
5. Use an ruby function to parse and interprete the results of the evaluation.
If you need to derive from this formula please skip this section.
### Format
A evaluator is specified in a evaluator tag
```xml
<evaluator>
...
</evaluator>
```
# Commandlines
The commandlines for the steps are given in compile-cmdline, upload-cmdline, download-cmdline and run-cmdline tags. They are given in a specific format that allows variable substitution. The line its self is given in a shell tag with a format string. For all places the shall be substituted write %s. Since this is a normal format string a single % is given via %%. Please refrain from using other format specifiers such as %d as all parameters are currently strings. The string is currently not checked. The values for the substitution are given in a vars tag as a space separated list.
```xml
<compile-cmdline>
<shell><![CDATA[
mkdir -p %s ; cd %s && cp -r /proj/ciptmp/ne32jilo/libgeodecomp/libgeodecomp-0.3.1/ . && cd libgeodecomp-0.3.1 && cd src && ./compile.sh "%s %s" && rm **/*.{cpp,h,cmake,txt,pc}
]]></shell>
<vars>hash hash flags standard-flags</vars>
</compile-cmdline>
<upload-cmdline>
<shell><![CDATA[
rsync -a .. %s/%s
]]></shell>
<vars>storage hash</vars>
</upload-cmdline>
<download-cmdline>
<shell><![CDATA[
cp -r %s/%s/* .
]]></shell>
<vars>storage-path hash-flags</vars>
</download-cmdline>
<run-cmdline>
<shell><![CDATA[
cd src/; LD_LIBRARY_PATH=. testbed/performancetests/performancetests 1234 0
]]></shell>
<vars></vars>
</run-cmdline>
```
The following variable substitutions are currently possible
- hash :: This is a (non-cryptographic) hash of the flags and the standard_flags. You can expect this to be unique and reproducible for all runs.
- flags :: The string representation of the FlagState.
- standard-flags :: The standard flags of the current optimizer task.
- storage :: The SCP-able path of the storage given for the current optimizer task.
- storage-path :: The local path portion of storage.
# Parsers
For the evaluation of the output of the compilation and evaluation jobs one needs to provide a parser. These parsers are simply standard ruby lambdas. They need to accept just one string argument representing the output of the command.
```xml
<eval-output-parser><![CDATA[
lambda do |string|
Hash[string.split.map { |x| x.split ";" }.drop(1).map do |fields|
%w{rev date host device order family species dimensions perf unit}.map(&to_sym).zip (fields)
end].select do |x|
%w{family species dimensions}.zip (["RegionCount", "gold", "(128, 128, 128)"]).map do |y|
x.first[y] == x.last
end
end.first[:perf]
end
]]></eval-output-parser>
```
The parsers are given in the compile-output-parser tag and the eval-output-parser tag.
## Subclassing Evaluator
It is also possible to subclass the {Evaluator} class. You need to expose the new class in the {OptimizeTask#read_evaluator} method.
Implement your evaluation logic in the [] method. This method takes an array of {FlagSet}s evaluates them and returns an array of {JobResult}s. Caching results is a useful feature, so consider implementing it.
# Algorithms
There are different algorithms availiable for the optimization process. They all evaluate a fixed number of candidates per step. The set of candidates of a step is called a generation.
All algorithms share the seed parameter that sets the seed of the random generator.
## Local Search
The Local Search takes the best candidate of the last generation and mutates it randomly to generate the candidates of the next generation
### Parameters
- `parallel` :: Number of candidates per generation
- `distance` :: Number of flags to be altered for candidates of a generation
## Random Search
The Random Search algorithm picks different parameters at random.
### Parameters
- `parallel` :: Number of random walks that should be executed in parallel
## Hill Climber
The Hillclimber takes the best candidate of the last generation and mutates one flag at a each step and continues with it if it is better otherwise it continues with the old candidate. When all flags ware done it starts over with the first flag.
### Parameters
- `parallel` :: Number of Hillclimbers that should be executed in parallel
## Genetic Algorithm
This algorithm is a bit more complicated. Its steps consist of three phases: selection, repopulation and mutation. At first new candidates are generated by crossover and mutation. Crossover means that two randomly chosen candidates are fused by mixing their flags. This happens by taking flags from one candidate and then randomly switch to the other. Mutation just alters random flags. Then the candidates are selected via a selection algorithm that favors good candidates.
There are two different selection algorithms aviailable:
### Truncation Select
This method just takes the best candidates and drops the rest.
### Propotional Select
Here candidates are selected randomly, with good ones having a better chance of suvival.
### Parameters
- `population_size` :: Number of candidates per generation
- `selection_method` :: The selection method. This may be "truncate" or "proportional".
- `crossovers` :: Number of candidates to generate via crossover
- `crossover_probability` :: Probability to switch between candidates during crossover
- `mutations` :: Number of candidates to generate via mutation
- `mutation_probability` :: Probability for each single flag to be mutated during mutation
# File Formats
All files a user has to to write in order to use FODOR are XML-based. XML Schema files can be found in the schema folder. Not a lot of data validation and consistency checks are in place at this point, so be careful while writing the files.
The results and the evaluator cache are in the Ruby Mashaling format. For more information for handling the results see the {file:docs/results.md results documentation}
All used paths are relative to the working directory. ~ and ~exampleuser are expanded to the users home directory and exampleusers home directory respectively.
## General Settings
All settings are given in the root tag named `settings`.
### SSH Agent Settings
If you want use a SSH Agent, use a `ssh-agent` tag. The keys to load are to be stated as `add-key` tags inside of the `ssh-agent` tag, containing the paths of the keys to be loaded. With the `askpass` attribute in the ssh-agent tag the programm for the password entry can be chosen.
```xml
<ssh-agent askpass="/usr/lib/ssh/x11-ssh-askpass">
<add-key>~/.ssh/id_rsa</add-key>
</ssh-agent>
```
### Load Host Sets
Host Sets can be loaded with the `hostset-file` tag.
```xml
<hostset-file>hostset.xml</hostset-file>
```
### Load Host Group Sets
Host Group Sets can be loaded with the `hostgroupset-file` tag.
```xml
<hostgroupset-file>hostgroupset.xml</hostgroupset-file>
```
### Callbacks
Jobs can trigger callbacks. A callback has a certain type and a selector. The selector is a regular expression that tries to match the job group name. Currently only the completion of a group can trigger a callback. All callbacks are represented as `callback` tags in a `callbacks` tag. The attributes of it are type, callback-type and selector. For type only `group_complete` is valid. The selector is a perl compatible regex.
The `callback-type` determines what will be done when the callback will be called. If this has parameters they can be given via `argument` tags. The name of the parameter is contained in the `name` attribute. The value is given as the content of the tag.
```xml
<callbacks>
<callback type="group_complete" callback-type="mail" selector=".*">
<argument name="to">user@example.com</argument>
<argument name="batch_size">1</argument>
</callback>
</callbacks>
```
#### Mail Callback
Currently only the mail callback is availiable. It sends a mail once a certain number fo job groups are finished. This number is given via the `batch_size` parameter. The recipient is given via the `to` parameter.
The mail will be sent via the mail library. You can configure it in the `pre-exec`. Have a look in the offical documentation at: {https://github.com/mikel/mail/wiki}
### Load Flag Set
Flag Sets can be loaded with the `flagset-file` tag.
```xml
<flagset-file>flags.xml</flagset-file>
```
### Slurm Poll Delay
Controls the time between two status updates on the Surm control host.
```xml
<slurm-poll-delay>2</slurm-poll-delay>
```
### Pre Exec
If you need to run some code before the main programm will be run, you can do it via the `pre-exec` tag. This tag contains Ruby code that will be executed beforehand.
## Optimization Run
All optimization runs are defined in single files. A optimization run will also be called a task.
```xml
<optimize-task name="Local Search with Distance = 5">
...
</optimize-task>
```
The task is described with a `optimize-task` tag. The name must be given in the `name` attribute. This will help with the analysis of the result data and can be used to automatically generate graphs with useful labelling.
### Algorithm
```xml
<algorithm>Local Search</algorithm>
```
The algorithm has to be given in an `algorithm` tag. All possible algorithms have to be accessable through the {Algorithm} module; for a list of all availiable algorithms see the {file:docs/algorithms.md algorithms documentation}.
```xml
<algorithm-values>parallel=5,distance=5,seed=20</algorithm-values>
```
Parameters to the algorithms are given in an `algorithm-values` tag in the following format: `parameter1=value1,parameter2=value2`.
```xml
<termination-criterion type="Steps">200</termination-criterion>
```
The termination criterion is selcted with an `termination-criterion` tag. The type of the criterion is given through the `type` attribute. If the criterion has a variable parameter this is given as the contents of the tag.
### Flag Sets
```xml
<flag-set>GCC Standard Optimizations</flag-set>
```
To include a flag set to this run include a `flag-set` tag with the name of the flag set as the content.
```xml
<exclude-flag-group>unsafe</exclude-flag-group>
<exclude-flag-group>graphite</exclude-flag-group>
<exclude-flag-group>bug</exclude-flag-group>
```
Flag groups can be removed from the set of tested flags with the `exclude-flag-group` tags have the name of the group to be excluded as the content of the tag.
```xml
<standard-flags>-march=native</standard-flags>
```
For flags that should be given to given to the compiler independently of the flag set you can use the `standard-flags` tag. Simply put the part of the commandline that is fixed in the content of this tag.
### Host Groups
```xml
<host-group-compile>cip-00</host-group-compile>
<host-group-eval>whistler</host-group-eval>
```
The host groups for the compilation and the evaluation are given as contents of the `host-group-compile` and `host-group-eval` tags.
### Folders
```xml
<folder-compile>/var/tmp/ne32jilo/</folder-compile>
<folder-eval>/tmp/ne32jilo/</folder-eval>
```
The folders on the remote machindes that should be used for this run are to be given as content of the `folder-compile` tag and the `folder-eval` tag for the compilation and evaluation respectively.
```xml
<storage>ne32jilo@faui36b.informatik.uni-erlangen.de:/scratch/ne32jilo/</storage>
```
A scp target should be given as the storage where the programm will be stored for the transfer from compilation machine to evaluation machine. This string should be put in the `storage` tag.
### Evaluator
```xml
<evaluator>libgeodecomp</evaluator>
```
The evaluator for the library under test must be state in the `evaluator` tag. See the {file:docs/adoption.md Adoption documentation} for libraries other then libgeodecomp.
### Versions
```xml
<versions>
<version of="gcc">4.9</version>
<version of="libgeodecomp">3.1.0</version>
</versions>
```
Most evaluators can load a cache file containing performance data for already executed flag states. Because this is dependent on the tested versions of compilers and optimization target. Each cache entry is tagged with the respective versions. This version description has to be given in the `versions` tag. For each variable version a `version` tag with the associated programm as the `name` attribute and the version as the content should be included in the `versions` tag.
## Automated Runs
```xml
<runs>
<parallel>
<run output="out/baz1">dummy-runs/hillclimber_1.xml</run>
<serial>
<run output="out/foo1">dummy-runs/genetic_crossover_0.1.xml</run>
<run output="out/bar1">dummy-runs/genetic_crossover_0.7.xml</run>
</serial>
<run output="out/baz2">dummy-runs/hillclimber_5.xml</run>
</parallel>
</runs>
```
All runs are given with in the root tag named `runs`. The runs at the toplevel are execute serially. Runs can be grouped in `serial` and `parallel` tags and are then executed in serial or parallel respecively. These groups can be nested.
The contens of a `run` tag is the filename of a run definition in the format described above. The result will be written to the file specified in the `output` attribute.
## Host Set
```xml
<hosts>
<host aliases="faui36b 36b" user="ne32jilo" hostname="faui36b.informatik.uni-erlangen.de" />
<host user="ne32jilo" hostname="faui36c" slurm-host="36b" partition="usaji" />
</hosts>
```
Hosts are to be declared with a `host` tag. The attributes of a host are `user`, `hostname` and `port` are set with the xml attributes of the same name. If you need aliases for a host simply add a space separated list as the `aliases` attribute.
All of the `host` tags are collected in the root tag `hosts`.
### Slurm Hosts
SSH hosts and Slurm hosts are discriminated by the presence of the `slurm-host` attribute. This attribute denominates the Slurm control server. It must name a SSH host that already appeard previously in the list of hosts. If the host belongs to a certain partition this can be configured via the `partition` attribute.
## Host Groups
```xml
<host-groups>
<host-group name="cip-00" description="irgendsoeinrechner">
<host>00b</host>
<host>00c</host>
<host>00d</host>
<host>00e</host>
</host-group>
<host-group name="blubb" description="irgendsoeinrechner">
<host>faui36c</host>
<host>faui36g</host>
<host>faui36i</host>
<host>faui36j</host>
<host>nomad</host>
</host-group>
</host-groups>
```
All host groups are give in `host-group` tags. The `name` attribute and the `description` attribute set the name and describe the group. The description is not mandatory but recommended for debugging and documentation. It usually contains a description of the hardware. The content of a `host-group` tag are `host` tags that contain the names of previously in the host set defined hosts. Aliases are allowed here. All `host-group` tags are collected in the root element named `host-groups`.
## Flag Set
```xml
<flags name="GCC PowerPC Optimizations">
<flag type="gcc-machine" name="pointers-to-nested-functions" domain-type="boolean_no" />
<flag type="gcc-machine" name="block-move-inline-limit" domain-type="Range" >
<range from="32" to="INTMAX" />
</flag>
<flag type="gcc-machine" name="cmodel" domain-type="list" >
<list-element value="small"/>
<list-element value="medium"/>
<list-element value="large"/>
</flag>
</flags>
```
All flags are defined with `flag` tags. A flag needs to have a `name` attribute that is the name of the flag as it is used on the commandline, a `type` attribute and a `domain-type`.
The type of a flag has a strong influence on how it will be represented in the commandline for the compilation. Currently there are four different types:
- `gcc` :: For regular gcc flags with the "-f" prefix like -fomit-frame-pointer
- `gcc-machine` :: For flags with the "-m" prefix like -mcmodel=small
- `gcc-param` :: For parameter flags like --param min-crossjump-insns=10
- `gcc-define` :: For defines like -DDEBUG
The prefix for the types can can be overridden using the `prefix` attribute.
Because of the need to denote various possible values for the flags, the domain-type was introduced. It defines what values are possible of a flag. Currently the following domain types are possible:
- `boolean`, `boolean_no` :: This is a boolean flag. The `boolean` variant simply is not present on the commandline if it is false, while the `boolean_no` version is used for flags like -fno-omit-frame-pointer
- `Range` :: This denotes all possible values of a integer range. The bounds of this range are given as a `range` tag as the contents of the `flag` tag. This `range` tag denotes the bound via the `from` and `to` attributes containing integers for the lower and upper limit. There is also a special value for the limits named `INTMAX` that corresponds to 2 ^ 31 - 1.
- `list` :: For flags having a string paramter there is the `list` type. All possible values are to be given inside the `flag` tag as `list-element` tags with the string parameters as the `value` attribute.
All flags in a set are collected in the root element names `flags`. This tag requires a `name` attribute containing a name that can be referenced in the defintion of a optimization run.
## Logging Configuration
Logging in FODOR is based on Log4R. You need put a configuration called `log4r_config.xml` in your current working directory. Logging is individually handeled for each class. Be sure to have a logger tag for each class. It is best to copy the following example config and modify it to your own needs:
```xml
<log4r_config>
<pre_config>
<custom_levels>DEBUG, INFO, WARN, ERROR, FATAL</custom_levels>
<global level="ALL"/>
</pre_config>
<outputter name="console" type="StdoutOutputter" level="DEBUG" >
<formatter type="Log4r::PatternFormatter">
<pattern>=>[%5l %d] %C: %M [%t]</pattern>
</formatter>
</outputter>
<outputter name="file_outputter" type="FileOutputter">
<filename>log/fodor.log</filename>
<formatter type="Log4r::PatternFormatter">
<pattern>=>[%5l %d] %C: %M [%t]</pattern>
</formatter>
</outputter>
<!-- Loggers -->
<logger name="JobManager" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="FlagSet" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="HostSet" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="HostGroup" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="HostGroupSet" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="OptimizeTask" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="Host" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="SlurmHost" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="EvaluatorTemplate" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="GeneticOptimizer" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
<logger name="OptimizerUtil" level="ALL" additive="false" trace="true">
<outputter>console</outputter>
<outputter>file_outputter</outputter>
</logger>
</log4r_config>
```
## Evaluator Defintion
If you want to define a new Evaluator see the {file:docs/adoption.md adoption documentation}.
# Getting Started
tl;dr For a short usage guide see the {file:docs/quickstart.md Quick Start Guide}.
## What's FODOR? (And what is it not?)
FODOR stands for "Flag Optimization using Discrete Optimization Algorithms in Ruby".
It was initially developed for the {https://www.libgeodecomp.org LibGeoDecomp} project to find out if the performance of the library can be improved by using "better" compiler flags. Most academic research implies that automated fiddeling with the flags can yield performance gains. The goal of FODOR is to make use of these performance gains by applying them on real world applications.
While FODOR is not a monolithic set of shell scripts but a moderately modular framework, most of its components were designed for the optimization of compiler flags and can't really be applied to generic optimization tasks without some changes to the core parts of the source code. This is certainly doable, but I don't intend to change this in the near future, but I am happy to accept any patches that do that, assuming the code dosen't break anything and doesn't deteriorate the overall code quality.
## Recomended Experimentation Setup
You should have a least a few different hosts that are accessible via SSH or Slurm. If you do perfomance measurements by timing a program, do note that you should be the sole user of a system. Otherwise this negatively influences the test results, because you can't monopolize the hardware. If you use hardware counters or other metrics this might be unproblematic.
In general, the shorter the evaluation runtime the more candidates you can evaluate. If you can shorten the runtime while keeping the accuracy of your measurement, please do so.
## Usage
In order to correctly use this programm, it is helpful to understand its inner workings.
1. The step function of the optimzier is called.
2. In the step function new candidates are generated and evaluated.
3. The evaluation is done via an instance of the Evaluator class.
4. The evaluator looks into its cache and if it can't provide an result it generates jobs that are executed
5. Evaluations consist of compilation and evalution parts.
6. These individual parts are executed on hosts withis specific host groups.
You can see that there are a lot of parameters. For the configuration of the programm see the {file:docs/formats.md format documentation}. If you need a simple and quick guide see the {file:docs/quickstart.md Quick Start Guide}.
For multiple runs defined in a xml-file just call the programm with the name of that file:
```
$ fodor runs.xml
```
If you want to run a single run simply call the programm with the `--single` and the optimizer file as the argument. The output file is given via the `--output` flag. This paramter is madatory:
```
$ fodor --single hillclimber.xml --output out.fodor
```
# Quick Start Guide
If you simply want to optimize the compilation result of a project do the following steps:
## Write an Evaluator Template
Write down how you compile your project and measure its runtime. These two steps are seperated and can be executed on different hosts. This allows for the compilation to happen on any host, while the critical evaluation step can be on a set of identical hosts. We will call the resulting file `myProject.xml`. All command lines are format strings in the `shell` tag with their variables in the `vars` tag. We have four commandlines:
- `compile-cmdline` :: compiles the project
- `upload-cmdline` :: uploads the compiled project to the storage
- `download-cmdline` :: downloads the compiled project from the storage
- `run-cmdline` :: evaluates the performance
The parsing of the output of the evaluation is done in the `eval-output-parser` tag. Its contents is ruby code. It consists of a lambda that takes one argument which is the output of the eval step and converts it into an {Numeric}. FODOR tries to minimize this value. It has to be a positive number (not 0!).
```xml
<evaluator>
<compile-cmdline>
<shell><![CDATA[
mkdir -p %s ; cd %s && cp -r /home/example-user/project/ . && cd project && cd src && CFLAGS="%s %s" make
]]></shell>
<vars>hash hash flags standard-flags</vars>
</compile-cmdline>
<upload-cmdline>
<shell><![CDATA[
rsync -a .. %s/%s
]]></shell>
<vars>storage hash</vars>
</upload-cmdline>
<download-cmdline>
<shell><![CDATA[
cp -r %s/%s/* .
]]></shell>
<vars>storage-path hash</vars>
</download-cmdline>
<run-cmdline>
<shell><![CDATA[
cd src/ && time ./project-performance-test
]]></shell>
<vars></vars>
</run-cmdline>
<eval-output-parser><![CDATA[
lambda do |string|