----------------------------------------------------------------------

Proem
-----

Author's email: <dpetrou@cs.cmu.edu>

Please understand this README before using ba_sim.

Read the INSTALL file for build instructions.

Read David Petrou's PhD dissertation for an explanation of why ba_sim
and its tools exist and what David learned through them.  The
dissertation is `Cluster scheduling for explicitly-speculative tasks,'
Electrical and Computer Engineering, Carnegie Mellon University,
CMU-PDL-04-112.  The dissertation and an archive of this software may
be found off of <http://www.pdl.cmu.edu>.

This documents the `ba_sim' simulator executable.  Also included in
this archive are Perl scripts for processing ba_sim output.  They are
not documented.  If there is interest, I can provide a description of
each source file used to form ba_sim along with descriptions of the
Perl scripts.

----------------------------------------------------------------------

Terminology
-----------

When an idea is hatched, initial terminology and organization often
does not well reflect the idea.  There are mismatches of terms and
concepts between the simulator and dissertation models because I have
not taken the time to realign the implementation to my increasingly
better described model.  These mismatches do not affect results
because there is a mapping between the implemented simulator model and
the described simulator model in the dissertation.

In the simulator:

  batch users disclose task sets and request tasks when needed; when
  making a scheduling decision, the simulator combines the disclosed
  and requested queues to present candidate tasks to a single policy

  interactive users disclose task sets and request tasks when needed;
  the simulator only presents candidate tasks from the requested queue
  to a single policy

  batchactive users disclose task sets and request tasks when needed;
  the simulator presents the queues to two policies; if the requested
  task queue is not empty, the requested queue policy sees its tasks,
  else, the disclosed queue policy sees tasks from the disclosed
  queue

  all users cancel task sets when outputs suggest no need to continue

  a task's visible response time is calculated as when the task
  executes minus when the task was requested (with negative times set
  to 0).

In the dissertation:

  batch users request task sets; the scheduler presents these tasks to
  a single policy; at some point, unknown to the scheduler, the users
  need tasks

  interactive users request tasks when needed.  the scheduler presents
  these tasks to a single policy

  batchactive users disclose task sets and request tasks when needed;
  the simulator presents the queues to two policies as above

  all users cancel task sets when outputs suggest no need to continue

  a task's visible response time is calculated as when the task
  executes minus when the task was needed (with negative times set to
  0).

The description in the dissertation is better because no `disclosed'
action exists in existing, non-speculative schedulers which is where
users behave in a batch or interactive manner.  The simulation
implementation is isomorphic with the above mapping, and is also
simpler to implement.

Another difference between the implementation and the textual
dissertation description is that when a task is `finished,' the
implementation removes it from the executed list whereas the
dissertation keeps it in the executed list to keep certain task state
transition diagrams and descriptions simpler.

The `harness' script reference by the dissertation is actually called
`ba_harness.pl'.  The `present' script is actually called
`ba_present.pl'.  The `improvement' script `ba_analyze.pl.'

The three database tables in MySQL are named poorly.  I list the
tables, their function, and what they should be named.

    analyze: the improvement factor between proposed and baseline user
    behaviors and scheduling policies; better named `improvement'

    output: ba_sim output; better named `simulation'

    trace: parameters determining a particular simulation run; better
    named `parameters' (the use of `trace' makes no sense and has
    nothing to do with the ability of ba_sim to take trace input or
    generate trace output)

----------------------------------------------------------------------

Options
-------

Run ba_sim with the `-h' option to get a list of options.

Some listed options have been disabled deep in the source.  Thus
output from the `-h' option is superseded by information in this file.
The disabled options set simulator behaviors not used for thesis
results, and thus may not have been adequately tested.  `Disabled'
means that the simulator may die with a meaningful error message on
startup, or that the simulator may run and output erroneous values, if
an attempt to use these options is made.

    The trace input and trace output options are turned off.

    The simulator mode of running until a certain number of users are
    spawned according to an arrival process is turned off.

        Heeding the specified maximum number of requested and
        disclosed tasks under this simulation mode is turned off.

----------------------------------------------------------------------

Scheduling policies
-------------------

The following are the supported simulator policies for either the
requested or disclosed queues.  The second set are only for the
disclosed queue when simulating batchactive scheduling.

  fcfs: fcfs
  ufb: user-fb
  srpt: srpt

  rfcfs: requested-fcfs
  urfb: user-requested-fb
  hrp: highest request probability
  hrr: highest requested resources

The following policies are implemented but not described in the
dissertation and have not been recently tested:

  rsrpt: requested-srpt
  gsvsd: greatest visible slowdown
  rand: random task
  randu: random user

----------------------------------------------------------------------

Metrics
-------

The simulator outputs metrics.  Some metrics are scalers.  Others are
sets of five comma separated numbers.  In order, they represent mean,
total, min, max, standard deviation, and count (# of samples).  The
exceptions are the requested and disclosed candidates queue metrics.
They consist of two numbers each: the average queue length and the
average weight (amount of work) in the queue.

The 5-tuples consist of samples one per user except for visible
response time and visible slowdown, whose samples are per-task.  For
example, for `all resource usage,' each user on completion reports all
the resources it uses.  If there are 10 users, there will be 10
samples, and the mean `all resource usage' will be the total reported
by the users divided by 10.

To track 25% and 75% quantile statistics, set USE_QUANT in ba_stats.c
to 1.

In the following list, the name on the left of the colon is the metric
as reported in the dissertation.  On the right of the colon is its
name as output by the simulator.  The word in parentheses names the
appropriate part of the 5-tuple.

  mean visible response time: visible response time (mean)

  mean visible slowdown: visible slowdown (mean)

  visible task throughput: finished tasks (total)
  
  variance of visible response time: visible response time (std. dev.)

  deadlines met: deadlines met (total)

  variance of user requested resource usage: billable resources
                                             (std. dev.)

  mean scaled billed resources: depends on user behavior

    when simulating interactive or batchactive users, this metric
    should be ignored, because the mean scaled billed resources is
    always 1 (the simulator is not smart enough to report this metric
    as 1);

    when simulating batch users, the metric is scaled resource usage
    (mean)

  load: load

  requested load: depends on user behavior

    for interactive and batch users, it is load;

    for batchactive users, it is requested load.

  uncharged load: depends on user behavior

    for interactive and batch, it's 0;

    for batchactive users, it's load - requested load.

  # of scheduler decisions: see below on turning on an option to
                            enable this metric

Additional metrics are reported that are not considered in the
dissertation.  I do not document them.  Beware in using them: some
refer to disclosed and possibly eventually requested tasks while
others refer to disclosed and never requested tasks.

----------------------------------------------------------------------

Warmup period
-------------

Simulator metrics are take time to converge to steady-state according
to the warmup period documented in the dissertation.  This warmup
period is set by the constant warmup_abs_time in ba_report.c.  In this
release, this warmup period is 2 days.  When the macro (defined in
ba_sim.h) BA_SIM_INSPECT is 1, no warmup time is elided.  When
BA_SIM_INSPECT is 0, the warmup time held is warmup_abs_time is
elided.  See below for the main use of BA_SIM_INSPECT.

Only the relevant reported metrics in the dissertation heed this
warmup period.  At least the following metrics are tabulated only
after this warmup period expires.  Metrics not listed probably track
the entire length of the simulation.  Yes, it would be nice if ba_sim
output reflected which metrics did and did not heed the warmup period.

  requested candidates queue
  disclosed candidates queue
  all resource usage
  requested resource usage
  disclosed only resource usage
  scaled resource usage
  billable resources
  load
  requested load
  requested load ignoring idle time
  finished tasks
  visible response time
  visible slowdown
  deadlines met

----------------------------------------------------------------------

Watch out for uncomputed metrics
--------------------------------

Certain simulator output metrics are conditionally not computed to
save memory during simulation.  These metrics were not needed by my
thesis.  To compute these metrics, set SAVE_MEMORY in ba_report.c to
0.  To not compute these metrics, set this macro to 1.  The metrics
affected by this macro are:

    non-work conserving time
    task sets alive time
    no task sets alive time
    tasks ready time

The internal variables in ba_report.c that hold these values are:

    non_work_conserving_delta_time
    task_set_alive_delta_time
    no_task_sets_delta_time
    task_set_ready_tasks_delta_time

which are computed from task_set_alive_periods,
task_set_ready_tasks_periods, no_task_sets_periods, and
non_work_conserving_periods.

----------------------------------------------------------------------

Adding metrics
--------------

In the dissertation I argue why visible response time, which reflects
task output need, is more important than response time, which reflects
when users submit tasks to non-speculative schedulers in an attempt to
minimize the response times they actually experience.  This is why I
advocate and report visible response time.  If one wishes to know the
traditional response time of tasks, the simulator can be extended as
follows:

    for batch and batchactive users, response time for a task begins
    when the task set in which it resides is submitted.

    for interactive users, response time for a task begins when that
    task is requested.

----------------------------------------------------------------------

Miscellaneous controls
----------------------

Certain code assertions are turned off to increase simulation speed.
To turn them on, set THOROUGH_CHECKS to 1 in ba_user.c

To output statistics concerning the number of scheduling decisions and
how long they took to make, set COUNT_SCHEDULER_DECISIONS in
ba_sched.c to 1.

To output the number of tasks in queue at each point in time, set one
of the following macros in ba_sched.c to 1: PRINT_NUM_ALL_CANDIDATES,
PRINT_NUM_REQ_CANDIDATES, or PRINT_NUM_DIS_CANDIDATES.  The output can
be fed to qlen_to_graph.pl.

To output the major scheduling events used to generate the simulator
verification in the dissertation, set BA_SIM_INSPECT to 1 in ba_sim.h.

----------------------------------------------------------------------

[eof]