-
Notifications
You must be signed in to change notification settings - Fork 2
The following image presents an overview of the classes used by the simulator (purple rectangles identify internal objects; teal rectangles represent the user defined objects during simulation).

The Application object defines the simulated applications. The user must define:
- the number of processing units required by each application
- the submission_time
- total walltime
- request walltimes (series of request times for consecutive failures)
- optional, a resubmission factor in case the user wishes to resubmit the application until successful
The Application classes uses an JobChangeType object internally in order to keep track of changes to any of its parameters. By default, an application is resubmitted in case of a failure by using the request walltimes in the given sequence. Once all values in the sequence have been used and the last request is still smaller than the walltime, the applications fails without being resubmitted. If the resubmission factor is provided, the application will be resubmitted even after the last request value in the sequence by increasing this last value with the given factor until the application is successfully executed.
Example usage:
job = ScheduleFlow.Application(10, 0, 120, [100, 130])creates an application requiring 10 processing units, submitted at time 0 (from the start of the simulation), requiring 120 time units to complete and requesting two submissions (one of 100 time units that will end in a failure and the second of 130 time units).
The Scheduler class defines the methods and policies defined by a scheduler. Currently it has two implementations:
- a reservation-based batch scheduler that schedules applications in batches (creates reservations for all applications in a batch in advanced)
- an online scheduler that chooses the next job to schedule each time a job finishes
Both schedulers use a Larger Jobs first policy in which larger jobs are scheduled before small ones (to increase the utilization of the machine). The batch scheduler implements a backfilling algorithm that uses small jobs (that have not yet been given a reservation) to fill in the gaps created by underestimating the walltime (request time < walltime).
Creating a schedule requires from the user to provide:
- a system
- the batch size (in case the Batch Scheduler is used)
The System object is defined by the total number of processing units it contains. Internally, it keeps track of free and used nodes during the simulation
Example usage:
sch = ScheduleFlow.BatchScheduler(ScheduleFlow.System(10))creates a Batch Scheduler over a system of 10 processing units.
The Simulator class defines global simulation parameters and can run multiple scenarios within these parameters. Upon creation, the user must define:
- generate_gif: if the simulation will create an animation with every job schedule (False by default)
- check_correctness: if the simulation should check for correctness of the final schedule (False by default)
- output_method: stdout or a file pointer
Users can use the run_scenario method to trigger a scheduling simulation by providing:
- the scheduler to be used (one of the classes inheriting the Scheduler object)
- a list of applications
Internally, the Simulator uses a Runtime object to keep track of all the events and to interact with the scheuler and the system. The Runtime uses an EventQueue object to implement the queue of events related to the states a job goes through during scheduling (job in the waiting queue, job start, job end, new schedule is triggered).
In addition, The Simulator internally uses a Visualization Engine for generating the GIF animation and a Statistics Engine for providing different metrics (system utilization, job response time, job wait time) for a simulation.
Example usage:
simulator = ScheduleFlow.Simulator(check_correctness=True,
generate_gif=False,
output_file_handler=sys.stdout)
simulator.run_scenario("test", sch, job_list=job_list)creates a Simulation that checks for correctness, does not generate an animation and prints the statistics on stdout. The code runs the "test" scenario using the sch scheduler and the job_list list of applications.