-
Static mode: You define your cluster ahead of swift runs.
-
Dynamic mode: Cloud resources provisioned dynamically.
Prerequisites: Java 1.7 Ant Python 2.7 The following steps
# Install swift-trunk from git
https://github.com/swift-lang/swift-k.git
# Extract package
tar xfz swift-0.95-RC6.tar.gz
# Add swift to the PATH environment variable
export PATH=$PATH:/path/to/swift-0.95-RC6/binClone the repository from github
git clone https://github.com/yadudoc/swift-on-cloud.git
cd swift-on-cloudOr, download the zip file from github and unpack.
# Download
wget https://github.com/yadudoc/swift-on-cloud/archive/master.zip
unzip master.zip
mv swift-on-cloud-master swift-on-cloud
cd swift-on-cloudTo run the tutorial on Google Compute Engine (GCE), follow the instructions here:
https://github.com/yadudoc/swift-on-cloud/tree/master/compute-engine
or, follow instructions for GCE, in the compute-engine folder of the swift-on-cloud
repository. Once your instances are running, connect to the headnode. Everthing
that you require for the swift-cloud-tutorial is already set up for you on the headnode.
This tutorial is based on two intentionally trivial example programs,
simulation.sh and stats.sh, (implemented as bash shell scripts)
that serve as easy-to-understand proxies for real science
applications. These "programs" behave as follows.
The simulation.sh script serves as a trivial proxy for any more complex scientific simulation application. It generates and prints a set of one or more random integers in the range [0-2^62) as controlled by its command line arguments, which are:
$ ./app/simulate.sh --help
./app/simulate.sh: usage:
-b|--bias offset bias: add this integer to all results [0]
-B|--biasfile file of integer biases to add to results [none]
-l|--log generate a log in stderr if not null [y]
-n|--nvalues print this many values per simulation [1]
-r|--range range (limit) of generated results [100]
-s|--seed use this integer [0..32767] as a seed [none]
-S|--seedfile use this file (containing integer seeds [0..32767]) one per line [none]
-t|--timesteps number of simulated "timesteps" in seconds (determines runtime) [1]
-x|--scale scale the results by this integer [1]
-h|-?|?|--help print this help
$
All of thess arguments are optional, with default values indicated above as [n].
With no arguments, simulate.sh prints 1 number in the range of 1-100. Otherwise it generates n numbers of the form (R*scale)+bias where R is a random integer. By default it logs information about its execution environment to stderr. Here’s some examples of its usage:
$ simulate.sh 2>log
5
$ head -4 log
Called as: /home/wilde/swift/tut/CIC_2013-08-09/app/simulate.sh:
Start time: Thu Aug 22 12:40:24 CDT 2013
Running on node: login01.osgconnect.net
$ simulate.sh -n 4 -r 1000000 2>log
239454
386702
13849
873526
$ simulate.sh -n 3 -r 1000000 -x 100 2>log
6643700
62182300
5230600
$ simulate.sh -n 2 -r 1000 -x 1000 2>log
565000
636000
$ time simulate.sh -n 2 -r 1000 -x 1000 -t 3 2>log
336000
320000
real 0m3.012s
user 0m0.005s
sys 0m0.006s
The stats.sh script serves as a trivial model of an "analysis" program. It reads N files each containing M integers and simply prints the\ average of all those numbers to stdout. Similarly to simulate.sh it logs environmental information to the stderr.
$ ls f* f1 f2 f3 f4 $ cat f* 25 60 40 75 $ stats.sh f* 2>log 50
-
Swift scripts are text files ending in
.swiftTheswiftcommand runs on any host, and executes these scripts.swiftis a Java application, which you can install almost anywhere. On Linux, just unpack the distributiontarfile and add itsbin/directory to yourPATH. -
Swift scripts run ordinary applications, just like shell scripts do. Swift makes it easy to run these applications on parallel and remote computers (from laptops to supercomputers). If you can
sshto the system, Swift can likely run applications there. -
The details of where to run applications and how to get files back and forth are described in configuration files separate from your program. Swift speaks ssh, PBS, Condor, SLURM, LSF, SGE, Cobalt, and Globus to run applications, and scp, http, ftp, and GridFTP to move data.
-
The Swift language has 5 main data types:
boolean,int,string,float, andfile. Collections of these are dynamic, sparse arrays of arbitrary dimension and structures of scalars and/or arrays defined by thetypedeclaration. -
Swift file variables are "mapped" to external files. Swift sends files to and from remote systems for you automatically.
-
Swift variables are "single assignment": once you set them you can’t change them (in a given block of code). This makes Swift a natural, "parallel data flow" language. This programming model keeps your workflow scripts simple and easy to write and understand.
-
Swift lets you define functions to "wrap" application programs, and to cleanly structure more complex scripts. Swift
appfunctions take files and parameters as inputs and return files as outputs. -
A compact set of built-in functions for string and file manipulation, type conversions, high level IO, etc. is provided. Swift’s equivalent of
printf()istracef(), with limited and slightly different format codes. -
Swift’s
foreach {}statement is the main parallel workhorse of the language, and executes all iterations of the loop concurrently. The actual number of parallel tasks executed is based on available resources and settable "throttles". -
In fact, Swift conceptually executes all the statements, expressions and function calls in your program in parallel, based on data flow. These are similarly throttled based on available resources and settings.
-
Swift also has
ifandswitchstatements for conditional execution. These are seldom needed in simple workflows but they enable very dynamic workflow patterns to be specified.
We’ll see many of these points in action in the examples below. Lets get started!
The first swift script, p1.swift, runs simulate.sh to generate a single random number. It writes the number to a file.
sys::[cat ../part01/p1.swift]
To run this script, run the following command:
$ cd part01
$ swift p1.swift
Swift 0.94.1 RC2 swift-r6895 cog-r3765
RunID: 20130827-1413-oa6fdib2
Progress: time: Tue, 27 Aug 2013 14:13:33 -0500
Final status: Tue, 27 Aug 2013 14:13:33 -0500 Finished successfully:1
$ cat sim.out
84
$ swift p1.swift
$ cat sim.out
36
To cleanup the directory and remove all outputs (including the log files and directories that Swift generates), run the cleanup script which is located in the tutorial PATH:
$ cleanup|
Note
|
You’ll also find two Swift configuration files in each partNN
directory of this tutorial. These specify the environment-specific
details of where to find application programs (file apps) and where
to run them (file sites.xml). These files will be explained in more
detail in parts 4-6, and can be ignored for now.
|
The p2.swift script introduces the foreach parallel iteration
construct to run many concurrent simulations.
sys::[cat ../part02/p2.swift]
The script also shows an
example of naming the output files of an ensemble run. In this case, the output files will be named
output/sim_N.out.
In part 2, we also update the apps file. Instead of using shell script (simulate.sh), we use the equivalent python version (simulate.py). The new apps file now looks like this:
sys::[cat ../part02/apps]
Swift does not need to know anything about the language an application is written in. The application can be written in Perl, Python, Java, Fortran, or any other language.
To run the script and view the output:
$ cd ../part02
$ swift p2.swift
$ ls output
sim_0.out sim_1.out sim_2.out sim_3.out sim_4.out sim_5.out sim_6.out sim_7.out sim_8.out sim_9.out
$ more output/*
::::::::::::::
output/sim_0.out
::::::::::::::
44
::::::::::::::
output/sim_1.out
::::::::::::::
55
...
::::::::::::::
output/sim_9.out
::::::::::::::
82
After all the parallel simulations in an ensemble run have completed,
its typically necessary to gather and analyze their results with some
kind of post-processing analysis program or script. p3.swift
introduces such a postprocessing step. In this case, the files created
by all of the parallel runs of simulation.sh will be averaged by by
the trivial "analysis application" stats.sh:
sys::[cat ../part03/p3.swift]
To run:
$ cd part03 $ swift p3.swift
Note that in p3.swift we expose more of the capabilities of the
simulate.sh application to the simulation() app function:
app (file o) simulation (int sim_steps, int sim_range, int sim_values)
{
simulate "--timesteps" sim_steps "--range" sim_range "--nvalues" sim_values stdout=filename(o);
}
p3.swift also shows how to fetch application-specific values from
the swift command line in a Swift script using arg() which
accepts a keyword-style argument and its default value:
int nsim = toInt(arg("nsim","10"));
int steps = toInt(arg("steps","1"));
int range = toInt(arg("range","100"));
int values = toInt(arg("values","5"));
Now we can specify that more runs should be performed and that each should run for more timesteps, and produce more that one value each, within a specified range, using command line arguments placed after the Swift script name in the form -parameterName=value:
$ swift p3.swift -nsim=3 -steps=10 -values=4 -range=1000000 Swift 0.94.1 RC2 swift-r6895 cog-r3765 RunID: 20130827-1439-s3vvo809 Progress: time: Tue, 27 Aug 2013 14:39:42 -0500 Progress: time: Tue, 27 Aug 2013 14:39:53 -0500 Active:2 Stage out:1 Final status: Tue, 27 Aug 2013 14:39:53 -0500 Finished successfully:4 $ ls output/ average.out sim_0.out sim_1.out sim_2.out $ more output/* :::::::::::::: output/average.out :::::::::::::: 651368 :::::::::::::: output/sim_0.out :::::::::::::: 735700 886206 997391 982970 :::::::::::::: output/sim_1.out :::::::::::::: 260071 264195 869198 933537 :::::::::::::: output/sim_2.out :::::::::::::: 201806 213540 527576 944233
Now try running (-nsim=) 100 simulations of (-steps=) 1 second each:
$ swift p3.swift -nsim=100 -steps=1 Swift 0.94.1 RC2 swift-r6895 cog-r3765 RunID: 20130827-1444-rq809ts6 Progress: time: Tue, 27 Aug 2013 14:44:55 -0500 Progress: time: Tue, 27 Aug 2013 14:44:56 -0500 Selecting site:79 Active:20 Stage out:1 Progress: time: Tue, 27 Aug 2013 14:44:58 -0500 Selecting site:58 Active:20 Stage out:1 Finished successfully:21 Progress: time: Tue, 27 Aug 2013 14:44:59 -0500 Selecting site:37 Active:20 Stage out:1 Finished successfully:42 Progress: time: Tue, 27 Aug 2013 14:45:00 -0500 Selecting site:16 Active:20 Stage out:1 Finished successfully:63 Progress: time: Tue, 27 Aug 2013 14:45:02 -0500 Active:15 Stage out:1 Finished successfully:84 Progress: time: Tue, 27 Aug 2013 14:45:03 -0500 Finished successfully:101 Final status: Tue, 27 Aug 2013 14:45:03 -0500 Finished successfully:101
We can see from Swift’s "progress" status that the tutorial’s default
swift.properties parameters for local execution allow Swift to run up to 20
application invocations concurrently on the login node. We’ll look at
this in more detail in the next sections where we execute applications
on the site’s compute nodes.
p4.swift will run our mock "simulation"
applications on compute nodes. The script is similar to as
p3.swift, but specifies that each simulation app invocation should
additionally return the log file which the application writes to
stderr.
Now when you run swift p4.swift you’ll see that two types output
files will placed in the output/ directory: sim_N.out and
sim_N.log. The log files provide data on the runtime environment of
each app invocation. For example:
$ cat output/sim_0.log Called as: simulate.sh: --timesteps 1 --range 100 --nvalues 5 Start time: Tue Oct 22 14:54:11 CDT 2013 Running as user: uid=5116(davidk) gid=311(collab) groups=311(collab),104(fuse),1349(swift),45053(swat) Running on node: stomp Node IP address: 140.221.9.237 Simulation parameters: bias=0 biasfile=none initseed=none log=yes paramfile=none range=100 scale=1 seedfile=none timesteps=1 output width=8 Environment: EDITOR=vim HOME=/homes/davidk JAVA_HOME=/nfs/proj-davidk/jdk1.7.0_01 LANG=C ....
To test with larger runs, there are two changes that are required. The first is a change to the command line arguments. The example below will run 1000 simulations with each simulation taking 5 seconds.
$ swift p6.swift -steps=5 -nsim=1000
This section is under development.
p6.swift expands the workflow pattern of p4.swift to add additional stages to the workflow. Here, we generate a dynamic seed value that will be used by all of the simulations, and for each simulation, we run an pre-processing application to generate a unique "bias file". This pattern is shown below, followed by the Swift script.
sys::[cat ../part06/p6.swift]
Note that the workflow is based on data flow dependencies: each simulation depends on the seed value, calculated in this statement:
seedfile = genseed(1);
and on the bias file, computed and then consumed in these two dependent statements:
biasfile = genbias(1000, 20, simulate_script); (simout,simlog) = simulation(steps, range, biasfile, 1000000, values, simulate_script, seedfile);
To run:
$ cd ../part06 $ swift p6.swift
The default parameters result in the following execution log:
$ swift p6.swift Swift 0.94.1 RC2 swift-r6895 cog-r3765 RunID: 20130827-1917-jvs4gqm5 Progress: time: Tue, 27 Aug 2013 19:17:56 -0500 *** Script parameters: nsim=10 range=100 num values=10 Progress: time: Tue, 27 Aug 2013 19:17:57 -0500 Stage in:1 Submitted:10 Generated seed=382537 Progress: time: Tue, 27 Aug 2013 19:17:59 -0500 Active:9 Stage out:1 Finished successfully:11 Final status: Tue, 27 Aug 2013 19:18:00 -0500 Finished successfully:22
which produces the following output:
$ ls -lrt output total 264 -rw-r--r-- 1 p01532 61532 9 Aug 27 19:17 seed.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_9.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_8.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_7.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_6.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_5.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_4.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_3.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_2.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_1.dat -rw-r--r-- 1 p01532 61532 180 Aug 27 19:17 bias_0.dat -rw-r--r-- 1 p01532 61532 90 Aug 27 19:17 sim_9.out -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:17 sim_9.log -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:17 sim_8.log -rw-r--r-- 1 p01532 61532 90 Aug 27 19:17 sim_7.out -rw-r--r-- 1 p01532 61532 90 Aug 27 19:17 sim_6.out -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:17 sim_6.log -rw-r--r-- 1 p01532 61532 90 Aug 27 19:17 sim_5.out -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:17 sim_5.log -rw-r--r-- 1 p01532 61532 90 Aug 27 19:17 sim_4.out -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:17 sim_4.log -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:17 sim_1.log -rw-r--r-- 1 p01532 61532 90 Aug 27 19:18 sim_8.out -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:18 sim_7.log -rw-r--r-- 1 p01532 61532 90 Aug 27 19:18 sim_3.out -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:18 sim_3.log -rw-r--r-- 1 p01532 61532 90 Aug 27 19:18 sim_2.out -rw-r--r-- 1 p01532 61532 14898 Aug 27 19:18 sim_2.log -rw-r--r-- 1 p01532 61532 90 Aug 27 19:18 sim_1.out -rw-r--r-- 1 p01532 61532 90 Aug 27 19:18 sim_0.out -rw-r--r-- 1 p01532 61532 14897 Aug 27 19:18 sim_0.log -rw-r--r-- 1 p01532 61532 9 Aug 27 19:18 average.out -rw-r--r-- 1 p01532 61532 14675 Aug 27 19:18 average.log
Each sim_N.out file is the sum of its bias file plus newly "simulated" random output scaled by 1,000,000:
$ cat output/bias_0.dat
302
489
81
582
664
290
839
258
506
310
293
508
88
261
453
187
26
198
402
555
$ cat output/sim_0.out
64000302
38000489
32000081
12000582
46000664
36000290
35000839
22000258
49000506
75000310
We produce 20 values in each bias file. Simulations of less than that number of values ignore the unneeded number, while simualtions of more than 20 will use the last bias number for all remoaining values past 20. As an exercise, adjust the code to produce the same number of bias values as is needed for each simulation. As a further exercise, modify the script to generate a unique seed value for each simulation, which is a common practice in ensemble computations.
-
When you start instances on OSDC, use the standard Ubuntu image.
-
Ensure that your SSH key is added to the instance for password login.
-
Swift should run on the OSDC headnode.
-
You can use the following command within coaster-service.conf to automatically populate WORKER_HOSTS with the IP addresses of all active instances you have running.
export WORKER_HOSTS=$( nova list | grep ACTIVE | sed -e 's/^.*private=//' -e 's/ .*//' |sed ':a;N;$!ba;s/\n/ /g' )



