<?php  
            include_once( $_SERVER['DOCUMENT_ROOT']."/static/includes/common.inc.php" );
            do_html_header("Documentation");
        ?><div id="content">
<div class="navheader">
<table width="100%" summary="Navigation header"><tr>
<td width="20%" align="left">
<a accesskey="p" href="monitoring_debugging_stats.php">Prev</a> </td>
<td width="60%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="20%" align="right"> <a accesskey="n" href="reference.php">Next</a>
</td>
</tr></table>
<hr>
</div>
<div class="chapter" title="Chapter 9. Example Workflows">
<div class="titlepage"><div><div><h2 class="title">
<a name="example_workflows"></a>Chapter 9. Example Workflows</h2></div></div></div>
<div class="toc"><dl>
<dt><span class="section"><a href="example_workflows.php#grid_examples">9.1. Grid Examples</a></span></dt>
<dt><span class="section"><a href="example_workflows.php#condor_examples">9.2. Condor Examples</a></span></dt>
<dt><span class="section"><a href="example_workflows.php#local_shell_examples">9.3. Local Shell Examples</a></span></dt>
<dt><span class="section"><a href="example_workflows.php#notifications_example">9.4. Notifications Example</a></span></dt>
<dt><span class="section"><a href="example_workflows.php#workflow_of_workflows">9.5. Workflow of Workflows</a></span></dt>
</dl></div>
<p>These examples are included in the Pegasus distribution and can be
  found under <code class="filename">share/pegasus/examples</code> in your Pegasus
  install (<code class="filename">/usr/share/pegasus/examples</code> for native
  packages)</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>These examples are intended to be a starting point for when you want
    to create your own workflows and want to see how other workflows are set
    up. The example workflows will probably not work in your environment
    without modifications. Site and transformation catalogs contain site and
    user specifics such as paths to scratch directories and installed
    software, and at least minor modificiations are required to get the
    workflows to plan and run. </p>
</div>
<div class="section" title="9.1. Grid Examples">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="grid_examples"></a>9.1. Grid Examples</h2></div></div></div>
<div class="toc"><dl>
<dt><span class="section"><a href="example_workflows.php#example_black_diamond">9.1.1. Black Diamond</a></span></dt>
<dt><span class="section"><a href="example_workflows.php#idp8268512">9.1.2. NASA/IPAC Montage</a></span></dt>
<dt><span class="section"><a href="example_workflows.php#idp15149888">9.1.3. Rosetta</a></span></dt>
</dl></div>
<p>These examples assumes you have access to a cluster with Globus
    installed. A pre-ws gatekeeper and gridftp server is required. You also
    need Globus and Pegasus installed, both on the machine you are submitting
    from, and the cluster.</p>
<div class="section" title="9.1.1. Black Diamond">
<div class="titlepage"><div><div><h3 class="title">
<a name="example_black_diamond"></a>9.1.1. Black Diamond</h3></div></div></div>
<p>Pegasus is shipped with 3 different Black Diamond examples for the
      grid. This is to highlight the available DAX APIs which are Java, Perl
      and Python. The examples can be found under:</p>
<pre class="programlisting">share/pegasus/examples/grid-blackdiamond-java/
share/pegasus/examples/grid-blackdiamond-perl/
share/pegasus/examples/grid-blackdiamond-python/</pre>
<p>The workflow has 4 nodes, layed out in a diamond shape, with files
      being passed between them (f.*):</p>
<div class="mediaobject" align="center"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0"><tr><td align="center" valign="middle"><img src="images/examples-diamond.jpg" align="middle"></td></tr></table></div>
<p>The binary for the nodes is a simple "mock application" name
      <span class="command"><strong>keg</strong></span> ("canonical example for the grid") which reads
      input files designated by arguments, writes them back onto output files,
      and produces on STDOUT a summary of where and when it was run. Keg ships
      with Pegasus in the bin directory.</p>
<p>This example ships with a "submit" script which will build the
      replica catalog, the transformation catalog, and the site catalog. When
      you create your own workflows, such a submit script is not needed if you
      want to maintain those catalogs manually.</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>The use of <code class="filename">./submit</code> scripts in these
        examples are just to make it more easy to run the examples out of the
        box. For a production site, the catalogs (transformation, replica,
        site) may or may not be static or generated by other tooling.</p>
</div>
<p>To test the examples, edit the <span class="command"><strong>submit</strong></span> script
      and change the cluster config to the setup and install locations for
      your cluster. Then run:</p>
<pre class="programlisting">$ <span class="bold"><strong>./submit</strong></span></pre>
<p>The workflow should now be submitted and in the output you should
      see a work dir location for the instance. With that directory you can
      monitor the workflow with:</p>
<pre class="programlisting">$ <span class="bold"><strong>pegasus-status [workdir]</strong></span></pre>
<p>Once the workflow is done, you can make sure it was sucessful
      with:</p>
<pre class="programlisting">$ <span class="bold"><strong>pegasus-analyzer -d [workdir]</strong></span></pre>
</div>
<div class="section" title="9.1.2. NASA/IPAC Montage">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp8268512"></a>9.1.2. NASA/IPAC Montage</h3></div></div></div>
<p>This example can be found under</p>
<pre class="programlisting"><code class="filename">share/pegasus/examples/grid-montage/</code></pre>
<p>The NASA IPAC Montage (<a class="ulink" href="http://montage.ipac.caltech.edu/" target="_top">http://montage.ipac.caltech.edu/</a>)
      workflow projects/montages a set of input images from telescopes like
      Hubble and end up with images like <a class="ulink" href="http://montage.ipac.caltech.edu/images/m104.jpg" target="_top">http://montage.ipac.caltech.edu/images/m104.jpg</a>
      . The test workflow is for a 1 by 1 degrees tile. It has about 45 input
      images which all have to be projected, background modeled and adjusted
      to come out as one seamless image.</p>
<p>Just like the <a class="xref" href="example_workflows.php#example_black_diamond" title="9.1.1. Black Diamond">Black Diamond</a> above, this example uses a <code class="filename">./submit</code>
      script.</p>
<p>The Montage DAX is generated with a tool called
      <code class="filename">mDAG</code> shipped with Montage which generates the
      workflow.</p>
</div>
<div class="section" title="9.1.3. Rosetta">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp15149888"></a>9.1.3. Rosetta</h3></div></div></div>
<p>This example can be found under</p>
<pre class="programlisting"><code class="filename">share/pegasus/examples/grid-rosetta/</code></pre>
<p>Rosetta (<a class="ulink" href="http://www.rosettacommons.org/" target="_top">http://www.rosettacommons.org/</a>)
      is a high resolution protein prediction and design software. Highlights
      in this example are:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem"><p>Using the Pegasus Java API to generate the DAX</p></li>
<li class="listitem"><p>The DAX generator loops over the input PDBs and creates a job
          for each input</p></li>
<li class="listitem"><p>The jobs all have a dependency on a flatfile database. For
          simplicity, each job depends on all the files in the database
          directory.</p></li>
<li class="listitem"><p>Job clustering is turned on to make each grid job run longer
          and better utilize the compute cluster</p></li>
</ul></div>
<p>Just like the <a class="xref" href="example_workflows.php#example_black_diamond" title="9.1.1. Black Diamond">Black Diamond</a> above, this example uses a <code class="filename">./submit</code>
      script.</p>
</div>
</div>
<div class="section" title="9.2. Condor Examples">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="condor_examples"></a>9.2. Condor Examples</h2></div></div></div>
<div class="toc"><dl><dt><span class="section"><a href="example_workflows.php#idp9842640">9.2.1. Black Diamond - condorio</a></span></dt></dl></div>
<div class="section" title="9.2.1. Black Diamond - condorio">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp9842640"></a>9.2.1. Black Diamond - condorio</h3></div></div></div>
<p>There are a set of Condor examples available, highlighting
      different <a class="link" href="running_workflows.php#data_staging_configuration" title="5.3. Data Staging Configuration">data staging
      configurations</a>.The most basic one is condorio, and the example
      can be found under:</p>
<pre class="programlisting"><code class="filename">share/pegasus/examples/condor-blackdiamond-condorio/</code></pre>
<p>This example is using the same abstract workflow as the <a class="xref" href="example_workflows.php#example_black_diamond" title="9.1.1. Black Diamond">Black Diamond</a> grid example above, and can be executed either on the submit
      machine (universe="local") or on a local Condor pool
      (universe="vanilla").</p>
<p>You can run this example with the <code class="filename">./submit</code>
      script. Example:</p>
<pre class="programlisting">$ <span class="bold"><strong>./submit</strong></span></pre>
</div>
</div>
<div class="section" title="9.3. Local Shell Examples">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="local_shell_examples"></a>9.3. Local Shell Examples</h2></div></div></div>
<div class="toc"><dl><dt><span class="section"><a href="example_workflows.php#idp6954032">9.3.1. Black Diamond</a></span></dt></dl></div>
<div class="section" title="9.3.1. Black Diamond">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp6954032"></a>9.3.1. Black Diamond</h3></div></div></div>
<p>To aid in workflow development and debugging, Pegasus can now map
      a workflow to a local shell script. One advantage is that you do not
      need a remote compute resource.</p>
<p>This example is using the same abstract workflow as the <a class="xref" href="example_workflows.php#example_black_diamond" title="9.1.1. Black Diamond">Black Diamond</a> grid example above. The difference is that a property is set
      in pegasusrc to force shell execution:</p>
<pre class="programlisting"># tell pegasus to generate shell version of
# the workflow
pegasus.code.generator = Shell</pre>
<p>You can run this example with the <code class="filename">./submit</code>
      script.</p>
</div>
</div>
<div class="section" title="9.4. Notifications Example">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="notifications_example"></a>9.4. Notifications Example</h2></div></div></div>
<p>A new feature in Pegasus 3.1. is notifications. While the workflow
    is running, a monitoring tool is running side by side to the workflow, and
    issues user defined notifications when certain events takes place, such as
    job completion or failure. See <a class="link" href="reference.php#notifications" title="10.7. Notifications">notifications
    section</a> for detailed information. A workflow example with
    notifications can be found under examples/notifications. This workflow is
    based on the Black Diamond, with the changes being notifications added to
    the DAX generator. For example, notifications are added at the workflow
    level:</p>
<pre class="programlisting"># Create a abstract dag
diamond = ADAG("diamond")
# dax level notifications
diamond.invoke('all', os.getcwd() + "/my-notify.sh")</pre>
<p>The DAX generator also contains job level notifications:</p>
<pre class="programlisting"># job level notifications - in this case for at_end events
frr.invoke('at_end', os.getcwd() + "/my-notify.sh")</pre>
<p>These invoke lines specify that the <span class="command"><strong>my-notify.sh</strong></span>
    script will be invoked for events generated (<span class="bold"><strong>all</strong></span> in the first case, <span class="bold"><strong>at_end</strong></span> in the second). The
    <span class="command"><strong>my-notify.sh</strong></span> script contains callouts sample
    notification tools shipped with Pegasus, one for email and for
    Jabber/GTalk (commented out by default):</p>
<pre class="programlisting">#!/bin/bash

# Pegasus ships with a couple of basic notification tools. Below
# we show how to notify via email and gtalk.

# all notifications will be sent to email
# change $USER to your full email addess
$PEGASUS_HOME/libexec/notification/email -t $USER

# this sends notifications about failed jobs to gtalk.
# note that you can also set which events to trigger on in your DAX.
# set jabberid to your gmail address, and put in yout
# password
# uncomment to enable
if [ "x$PEGASUS_STATUS" != "x" -a "$PEGASUS_STATUS" != "0" ]; then
    $PEGASUS_HOME/libexec/notification/jabber --jabberid FIXME@gmail.com \
                                              --password FIXME \
                                              --host talk.google.com
fi
</pre>
</div>
<div class="section" title="9.5. Workflow of Workflows">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="workflow_of_workflows"></a>9.5. Workflow of Workflows</h2></div></div></div>
<div class="toc"><dl><dt><span class="section"><a href="example_workflows.php#idp18626368">9.5.1. Galactic Plane</a></span></dt></dl></div>
<div class="section" title="9.5.1. Galactic Plane">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp18626368"></a>9.5.1. Galactic Plane</h3></div></div></div>
<p>The <a class="ulink" href="http://en.wikipedia.org/wiki/Galactic_plane" target="_top">Galactic Plane</a>
      workflow is a workflow of many Montage workflows. The output is a set of
      tiles which can be used in software which takes the tiles and produces a
      seamless image which can be scrolled and zoomed into. As this is more of
      a production workflow than an example one, it can be a little bit harder
      to get running in your environment.</p>
<p>Highlights of the example are:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem"><p>The subworkflow DAXes are generated as jobs in the parent
          workflow - this is an example on how to make more dynamic workflows.
          For example, if you need a job in your workflow to determine the
          number of jobs in the next level, you can have the first job create
          a subworkflow with the right number of jobs.</p></li>
<li class="listitem"><p>DAGMan job categories are used to limit the number of
          concurrant jobs in certain places. This is used to limit the number
          of concurrant connections to the data find service, as well limit
          the number of concurrant subworkflows to manage disk usage on the
          compute cluster.</p></li>
<li class="listitem"><p>Job priorities are used to make sure we overlap staging and
          computation. Pegasus sets default priorities, which for most jobs
          are fine, but the priority of the data find job is set explicitly to
          a higher priority.</p></li>
<li class="listitem"><p>A specific output site is defined the the site catalog and
          specified with the --output option of subworkflows.</p></li>
</ul></div>
<p>The DAX API has support for sub workflows:</p>
<pre class="programlisting">    remote_tile_setup = Job(namespace="gp", name="remote_tile_setup", version="1.0")
    remote_tile_setup.addArguments("%05d" % (tile_id))
    remote_tile_setup.addProfile(Profile("dagman", "CATEGORY", "remote_tile_setup"))
    remote_tile_setup.uses(params, link=Link.INPUT, register=False)
    remote_tile_setup.uses(mdagtar, link=Link.OUTPUT, register=False, transfer=True)
    uberdax.addJob(remote_tile_setup)
...
    subwf = DAX("%05d.dax" % (tile_id), "ID%05d" % (tile_id))
    subwf.addArguments("-Dpegasus.schema.dax=%s/etc/dax-2.1.xsd" %(os.environ["PEGASUS_HOME"]),
                       "-Dpegasus.catalog.replica.file=%s/rc.data" % (tile_work_dir),
                       "-Dpegasus.catalog.site.file=%s/sites.xml" % (work_dir),
                       "-Dpegasus.transfer.links=true",
                       "--sites", cluster_name,
                       "--cluster", "horizontal",
                       "--basename", "tile-%05d" % (tile_id),
                       "--force",
                       "--output", output_name)
    subwf.addProfile(Profile("dagman", "CATEGORY", "subworkflow"))
    subwf.uses(subdax_file, link=Link.INPUT, register=False)
    uberdax.addDAX(subwf)

</pre>
<p></p>
</div>
</div>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="monitoring_debugging_stats.php">Prev</a> </td>
<td width="20%" align="center"> </td>
<td width="40%" align="right"> <a accesskey="n" href="reference.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Chapter 8. Monitoring, Debugging and Statistics </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> Chapter 10. Reference Manual</td>
</tr>
</table>
</div>
</div><?php  
            do_html_footer();
        ?>
