<?php  
            include_once( $_SERVER['DOCUMENT_ROOT']."/static/includes/common.inc.php" );
            do_html_header("Documentation");
        ?><div id="content">
<div class="navheader">
<table width="100%" summary="Navigation header"><tr>
<td width="20%" align="left">
<a accesskey="p" href="ch02s02.php">Prev</a> </td>
<td width="60%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="20%" align="right"> <a accesskey="n" href="ch02s04.php">Next</a>
</td>
</tr></table>
<hr>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp56813520"></a>2.3. Generating the Workflow</h2></div></div></div>
<p>We will be creating and running a simple diamond-shaped workflow
    that looks like this:</p>
<div class="figure">
<a name="idp56814784"></a><p class="title"><b>Figure 2.1. Diamond Workflow</b></p>
<div class="figure-contents"><div class="mediaobject"><img src="images/concepts-diamond.jpg" alt="Diamond Workflow"></div></div>
</div>
<br class="figure-break"><p>In this diagram, the ovals represent computational jobs, the
    dog-eared squares are files, and the arrows are dependencies.</p>
<p>Pegasus reads workflow descriptions from DAX files. The term “DAX”
    is short for “Directed Acyclic Graph in XML”. DAX is an XML file format
    that has syntax for expressing jobs, arguments, files, and
    dependencies.</p>
<p>In order to create a DAX it is necessary to write code for a DAX
    generator. Pegasus comes with Perl, Java, and Python libraries for writing
    DAX generators. In this tutorial we will show how to use the Python
    library.</p>
<p>The DAX generator for the diamond workflow is in the file
    <code class="filename">generate_dax.py</code>. Look at the file by typing:</p>
<pre class="programlisting">$ <span class="bold"><strong>more generate_dax.py</strong></span>
...</pre>
<div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Tip</h3>
<p>We will be using the <code class="literal">more</code> command to inspect
      several files in this tutorial. <code class="literal">more</code> is a pager
      application, meaning that it splits text files into pages and displays
      the pages one at a time. You can view the next page of a file by
      pressing the spacebar. Type 'h' to get help on using
      <code class="literal">more</code>. When you are done, you can type 'q' to close
      the file.</p>
</div>
<p>The code has 5 sections:</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem"><p>A few system libraries and the Pegasus.DAX3 library are
        imported. The search path is modified to include the directory with
        the Pegasus Python library.</p></li>
<li class="listitem"><p>The name for the DAX output file is retrieved from the
        arguments.</p></li>
<li class="listitem"><p>A new ADAG object is created. This is the main object to which
        jobs and dependencies are added.</p></li>
<li class="listitem"><p>Jobs and files are added. The 4 jobs in the diagram above are
        added and the 6 files are referenced. Arguments are defined using
        strings and File objects. The input and output files are defined for
        each job. This is an important step, as it allows Pegasus to track the
        files, and stage the data if necessary. Workflow outputs are tagged
        with “transfer=true”.</p></li>
<li class="listitem"><p>Dependencies are added. These are shown as arrows in the diagram
        above. They define the parent/child relationships between the jobs.
        When the workflow is executing, the order in which the jobs will be
        run is determined by the dependencies between them.</p></li>
</ol></div>
<p>Generate a DAX file named <code class="filename">diamond.dax</code> by
    typing:</p>
<pre class="programlisting">$ <span class="bold"><strong>./generate_dax.py diamond.dax</strong></span>
Creating ADAG...
Adding preprocess job...
Adding left Findrange job...
Adding right Findrange job...
Adding Analyze job...
Adding control flow dependencies...
Writing diamond.dax</pre>
<p>The <code class="filename">diamond.dax</code> file should contain an XML
    representation of the diamond workflow. You can inspect it by
    typing:</p>
<pre class="programlisting">$ <span class="bold"><strong>more diamond.dax</strong></span>
...</pre>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="ch02s02.php">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="tutorial.php">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="ch02s04.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">2.2. Getting Started </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> 2.4. Information Catalogs</td>
</tr>
</table>
</div>
</div><?php  
            do_html_footer();
        ?>
