<?php  
            require('/srv/new-pegasus.isi.edu/includes/common.php'); 
            pegasus_header("Chapter 5. Running Workflows");
        ?><div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.php">Pegasus 4.8.0 User Guide</a></span> &gt; <span class="breadcrumb-node">Running Workflows</span>
</div><hr><div class="chapter">
<div class="titlepage"><div><div><h1 class="title">
<a name="running_workflows"></a>Chapter 5. Running Workflows</h1></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="running_workflows.php#executable_workflows">5.1. Executable Workflows (DAG)</a></span></dt>
<dt><span class="section"><a href="mapping_refinement_steps.php">5.2. Mapping Refinement Steps</a></span></dt>
<dt><span class="section"><a href="data_staging_configuration.php">5.3. Data Staging Configuration</a></span></dt>
<dt><span class="section"><a href="pegasuslite.php">5.4. PegasusLite</a></span></dt>
<dt><span class="section"><a href="pegasus-plan.php">5.5. Pegasus-Plan</a></span></dt>
<dt><span class="section"><a href="BasicProperties.php">5.6. Basic Properties</a></span></dt>
</dl></div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="executable_workflows"></a>5.1. Executable Workflows (DAG)</h2></div></div></div>
<p>The DAG is an executable (concrete) workflow that can be executed
    over a variety of resources. When the workflow tasks are mapped to
    multiple resources that do not share a file system, explicit nodes are
    added to the workflow for orchestrating data. transfer between the
    tasks.</p>
<p>When you take the DAX workflow created in <a class="link" href="creating_workflows.php" title="Chapter 4. Creating Workflows">Creating Workflows</a>, and plan it for a
    single remote grid execution, here a site with handle <span class="bold"><strong>hpcc</strong></span>, and plan the workflow without clean-up nodes,
    the following concrete workflow is built:</p>
<div class="figure">
<a name="concepts-fig-dag"></a><p class="title"><b>Figure 5.1. Black Diamond DAG</b></p>
<div class="figure-contents"><div class="mediaobject" align="center"><table border="0" summary="manufactured viewport for HTML img" style="cellpadding: 0; cellspacing: 0;"><tr><td align="center" valign="middle"><img src="images/concepts-diamond-dag.png" align="middle" alt="Black Diamond DAG"></td></tr></table></div></div>
</div>
<p><br class="figure-break"></p>
<p>Planning augments the original abstract workflow with ancillary
    tasks to facility the proper execution of the workflow. These tasks
    include:</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem"><p>the creation of remote working directories. These directories
        typically have name that seeks to avoid conflicts with other
        simultaneously running similar workflows. Such tasks use a job prefix
        of <code class="code">create_dir</code>.</p></li>
<li class="listitem"><p>the stage-in of input files before any task which requires these
        files. Any file consumed by a task needs to be staged to the task, if
        it does not already exist on that site. Such tasks use a job prefix of
        <code class="code">stage_in</code>.If multiple files from various sources need to
        be transferred, multiple stage-in jobs will be created. Additional
        advanced options permit to control the size and number of these jobs,
        and whether multiple compute tasks can share stage-in jobs.</p></li>
<li class="listitem"><p>the original DAX job is concretized into a compute task in the
        DAG. Compute jobs are a concatination of the job's <span class="bold"><strong>name</strong></span> and <span class="bold"><strong>id</strong></span>
        attribute from the DAX file.</p></li>
<li class="listitem"><p>the stage-out of data products to a collecting site. Data
        products with their <span class="bold"><strong>transfer</strong></span> flag set
        to <code class="literal">false</code> will not be staged to the output site.
        However, they may still be eligible for staging to other, dependent
        tasks. Stage-out tasks use a job prefix of
        <code class="code">stage_out</code>.</p></li>
<li class="listitem"><p>If compute jobs run at different sites, an intermediary staging
        task with prefix <code class="code">stage_inter</code> is inserted between the
        compute jobs in the workflow, ensuring that the data products of the
        parent are available to the child job.</p></li>
<li class="listitem"><p>the registration of data products in a replica catalog. Data
        products with their <span class="bold"><strong>register</strong></span> flag set
        to <code class="literal">false</code> will not be registered.</p></li>
<li class="listitem"><p>the clean-up of transient files and working directories. These
        steps can be omitted with the <span class="command"><strong>--no-cleanup</strong></span> option
        to the planner.</p></li>
</ul></div>
<p>The <a class="link" href="data_management.php" title="Chapter 10. Data Management">Data Management</a> chapter
    details more about when and how staging nodes are inserted into the
    workflow.</p>
<p>The DAG will be found in file <code class="filename">diamond-0.dag</code>,
    constructed from the <span class="bold"><strong>name</strong></span> and <span class="bold"><strong>index</strong></span> attributes found in the root element of the
    DAX file.</p>
<pre class="programlisting">######################################################################
# PEGASUS WMS GENERATED DAG FILE
# DAG diamond
# Index = 0, Count = 1
######################################################################

JOB create_dir_diamond_0_hpcc create_dir_diamond_0_hpcc.sub
SCRIPT POST create_dir_diamond_0_hpcc /opt/pegasus/default/bin/pegasus-exitcode create_dir_diamond_0_hpcc.out

JOB stage_in_local_hpcc_0 stage_in_local_hpcc_0.sub
SCRIPT POST stage_in_local_hpcc_0 /opt/pegasus/default/bin/pegasus-exitcode stage_in_local_hpcc_0.out

JOB preprocess_ID000001 preprocess_ID000001.sub
SCRIPT POST preprocess_ID000001 /opt/pegasus/default/bin/pegasus-exitcode preprocess_ID000001.out

JOB findrange_ID000002 findrange_ID000002.sub
SCRIPT POST findrange_ID000002 /opt/pegasus/default/bin/pegasus-exitcode findrange_ID000002.out

JOB findrange_ID000003 findrange_ID000003.sub
SCRIPT POST findrange_ID000003 /opt/pegasus/default/bin/pegasus-exitcode findrange_ID000003.out

JOB analyze_ID000004 analyze_ID000004.sub
SCRIPT POST analyze_ID000004 /opt/pegasus/default/bin/pegasus-exitcode analyze_ID000004.out

JOB stage_out_local_hpcc_2_0 stage_out_local_hpcc_2_0.sub
SCRIPT POST stage_out_local_hpcc_2_0 /opt/pegasus/default/bin/pegasus-exitcode stage_out_local_hpcc_2_0.out

PARENT findrange_ID000002 CHILD analyze_ID000004
PARENT findrange_ID000003 CHILD analyze_ID000004
PARENT preprocess_ID000001 CHILD findrange_ID000002
PARENT preprocess_ID000001 CHILD findrange_ID000003
PARENT analyze_ID000004 CHILD stage_out_local_hpcc_2_0
PARENT stage_in_local_hpcc_0 CHILD preprocess_ID000001
PARENT create_dir_diamond_0_hpcc CHILD findrange_ID000002
PARENT create_dir_diamond_0_hpcc CHILD findrange_ID000003
PARENT create_dir_diamond_0_hpcc CHILD preprocess_ID000001
PARENT create_dir_diamond_0_hpcc CHILD analyze_ID000004
PARENT create_dir_diamond_0_hpcc CHILD stage_in_local_hpcc_0
######################################################################
# End of DAG
######################################################################
</pre>
<p>The DAG file declares all jobs and links them to a Condor submit
    file that describes the planned, concrete job. In the same directory as
    the DAG file are all Condor submit files for the jobs from the picture
    plus a number of additional helper files.</p>
<p>The various instructions that can be put into a DAG file are
    described in <a class="ulink" href="http://www.cs.wisc.edu/condor/manual/v7.5/2_10DAGMan_Applications.html" target="_top">Condor's
    DAGMAN documentation</a>.The constituents of the submit directory are
    described in the<a class="link" href="submit_directory.php" title="Chapter 14. Submit Directory Details"> "Submit Directory
    Details"</a>chapter</p>
</div>
</div><div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="variable_expansion.php">Prev</a> </td>
<td width="20%" align="center"> </td>
<td width="40%" align="right"> <a accesskey="n" href="mapping_refinement_steps.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">4.5. Variable Expansion </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> 5.2. Mapping Refinement Steps</td>
</tr>
</table>
</div><?php  
            pegasus_footer();
        ?>
