<?php  
            include_once( $_SERVER['DOCUMENT_ROOT']."/static/includes/common.inc.php" );
            do_html_header("Documentation");
        ?><div id="content">
<div class="navheader">
<table width="100%" summary="Navigation header"><tr>
<td width="20%" align="left">
<a accesskey="p" href="static_bp_file.php">Prev</a> </td>
<td width="60%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="20%" align="right"> <a accesskey="n" href="dax_generator_api.php">Next</a>
</td>
</tr></table>
<hr>
</div>
<div class="chapter">
<div class="titlepage"><div><div><h1 class="title">
<a name="api"></a>Chapter 14. API Reference</h1></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="api.php#idp56147856">14.1. DAX XML Schema</a></span></dt>
<dt><span class="section"><a href="dax_generator_api.php">14.2. DAX Generator API</a></span></dt>
<dt><span class="section"><a href="ch14s03.php">14.3. DAX Generator without a Pegasus DAX API</a></span></dt>
</dl></div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="idp56147856"></a>14.1. DAX XML Schema</h2></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="api.php#idp55528432">14.1.1. DAX XML Schema In Detail</a></span></dt>
<dt><span class="section"><a href="api.php#idp62587536">14.1.2. DAX XML Schema Example</a></span></dt>
</dl></div>
<p>The DAX format is described by the XML schema instance document
    <a class="ulink" href="http://pegasus.isi.edu/wms/docs/schemas/dax-3.3/dax-3.3.xsd" target="_top">dax-3.3.xsd</a>.
    A local copy of the schema definition is provided in the
    <span class="quote">“<span class="quote">etc</span>”</span> directory. The documentation of the XML schema and its
    elements can be found in <a class="ulink" href="http://pegasus.isi.edu/wms/docs/schemas/dax-3.3/dax-3.3.html" target="_top">dax-3.3.html</a>
    as well as locally in
    <code class="filename">doc/schemas/dax-3.3/dax-3.3.html</code> in your Pegasus
    distribution.</p>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp55528432"></a>14.1.1. DAX XML Schema In Detail</h3></div></div></div>
<p>The DAX file format has four major sections, with the second
      section divided into more sub-sections. The DAX format works on the
      abstract or logical level, letting you focus on the shape of the
      workflows, what to do and what to work upon.</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem">
<p>Workflow-level Notifications</p>
<p>Very simple workflow-level notifications. These are defined in
          the <a class="link" href="notifications.php" title="6.4. Notifications">Notification</a>
          section.</p>
</li>
<li class="listitem">
<p>Catalogs</p>
<p>The first section deals with included catalogs. While we do
          recommend to use external replica- and transformation catalogs, it
          is possible to include some replicas and transformations into the
          DAX file itself. Any DAX-included entry takes precedence over
          regular replica catalog (RC) and transformation catalog (TC)
          entries.</p>
<p>The first section (and any of its sub-sections) is completely
          optional.</p>
<div class="orderedlist"><ol class="orderedlist" type="a">
<li class="listitem"><p>The first sub-section deals with included replica
              descriptions.</p></li>
<li class="listitem"><p>The second sub-section deals with included transformation
              descriptions.</p></li>
<li class="listitem"><p>The third sub-section declares multi-item
              executables.</p></li>
</ol></div>
</li>
<li class="listitem">
<p>Job List</p>
<p>The jobs section defines the job- or task descriptions. For
          each task to conduct, a three-part logical name declares the task
          and aides identifying it in the transformation catalog or one of the
          <span class="emphasis"><em>executable</em></span> section above. During planning, the
          logical name is translated into the physical executable location on
          the chosen target site. By declaring jobs abstractly, physical
          layout consideration of the target sites do not matter. The job's
          <span class="emphasis"><em>id</em></span> uniquley identifies the job within this
          workflow.</p>
<p>The arguments declare what command-line arguments to pass to
          the job. If you are passing filenames, you should refer to the
          logical filename using the <span class="emphasis"><em>file</em></span> element in the
          argument list.</p>
<p>Important for properly planning the task is the list of files
          consumed by the task, its input files, and the files produced by the
          task, its output files. Each file is described with a
          <span class="emphasis"><em>uses</em></span> element inside the task.</p>
<p>Elements exist to link a logical file to any of the stdio file
          descriptors. The <span class="emphasis"><em>profile</em></span> element is Pegasus's
          way to abstract site-specific data.</p>
<p>Jobs are nodes in the workflow graph. Other nodes include
          unplanned workflows (DAX), which are planned and then run when the
          node runs, and planned workflows (DAG), which are simply
          executed.</p>
</li>
<li class="listitem">
<p>Control-flow Dependencies</p>
<p>The third section lists the dependencies between the tasks.
          The relationships are defined as child parent relationships, and
          thus impacts the order in which tasks are run. No cyclic
          dependencies are permitted.</p>
<p>Dependencies are directed edges in the workflow graph.</p>
</li>
</ol></div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="idp60680112"></a>14.1.1.1. XML Intro</h4></div></div></div>
<p>If you have seen the DAX schema before, not a lot of new items
        in the root element. <span class="emphasis"><em>However</em></span>, we did retire the
        (old) attributes ending in <span class="emphasis"><em>Count</em></span>.</p>
<pre class="programlisting">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!-- generated: 2011-07-28T18:29:57Z --&gt;
&lt;adag xmlns="http://pegasus.isi.edu/schema/DAX" 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://pegasus.isi.edu/schema/DAX http://pegasus.isi.edu/schema/dax-3.3.xsd" 
      version="3.3" 
      name="diamond" 
      index="0" 
      count="1"&gt;</pre>
<p>The following attributes are supported for the root element
        <span class="emphasis"><em>adag</em></span>.</p>
<div class="table">
<a name="idp60684144"></a><p class="title"><b>Table 14.1. </b></p>
<div class="table-contents"><table border="1">
<colgroup>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>attribute</th>
<th>optional?</th>
<th>type</th>
<th>meaning</th>
</tr></thead>
<tbody>
<tr>
<td>version</td>
<td>required</td>
<td>
                  <span class="emphasis"><em>VersionPattern</em></span>
                </td>
<td>Version number of DAX instance document. Must be
                3.3.</td>
</tr>
<tr>
<td>name</td>
<td>required</td>
<td>string</td>
<td>name of this DAX (or set of DAXes).</td>
</tr>
<tr>
<td>count</td>
<td>optional</td>
<td>positiveInteger</td>
<td>size of list of DAXes with this
                <span class="emphasis"><em>name</em></span>. Defaults to 1.</td>
</tr>
<tr>
<td>index</td>
<td>optional</td>
<td>nonNegativeInteger</td>
<td>current index of DAX with same
                <span class="emphasis"><em>name</em></span>. Defaults to 0.</td>
</tr>
<tr>
<td>fileCount</td>
<td>removed</td>
<td>nonNegativeInteger</td>
<td>Old 2.1 attribute, removed, do not use.</td>
</tr>
<tr>
<td>jobCount</td>
<td>removed</td>
<td>positiveInteger</td>
<td>Old 2.1 attribute, removed, do not use.</td>
</tr>
<tr>
<td>childCount</td>
<td>removed</td>
<td>nonNegativeInteger</td>
<td>Old 2.1 attribute, removed, do not use.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>The <span class="emphasis"><em>version</em></span> attribute is restricted to the
        regular expression <code class="code">\d+(\.\d+(\.\d+)?)?</code>.This expression
        represents the <span class="emphasis"><em>VersionPattern</em></span> type that is used
        in other places, too. It is a more restrictive expression than before,
        but allows us to compute comparable version number using the following
        formula:</p>
<div class="informaltable"><table border="1">
<tr>
            <td>version1: a.b.c</td>

            <td>version2: d.e.f</td>
          </tr>
<tr>
            <td>n = a * 1,000,000 + b * 1,000 + c</td>

            <td>m = d * 1,000,000 + e * 1,000 + f</td>
          </tr>
<tr>
            <td align="center" colspan="2">version1 &gt; version2 if n &gt;
            m</td>
          </tr>
</table></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="idp56622320"></a>14.1.1.2. Workflow-level Notifications</h4></div></div></div>
<p>(something to be said here.)</p>
<pre class="programlisting">  &lt;!-- part 1.1: invocations --&gt;
  &lt;invoke when="at_end"&gt;/bin/date -Ins &amp;gt;&amp;gt; my.log&lt;/invoke&gt;</pre>
<p>The above snippet will append the current time to a log file in
        the current directory. This is with regards to the monitord instance
        acting on the <a class="link" href="notifications.php" title="6.4. Notifications">notification</a>.</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="idp60982672"></a>14.1.1.3. The Catalogs Section</h4></div></div></div>
<p>The initial section features three sub-sections:</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem"><p>a catalog of files used,</p></li>
<li class="listitem"><p>a catalog of transformations used, and</p></li>
<li class="listitem"><p>compound transformation declarations.</p></li>
</ol></div>
<div class="section">
<div class="titlepage"><div><div><h5 class="title">
<a name="dax_replica_catalog"></a>14.1.1.3.1. The Replica Catalog Section</h5></div></div></div>
<p>The file section acts as in in-file replica catalog (RC). Any
          files declared in this section take precedence over files in
          external replica catalogs during planning.</p>
<pre class="programlisting">  &lt;!-- part 1.2: included replica catalog --&gt;
  &lt;file name="example.a" &gt;
    &lt;!-- profiles are optional --&gt;
    &lt;!-- The "stat" namespace is ONLY AN EXAMPLE --&gt;
    &lt;profile namespace="stat" key="size"&gt;/* integer to be defined */&lt;/profile&gt;
    &lt;profile namespace="stat" key="md5sum"&gt;/* 32 char hex string */&lt;/profile&gt;
    &lt;profile namespace="stat" key="mtime"&gt;/* ISO-8601 timestamp */&lt;/profile&gt;

    &lt;!-- metadata is currently NOT SUPPORTED --&gt;
    &lt;metadata key="timestamp" type="int"&gt;/* ISO-8601 *or* 20100417134523:int */&lt;/metadata&gt;
    &lt;metadata key="origin" type="string"&gt;ocean&lt;/metadata&gt;
    
    &lt;!-- PFN to by-pass replica catalog --&gt;
    &lt;!-- The "site attribute is optional --&gt;
    &lt;pfn url="file:///tmp/example.a" site="local"&gt;
      &lt;profile namespace="stat" key="owner"&gt;voeckler&lt;/profile&gt;
    &lt;/pfn&gt;
    &lt;pfn url="file:///storage/funky.a" site="local"/&gt;    
  &lt;/file&gt;

  &lt;!-- a more typical example from the black diamond --&gt;
  &lt;file name="f.a"&gt;
    &lt;pfn url="file:///Users/voeckler/f.a" site="local"/&gt;
  &lt;/file&gt;</pre>
<p>The first <span class="emphasis"><em>file</em></span> entry above is an example
          of a data file with two replicas. The <span class="emphasis"><em>file</em></span>
          element requires a logical file <span class="emphasis"><em>name</em></span>. Each
          logical filename may have additional information associated with it,
          enumerated by <span class="emphasis"><em>profile</em></span> elements. Each file entry
          may have 0 or more <span class="emphasis"><em>metadata</em></span> associated with it.
          Each piece of metadata has a <span class="emphasis"><em>key</em></span> string and
          <span class="emphasis"><em>type</em></span> attribute describing the element's
          value.</p>
<div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Warning</h3>
<p>The <span class="emphasis"><em>metadata</em></span> element is not support as
            of this writing! Details may change in the future.</p>
</div>
<p>The <span class="emphasis"><em>file</em></span> element can provide 0 or more
          <span class="emphasis"><em>pfn</em></span> locations, taking precedence over the
          replica catalog. A <span class="emphasis"><em>file</em></span> element that does not
          name any <span class="emphasis"><em>pfn</em></span> children-elements will still
          require look-ups in external replica catalogs. Each
          <span class="emphasis"><em>pfn</em></span> element names a concrete location of a
          file. Multiple locations constitute replicas of the same file, and
          are assumed to be usable interchangably. The
          <span class="emphasis"><em>url</em></span> attribute is mandatory, and typically would
          use a file schema URL. The <span class="emphasis"><em>site</em></span> attribute is
          optional, and defaults to value <span class="emphasis"><em>local</em></span> if
          missing. A <span class="emphasis"><em>pfn</em></span> element may have
          <span class="emphasis"><em>profile</em></span> children-elements, which refer to
          attributes of the physical file. The file-level profiles refer to
          attributes of the logical file.</p>
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>The <code class="literal">stat</code> profile namespace is ony an
            example, and details about stat are not yet implemented. The
            proper namespaces <code class="literal">pegasus</code>,
            <code class="literal">condor</code>, <code class="literal">dagman</code>,
            <code class="literal">env</code>, <code class="literal">hints</code>,
            <code class="literal">globus</code> and <code class="literal">selector</code> enjoy
            full support.</p>
</div>
<p>The second <span class="emphasis"><em>file</em></span> entry above shows a usage
          example from the black-diamond example workflow that you are more
          likely to encouter or write.</p>
<p>The presence of an in-file replica catalog lets you declare a
          couple of interesting advanced features. The DAG and DAX file
          declarations are just files for all practical purposes. For deferred
          planning, the location of the site catalog (SC) can be captured in a
          file, too, that is passed to the job dealing with the deferred
          planning as logical filename.</p>
<pre class="programlisting">  &lt;file name="black.dax" &gt;
    &lt;!-- specify the location of the DAX file --&gt;
    &lt;pfn url="file:///Users/vahi/Pegasus/work/dax-3.0/blackdiamond_dax.xml" site="local"/&gt;
  &lt;/file&gt;

  &lt;file name="black.dag" &gt;
    &lt;!-- specify the location of the DAG file --&gt;
    &lt;pfn url="file:///Users/vahi/Pegasus/work/dax-3.0/blackdiamond.dag" site="local"/&gt;
  &lt;/file&gt;
  
  &lt;file name="sites.xml" &gt;
    &lt;!-- specify the location of a site catalog to use for deferred planning --&gt;
    &lt;pfn url="file:///Users/vahi/Pegasus/work/dax-3.0/conf/sites.xml" site="local"/&gt;
  &lt;/file&gt;</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h5 class="title">
<a name="dax_transformation_catalog"></a>14.1.1.3.. The Transformation Catalog Section</h5></div></div></div>
<p>The executable section acts as an in-file transformation
          catalog (TC). Any transformations declared in this section take
          precedence over the external transformation catalog during
          planning.</p>
<pre class="programlisting">  &lt;!-- part 1.3: included transformation catalog --&gt;
  &lt;executable namespace="example" name="mDiffFit" version="1.0" 
              arch="x86_64" os="linux" installed="true" &gt;
    &lt;!-- profiles are optional --&gt;
    &lt;!-- The "stat" namespace is ONLY AN EXAMPLE! --&gt;
    &lt;profile namespace="stat" key="size"&gt;5000&lt;/profile&gt;
    &lt;profile namespace="stat" key="md5sum"&gt;AB454DSSDA4646DS&lt;/profile&gt;
    &lt;profile namespace="stat" key="mtime"&gt;2010-11-22T10:05:55.470606000-0800&lt;/profile&gt;

    &lt;!-- metadata is currently NOT SUPPORTED! --&gt;
    &lt;metadata key="timestamp" type="int"&gt;/* see above */&lt;/metadata&gt;
    &lt;metadata key="origin" type="string"&gt;ocean&lt;/metadata&gt;
 
    &lt;!-- PFN to by-pass transformation catalog --&gt;
    &lt;!-- The "site" attribute is optional --&gt;
    &lt;pfn url="file:///tmp/mDiffFit"          site="local"/&gt;     
    &lt;pfn url="file:///tmp/storage/mDiffFit"  site="local"/&gt;     
  &lt;/executable&gt;

  &lt;!-- to be used in compound transformation later --&gt;
  &lt;executable namespace="example" name="mDiff" version="1.0" 
              arch="x86_64" os="linux" installed="true" &gt;
    &lt;pfn url="file:///tmp/mDiff" site="local"/&gt;        
  &lt;/executable&gt;

  &lt;!-- to be used in compound transformation later --&gt;
  &lt;executable namespace="example" name="mFitplane" version="1.0"
              arch="x86_64" os="linux" installed="true" &gt;
    &lt;pfn url="file:///tmp/mDiffFitplane"  site="local"&gt;
      &lt;profile namespace="stat" key="md5sum"&gt;0a9c38b919c7809cb645fc09011588a6&lt;/profile&gt;
    &lt;/pfn&gt;
    &lt;invoke when="at_end"&gt;/path/to/my_send_email some args&lt;/invoke&gt;
  &lt;/executable&gt;

  &lt;!-- a more likely example from the black diamond --&gt;
  &lt;executable namespace="diamond" name="preprocess" version="2.0" 
              arch="x86_64"
              os="linux" 
              osversion="2.6.18"&gt;
    &lt;pfn url="file:///opt/pegasus/default/bin/keg" site="local" /&gt;
  &lt;/executable&gt;</pre>
<p>Logical filenames pertaining to a single executables in the
          transformation catalog use the <span class="emphasis"><em>executable</em></span>
          element. Any <span class="emphasis"><em>executable</em></span> element features the
          optional <span class="emphasis"><em>namespace</em></span> attribute, a mandatory
          <span class="emphasis"><em>name</em></span> attribute, and an optional
          <span class="emphasis"><em>version</em></span> attribute. The
          <span class="emphasis"><em>version</em></span> attribute defaults to "1.0" when
          absent. An executable typically needs additional attributes to
          describe it properly, like the architecture, OS release and other
          flags typically seen with transformations, or found in the
          transformation catalog.</p>
<div class="table">
<a name="idp63505936"></a><p class="title"><b>Table 14.2. </b></p>
<div class="table-contents"><table border="1">
<colgroup>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>attribute</th>
<th>optional?</th>
<th>type</th>
<th>meaning</th>
</tr></thead>
<tbody>
<tr>
<td>name</td>
<td>required</td>
<td>string</td>
<td>logical transformation name</td>
</tr>
<tr>
<td>namespace</td>
<td>optional</td>
<td>string</td>
<td>namespace of logical transformation, default to
                  <span class="emphasis"><em>null</em></span> value.</td>
</tr>
<tr>
<td>version</td>
<td>optional</td>
<td>VersionPattern</td>
<td>version of logical transformation, defaults to
                  "1.0".</td>
</tr>
<tr>
<td>installed</td>
<td>optional</td>
<td>boolean</td>
<td>whether to stage the file (false), or not (true,
                  default).</td>
</tr>
<tr>
<td>arch</td>
<td>optional</td>
<td>Architecture</td>
<td>restricted set of tokens, see schema definition
                  file.</td>
</tr>
<tr>
<td>os</td>
<td>optional</td>
<td>OSType</td>
<td>restricted set of tokens, see schema definition
                  file.</td>
</tr>
<tr>
<td>osversion</td>
<td>optional</td>
<td>VersionPattern</td>
<td>kernel version as beginning of `uname -r`.</td>
</tr>
<tr>
<td>glibc</td>
<td>optional</td>
<td>VersionPattern</td>
<td>version of libc.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>The rationale for giving these flags in the
          <span class="emphasis"><em>executable</em></span> element header is that PFNs are just
          identical replicas or instances of a given LFN. If you need a
          different 32/64 bit-ed-ness or OS release, the underlying PFN would
          be different, and thus the LFN for it should be different,
          too.</p>
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>We are still discussing some details and implications of
            this decision.</p>
</div>
<p>The initial examples come with the same caveats as for the
          included replica catalog.</p>
<div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Warning</h3>
<p>The <span class="emphasis"><em>metadata</em></span> element is not support as
            of this writing! Details may change in the future.</p>
</div>
<p>Similar to the replica catalog, each
          <span class="emphasis"><em>executable</em></span> element may have 0 or more
          <span class="emphasis"><em>profile</em></span> elements abstracting away site-specific
          details, zero or more <span class="emphasis"><em>metadata</em></span> elements, and
          zero or more <span class="emphasis"><em>pfn</em></span> elements. If there are no
          <span class="emphasis"><em>pfn</em></span> elements, the transformation must still be
          searched for in the external transformation catalog. As before, the
          <span class="emphasis"><em>pfn</em></span> element may have
          <span class="emphasis"><em>profile</em></span> children-elements, referring to
          attributes of the physical filename itself.</p>
<p>Each <span class="emphasis"><em>executable</em></span> element may also feature
          <span class="emphasis"><em>invoke</em></span> elements. These enable notifications at
          the appropriate point when every job that uses this executable
          reaches the point of notification. Please refer to the <a class="link" href="notifications.php" title="6.4. Notifications">notification section</a> for details and
          caveats.</p>
<p>The last example above comes from the black diamond example
          workflow, and presents the kind and extend of attributes you are
          most likely to see and use in your own workflows.</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h5 class="title">
<a name="idp55374512"></a>14.1.1.3.3. The Compound Transformation Section</h5></div></div></div>
<p>The compound transformation section declares a transformation
          that comprises multiple plain transformation. You can think of a
          compound transformation like a script interpreter and the script
          itself. In order to properly run the application, you must start
          both, the script interpreter and the script passed to it. The
          compound transformation helps Pegasus to properly deal with this
          case, especially when it needs to stage executables.</p>
<pre class="programlisting">  &lt;transformation namespace="example" version="1.0" name="mDiffFit" &gt;
    &lt;uses name="mDiffFit" /&gt;
    &lt;uses name="mDiff" namespace="example" version="2.0" /&gt;
    &lt;uses name="mFitPlane" /&gt;
    &lt;uses name="mDiffFit.config" executable="false" /&gt;
  &lt;/transformation&gt;</pre>
<p>A <span class="emphasis"><em>transformation</em></span> element declares a set
          of purely logical entities, executables and config (data) files,
          that are all required together for the same job. Being purely
          logical entities, the lookup happens only when the transformation
          element is referenced (or instantiated) by a job element later
          on.</p>
<p>The <span class="emphasis"><em>namespace</em></span> and
          <span class="emphasis"><em>version</em></span> attributes of the transformation
          element are optional, and provide the defaults for the inner uses
          elements. They are also essential for matching the transformation
          with a job.</p>
<p>The <span class="emphasis"><em>transformation</em></span> is made up of 1 or
          more <span class="emphasis"><em>uses</em></span> element. Each
          <span class="emphasis"><em>uses</em></span> has a boolean attribute
          <span class="emphasis"><em>executable</em></span>, <code class="literal">true</code> by default,
          or <code class="literal">false</code> to indicate a data file. The
          <span class="emphasis"><em>name</em></span> is a mandatory attribute, refering to an
          LFN declared previously in the File Catalog
          (<span class="emphasis"><em>executable</em></span> is <code class="literal">false</code>),
          Executable Catalog (<span class="emphasis"><em>executable</em></span> is
          <code class="literal">true</code>), or to be looked up as necessary at
          instantiation time. The lookup catalog is determined by the
          <span class="emphasis"><em>executable</em></span> attribute.</p>
<p>After <span class="emphasis"><em>uses</em></span> elements, any number of
          <span class="emphasis"><em>invoke</em></span> elements may occur to add a <a class="link" href="notifications.php" title="6.4. Notifications">notification</a> each whenever this
          transformation is instantiated.</p>
<p>The <span class="emphasis"><em>namespace</em></span> and
          <span class="emphasis"><em>version</em></span> attributes' default values inside
          <span class="emphasis"><em>uses</em></span> elements are inherited from the
          <span class="emphasis"><em>transformation</em></span> attributes of the same name.
          There is no such inheritance for <span class="emphasis"><em>uses</em></span> elements
          with <span class="emphasis"><em>executable</em></span> attribute of
          <code class="literal">false</code>.</p>
</div>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="api-graph-nodes"></a>14.1.1.4. Graph Nodes</h4></div></div></div>
<p>The nodes in the DAX comprise regular job nodes, already
        instantiated sub-workflows as dag nodes, and still to be instantiated
        dax nodes. Each of the graph nodes can has a mandatory
        <span class="emphasis"><em>id</em></span> attribute. The <span class="emphasis"><em>id</em></span>
        attribute is currently a restriction of type
        <span class="emphasis"><em>NodeIdentifierPattern</em></span> type, which is a
        restriction of the <code class="code">xs:NMTOKEN</code> type to letters, digits,
        hyphen and underscore.</p>
<p>The <span class="emphasis"><em>level</em></span> attribute is deprecated, as the
        planner will trust its own re-computation more than user input. Please
        do not use nor produce any <span class="emphasis"><em>level</em></span>
        attribute.</p>
<p>The <span class="emphasis"><em>node-label</em></span> attribute is optional. It
        applies to the use-case when every transformation has the same name,
        but its arguments determine what it really does. In the presence of a
        <span class="emphasis"><em>node-label</em></span> value, a workflow grapher could use
        the label value to show graph nodes to the user. It may also come in
        handy while debugging.</p>
<p>Any job-like graph node has the following set of children
        elements, as defined in the <span class="emphasis"><em>AbstractJobType</em></span>
        declaration in the schema definition:</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem"><p>0 or 1 <span class="emphasis"><em>argument</em></span> element to declare the
            command-line of the job's invocation.</p></li>
<li class="listitem"><p>0 or more <span class="emphasis"><em>profile</em></span> elements to abstract
            away site-specific or job-specific details.</p></li>
<li class="listitem"><p>0 or 1 <span class="emphasis"><em>stdin</em></span> element to link a logical
            file the the job's standard input.</p></li>
<li class="listitem"><p>0 or 1 <span class="emphasis"><em>stdout</em></span> element to link a logical
            file to the job's standard output.</p></li>
<li class="listitem"><p>0 or 1 <span class="emphasis"><em>stderr</em></span> element to link a logical
            file to the job's standard error.</p></li>
<li class="listitem"><p>0 or more <span class="emphasis"><em>uses</em></span> elements to declare
            consumed data files and produced data files.</p></li>
<li class="listitem"><p>0 or more <span class="emphasis"><em>invoke</em></span> elements to solicit
            <a class="link" href="notifications.php" title="6.4. Notifications">notifications</a> whence a job
            reaches a certain state in its life-cycle.</p></li>
</ul></div>
<div class="section">
<div class="titlepage"><div><div><h5 class="title">
<a name="api-job-nodes"></a>14.1.1.4.1. Job Nodes</h5></div></div></div>
<p>A job element has a number of attributes. In addition to the
          <span class="emphasis"><em>id</em></span> and <span class="emphasis"><em>node-label</em></span>
          described in (Graph Nodes)above, the optional
          <span class="emphasis"><em>namespace</em></span>, mandatory <span class="emphasis"><em>name</em></span>
          and optional <span class="emphasis"><em>version</em></span> identify the
          transformation, and provide the look-up handle: first in the DAX's
          <span class="emphasis"><em>transformation</em></span> elements, then in the
          <span class="emphasis"><em>executable</em></span> elements, and finally in an external
          transformation catalog.</p>
<pre class="programlisting">  &lt;!-- part 2: definition of all jobs (at least one) --&gt;
  &lt;job id="ID000001" namespace="example" name="mDiffFit" version="1.0" 
       node-label="preprocess" &gt;
    &lt;argument&gt;-a top -T 6  -i &lt;file name="f.a"/&gt;  -o &lt;file name="f.b1"/&gt;&lt;/argument&gt;

    &lt;!-- profiles are optional --&gt;
    &lt;profile namespace="execution" key="site"&gt;isi_viz&lt;/profile&gt;
    &lt;profile namespace="condor" key="getenv"&gt;true&lt;/profile&gt;

    &lt;uses name="f.a" link="input"  register="false" transfer="true" type="data" /&gt;
    &lt;uses name="f.b" link="output" register="false" transfer="true" type="data" /&gt;
    
    &lt;!-- 'WHEN' enumeration: never, start, on_error, on_success, at_end, all --&gt;
    &lt;!-- PEGASUS_* env-vars: event, status, submit dir, wf/job id, stdout, stderr --&gt;
    &lt;invoke when="start"&gt;/path/to arg arg&lt;/invoke&gt;
    &lt;invoke when="on_success"&gt;&lt;![CDATA[/path/to arg arg]]&gt;&lt;/invoke&gt;
    &lt;invoke when="at_end"&gt;&lt;![CDATA[/path/to arg arg]]&gt;&lt;/invoke&gt;
  &lt;/job&gt;</pre>
<p>The <span class="emphasis"><em>argument</em></span> element contains the
          complete command-line that is needed to invoke the executable. The
          only variable components are logical filenames, as included
          <span class="emphasis"><em>file</em></span> elements.</p>
<p>The <span class="emphasis"><em>profile</em></span> argument lets you encapsulate
          site-specific knowledge .</p>
<p>The <span class="emphasis"><em>stdin</em></span>, <span class="emphasis"><em>stdout</em></span>
          and <span class="emphasis"><em>stderr</em></span> element permits you to connect a
          stdio file descriptor to a logical filename. Note that you will
          still have to declare these files in the <span class="emphasis"><em>uses</em></span>
          section below.</p>
<p>The <span class="emphasis"><em>uses</em></span> element enumerates all the files
          that the task consumes or produces. While it is not necessary nor
          required to have all files appear on the command-line, it is
          imperative that you declare even hidden files that your task
          requires in this section, so that the proper ancilliary staging- and
          clean-up tasks can be generated during planning.</p>
<p>The <span class="emphasis"><em>invoke</em></span> element may be specified
          multiple times, as needed. It has a mandatory when attribute with
          the following value set:</p>
<div class="table">
<a name="idp61010560"></a><p class="title"><b>Table 14.3. </b></p>
<div class="table-contents"><table border="1">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th align="center">keyword</th>
<th align="center">job life-cycle state</th>
<th align="center">meaning</th>
</tr></thead>
<tbody>
<tr>
<td>never</td>
<td>never</td>
<td>(default). Never notify of anything. This is useful
                  to temporarily disable an existing notifications.</td>
</tr>
<tr>
<td>start</td>
<td>submit</td>
<td>create a notification when the job is
                  submitted.</td>
</tr>
<tr>
<td>on_error</td>
<td>end</td>
<td>after a job finishes with failure (exitcode !=
                  0).</td>
</tr>
<tr>
<td>on_success</td>
<td>end</td>
<td>after a job finishes with success (exitcode ==
                  0).</td>
</tr>
<tr>
<td>at_end</td>
<td>end</td>
<td>after a job finishes, regardless of exitcode.</td>
</tr>
<tr>
<td>all</td>
<td>always</td>
<td>like start and at_end combined.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Warning</h3>
<p>In clustered jobs, a notification can only be sent at the
            start or end of the clustered job, not for each member.</p>
</div>
<p>Each <span class="emphasis"><em>invoke</em></span> is a simple local invocation
          of an executable or script with the specified arguments. The
          executable inside the invoke body will see the following environment
          variables:</p>
<div class="table">
<a name="idp57020624"></a><p class="title"><b>Table 14.4. </b></p>
<div class="table-contents"><table border="1">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th align="center">variable</th>
<th align="center">job life-cycle state</th>
<th align="center">meaning</th>
</tr></thead>
<tbody>
<tr>
<td>PEGASUS_EVENT</td>
<td>always</td>
<td>The value of the <code class="code">when</code> attribute</td>
</tr>
<tr>
<td>PEGASUS_STATUS</td>
<td>end</td>
<td>The exit status of the graph node. Only available for
                  end notifications.</td>
</tr>
<tr>
<td>PEGASUS_SUBMIT_DIR</td>
<td>always</td>
<td>In which directory to find the job (or
                  workflow).</td>
</tr>
<tr>
<td>PEGASUS_JOBID</td>
<td>always</td>
<td>The job (or workflow) identifier. This is potentially
                  more than merely the value of the <span class="emphasis"><em>id</em></span>
                  attribute.</td>
</tr>
<tr>
<td>PEGASUS_STDOUT</td>
<td>always</td>
<td>The filename where <span class="emphasis"><em>stdout</em></span> goes.
                  Empty and possibly non-existent at submit time (though we
                  still have the filename). The kickstart record for job
                  nodes.</td>
</tr>
<tr>
<td>PEGASUS_STDERR</td>
<td>always</td>
<td>The filename where <span class="emphasis"><em>stderr</em></span> goes.
                  Empty and possibly non-existent at submit time (though we
                  still have the filename).</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>Generators should use CDATA encapsulated values to the invoke
          element to minimize interference. Unfortunately, CDATA cannot be
          nested, so if the user invocation contains a CDATA section, we
          suggest that they use careful XML-entity escaped strings. The <a class="link" href="notifications.php" title="6.4. Notifications">notifications section</a> describes these
          in further detail.</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h5 class="title">
<a name="idp53818448"></a>14.1.1.4.2. DAG Nodes</h5></div></div></div>
<p>A workflow that has already been concretized, either by an
          earlier run of Pegasus, or otherwise constructed for DAGMan
          execution, can be included into the current workflow using the
          <span class="emphasis"><em>dag</em></span> element.</p>
<pre class="programlisting">  &lt;dag id="ID000003" name="black.dag" node-label="foo" &gt;
    &lt;profile namespace="dagman" key="DIR"&gt;/dag-dir/test&lt;/profile&gt;
    &lt;invoke&gt; &lt;!-- optional, should be possible --&gt; &lt;/invoke&gt;
    &lt;uses file="sites.xml" link="input" register="false" transfer="true" type="data"/&gt;     
  &lt;/dag&gt;</pre>
<p>The <span class="emphasis"><em>id</em></span> and
          <span class="emphasis"><em>node-label</em></span> attributes were described <a class="link" href="api.php#api-graph-nodes" title="14.1.1.4. Graph Nodes">previously</a>. The
          <span class="emphasis"><em>name</em></span> attribute refers to a file from the File
          Catalog that provides the actual DAGMan DAG as data content. The
          <span class="emphasis"><em>dag</em></span> element features optional
          <span class="emphasis"><em>profile</em></span> elements. These would most likely
          pertain to the <code class="literal">dagman</code> and <code class="literal">env</code>
          profile namespaces. It should be possible to have the optional
          <span class="emphasis"><em>notify</em></span> element in the same manner as for
          jobs.</p>
<p>A graph node that is a dag instead of a job would just use a
          different submit file generator to create a DAGMan invocation. There
          can be an <span class="emphasis"><em>argument</em></span> element to modify the
          command-line passed to DAGMan.</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h5 class="title">
<a name="idp54419280"></a>14.1.1.4.3. DAX Nodes</h5></div></div></div>
<p>A still to be planned workflow incurs an invocation of the
          Pegasus planner as part of the workflow. This still abstract
          sub-workflow uses the <span class="emphasis"><em>dax</em></span> element.</p>
<pre class="programlisting">  &lt;dax id="ID000002" name="black.dax" node-label="bar" &gt;
    &lt;profile namespace="env" key="foo"&gt;bar&lt;/profile&gt;
    &lt;argument&gt;-Xmx1024 -Xms512 -Dpegasus.dir.storage=storagedir  -Dpegasus.dir.exec=execdir -o local --dir ./datafind -vvvvv --force -s dax_site &lt;/argument&gt;
    &lt;invoke&gt; &lt;!-- optional, may not be possible here --&gt; &lt;/invoke&gt;
    &lt;uses file="sites.xml" link="input" register="false" transfer="true" type="data" /&gt;
  &lt;/dax&gt;</pre>
<p>In addition to the <span class="emphasis"><em>id</em></span> and
          <span class="emphasis"><em>node-label</em></span> attributes, See <a class="link" href="api.php#api-graph-nodes" title="14.1.1.4. Graph Nodes">Graph Nodes</a>. The
          <span class="emphasis"><em>name</em></span> attribute refers to a file from the File
          Catalog that provides the to be planned DAX as external file data
          content. The <span class="emphasis"><em>dax</em></span> element features optional
          <span class="emphasis"><em>profile</em></span> elements. These would most likely
          pertain to the <code class="literal">pegasus</code>, <code class="literal">dagman</code>
          and <code class="literal">env</code> profile namespaces. It may be possible to
          have the optional <span class="emphasis"><em>notify</em></span> element in the same
          manner as for jobs.</p>
<p>A graph node that is a <span class="emphasis"><em>dax</em></span> instead of a
          job would just use yet another submit file and pre-script generator
          to create a DAGMan invocation. The <span class="emphasis"><em>argument</em></span>
          string pertains to the command line of the to-be-generated DAGMan
          invocation.</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h5 class="title">
<a name="idp55559376"></a>14.1.1.4.4. Inner ADAG Nodes</h5></div></div></div>
<p>While completeness would argue to have a recursive nesting of
          <span class="emphasis"><em>adag</em></span> elements, such recursive nestings are
          currently not supported, not even in the schema. If you need to nest
          workflows, please use the <span class="emphasis"><em>dax</em></span> or
          <span class="emphasis"><em>dag</em></span> element to achieve the same goal.</p>
</div>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="idp53253312"></a>14.1.1.5. The Dependency Section</h4></div></div></div>
<p>This section describes the dependencies between the jobs.</p>
<pre class="programlisting">  &lt;!-- part 3: list of control-flow dependencies --&gt;
  &lt;child ref="ID000002"&gt;
    &lt;parent ref="ID000001" edge-label="edge1" /&gt;
  &lt;/child&gt;
  &lt;child ref="ID000003"&gt;
    &lt;parent ref="ID000001" edge-label="edge2" /&gt;
  &lt;/child&gt;
  &lt;child ref="ID000004"&gt;
    &lt;parent ref="ID000002" edge-label="edge3" /&gt;
    &lt;parent ref="ID000003" edge-label="edge4" /&gt;
  &lt;/child&gt;</pre>
<p>Each <span class="emphasis"><em>child</em></span> element contains one or more
        <span class="emphasis"><em>parent</em></span> element. Either element refers to a
        <span class="emphasis"><em>job</em></span>, <span class="emphasis"><em>dag</em></span> or
        <span class="emphasis"><em>dax</em></span> element id attribute using the
        <span class="emphasis"><em>ref</em></span> attribute. In this version, we relaxed the
        <code class="code">xs:IDREF</code> constraint in favor of a restriction on the
        <code class="code">xs:NMTOKEN</code> type to permit a larger set of
        identifiers.</p>
<p>The <span class="emphasis"><em>parent</em></span> element has an optional
        <span class="emphasis"><em>edge-label</em></span> attribute.</p>
<div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Warning</h3>
<p>The <span class="emphasis"><em>edge-label</em></span> attribute is currently
          unused.</p>
</div>
<p>Its goal is to annotate edges when drawing workflow
        graphs.</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="idp62585264"></a>14.1.1.6. Closing</h4></div></div></div>
<p>As any XML element, the root element needs to be closed.</p>
<pre class="programlisting">&lt;/adag&gt;</pre>
</div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp62587536"></a>14.1.2. DAX XML Schema Example</h3></div></div></div>
<p>The following code example shows the XML instance document
      representing the diamond workflow.</p>
<pre class="programlisting">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;adag xmlns="http://pegasus.isi.edu/schema/DAX"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://pegasus.isi.edu/schema/DAX http://pegasus.isi.edu/schema/dax-3.3.xsd"
 version="3.3" name="diamond" index="0" count="1"&gt;
  &lt;!-- part 1.1: invocations --&gt;
  &lt;invoke when="on_error"&gt;/bin/mailx -s &amp;apos;diamond failed&amp;apos; use@some.domain&lt;/invoke&gt;

  &lt;!-- part 1.2: included replica catalog --&gt;
  &lt;file name="f.a"&gt;
    &lt;pfn url="file:///lfs/voeckler/src/svn/pegasus/trunk/examples/grid-blackdiamond-perl/f.a" site="local" /&gt;
  &lt;/file&gt;

  &lt;!-- part 1.3: included transformation catalog --&gt;
  &lt;executable namespace="diamond" name="preprocess" version="2.0" arch="x86_64" os="linux" installed="false"&gt;
    &lt;profile namespace="globus" key="maxtime"&gt;2&lt;/profile&gt;
    &lt;profile namespace="dagman" key="RETRY"&gt;3&lt;/profile&gt;
    &lt;pfn url="file:///opt/pegasus/latest/bin/keg" site="local" /&gt;
  &lt;/executable&gt;
  &lt;executable namespace="diamond" name="analyze" version="2.0" arch="x86_64" os="linux" installed="false"&gt;
    &lt;profile namespace="globus" key="maxtime"&gt;2&lt;/profile&gt;
    &lt;profile namespace="dagman" key="RETRY"&gt;3&lt;/profile&gt;
    &lt;pfn url="file:///opt/pegasus/latest/bin/keg" site="local" /&gt;
  &lt;/executable&gt;
  &lt;executable namespace="diamond" name="findrange" version="2.0" arch="x86_64" os="linux" installed="false"&gt;
    &lt;profile namespace="globus" key="maxtime"&gt;2&lt;/profile&gt;
    &lt;profile namespace="dagman" key="RETRY"&gt;3&lt;/profile&gt;
    &lt;pfn url="file:///opt/pegasus/latest/bin/keg" site="local" /&gt;
  &lt;/executable&gt;

  &lt;!-- part 2: definition of all jobs (at least one) --&gt;
  &lt;job namespace="diamond" name="preprocess" version="2.0" id="ID000001"&gt;
    &lt;argument&gt;-a preprocess -T60 -i &lt;file name="f.a" /&gt; -o &lt;file name="f.b1" /&gt; &lt;file name="f.b2" /&gt;&lt;/argument&gt;
    &lt;uses name="f.b2" link="output" register="false" transfer="true" /&gt;
    &lt;uses name="f.b1" link="output" register="false" transfer="true" /&gt;
    &lt;uses name="f.a" link="input" /&gt;
  &lt;/job&gt;
  &lt;job namespace="diamond" name="findrange" version="2.0" id="ID000002"&gt;
    &lt;argument&gt;-a findrange -T60 -i &lt;file name="f.b1" /&gt; -o &lt;file name="f.c1" /&gt;&lt;/argument&gt;
    &lt;uses name="f.b1" link="input" register="false" transfer="true" /&gt;
    &lt;uses name="f.c1" link="output" register="false" transfer="true" /&gt;
  &lt;/job&gt;
  &lt;job namespace="diamond" name="findrange" version="2.0" id="ID000003"&gt;
    &lt;argument&gt;-a findrange -T60 -i &lt;file name="f.b2" /&gt; -o &lt;file name="f.c2" /&gt;&lt;/argument&gt;
    &lt;uses name="f.b2" link="input" register="false" transfer="true" /&gt;
    &lt;uses name="f.c2" link="output" register="false" transfer="true" /&gt;
  &lt;/job&gt;
  &lt;job namespace="diamond" name="analyze" version="2.0" id="ID000004"&gt;
    &lt;argument&gt;-a analyze -T60 -i &lt;file name="f.c1" /&gt; &lt;file name="f.c2" /&gt; -o &lt;file name="f.d" /&gt;&lt;/argument&gt;
    &lt;uses name="f.c2" link="input" register="false" transfer="true" /&gt;
    &lt;uses name="f.d" link="output" register="false" transfer="true" /&gt;
    &lt;uses name="f.c1" link="input" register="false" transfer="true" /&gt;
  &lt;/job&gt;

  &lt;!-- part 3: list of control-flow dependencies --&gt;
  &lt;child ref="ID000002"&gt;
    &lt;parent ref="ID000001" /&gt;
  &lt;/child&gt;
  &lt;child ref="ID000003"&gt;
    &lt;parent ref="ID000001" /&gt;
  &lt;/child&gt;
  &lt;child ref="ID000004"&gt;
    &lt;parent ref="ID000002" /&gt;
    &lt;parent ref="ID000003" /&gt;
  &lt;/child&gt;
&lt;/adag&gt;
</pre>
<p>The above workflow defines the black diamond from the abstract
      workflow section of the <a class="link" href="about.php" title="Chapter 1. Introduction">Introduction</a>
      chapter. It will require minimal configuration, because the catalog
      sections include all necessary declarations.</p>
<p>The file element defines the location of the required input file
      in terms of the local machine. Please note that</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem"><p>The <span class="bold"><strong>file</strong></span> element declares the
          required input file "f.a" in terms of the local machine. Please note
          that if you plan the workflow for a remote site, the has to be some
          way for the file to be staged from the local site to the remote
          site. While Pegasus will augment the workflow with such ancillary
          jobs, the site catalog as well as local and remote site have to be
          set up properlyl. For a locally run workflow you don't need to do
          anything.</p></li>
<li class="listitem"><p>The <span class="bold"><strong>executable</strong></span> elements
          declare the same executable keg that is to be run for each the
          logical transformation in terms of the remote site
          <span class="emphasis"><em>futuregrid</em></span>. To declare it for a local site, you
          would have to adjust the <span class="emphasis"><em>site</em></span> attribute's value
          to <code class="literal">local</code>. This section also shows that the same
          executable may come in different guises as transformation.</p></li>
<li class="listitem"><p>The <span class="bold"><strong>job</strong></span> elements define the
          workflow's logical constituents, the way to invoke the
          <code class="literal">keg</code> command, where to put filenames on the
          commandline, and what files are consumed or produced. In addition to
          the direction of files, further attributes determine whether to
          register the file with a replica catalog and whether to transfer it
          to the output site in case of a product. We are only interested in
          the final data product "f.d" in this workflow, and not any
          intermediary files. Typically, you would also want to register the
          data products in the replica catalog, especially in larger
          scenarios.</p></li>
<li class="listitem"><p>The <span class="bold"><strong>child</strong></span> elements define the
          control flow between the jobs.</p></li>
</ul></div>
</div>
</div>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="static_bp_file.php">Prev</a> </td>
<td width="20%" align="center"> </td>
<td width="40%" align="right"> <a accesskey="n" href="dax_generator_api.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">13.6. Pegasus static.bp File </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> 14.2. DAX Generator API</td>
</tr>
</table>
</div>
</div><?php  
            do_html_footer();
        ?>
