<?php  
            include_once( $_SERVER['DOCUMENT_ROOT']."/static/includes/common.inc.php" );
            do_html_header("Documentation");
        ?><div id="content">
<div class="navheader">
<table width="100%" summary="Navigation header"><tr>
<td width="20%" align="left">
<a accesskey="p" href="profiles.php">Prev</a> </td>
<td width="60%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="20%" align="right"> <a accesskey="n" href="submit_directory.php">Next</a>
</td>
</tr></table>
<hr>
</div>
<div class="section" title="12.3. Properties">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="properties"></a>12.3. Properties</h2></div></div></div>
<div class="toc"><dl>
<dt><span class="section"><a href="properties.php#local_dir_props">12.3.1. Local Directories Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#site_dir_props">12.3.2. Site Directories Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#schema_props">12.3.3. Schema File Location Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#db_props">12.3.4. Database Drivers For All Relational Catalogs</a></span></dt>
<dt><span class="section"><a href="properties.php#catalog_props">12.3.5. Catalog Related Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#replica_sel_props">12.3.6. Replica Selection Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#site_sel_props">12.3.7. Site Selection Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#data_conf_props">12.3.8. Data Staging Configuration Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#transfer_props">12.3.9. Transfer Configuration Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#monitoring_props">12.3.10. Monitoring Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#job_clustering_props">12.3.11. Job Clustering Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#logging_props">12.3.12. Logging Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#cleanup_props">12.3.13. Cleanup Properties</a></span></dt>
<dt><span class="section"><a href="properties.php#misc__props">12.3.14. Miscellaneous Properties</a></span></dt>
</dl></div>
<p>Properties are primarily used to configure the behavior of the
    Pegasus Workflow Planner at a global level. The properties file is
    actually a java properties file and follows the same conventions as that
    to specify the properties.</p>
<p>Please note that the values rely on proper capitalization, unless
    explicitly noted otherwise.</p>
<p>Some properties rely with their default on the value of other
    properties. As a notation, the curly braces refer to the value of the
    named property. For instance, ${pegasus.home} means that the value depends
    on the value of the pegasus.home property plus any noted additions. You
    can use this notation to refer to other properties, though the extent of
    the subsitutions are limited. Usually, you want to refer to a set of the
    standard system properties. Nesting is not allowed. Substitutions will
    only be done once.</p>
<p>There is a priority to the order of reading and evaluating
    properties. Usually one does not need to worry about the priorities.
    However, it is good to know the details of when which property applies,
    and how one property is able to overwrite another. The following is a
    mutually exclusive list ( highest priority first ) of property file
    locations.</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem">
           --conf option to the tools. Almost all of the clients that use properties have a --conf option to specify the property file to pick up. 
        </li>
<li class="listitem">
           submit-dir/pegasus.xxxxxxx.properties file. All tools that work on the submit directory ( i.e after pegasus has planned a workflow) pick up the pegasus.xxxxx.properties file from the submit directory. The location for the pegasus.xxxxxxx.propertiesis picked up from the braindump file. 
        </li>
<li class="listitem">
           The properties defined in the user property file 

          <span class="emphasis"><em>${user.home}/.pegasusrc</em></span>

           have lowest priority. 
        </li>
</ol></div>
<p>Commandline properties have the highest priority. These override any
    property loaded from a property file. Each commandline property is
    introduced by a -D argument. Note that these arguments are parsed by the
    shell wrapper, and thus the -D arguments must be the first arguments to
    any command. Commandline properties are useful for debugging
    purposes.</p>
<p>From Pegasus 3.1 release onwards, support has been dropped for the
    following properties that were used to signify the location of the
    properties file</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
           pegasus.properties 
        </li>
<li class="listitem">
           pegasus.user.properties 
        </li>
</ul></div>
<p>The following example provides a sensible set of properties to be
    set by the user property file. These properties use mostly non-default
    settings. It is an example only, and will not work for you:</p>
<pre class="screen">
pegasus.catalog.replica              File
pegasus.catalog.replica.file         ${pegasus.home}/etc/sample.rc.data
pegasus.catalog.transformation       Text
pegasus.catalog.transformation.file  ${pegasus.home}/etc/sample.tc.text
pegasus.catalog.site.file            ${pegasus.home}/etc/sample.sites.xml
</pre>
<p>If you are in doubt which properties are actually visible, pegasus
    during the planning of the workflow dumps all properties after reading and
    prioritizing in the submit directory in a file with the suffix
    properties.</p>
<div class="section" title="12.3.1. Local Directories Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="local_dir_props"></a>12.3.1. Local Directories Properties</h3></div></div></div>
<p>This section describes the GNU directory structure conventions.
      GNU distinguishes between architecture independent and thus sharable
      directories, and directories with data specific to a platform, and thus
      often local. It also distinguishes between frequently modified data and
      rarely changing data. These two axis form a space of four distinct
      directories.</p>
<div class="table">
<a name="idp42613056"></a><p class="title"><b>Table 12.10. Local Directories Related Properties</b></p>
<div class="table-contents"><table summary="Local Directories Related Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes
                </strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.home.datadir<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home}/share</p></div></td>
<td>The datadir directory contains broadly visible and
                possibly exported configuration files that rarely change. This
                directory is currently unused.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.home.sysconfdir<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home}/etc</p></div></td>
<td>The system configuration directory contains
                configuration files that are specific to the machine or
                installation, and that rarely change. This is the directory
                where the XML schema definition copies are stored, and where
                the base pool configuration file is stored.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.home.sharedstatedir<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home}/com</p></div></td>
<td>Frequently changing files that are broadly visible are
                stored in the shared state directory. This is currently
                unused.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.home.localstatedir<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home}/var</p></div></td>
<td>Frequently changing files that are specific to a
                machine and/or installation are stored in the local state
                directory. This is currently unused</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.dir.submit.logs<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>
<p>This property can be used to specify the
                directory where the condor logs for the workflow should go to.
                By default, starting 4.2.1 release, Pegasus will setup the log
                to be in the workflow submit directory. This can create
                problems, in case users submit directories are on
                NSF.</p>
<p>This is done to ensure that the logs are
                created in a local directory even though the submit directory
                maybe on NFS</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break"></p>
</div>
<div class="section" title="12.3.2. Site Directories Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="site_dir_props"></a>12.3.2. Site Directories Properties</h3></div></div></div>
<p>The site directory properties modify the behavior of remotely run
      jobs. In rare occasions, it may also pertain to locally run compute
      jobs.</p>
<div class="table">
<a name="idp32010720"></a><p class="title"><b>Table 12.11. Site Directories Related Properties</b></p>
<div class="table-contents"><table summary="Site Directories Related Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes </strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.dir.useTimestamp<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.1<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false</p></div></td>
<td>While creating the submit directory, Pegasus employs a
              run numbering scheme. Users can use this Boolean property to use
              a timestamp based numbering scheme instead of the runxxxx
              scheme.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.dir.exec<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>This property modifies the remote location work directory
              in which all your jobs will run. If the path is relative then it
              is appended to the work directory (associated with the site), as
              specified in the site catalog. If the path is absolute then it
              overrides the work directory specified in the site
              catalog.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.dir.storage.mapper<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.3<br>
<span class="bold"><strong>Type        : </strong></span>Enumeration<br>
<span class="bold"><strong>Values      :</strong></span> Flat|Fixed|Hashed|Replica<br>
<span class="bold"><strong>Default     :</strong></span> Flat</p></div></td>
<td>This property modifies determines how the output files
              are mapped on the output site storage location. <p></p>In order
              to preserve backward compatibility, setting the boolean property
              pegasus.dir.storage.deep results in the Hashed output mapper to
              be loaded, if no output mapper property is specified.
              <div class="variablelist"><dl>
<dt><span class="term">Flat</span></dt>
<dd>
                       By default, Pegasus will place the output files in the storage directory specified in the site catalog for the output site. 
                    </dd>
<dt><span class="term">Fixed</span></dt>
<dd>
                       Using this mapper, users can specify an externally accesible url to the storage directory in their properties file. The following property needs to be set. 

                      <pre class="screen">
pegasus.dir.storage.mapper.fixed.url  an externally accessible URL to the
storage directory on the output site
e.g. gsiftp://outputs.isi.edu/shared/outputs
</pre>

                       Note: For hierarchal workflows, the above property needs to be set separately for each dax job, if you want the sub workflow outputs to goto a different directory. 
                    </dd>
<dt><span class="term">Hashed</span></dt>
<dd>
                       This mapper results in the creation of a deep directory structure on the output site, while populating the results. The base directory on the remote end is determined from the site catalog. Depending on the number of files being staged to the remote site a Hashed File Structure is created that ensures that only 256 files reside in one directory. To create this directory structure on the storage site, Pegasus relies on the directory creation feature of the Grid FTP server, which appeared in globus 4.0.x 
                    </dd>
<dt><span class="term">Replica</span></dt>
<dd>
                       This mapper determines the path for an output file on the output site by querying an output replica catalog. The output site is one that is passed on the command line. The output replica catalog can be configured by specifiing the properties with the prefix pegasus.dir.storage.replica. By default, a Regex File based backend is assumed unless overridden. For example 

                      <pre class="screen">
pegasus.dir.storage.mapper.replica       Regex|File
pegasus.dir.storage.mapper.replica.file  the RC file at the backend to use if using a file based RC
</pre>
</dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.dir.storage.deep<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.1<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false</p></div></td>
<td>
<p>This Boolean property results in the creation of a
              deep directory structure on the output site, while populating
              the results. The base directory on the remote end is determined
              from the site catalog.</p>
<p>To this base directory, the
              relative submit directory structure (
              $user/$vogroup/$label/runxxxx ) is
              appended.</p>
<p>$storage = $base +
              $relative_submit_directory</p>
<p>This is the base
              directory that is passed to the storage
              mapper.</p>
<p>Note: To preserve backward compatibilty,
              setting this property results in the Hashed mapper to be loaded
              unless pegasus.dir.storage.mapper is explicitly specified.
              Before 4.3, this property resulted in HashedDirectory
              structure.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.dir.create.strategy<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.2<br>
<span class="bold"><strong>Type        : </strong></span>Enumeration<br>
<span class="bold"><strong>Values      :</strong></span> HourGlass|Tentacles|Minimal<span class="bold"><strong><br>
Default     :</strong></span> Minimal</p></div></td>
<td>
<p>If the </p>
<pre class="screen">--randomdir</pre>
<p> option is given
              to the Planner at runtime, the Pegasus planner adds nodes that
              create the random directories at the remote pool sites, before
              any jobs are actually run. The two modes determine the placement
              of these nodes and their dependencies to the rest of the
              graph.</p>
<div class="variablelist"><dl>
<dt><span class="term">HourGlass</span></dt>
<dd>
                       It adds a make directory node at the top level of the graph, and all these concat to a single dummy job before branching out to the root nodes of the original/ concrete dag so far. So we introduce a classic X shape at the top of the graph. Hence the name HourGlass. 
                    </dd>
<dt><span class="term">Tentacles</span></dt>
<dd>
                       This option places the jobs creating directories at the top of the graph. However instead of constricting it to an hour glass shape, this mode links the top node to all the relevant nodes for which the create dir job is necessary. It looks as if the node spreads its tentacleas all around. This puts more load on the DAGMan because of the added dependencies but removes the restriction of the plan progressing only when all the create directory jobs have progressed on the remote pools, as is the case in the HourGlass model. 
                    </dd>
<dt><span class="term">Minimal</span></dt>
<dd>
                       The strategy involves in walking the graph in a BFS order, and updating a bit set associated with each job based on the BitSet of the parent jobs. The BitSet indicates whether an edge exists from the create dir job to an ancestor of the node. For a node, the bit set is the union of all the parents BitSets. The BFS traversal ensures that the bitsets are of a node are only updated once the parents have been processed. 
                    </dd>
</dl></div>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p></p>
</div>
<div class="section" title="12.3.3. Schema File Location Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="schema_props"></a>12.3.3. Schema File Location Properties</h3></div></div></div>
<p>This section defines the location of XML schema files that are
      used to parse the various XML document instances in the PEGASUS. The
      schema backups in the installed file-system permit PEGASUS operations
      without being online.</p>
<div class="table">
<a name="idp40000272"></a><p class="title"><b>Table 12.12. Schema File Location Properties</b></p>
<div class="table-contents"><table summary="Schema File Location Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes
                </strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.schema.dax<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home.sysconfdir}/dax-3.4.xsd</p></div></td>
<td>This file is a copy of the XML schema that describes
                abstract DAG files that are the result of the abstract
                planning process, and input into any concrete planning.
                Providing a copy of the schema enables the parser to use the
                local copy instead of reaching out to the Internet, and
                obtaining the latest version from the Pegasus website
                dynamically.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.schema.sc<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home.sysconfdir}/sc-4.0.xsd</p></div></td>
<td>This file is a copy of the XML schema that describes
                the xml description of the site catalog. Providing a copy of
                the schema enables the parser to use the local copy instead of
                reaching out to the internet, and obtaining the latest version
                from the GriPhyN website dynamically.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.schema.ivr<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home.sysconfdir}/iv-2.0.xsd</p></div></td>
<td>This file is a copy of the XML schema that describes
                invocation record files that are the result of the a grid
                launch in a remote or local site. Providing a copy of the
                schema enables the parser to use the local copy instead of
                reaching out to the Internet, and obtaining the latest version
                from the Pegasus website dynamically.</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break"></p>
</div>
<div class="section" title="12.3.4. Database Drivers For All Relational Catalogs">
<div class="titlepage"><div><div><h3 class="title">
<a name="db_props"></a>12.3.4. Database Drivers For All Relational Catalogs</h3></div></div></div>
<p></p>
<div class="table">
<a name="idp43868592"></a><p class="title"><b>Table 12.13. Database Driver Properties</b></p>
<div class="table-contents"><table summary="Database Driver Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Property Key </strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.*.db.driver<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>Enumeration<br>
<span class="bold"><strong>Values      : </strong></span>MySQL|PostGres|SQLite<span class="bold"><strong><br>
Default     :</strong></span> (no default)</p></div></td>
<td>
<p>The database driver class is dynamically loaded, as
              required by the schema. Currently, only MySQL 5.x, PostGreSQL
              7.3 and SQlite are supported. Their respective JDBC3 driver is
              provided as part and parcel of the PEGASUS.</p>
<p>The * in
              the property name can be replaced by a catalog name to apply the
              property only for that catalog. Valid catalog names
              are</p>
<pre class="screen">replica
</pre>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.*.db.url<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>Database URL<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>Each database has its own string to contact the database
              on a given host, port, and database. Although most driver URLs
              allow to pass arbitrary arguments, please use the
              pegasus.catalog.[catalog-name].db.* keys or
              pegasus.catalog.*.db.* to preload these arguments. <p></p>THE
              URL IS A MANDATORY PROPERTY FOR ANY DBMS BACKEND.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.*.db.user<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String<br>
<span class="bold"><strong>Default     :</strong></span> </p></div></td>
<td>
<p>In order to access a database, you must provide the
              name of your account on the DBMS. This property is
              database-independent. THIS IS A MANDATORY PROPERTY FOR MANY DBMS
              BACKENDS.</p>
<p>The * in the property name can be replaced
              by a catalog name to apply the property only for that catalog.
              Valid catalog names are</p>
<pre class="screen">replica</pre>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.*.db.password<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>
<p>In order to access a database, you must provide an
              optional password of your account on the DBMS. This property is
              database-independent. THIS IS A MANDATORY PROPERTY, IF YOUR DBMS
              BACKEND ACCOUNT REQUIRES A PASSWORD.</p>
<p>The * in the
              property name can be replaced by a catalog name to apply the
              property only for that catalog. Valid catalog names are</p>
<pre class="screen">replica</pre>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.*.db.*<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>
<p></p>
<p>Each database has a multitude of options to
              control in fine detail the further behaviour. You may want to
              check the JDBC3 documentation of the JDBC driver for your
              database for details. The keys will be passed as part of the
              connect properties by stripping the
              "pegasus.catalog.[catalog-name].db." prefix from them. The
              catalog-name can be replaced by the following values provenance
              for Provenance Catalog (PTC), replica for Replica Catalog
              (RC)</p>
<p>Postgres 7.3 parses the following properties:
              </p>
<pre class="screen">
pegasus.catalog.*.db.user
pegasus.catalog.*.db.password
pegasus.catalog.*.db.PGHOST
pegasus.catalog.*.db.PGPORT
pegasus.catalog.*.db.charSet
pegasus.catalog.*.db.compatible
</pre>
<p>MySQL 5.0 parses the following
              properties:</p>
<pre class="screen">
pegasus.catalog.*.db.user
pegasus.catalog.*.db.password
pegasus.catalog.*.db.databaseName
pegasus.catalog.*.db.serverName
pegasus.catalog.*.db.portNumber
pegasus.catalog.*.db.socketFactory
pegasus.catalog.*.db.strictUpdates
pegasus.catalog.*.db.ignoreNonTxTables
pegasus.catalog.*.db.secondsBeforeRetryMaster
pegasus.catalog.*.db.queriesBeforeRetryMaster
pegasus.catalog.*.db.allowLoadLocalInfile
pegasus.catalog.*.db.continueBatchOnError
pegasus.catalog.*.db.pedantic
pegasus.catalog.*.db.useStreamLengthsInPrepStmts
pegasus.catalog.*.db.useTimezone
pegasus.catalog.*.db.relaxAutoCommit
pegasus.catalog.*.db.paranoid
pegasus.catalog.*.db.autoReconnect
pegasus.catalog.*.db.capitalizeTypeNames
pegasus.catalog.*.db.ultraDevHack
pegasus.catalog.*.db.strictFloatingPoint
pegasus.catalog.*.db.useSSL
pegasus.catalog.*.db.useCompression
pegasus.catalog.*.db.socketTimeout
pegasus.catalog.*.db.maxReconnects
pegasus.catalog.*.db.initialTimeout
pegasus.catalog.*.db.maxRows
pegasus.catalog.*.db.useHostsInPrivileges
pegasus.catalog.*.db.interactiveClient
pegasus.catalog.*.db.useUnicode
pegasus.catalog.*.db.characterEncoding
</pre>
<p>MS SQL Server 2000 support the following properties
              (keys are case-insensitive, e.g. both "user" and "User" are
              valid):</p>
<pre class="screen">
pegasus.catalog.*.db.User
pegasus.catalog.*.db.Password
pegasus.catalog.*.db.DatabaseName
pegasus.catalog.*.db.ServerName
pegasus.catalog.*.db.HostProcess
pegasus.catalog.*.db.NetAddress
pegasus.catalog.*.db.PortNumber
pegasus.catalog.*.db.ProgramName
pegasus.catalog.*.db.SendStringParametersAsUnicode
pegasus.catalog.*.db.SelectMethod
</pre>
<p>The * in the property name can be replaced by a catalog
              name to apply the property only for that catalog. Valid catalog
              names are</p>
<pre class="screen">replica</pre>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.*.timeout<span class="bold"><strong><span class="bold"><strong><br>
                      Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
                      Scope       :</strong></span> Properties<br>
                      <span class="bold"><strong>Since       :</strong></span> 4.5.1<br>
                      <span class="bold"><strong>Type        : </strong></span>Integer<br>
                      <span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>
<p>This property sets a busy handler that sleeps for a
              specified amount of time (in seconds) when a table is locked.
              This property has effect only in a sqlite database.</p>
              <p>The * in the property name can be replaced by a catalog
              name to apply the property only for that catalog. Valid catalog
              names are</p>
<pre class="screen">master
workflow</pre>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break">
</div>
<div class="section" title="12.3.5. Catalog Related Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="catalog_props"></a>12.3.5. Catalog Related Properties</h3></div></div></div>
<p></p>
<div class="table">
<a name="idp46601408"></a><p class="title"><b>Table 12.14. Replica Catalog Properties</b></p>
<div class="table-contents"><table summary="Replica Catalog Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong>Property Key: </strong></span>pegasus.catalog.replica<span class="bold"><strong><br>
Profile  Key: </strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> File<br>
</p></div></td>
<td>
<p>Pegasus queries a Replica Catalog to discover the
              physical filenames (PFN) for input files specified in the DAX.
              Pegasus can interface with various types of Replica Catalogs.
              This property specifies which type of Replica Catalog to use
              during the planning process.</p>
<div class="variablelist"><dl>
<dt><span class="term">JDBCRC</span></dt>
<dd>
                       In this mode, Pegasus queries a SQL based replica catalog that is accessed via JDBC. The sql schema's for this catalog can be found at $PEGASUS_HOME/sql directory. To use JDBCRC, the user additionally needs to set the following properties 

                      <div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem">pegasus.catalog.replica.db.driver =
                        mysql</li>
<li class="listitem">pegasus.catalog.replica.db.url = jdbc url to
                        database e.g
                        jdbc:mysql://database-host.isi.edu/database-name</li>
<li class="listitem">pegasus.catalog.replica.db.user =
                        database-user</li>
<li class="listitem">pegasus.catalog.replica.db.password =
                        database-password</li>
</ol></div>
</dd>
<dt><span class="term">File</span></dt>
<dd>
<p>In this mode, Pegasus queries a file based replica
                      catalog. It is neither transactionally safe, nor advised
                      to use for production purposes in any way. Multiple
                      concurrent instances <span class="emphasis"><em>will clobber</em></span>
                      each other!. The site attribute should be specified
                      whenever possible. The attribute key for the site
                      attribute is "site".</p>
<p>The LFN may or may not be quoted. If it contains
                      linear whitespace, quotes, backslash or an equality
                      sign, it must be quoted and escaped. Ditto for the PFN.
                      The attribute key-value pairs are separated by an
                      equality sign without any whitespaces. The value may be
                      in quoted. The LFN sentiments about quoting
                      apply.</p>
<pre class="screen">
LFN PFN
LFN PFN a=b [..]
LFN PFN a="b" [..]
"LFN w/LWS" "PFN w/LWS" [..]
</pre>
<p>To use File, the user additionally needs to
                      specify pegasus.catalog.replica.file property to specify
                      the path to the file based RC.</p>
</dd>
<dt><span class="term">Regex</span></dt>
<dd>
<p>In this mode, Pegasus queries a file based replica
                      catalog. It is neither transactionally safe, nor advised
                      to use for production purposes in any way. Multiple
                      concurrent access to the File will end up clobbering the
                      contents of the file. The site attribute should be
                      specified whenever possible. The attribute key for the
                      site attribute is "site".</p>
<p>The LFN may or may not be quoted. If it contains
                      linear whitespace, quotes, backslash or an equality
                      sign, it must be quoted and escaped. Ditto for the PFN.
                      The attribute key-value pairs are separated by an
                      equality sign without any whitespaces. The value may be
                      in quoted. The LFN sentiments about quoting
                      apply.</p>
<p>In addition users can specifiy regular expression
                      based LFN's. A regular expression based entry should be
                      qualified with an attribute named 'regex'. The attribute
                      regex when set to true identifies the catalog entry as a
                      regular expression based entry. Regular expressions
                      should follow Java regular expression syntax.</p>
<p>For example, consider a replica catalog as shown
                      below.</p>
<p>Entry 1 refers to an entry which does not use a
                      resular expressions. This entry would only match a file
                      named 'f.a', and nothing else. Entry 2 referes to an
                      entry which uses a regular expression. In this entry f.a
                      referes to files having name as f[any-character]a i.e.
                      faa, f.a, f0a, etc.</p>
<pre class="screen">
f.a file:///Vol/input/f.a site="local"
f.a file:///Vol/input/f.a site="local" regex="true"
</pre>
<p>Regular expression based entries also support
                      substitutions. For example, consider the regular
                      expression based entry shown below.</p>
<p>Entry 3 will match files with name alpha.csv,
                      alpha.txt, alpha.xml. In addition, values matched in the
                      expression can be used to generate a PFN.</p>
<p>For the entry below if the file being looked up is
                      alpha.csv, the PFN for the file would be generated as
                      file:///Volumes/data/input/csv/alpha.csv. Similary if
                      the file being lookedup was alpha.csv, the PFN for the
                      file would be generated as
                      file:///Volumes/data/input/xml/alpha.xml i.e. The
                      section [0], [1] will be replaced. Section [0] refers to
                      the entire string i.e. alpha.csv. Section [1] refers to
                      a partial match in the input i.e. csv, or txt, or xml.
                      Users can utilize as many sections as they wish.</p>
<pre class="screen">
alpha\.(csv|txt|xml) file:///Vol/input/[1]/[0] site="local" regex="true"
</pre>
<p>To use File, the user additionally needs to
                      specify pegasus.catalog.replica.file property to specify
                      the path to the file based RC.</p>
</dd>
<dt><span class="term">Directory</span></dt>
<dd>
<p>In this mode, Pegasus does a directory listing on
                      an input directory to create the LFN to PFN mappings.
                      The directory listing is performed recursively,
                      resulting in deep LFN mappings. For example, if an input
                      directory $input is specified with the following
                      structure </p>
<pre class="screen">
$input
$input/f.1
$input/f.2
$input/D1
$input/D1/f.3
</pre>
<p> Pegasus will create the mappings the following LFN PFN mappings
                      internally </p>
<pre class="screen">
f.1 file://$input/f.1  site="local"
f.2 file://$input/f.2  site="local"
D1/f.3 file://$input/D2/f.3 site="local"
</pre>
<p>If you don't want the deep lfn's to be created
                      then, you can set
                      pegasus.catalog.replica.directory.flat.lfn to true In
                      that case, for the previous example, Pegasus will create
                      the following LFN PFN mappings internally. </p>
<pre class="screen">
f.1 file://$input/f.1  site="local"
f.2 file://$input/f.2  site="local"
f.3 file://$input/D2/f.3 site="local"
</pre>
<p>pegasus-plan has --input-dir option that can be
                      used to specify an input directory.</p>
<p>Users can optionally specify additional properties
                      to configure the behvavior of this
                      implementation.</p>
<p>pegasus.catalog.replica.directory.site to specify
                      a site attribute other than local to associate with the
                      mappings.</p>
<p>pegasus.catalog.replica.directory.url.prefix to
                      associate a URL prefix for the PFN's constructed. If not
                      specified, the URL defaults to file://</p>
</dd>
<dt><span class="term">MRC</span></dt>
<dd>
<p>In this mode, Pegasus queries multiple replica
                      catalogs to discover the file locations on the grid. To
                      use it set</p>
<pre class="screen">
pegasus.catalog.replica MRC
</pre>
<p>Each associated replica catalog can be configured
                      via properties as follows.</p>
<p>The user associates a variable name referred to as
                      [value] for each of the catalogs, where [value] is any
                      legal identifier (concretely [A-Za-z][_A-Za-z0-9]*) For
                      each associated replica catalogs the user specifies the
                      following properties.</p>
<pre class="screen">
pegasus.catalog.replica.mrc.[value]       specifies the type of \
                                          replica catalog.
pegasus.catalog.replica.mrc.[value].key   specifies a property name\
                                          key for a particular catalog
</pre>
<pre class="screen">
pegasus.catalog.replica.mrc.directory1 Directory
pegasus.catalog.replica.mrc.directory1.url /input/dir1
pegasus.catalog.replica.mrc.directory2 Directory
pegasus.catalog.replica.mrc.directory2.url /input/dir2
</pre>
<p>In the above example, directory1, directory2 are
                      any valid identifier names and url is the property key
                      that needed to be specified.</p>
</dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong>Property Key:</strong></span></strong></span></strong></span></strong></span><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong> </strong></span></strong></span></strong></span></strong></span>pegasus.catalog.replica.chunk.size<span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> 1000<br>
</p></div></td>
<td><p>The pegasus-rc-client takes in an input file
              containing the mappings upon which to work. This property
              determines, the number of lines that are read in at a time, and
              worked upon at together. This allows the various operations like
              insert, delete happen in bulk if the underlying replica
              implementation supports it.</p></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong>Property Key:</strong></span></strong></span></strong></span></strong></span></strong></span></strong></span> </strong></span></strong></span>pegasus.catalog.replica.cache.asrc<span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><br>
Profile Key :</strong></span></strong></span></strong></span></strong></span><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong> </strong></span></strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> false<br>
</p></div></td>
<td>
<p>This Boolean property determines whether to treat
              the cache file specified as a supplemental replica catalog or
              not. User can specify on the command line to pegasus-plan a
              comma separated list of cache files using the --cache option. By
              default, the LFN-&gt;PFN mappings contained in the cache file
              are treated as cache, i.e if an entry is found in a cache file
              the replica catalog is not queried. This results in only the
              entry specified in the cache file to be available for replica
              selection.</p>Setting this property to true, results in the
              cache files to be treated as supplemental replica catalogs. This
              results in the mappings found in the replica catalog (as
              specified by pegasus.catalog.replica) to be merged with the ones
              found in the cache files. Thus, mappings for a particular LFN
              found in both the cache and the replica catalog are available
              for replica selection.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong>Property Key:</strong></span></strong></span></strong></span></strong></span></strong></span></strong></span> </strong></span></strong></span>pegasus.catalog.replica.dax.asrc<span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><br>
Profile Key :</strong></span></strong></span></strong></span></strong></span><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong> </strong></span></strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.5.2<br>
<span class="bold"><strong>Default     :</strong></span> false<br>
</p></div></td>
<td>
<p>This Boolean property determines whether to treat
              the locations of files recorded in the DAX as a supplemental
              replica catalog or not. By default, the LFN-&gt;PFN mappings
              contained in the DAX file overrides any specified in a replica
              catalog. This results in only the entry specified in the DAX
              file to be available for replica selection.</p>Setting this
              property to true, results in the locations of files recorded in
              the DAX files to be treated as a supplemental replica catalog.
              This results in the mappings found in the replica catalog (as
              specified by pegasus.catalog.replica) to be merged with the ones
              found in the cache files. Thus, mappings for a particular LFN
              found in both the DAX and the replica catalog are available for
              replica selection.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><div class="table">
<a name="idp45643840"></a><p class="title"><b>Table 12.15. Site Catalog Properties</b></p>
<div class="table-contents"><table summary="Site Catalog Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span></strong></span></strong></span></strong></span></strong></span></strong></span></strong></span></strong></span>pegasus.catalog.site<span class="bold"><strong><br>
Profile  Key: </strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> XML<br>
</p></div></td>
<td>Pegasus supports two different types of site catalogs in
              XML format conforming <div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem"><p>sc-3.0.xsd
                    http://pegasus.isi.edu/schema/sc-3.0.xsd</p></li>
<li class="listitem"><p>sc-4.0.xsd
                    http://pegasus.isi.edu/schema/sc-4.0.xsd</p></li>
</ul></div>Pegasus is able to auto-detect what schema a
              user site catalog refers to. Hence, this property may no longer
              be set.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key:</strong></span> </strong></span>pegasus.catalog.site.file<span class="bold"><strong><span class="bold"><strong><br>
Profile Key : </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home.sysconfdir}/sites.xml</p></div></td>
<td>The path to the site catalog file, that describes the
              various sites and their layouts to Pegasus.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><div class="table">
<a name="idp46010928"></a><p class="title"><b>Table 12.16. Transformation Catalog Properties</b></p>
<div class="table-contents"><table summary="Transformation Catalog Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong>Property Key: </strong></span>pegasus.catalog.transformation<span class="bold"><strong><br>
Profile  Key: </strong></span>N/A<span class="bold"><strong><br>
Scope       : </strong></span>Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> Text<br>
</p></div></td>
<td>
<p>The only recommended and supported version of
              Transformation Catalog for Pegasus is Text. For the old File
              based formats, users should use pegasus-tc-converter to convert
              File format to Text Format.</p>
<div class="variablelist"><dl>
<dt><span class="term">Text</span></dt>
<dd>
<p>In this mode, a multiline file based format is
                      understood. The file is read and cached in memory. Any
                      modifications, as adding or deleting, causes an update
                      of the memory and hence to the file underneath. All
                      queries are done against the memory
                      representation.</p>
<p>The file sample.tc.text in the etc directory
                      contains an example</p>
<p>Here is a sample textual format for transfomation
                      catalog containing one transformation on two
                      sites</p>
<pre class="screen">
tr example::keg:1.0 {
#specify profiles that apply for all the sites for the transformation
#in each site entry the profile can be overriden
profile env "APP_HOME" "/tmp/karan"
profile env "JAVA_HOME" "/bin/app"
site isi {
profile env "me" "with"
profile condor "more" "test"
profile env "JAVA_HOME" "/bin/java.1.6"
pfn "/path/to/keg"
arch  "x86"
os    "linux"
osrelease "fc"
osversion "4"
type "INSTALLED"
site wind {
profile env "me" "with"
profile condor "more" "test"
pfn "/path/to/keg"
arch  "x86"
os    "linux"
osrelease "fc"
osversion "4"
type "STAGEABLE"
</pre>
</dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span></strong></span>pegasus.catalog.transformation<span class="bold"><strong><span class="bold"><strong><br>
Profile Key : </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> ${pegasus.home.sysconfdir}/tc.text </p></div></td>
<td>The path to the transformation catalog file, that
              describes the locations of the executables.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break">
</div>
<div class="section" title="12.3.6. Replica Selection Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="replica_sel_props"></a>12.3.6. Replica Selection Properties</h3></div></div></div>
<div class="table">
<a name="idp46212800"></a><p class="title"><b>Table 12.17. Replica Selection Properties</b></p>
<div class="table-contents"><table summary="Replica Selection Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong>Property Key: </strong></span>pegasus.selector.replica<span class="bold"><strong><br>
Profile  Key: </strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String<br>
<span class="bold"><strong>Default     :</strong></span> Default<br>
<span class="bold"><strong>See Also    :</strong></span> pegasus.selector.replica.*.ignore.stagein.sites<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.replica.*.prefer.stagein.sites<br>
</p></div></td>
<td>
<p>Each job in the DAX maybe associated with input
                LFN's denoting the files that are required for the job to run.
                To determine the physical replica (PFN) for a LFN, Pegasus
                queries the replica catalog to get all the PFN's (replicas)
                associated with a LFN. Pegasus then calls out to a replica
                selector to select a replica amongst the various replicas
                returned. This property determines the replica selector to use
                for selecting the replicas.</p>
<div class="variablelist"><dl>
<dt><span class="term">Default</span></dt>
<dd>
                         If a PFN that is a file URL (starting with file:///) and has a "site" attribute matching to the site handle of the site where the compute is to be run is found, then that is returned. Else,a random PFN is selected amongst all the PFN's that have a "site" attribute matching to the site handle of the site where a compute job is to be run. Else, a random pfn is selected amongst all the PFN's. 
                      </dd>
<dt><span class="term">Restricted</span></dt>
<dd>
<p>This replica selector, allows the user to
                        specify good sites and bad sites for staging in data
                        to a particular compute site. A good site for a
                        compute site X, is a preferred site from which
                        replicas should be staged to site X. If there are more
                        than one good sites having a particular replica, then
                        a random site is selected amongst these preferred
                        sites.</p>
<p>A bad site for a compute site X, is a site from
                        which replica's should not be staged. The reason of
                        not accessing replica from a bad site can vary from
                        the link being down, to the user not having
                        permissions on that site's data.</p>
<p>The good | bad sites are specified by the
                        properties</p>
<pre class="screen">
pegasus.replica.*.prefer.stagein.sites
pegasus.replica.*.ignore.stagein.sites
</pre>
<p>where the * in the property name denotes the
                        name of the compute site. A * in the property key is
                        taken to mean all sites.</p>
<p>The pegasus.replica.*.prefer.stagein.sites
                        property takes precedence over
                        pegasus.replica.*.ignore.stagein.sites property i.e.
                        if for a site X, a site Y is specified both in the
                        ignored and the preferred set, then site Y is taken to
                        mean as only a preferred site for a site X.</p>
</dd>
<dt><span class="term">Regex</span></dt>
<dd>
<p>This replica selector allows the user allows the
                        user to specific regex expressions that can be used to
                        rank various PFN's returned from the Replica Catalog
                        for a particular LFN. This replica selector selects
                        the highest ranked PFN i.e the replica with the lowest
                        rank value.</p>
<p>The regular expressions are assigned different
                        rank, that determine the order in which the
                        expressions are employed. The rank values for the
                        regex can expressed in user properties using the
                        property.</p>
<pre class="screen">
pegasus.selector.replica.regex.rank.[value]   regex-expression
</pre>
<p>The value is an integer value that denotes the
                        rank of an expression with a rank value of 1 being the
                        highest rank.</p>
<p>Please note that before applying any regular
                        expressions on the PFN's, the file URL's that dont
                        match the preferred site are explicitly filtered
                        out.</p>
</dd>
<dt><span class="term">Local</span></dt>
<dd>
                         This replica selector prefers replicas from the local host and that start with a file: URL scheme. It is useful, when users want to stagin files to a remote site from your submit host using the Condor file transfer mechanism. 
                      </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.replica.*.ignore.stagein.sites<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> (no default)<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.replica<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.replica.*.prefer.stagein.sites</p></div></td>
<td>
<p>A comma separated list of storage sites from
                which to never stage in data to a compute site. The property
                can apply to all or a single compute site, depending on how
                the * in the property name is expanded.</p>
<p>The * in
                the property name means all compute sites unless replaced by a
                site name.</p>
<p>For e.g setting
                pegasus.selector.replica.*.ignore.stagein.sites to usc means
                that ignore all replicas from site usc for staging in to any
                compute site. Setting pegasus.replica.isi.ignore.stagein.sites
                to usc means that ignore all replicas from site usc for
                staging in data to site isi.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.replica.*.prefer.stagein.sites<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> (no default)<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.replica<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.replica.*.ignore.stagein.sites</p></div></td>
<td>
<p>A comma separated list of preferred storage sites
                from which to stage in data to a compute site. The property
                can apply to all or a single compute site, depending on how
                the * in the property name is expanded.</p>
<p>The * in
                the property name means all compute sites unless replaced by a
                site name.</p>
<p>For e.g setting
                pegasus.selector.replica.*.prefer.stagein.sites to usc means
                that prefer all replicas from site usc for staging in to any
                compute site. Setting pegasus.replica.isi.prefer.stagein.sites
                to usc means that prefer all replicas from site usc for
                staging in data to site isi.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.replica.regex.rank.[value]<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.3.0<br>
<span class="bold"><strong>Default     :</strong></span> (no default)<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.replica</p></div></td>
<td>
<p>Specifies the regex expressions to be applied on
                the PFNs returned for a particular LFN. Refer to </p>
<pre class="screen">
http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html
</pre>
<p> on information of how to construct a regex
                expression.</p>
<p>The [value] in the property key is to
                be replaced by an int value that designates the rank value for
                the regex expression to be applied in the Regex replica
                selector.</p>
<p>The example below indicates preference
                for file URL's over URL's referring to gridftp server at
                example.isi.edu</p>
<pre class="screen">
pegasus.selector.replica.regex.rank.1 file://.*
pegasus.selector.replica.regex.rank.2 gsiftp://example\.isi\.edu.*
</pre>
</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break"></p>
</div>
<div class="section" title="12.3.7. Site Selection Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="site_sel_props"></a>12.3.7. Site Selection Properties</h3></div></div></div>
<div class="table">
<a name="idp37240224"></a><p class="title"><b>Table 12.18. Site Selection Properties</b></p>
<div class="table-contents"><table summary="Site Selection Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong>Property Key: </strong></span>pegasus.selector.site<span class="bold"><strong><br>
Profile  Key: </strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>String<br>
<span class="bold"><strong>Default     :</strong></span> Random<br>
<span class="bold"><strong>See Also    :</strong></span> pegasus.selector.site.path<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.site.timeout<br>
<span class="bold"><strong>See Also    :</strong></span> pegasus.selector.site.keep.tmp<span class="bold"><strong><br>
See Also    : </strong></span>pegasus.selector.site.env.*</p></div></td>
<td>
<p>The site selection in Pegasus can be on basis of
                any of the following strategies.</p>
<div class="variablelist"><dl>
<dt><span class="term">Random</span></dt>
<dd>
                         In this mode, the jobs will be randomly distributed among the sites that can execute them. 
                      </dd>
<dt><span class="term">RoundRobin</span></dt>
<dd>
                         In this mode. the jobs will be assigned in a round robin manner amongst the sites that can execute them. Since each site cannot execute everytype of job, the round robin scheduling is done per level on a sorted list. The sorting is on the basis of the number of jobs a particular site has been assigned in that level so far. If a job cannot be run on the first site in the queue (due to no matching entry in the transformation catalog for the transformation referred to by the job), it goes to the next one and so on. This implementation defaults to classic round robin in the case where all the jobs in the workflow can run on all the sites. 
                      </dd>
<dt><span class="term">NonJavaCallout</span></dt>
<dd>
<p>In this mode, Pegasus will callout to an
                        external site selector.In this mode a temporary file
                        is prepared containing the job information that is
                        passed to the site selector as an argument while
                        invoking it. The path to the site selector is
                        specified by setting the property
                        pegasus.site.selector.path. The environment variables
                        that need to be set to run the site selector can be
                        specified using the properties with a
                        pegasus.site.selector.env. prefix. The temporary file
                        contains information about the job that needs to be
                        scheduled. It contains key value pairs with each key
                        value pair being on a new line and separated by a
                        =.</p>
<p>The following pairs are currently generated for
                        the site selector temporary file that is generated in
                        the NonJavaCallout.</p>
<div class="informaltable"><table border="0">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td align="left">version</td>
<td align="left">is the version of the site selector
                                  api,currently 2.0.</td>
</tr>
<tr>
<td align="left">transformation</td>
<td align="left">is the fully-qualified definition identifier
                                  for the transformation (TR)
                                  namespace::name:version.</td>
</tr>
<tr>
<td align="left">derivation</td>
<td align="left">is teh fully qualified definition identifier
                                  for the derivation (DV),
                                  namespace::name:version.</td>
</tr>
<tr>
<td align="left">job.level</td>
<td align="left">is the job's depth in the tree of the
                                  workflow DAG.</td>
</tr>
<tr>
<td align="left">job.id</td>
<td align="left">is the job's ID, as used in the DAX
                                  file.</td>
</tr>
<tr>
<td align="left">resource.id</td>
<td align="left">is a site handle, followed by whitespace,
                                  followed by a gridftp server. Typically, each
                                  gridftp server is enumerated once, so you may have
                                  multiple occurances of the same site. There can be
                                  multiple occurances of this key.</td>
</tr>
<tr>
<td align="left">input.lfn</td>
<td align="left">is an input LFN, optionally followed by a
                                  whitespace and file size. There can be multiple
                                  occurances of this key,one for each input LFN
                                  required by the job.</td>
</tr>
<tr>
<td align="left">wf.name</td>
<td align="left">label of the dax, as found in the DAX's root
                                  element. wf.index is the DAX index, that is
                                  incremented for each partition in case of deferred
                                  planning.</td>
</tr>
<tr>
<td align="left">wf.time</td>
<td align="left">is the mtime of the workflow.</td>
</tr>
<tr>
<td align="left">wf.manager</td>
<td align="left">is the name of the workflow manager being
                                  used .e.g condor</td>
</tr>
<tr>
<td align="left">vo.name</td>
<td align="left">is the name of the virtual organization that
                                  is running this workflow. It is currently set to
                                  NONE</td>
</tr>
<tr>
<td align="left">vo.group</td>
<td align="left">unused at present and is set to NONE.</td>
</tr>
<tr>
<td align="left"> </td>
<td class="auto-generated"> </td>
</tr>
</tbody>
</table></div>
</dd>
<dt><span class="term">Group</span></dt>
<dd>
                         In this mode, a group of jobs will be assigned to the same site that can execute them. The use of the PEGASUS profile key group in the dax, associates a job with a particular group. The jobs that do not have the profile key associated with them, will be put in the default group. The jobs in the default group are handed over to the "Random" Site Selector for scheduling. 
                      </dd>
<dt><span class="term">Heft</span></dt>
<dd>
<p>In this mode, a version of the HEFT processor
                        scheduling algorithm is used to schedule jobs in the
                        workflow to multiple grid sites. The implementation
                        assumes default data communication costs when jobs are
                        not scheduled on to the same site. Later on this may
                        be made more configurable.</p>
<p>The runtime for the jobs is specified in the
                        transformation catalog by associating the pegasus
                        profile key runtime with the entries.</p>
<p>The number of processors in a site is picked up
                        from the attribute idle-nodes associated with the
                        vanilla jobmanager of the site in the site
                        catalog.</p>
</dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.site.path<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>If one calls out to an external site selector using the
                NonJavaCallout mode, this refers to the path where the site
                selector is installed. In case other strategies are used it
                does not need to be set.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.site.env.*<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>
<p>The environment variables that need to be set
                while callout to the site selector. These are the variables
                that the user would set if running the site selector on the
                command line. The name of the environment variable is got by
                stripping the keys of the prefix "pegasus.site.selector.env."
                prefix from them. The value of the environment variable is the
                value of the property.</p>
<p>e.g
                pegasus.site.selector.path.LD_LIBRARY_PATH /globus/lib would
                lead to the site selector being called with the
                LD_LIBRARY_PATH set to /globus/lib.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.site.timeout<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.3.0<br>
<span class="bold"><strong>Default     :</strong></span> 60<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.site</p></div></td>
<td>It sets the number of seconds Pegasus waits to hear
                back from an external site selector using the NonJavaCallout
                interface before timing out.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.site.keep.tmp<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.3.0<br>
<span class="bold"><strong>Values</strong></span>      : onerror|always|never<br>
<span class="bold"><strong>Default     :</strong></span> onerror<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.selector.site</p></div></td>
<td>
<p>It determines whether Pegasus deletes the
                temporary input files that are generated in the temp directory
                or not. These temporary input files are passed as input to the
                external site selectors.</p>
<p>A temporary input file is
                created for each that needs to be scheduled.</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break"></p>
</div>
<div class="section" title="12.3.8. Data Staging Configuration Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="data_conf_props"></a>12.3.8. Data Staging Configuration Properties</h3></div></div></div>
<div class="table">
<a name="idp43205632"></a><p class="title"><b>Table 12.19. Data Configuration Properties</b></p>
<div class="table-contents"><table summary="Data Configuration Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.data.configuration<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>data.configuration<span class="bold"><strong><br>
Scope       :</strong></span> Properties, Site Catalog<br>
<span class="bold"><strong>Since       :</strong></span> 4.0.0<br>
<span class="bold"><strong>Values</strong></span>      : sharedfs|nonsharedfs|condorio<br>
<span class="bold"><strong>Default     :</strong></span> sharedfs<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.transfer.bypass.input.staging</p></div></td>
<td>
<p>This property sets up Pegasus to run in different
                environments. For Pegasus 4.5.0 and above, users can set the
                pegasus profile data.configuration with the sites in their
                site catalog, to run multisite workflows with each site having
                a different data configuration.</p>
<div class="variablelist"><dl>
<dt><span class="term">sharedfs</span></dt>
<dd>
                         If this is set, Pegasus will be setup to execute jobs on the shared filesystem on the execution site. This assumes, that the head node of a cluster and the worker nodes share a filesystem. The staging site in this case is the same as the execution site. Pegasus adds a create dir job to the executable workflow that creates a workflow specific directory on the shared filesystem . The data transfer jobs in the executable workflow ( stage_in_ , stage_inter_ , stage_out_ ) transfer the data to this directory.The compute jobs in the executable workflow are launched in the directory on the shared filesystem. 
                      </dd>
<dt><span class="term">condorio</span></dt>
<dd>
                         If this is set, Pegasus will be setup to run jobs in a pure condor pool, with the nodes not sharing a filesystem. Data is staged to the compute nodes from the submit host using Condor File IO. The planner is automatically setup to use the submit host ( site local ) as the staging site. All the auxillary jobs added by the planner to the executable workflow ( create dir, data stagein and stage-out, cleanup ) jobs refer to the workflow specific directory on the local site. The data transfer jobs in the executable workflow ( stage_in_ , stage_inter_ , stage_out_ ) transfer the data to this directory. When the compute jobs start, the input data for each job is shipped from the workflow specific directory on the submit host to compute/worker node using Condor file IO. The output data for each job is similarly shipped back to the submit host from the compute/worker node. This setup is particularly helpful when running workflows in the cloud environment where setting up a shared filesystem across the VM's may be tricky. 

                        <pre class="screen">pegasus.gridstart                    PegasusLite
pegasus.transfer.worker.package      true
</pre>
</dd>
<dt><span class="term">nonsharedfs</span></dt>
<dd>
                         If this is set, Pegasus will be setup to execute jobs on an execution site without relying on a shared filesystem between the head node and the worker nodes. You can specify staging site ( using --staging-site option to pegasus-plan) to indicate the site to use as a central storage location for a workflow. The staging site is independant of the execution sites on which a workflow executes. All the auxillary jobs added by the planner to the executable workflow ( create dir, data stagein and stage-out, cleanup ) jobs refer to the workflow specific directory on the staging site. The data transfer jobs in the executable workflow ( stage_in_ , stage_inter_ , stage_out_ ) transfer the data to this directory. When the compute jobs start, the input data for each job is shipped from the workflow specific directory on the submit host to compute/worker node using pegasus-transfer. The output data for each job is similarly shipped back to the submit host from the compute/worker node. The protocols supported are at this time SRM, GridFTP, iRods, S3. This setup is particularly helpful when running workflows on OSG where most of the execution sites don't have enough data storage. Only a few sites have large amounts of data storage exposed that can be used to place data during a workflow run. This setup is also helpful when running workflows in the cloud environment where setting up a shared filesystem across the VM's may be tricky. On loading this property, internally the following properies are set 

                        <pre class="screen">pegasus.gridstart                    PegasusLite
pegasus.transfer.worker.package      true
</pre>
</dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.bypass.input.staging<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.3.0<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.data.configuration</p></div></td>
<td>
<p>When executiing in a non shared filesystem setup
                i.e data configuration set to nonsharedfs or condorio, Pegasus
                always stages the input files through the staging site i.e the
                stage-in job stages in data from the input site to the staging
                site. The PegasusLite jobs that start up on the worker nodes,
                then pull the input data from the staging site for each
                job.</p>
<p>This property can be used to setup the
                PegasusLite jobs to pull input data directly from the input
                site without going through the staging server. This is based
                on the assumption that the worker nodes can access the input
                site. If users set this to true, they should be aware that the
                access to the input site is no longer throttled ( as in case
                of stage in jobs). If large number of compute jobs start at
                the same time in a workflow, the input server will see a
                connection from each job.</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break"></p>
</div>
<div class="section" title="12.3.9. Transfer Configuration Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="transfer_props"></a>12.3.9. Transfer Configuration Properties</h3></div></div></div>
<div class="table">
<a name="idp40919824"></a><p class="title"><b>Table 12.20. Transfer Configuration Properties</b></p>
<div class="table-contents"><table summary="Transfer Configuration Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.*.impl<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Values</strong></span>      : Transfer|GUC<br>
<span class="bold"><strong>Default     :</strong></span> Transfer<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.transfer.refiner</p></div></td>
<td><table width="100%" border="0">
<colgroup><col></colgroup>
<tbody><tr><td>
<p>Each compute job usually has data products
                      that are required to be staged in to the execution site,
                      materialized data products staged out to a final resting
                      place, or staged to another job running at a different
                      site. This property determines the underlying grid
                      transfer tool that is used to manage the
                      transfers.</p>
<p>The * in the property name can be
                      replaced to achieve finer grained control to dictate
                      what type of transfer jobs need to be managed with which
                      grid transfer tool.</p>
<p>Usually,the arguments
                      with which the client is invoked can be specified by
                      </p>
<pre class="screen">
- the property pegasus.transfer.arguments
- associating the PEGASUS profile key transfer.arguments
</pre>
<p>The table below illustrates all the possible variations
                      of the property.</p>
<div class="informaltable"><table border="0">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td align="left">Property Name</td>
<td align="left">Applies to</td>
</tr>
<tr>
<td align="left">pegasus.transfer.stagein.impl</td>
<td align="left">the stage in transfer jobs</td>
</tr>
<tr>
<td align="left">pegasus.transfer.stageout.impl</td>
<td align="left">the stage out transfer jobs</td>
</tr>
<tr>
<td align="left">pegasus.transfer.inter.impl</td>
<td align="left">the inter site transfer jobs</td>
</tr>
<tr>
<td align="left">pegasus.transfer.setup.impl</td>
<td align="left">the setup transfer job</td>
</tr>
<tr>
<td align="left">pegasus.transfer.*.impl</td>
<td align="left">apply to types of transfer jobs</td>
</tr>
<tr>
<td align="left"> </td>
<td class="auto-generated"> </td>
</tr>
</tbody>
</table></div>
<p>Note: Since version 2.2.0
                      the worker package is staged automatically during
                      staging of executables to the remote site. This is
                      achieved by adding a setup transfer job to the workflow.
                      The setup transfer job by default uses GUC to stage the
                      data. The implementation to use can be configured by
                      setting the property </p>
<pre class="screen">pegasus.transfer.setup.impl </pre>
<p>property.
                      However, if you have pegasus.transfer.*.impl set in your
                      properties file, then you need to set
                      pegasus.transfer.setup.impl to GUC</p>
<p>The
                      various grid transfer tools that can be used to manage
                      data transfers are explained
                      below</p>
<div class="variablelist"><dl>
<dt><span class="term">Transfer</span></dt>
<dd>
<p>This results in pegasus-transfer to be used
                              for transferring of files. It is a python based
                              wrapper around various transfer clients like
                              globus-url-copy, lcg-copy, wget, cp, ln .
                              pegasus-transfer looks at source and destination url
                              and figures out automatically which underlying
                              client to use. pegasus-transfer is distributed with
                              the PEGASUS and can be found at
                              $PEGASUS_HOME/bin/pegasus-transfer.</p>
<p>For remote sites, Pegasus constructs the
                              default path to pegasus-transfer on the basis of
                              PEGASUS_HOME env profile specified in the site
                              catalog. To specify a different path to the
                              pegasus-transfer client , users can add an entry
                              into the transformation catalog with fully qualified
                              logical name as pegasus::pegasus-transfer</p>
</dd>
<dt><span class="term">GUC</span></dt>
<dd>
                               This refers to the new guc client that does multiple file transfers per invocation. The globus-url-copy client distributed with Globus 4.x is compatible with this mode. 
                            </dd>
</dl></div>
</td></tr></tbody>
</table></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.arguments<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>transfer.arguments<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Type        : </strong></span>String<span class="bold"><strong><br>
Default     :</strong></span> (no default)<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.transfer.lite.arguments</p></div></td>
<td><p>This determines the extra arguments with which
                the transfer implementation is invoked. The transfer
                executable that is invoked is dependant upon the transfer mode
                that has been selected. The property can be overloaded by
                associated the pegasus profile key transfer.arguments either
                with the site in the site catalog or the corresponding
                transfer executable in the transformation
                catalog.</p></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.threads<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>transfer.threads<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.4.0<br>
<span class="bold"><strong>Type        : </strong></span>Integer<span class="bold"><strong><br>
Default     :</strong></span> 2</p></div></td>
<td><p>This property set the number of threads
                pegasus-transfer uses to transfer the files. This property to
                applies to the separate data transfer nodes that are added by
                Pegasus to the executable workflow. The property can be
                overloaded by associated the pegasus profile key
                transfer.threads either with the site in the site catalog or
                the corresponding transfer executable in the transformation
                catalog.</p></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.lite.arguments<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>transfer.lite.arguments<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.4.0<br>
<span class="bold"><strong>Type        : </strong></span>String<span class="bold"><strong><br>
Default     :</strong></span> (no default)<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.transfer.arguments</p></div></td>
<td>This determines the extra arguments with which the
                PegasusLite transfer implementation is invoked. The transfer
                executable that is invoked is dependant upon the PegasusLite
                transfer implementation that has been selected.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.worker.package<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<span class="bold"><strong><br>
Default     :</strong></span> false<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.data.configuration</p></div></td>
<td>
<p>By default, Pegasus relies on the worker package
                to be installed in a directory accessible to the worker nodes
                on the remote sites . Pegasus uses the value of PEGASUS_HOME
                environment profile in the site catalog for the remote sites,
                to then construct paths to pegasus auxillary executables like
                kickstart, pegasus-transfer, seqexec etc.</p>
<p>If the
                Pegasus worker package is not installed on the remote sites
                users can set this property to true to get Pegasus to deploy
                worker package on the nodes.</p>
<p>In the case of
                sharedfs setup, the worker package is deployed on the shared
                scratch directory for the workflow , that is accessible to all
                the compute nodes of the remote sites.</p>
<p>When
                running in nonsharefs environments, the worker package is
                first brought to the submit directory and then transferred to
                the worker node filesystem using Condor file
                IO.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.links<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<span class="bold"><strong><br>
Default     :</strong></span> false<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>If this is set, and the transfer implementation is set
                to Transfer i.e. using the transfer executable distributed
                with the PEGASUS. On setting this property, if Pegasus while
                fetching data from the Replica Catalog sees a "site" attribute
                associated with the PFN that matches the execution site on
                which the data has to be transferred to, Pegasus instead of
                the URL returned by the Replica Catalog replaces it with a
                file based URL. This is based on the assumption that the if
                the "site" attributes match, the filesystems are visible to
                the remote execution directory where input data resides. On
                seeing both the source and destination urls as file based URLs
                the transfer executable spawns a job that creates a symbolic
                link by calling ln -s on the remote site.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.*.remote.sites<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Type        : </strong></span>comma separated list of sites<span class="bold"><strong><br>
Default     :</strong></span> (no default)<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>
<p>By default Pegasus looks at the source and
                destination URL's for to determine whether the associated
                transfer job runs on the submit host or the head node of a
                remote site, with preference set to run a transfer job to run
                on submit host.</p>
<p>Pegasus will run transfer jobs on
                the remote sites</p>
<pre class="screen">
-  if the file server for the compute site is a file server i.e url prefix file://
-  symlink jobs need to be added that require the symlink transfer jobs to
be run remotely.
</pre>
<p>This property can be used to change the default
                behaviour of Pegasus and force pegasus to run different types
                of transfer jobs for the sites specified on the remote
                site.</p>
<p>The table below illustrates all the possible
                variations of the property.</p>
<div class="informaltable"><table border="0">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td align="left">Property Name</td>
<td align="left">Applies to</td>
</tr>
<tr>
<td align="left">pegasus.transfer.stagein.remote.sites</td>
<td align="left">the stage in transfer jobs</td>
</tr>
<tr>
<td align="left">pegasus.transfer.stageout.remote.sites</td>
<td align="left">the stage out transfer jobs</td>
</tr>
<tr>
<td align="left">pegasus.transfer.inter.remote.sites</td>
<td align="left">the inter site transfer jobs</td>
</tr>
<tr>
<td align="left">pegasus.transfer.*.remote.sites</td>
<td align="left">apply to types of transfer jobs</td>
</tr>
<tr>
<td align="left"> </td>
<td class="auto-generated"> </td>
</tr>
</tbody>
</table></div>
<p>In addition * can be specified
                as a property value, to designate that it applies to all
                sites.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.staging.delimiter<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Type        : </strong></span>String<span class="bold"><strong><br>
Default     :</strong></span> :<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>Pegasus supports executable staging as part of the
                workflow. Currently staging of statically linked executables
                is supported only. An executable is normally staged to the
                work directory for the workflow/partition on the remote site.
                The basename of the staged executable is derived from the
                namespace,name and version of the transformation in the
                transformation catalog. This property sets the delimiter that
                is used for the construction of the name of the staged
                executable.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.disable.chmod.sites<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Type        : </strong></span>comma separated list of sites<span class="bold"><strong><br>
Default     :</strong></span> (no default)<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>
<p>During staging of executables to remote sites,
                chmod jobs are added to the workflow. These jobs run on the
                remote sites and do a chmod on the staged executable. For some
                sites, this maynot be required. The permissions might be
                preserved, or there maybe an automatic mechanism that does
                it.</p>
<p>This property allows you to specify the list
                of sites, where you do not want the chmod jobs to be executed.
                For those sites, the chmod jobs are replaced by NoOP jobs. The
                NoOP jobs are executed by Condor, and instead will immediately
                have a terminate event written to the job log file and removed
                from the queue.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.transfer.setup.source.base.url<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0.0<br>
<span class="bold"><strong>Type        : </strong></span>URL<br>
<span class="bold"><strong>Default     :</strong></span> (no default)<span class="bold"><strong><br>
</strong></span></p></div></td>
<td>This property specifies the base URL to the directory
                containing the Pegasus worker package builds. During Staging
                of Executable, the Pegasus Worker Package is also staged to
                the remote site. The worker packages are by default pulled
                from the http server at pegasus.isi.edu. This property can be
                used to override the location from where the worker package
                are staged. This maybe required if the remote computes sites
                don't allows files transfers from a http server.</td>
</tr>
</tbody>
</table></div>
</div>
<p><br class="table-break"></p>
</div>
<div class="section" title="12.3.10. Monitoring Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="monitoring_props"></a>12.3.10. Monitoring Properties</h3></div></div></div>
<div class="table">
<a name="idp47296672"></a><p class="title"><b>Table 12.21. Monitoring Properties</b></p>
<div class="table-contents"><table summary="Monitoring Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.monitord.events<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 3.0.2<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Default     :</strong></span> true<span class="bold"><strong><br>
See Also    : </strong></span>pegasus.catalog.workflow.url</p></div></td>
<td>This property determines whether pegasus-monitord
              generates log events. If log events are disabled using this
              property, no bp file, or database will be created, even if the
              pegasus.monitord.output property is specified.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.workflow.url<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.5<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Default     :</strong></span> SQlite database in submit<br>
              directory.<br>
<span class="bold"><strong>See Also    :</strong></span> pegasus.monitord.events</p></div></td>
<td>
<p>This property specifies the destination for
              generated log events in pegasus-monitord. By default, events are
              stored in a sqlite database in the workflow directory, which
              will be created with the workflow's name, and a ".stampede.db"
              extension. Users can specify an alternative database by using a
              SQLAlchemy connection string. Details are available at: </p>
<pre class="screen">
http://www.sqlalchemy.org/docs/05/reference/dialects/index.html
</pre>
<p> It is important to note that users will need to have the appropriate
              db interface library installed. Which is to say, SQLAlchemy is a
              wrapper around the mysql interface library (for instance), it
              does not provide a MySQL driver itself. The Pegasus distribution
              includes both SQLAlchemy and the SQLite Python driver. As a
              final note, it is important to mention that unlike when using
              SQLite databases, using SQLAlchemy with other database servers,
              e.g. MySQL or Postgres , the target database needs to exist.
              Users can also specify a file name using this property in order
              to create a file with the log events.</p>
<p>Example values
              for the SQLAlchemy connection string for various end points are
              listed below</p>
<div class="informaltable"><table border="0">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td align="left">SQL Alchemy End Point</td>
<td align="left">Example Value</td>
</tr>
<tr>
<td align="left">Netlogger BP File</td>
<td align="left">file:///submit/dir/myworkflow.bp</td>
</tr>
<tr>
<td align="left">SQL Lite Database</td>
<td align="left">sqlite:///submit/dir/myworkflow.db</td>
</tr>
<tr>
<td align="left">MySQL Database</td>
<td align="left">mysql://user:password@host:port/databasename</td>
</tr>
<tr>
<td align="left"> </td>
<td class="auto-generated"> </td>
</tr>
</tbody>
</table></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.master.url<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.2<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Default     :</strong></span> sqlite database in $HOME/.pegasus/workflow.db<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.catalog.workflow.url</p></div></td>
<td>
<p>This property specifies the destination for the
              workflow dashboard database. By default, the workflow dashboard
              datbase defaults to a sqlite database named workflow.db in the
              $HOME/.pegasus directory. This is database is shared for all
              workflows run as a particular user. Users can specify an
              alternative database by using a SQLAlchemy connection string.
              Details are available at: </p>
<pre class="screen">
http://www.sqlalchemy.org/docs/05/reference/dialects/index.html
</pre>
<p> It is important to note that users will need to have the appropriate
              db interface library installed. Which is to say, SQLAlchemy is a
              wrapper around the mysql interface library (for instance), it
              does not provide a MySQL driver itself. The Pegasus distribution
              includes both SQLAlchemy and the SQLite Python driver. As a
              final note, it is important to mention that unlike when using
              SQLite databases, using SQLAlchemy with other database servers,
              e.g. MySQL or Postgres , the target database needs to exist.
              Users can also specify a file name using this property in order
              to create a file with the log events.</p>
<p>Example values
              for the SQLAlchemy connection string for various end points are
              listed below</p>
<div class="informaltable"><table border="0">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td align="left">SQL Alchemy End Point</td>
<td align="left">Example Value</td>
</tr>
<tr>
<td align="left">SQL Lite Database</td>
<td align="left">sqlite:///shared/myworkflow.db</td>
</tr>
<tr>
<td align="left">MySQL Database</td>
<td align="left">mysql://user:password@host:port/databasename</td>
</tr>
<tr>
<td align="left"> </td>
<td class="auto-generated"> </td>
</tr>
</tbody>
</table></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.monitord.output<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 3.0.2<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Default     :</strong></span> SQlite database in submit<br>
              directory.<br>
<span class="bold"><strong>See Also    :</strong></span> pegasus.monitord.events</p></div></td>
<td><p>This property has been deprecated in favore of
              pegasus.catalog.workflow.url that introduced in 4.5 release.
              Support for this property will be dropped in future releases.
              </p></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.dashboard.output<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.2<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Default     :</strong></span> sqlite database in $HOME/.pegasus/workflow.db<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.monitord.output</p></div></td>
<td><p>This property has been deprecated in favore of
              pegasus.catalog.master.url that introduced in 4.5 release.
              Support for this property will be dropped in future releases.
              </p></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.monitord.notifications<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 3.1.0<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> true<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.monitord.notifications.max<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.monitord.notifications.timeout</p></div></td>
<td>This property determines how many notification scripts
              pegasus-monitord will call concurrently. Upon reaching this
              limit, pegasus-monitord will wait for one notification script to
              finish before issuing another one. This is a way to keep the
              number of processes under control at the submit host. Setting
              this property to 0 will disable notifications
              completely.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.monitord.notifications.max<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 3.1.0<br>
<span class="bold"><strong>Type        : </strong></span>Integer<br>
<span class="bold"><strong>Default     :</strong></span> 10<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.monitord.notifications <span class="bold"><strong><br>
See Also    :</strong></span> pegasus.monitord.notifications.timeout</p></div></td>
<td>This property determines whether pegasus-monitord
              processes notifications. When notifications are enabled,
              pegasus-monitord will parse the .notify file generated by
              pegasus-plan and will invoke notification scripts whenever
              conditions matches one of the notifications.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.monitord.notifications.timeout<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 3.1.0<br>
<span class="bold"><strong>Type        : </strong></span>Integer<br>
<span class="bold"><strong>Default     :</strong></span> true<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.monitord.notifications.<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.monitord.notifications.max</p></div></td>
<td>This property determines how long will pegasus-monitord
              let notification scripts run before terminating them. When this
              property is set to 0 (default), pegasus-monitord will not
              terminate any notification scripts, letting them run
              indefinitely. If some notification scripts missbehave, this has
              the potential problem of starving pegasus-monitord's
              notification slots (see the pegasus.monitord.notifications.max
              property), and block further notifications. In addition, users
              should be aware that pegasus-monitord will not exit until all
              notification scripts are finished.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.monitord.stdout.disable.parsing<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 3.1.1<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false</p></div></td>
<td>By default, pegasus-monitord parses the stdout/stderr
              section of the kickstart to populate the applications captured
              stdout and stderr in the job instance table for the stampede
              schema. For large workflows, this may slow down monitord
              especially if the application is generating a lot of output to
              it's stdout and stderr. This property, can be used to turn of
              the database population.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p></p>
</div>
<div class="section" title="12.3.11. Job Clustering Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="job_clustering_props"></a>12.3.11. Job Clustering Properties</h3></div></div></div>
<div class="table">
<a name="idp39479424"></a><p class="title"><b>Table 12.22. Job Clustering Properties</b></p>
<div class="table-contents"><table summary="Job Clustering Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.clusterer.job.aggregator<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Values</strong></span>      : seqexec|mpiexec<br>
<span class="bold"><strong>Default     :</strong></span> seqexec</p></div></td>
<td>
<p>A large number of workflows executed through the
              Virtual Data System, are composed of several jobs that run for
              only a few seconds or so. The overhead of running any job on the
              grid is usually 60 seconds or more. Hence, it makes sense to
              collapse small independent jobs into a larger job. This property
              determines, the executable that will be used for running the
              larger job on the remote site.</p>
<div class="variablelist"><dl>
<dt><span class="term">seqexec</span></dt>
<dd>
                       In this mode, the executable used to run the merged job is "pegasus-cluster" that runs each of the smaller jobs sequentially on the same node. The executable "pegasus-cluster" is a PEGASUS tool distributed in the PEGASUS worker package, and can be usually found at {pegasus.home}/bin/seqexec. 
                    </dd>
<dt><span class="term">mpiexec</span></dt>
<dd>
                       In this mode, the executable used to run the merged job is "pegasus-mpi-cluster" (PMC) that runs the smaller jobs via mpi on n nodes where n is the nodecount associated with the merged job. The executable "pegasus-mpi-cluster" is a PEGASUS tool distributed in the PEGASUS distribution and is built only if mpi compiler is available. 
                    </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.clusterer.job.aggregator.seqexec.log<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.3<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.clusterer.job.aggregator<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.clusterer.job.aggregator.seqexec.log.global</p></div></td>
<td>
<p>The tool pegasus-cluster logs the progress of the
              jobs that are being run by it in a progress file on the remote
              cluster where it is executed.</p>
<p>This property sets the
              Boolean flag, that indicates whether to turn on the logging or
              not.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.clusterer.job.aggregator.seqexec.log<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.3<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.clusterer.job.aggregator<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.clusterer.job.aggregator.seqexec.log.global</p></div></td>
<td>
<p>The tool pegasus-cluster logs the progress of the
              jobs that are being run by it in a progress file on the remote
              cluster where it is executed. The progress log is useful for you
              to track the progress of your computations and remote grid
              debugging. The progress log file can be shared by multiple
              pegasus-cluster jobs that are running on a particular cluster as
              part of the same workflow. Or it can be per
              job.</p>
<p>This property sets the Boolean flag, that
              indicates whether to have a single global log for all the
              pegasus-cluster jobs on a particular cluster or progress log per
              job.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.clusterer.job.aggregator.seqexec.firstjobfail<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.2<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> true<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.clusterer.job.aggregator</p></div></td>
<td>
<p>By default "pegasus-cluster" does not stop
              execution even if one of the clustered jobs it is executing
              fails. This is because "pegasus-cluster" tries to get as much
              work done as possible.</p>
<p>This property sets the
              Boolean flag, that indicates whether to make "pegasus-cluster"
              stop on the first job failure it detects.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.clusterer.label.key<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Default     :</strong></span> label<br>
</p></div></td>
<td><table width="100%" border="0">
<colgroup><col></colgroup>
<tbody><tr><td>
<p>While clustering jobs in the workflow into
                    larger jobs, you can optionally label your graph to
                    control which jobs are clustered and to which clustered
                    job they belong. This done using a label based clustering
                    scheme and is done by associating a profile/label key in
                    the PEGASUS namespace with the jobs in the DAX. Each job
                    that has the same value/label value for this profile key,
                    is put in the same clustered job.</p>
<p>This
                    property allows you to specify the PEGASUS profile key
                    that you want to use for label based
                    clustering.</p>
</td></tr></tbody>
</table></td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p></p>
</div>
<div class="section" title="12.3.12. Logging Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="logging_props"></a>12.3.12. Logging Properties</h3></div></div></div>
<div class="table">
<a name="idp42449680"></a><p class="title"><b>Table 12.23. Logging Properties</b></p>
<div class="table-contents"><table summary="Logging Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.log.manager<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.2.0<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Values</strong></span>      : Default|Log4J<br>
<span class="bold"><strong>Default     :</strong></span> Default<span class="bold"><strong><br>
See Also    :</strong></span>pegasus.log.manager.formatter</p></div></td>
<td>
<p>This property sets the logging implementation to
              use for logging.</p>
<div class="variablelist"><dl>
<dt><span class="term">Default</span></dt>
<dd>
                       This implementation refers to the legacy Pegasus logger, that logs directly to stdout and stderr. It however, does have the concept of levels similar to log4j or syslog. 
                    </dd>
<dt><span class="term">Log4j</span></dt>
<dd>
                       This implementation, uses Log4j to log messages. The log4j properties can be specified in a properties file, the location of which is specified by the property 

                      <pre class="screen">
pegasus.log.manager.log4j.conf
</pre>
</dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.log.manager.formatter<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.2.0<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Values</strong></span>      : Simple|Netlogger<br>
<span class="bold"><strong>Default     :</strong></span> Simple<span class="bold"><strong><br>
See Also    :</strong></span>pegasus.log.manager</p></div></td>
<td><table width="100%" border="0">
<colgroup><col></colgroup>
<tbody><tr><td>
<p>This property sets the formatter to use for
                    formatting the log messages while
                    logging.</p>
<div class="variablelist"><dl>
<dt><span class="term">Simple</span></dt>
<dd>
                             This formats the messages in a simple format. The messages are logged as is with minimal formatting. Below are sample log messages in this format while ranking a dax according to performance. 

                            <pre class="screen">
event.pegasus.ranking dax.id se18-gda.dax  - STARTED
event.pegasus.parsing.dax dax.id se18-gda-nested.dax  - STARTED
event.pegasus.parsing.dax dax.id se18-gda-nested.dax  - FINISHED
job.id jobGDA
job.id jobGDA query.name getpredicted performace time 10.00
event.pegasus.ranking dax.id se18-gda.dax  - FINISHED
</pre>
</dd>
<dt><span class="term">Netlogger</span></dt>
<dd>
<p>This formats the messages in the Netlogger
                            format , that is based on key value pairs. The
                            netlogger format is useful for loading the logs into
                            a database to do some meaningful analysis. Below are
                            sample log messages in this format while ranking a
                            dax according to performance. </p>
<pre class="screen">
ts=2008-09-06T12:26:20.100502Z event=event.pegasus.ranking.start \
msgid=6bc49c1f-112e-4cdb-af54-3e0afb5d593c \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5 \
dax.id=se18-gda.dax prog=Pegasus
ts=2008-09-06T12:26:20.100750Z event=event.pegasus.parsing.dax.start \
msgid=fed3ebdf-68e6-4711-8224-a16bb1ad2969 \
eventId=event.pegasus.parsing.dax_887134a8-39cb-40f1-b11c-b49def0c5232\
dax.id=se18-gda-nested.dax prog=Pegasus
ts=2008-09-06T12:26:20.100894Z event=event.pegasus.parsing.dax.end \
msgid=a81e92ba-27df-451f-bb2b-b60d232ed1ad \
eventId=event.pegasus.parsing.dax_887134a8-39cb-40f1-b11c-b49def0c5232
ts=2008-09-06T12:26:20.100395Z event=event.pegasus.ranking \
msgid=4dcecb68-74fe-4fd5-aa9e-ea1cee88727d \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5 \
job.id="jobGDA"
ts=2008-09-06T12:26:20.100395Z event=event.pegasus.ranking \
msgid=4dcecb68-74fe-4fd5-aa9e-ea1cee88727d \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5 \
job.id="jobGDA" query.name="getpredicted performace" time="10.00"
ts=2008-09-06T12:26:20.102003Z event=event.pegasus.ranking.end \
msgid=31f50f39-efe2-47fc-9f4c-07121280cd64 \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5
</pre>
</dd>
</dl></div>
</td></tr></tbody>
</table></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.log.*<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>file path<br>
<span class="bold"><strong>Default     :</strong></span> no default</p></div></td>
<td>This property sets the path to the file where all the
              logging for Pegasus can be redirected to. Both stdout and stderr
              are logged to the file specified.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.log.memory.usage<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.3.4<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false</p></div></td>
<td>This property if set to true, will result in the planner
              writing out JVM heap memory statistics at the end of the
              planning process at the INFO level. This is useful, if users
              want to fine tune their java memory settings by setting
              JAVA_HEAPMAX and JAVA_HEAPMIN for large workflows.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.metrics.app<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.3.0<br>
<span class="bold"><strong>Type        : </strong></span>String<br>
<span class="bold"><strong>Default     :</strong></span> (no default)</p></div></td>
<td>
<p>This property namespace allows users to pass
              application level metrics to the metrics server. The value of
              this property is the name of the
              application.</p>
<p>Additional application specific
              attributes can be passed by using the prefix pegasus.metrics.app
              </p>
<pre class="screen">
pegasus.metrics.app.[arribute-name]       attribute-value
</pre>
<p>Note: the attribute cannot be named name. This attribute
              is automatically assigned the value from
              pegasus.metrics.app</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p></p>
</div>
<div class="section" title="12.3.13. Cleanup Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="cleanup_props"></a>12.3.13. Cleanup Properties</h3></div></div></div>
<div class="table">
<a name="idp46031280"></a><p class="title"><b>Table 12.24. Cleanup Properties</b></p>
<div class="table-contents"><table summary="Cleanup Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.file.cleanup.strategy<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.2-<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Default     :</strong></span> InPlace</p></div></td>
<td>
<p>This property is used to select the strategy of how
              the the cleanup nodes are added to the executable
              workflow.</p>
<div class="variablelist"><dl>
<dt><span class="term">InPlace</span></dt>
<dd>
                       This is the only mode available . 
                    </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.file.cleanup.impl<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.2<br>
<span class="bold"><strong>Type        : </strong></span>String<br>
<span class="bold"><strong>Default     :</strong></span> Cleanup</p></div></td>
<td>
<p>This property is used to select the executable that
              is used to create the working directory on the compute
              sites.</p>
<div class="variablelist"><dl>
<dt><span class="term">Cleanup</span></dt>
<dd>
                       The default executable that is used to delete files is the "pegasus-cleanup" executable shipped with Pegasus. It is found at $PEGASUS_HOME/bin/pegasus-cleanup in the pegasus distribution. An entry for transformation pegasus::dirmanager needs to exist in the Transformation Catalog or the PEGASUS_HOME environment variable should be specified in the site catalog for the sites for this mode to work. 
                    </dd>
<dt><span class="term">RM</span></dt>
<dd>
                       This mode results in the rm executable to be used to delete files from remote directories. The rm executable is standard on *nix systems and is usually found at /bin/rm location. 
                    </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.file.cleanup.clusters.num<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.2.0<br>
<span class="bold"><strong>Type        : </strong></span>Integer<br>
<span class="bold"><strong>Default     :</strong></span> 2</p></div></td>
<td>In case of the InPlace strategy for adding the cleanup
              nodes to the workflow, this property specifies the maximum
              number of cleanup jobs that are added to the executable workflow
              on each level.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.file.cleanup.clusters.size<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.2.0<br>
<span class="bold"><strong>Type        : </strong></span>Integer<br>
<span class="bold"><strong>Default     :</strong></span> 2</p></div></td>
<td>In case of the InPlace strategy this property sets the
              number of cleanup jobs that get clustered into a bigger cleanup
              job. This parameter is only used if
              pegasus.file.cleanup.clusters.num is not set.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.file.cleanup.scope<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.3.0<br>
<span class="bold"><strong>Type        : </strong></span>Enumeration<br>
<span class="bold"><strong>Value       : </strong></span>fullahead|deferred<br>
<span class="bold"><strong>Default     :</strong></span> fullahead</p></div></td>
<td>
<p>By default in case of deferred planning InPlace
              file cleanup is turned OFF. This is because the cleanup
              algorithm does not work across partitions. This property can be
              used to turn on the cleanup in case of deferred planning.
              </p>
<div class="variablelist"><dl>
<dt><span class="term">fullahead</span></dt>
<dd>
                       This is the default scope. The pegasus cleanup algorithm does not work across partitions in deferred planning. Hence the cleanup is always turned OFF , when deferred planning occurs and cleanup scope is set to full ahead. 
                    </dd>
<dt><span class="term">deferred</span></dt>
<dd>
                       If the scope is set to deferred, then Pegasus will not disable file cleanup in case of deferred planning. This is useful for scenarios where the partitions themselves are independant ( i.e. dont share files ). Even if the scope is set to deferred, users can turn off cleanup by specifying --nocleanup option to pegasus-plan. 
                    </dd>
</dl></div>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p></p>
</div>
<div class="section" title="12.3.14. Miscellaneous Properties">
<div class="titlepage"><div><div><h3 class="title">
<a name="misc__props"></a>12.3.14. Miscellaneous Properties</h3></div></div></div>
<div class="table">
<a name="idp44303152"></a><p class="title"><b>Table 12.25. Miscellaneous Properties</b></p>
<div class="table-contents"><table summary="Miscellaneous Properties" border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><span class="bold"><strong>Key Attributes</strong></span></td>
<td><span class="bold"><strong>Description</strong></span></td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.code.generator<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 3.0<br>
<span class="bold"><strong>Type  </strong></span>      : String<br>
<span class="bold"><strong>Values</strong></span>      : Condor|Shell|PMC<br>
<span class="bold"><strong>Default     :</strong></span> Condor<span class="bold"><strong><br>
See Also    :</strong></span> pegasus.log.manager.formatter</p></div></td>
<td>
<p>This property is used to load the appropriate Code
              Generator to use for writing out the executable
              workflow.</p>
<div class="variablelist"><dl>
<dt><span class="term">Condor</span></dt>
<dd>
                       This is the default code generator for Pegasus . This generator generates the executable workflow as a Condor DAG file and associated job submit files. The Condor DAG file is passed as input to Condor DAGMan for job execution. 
                    </dd>
<dt><span class="term">Shell</span></dt>
<dd>
                       This Code Generator generates the executable workflow as a shell script that can be executed on the submit host. While using this code generator, all the jobs should be mapped to site local i.e specify --sites local to pegasus-plan. 
                    </dd>
<dt><span class="term">PMC</span></dt>
<dd>
                       This Code Generator generates the executable workflow as a PMC task workflow. This is useful to run on platforms where it not feasible to run Condor such as the new XSEDE machines such as Blue Waters. In this mode, Pegasus will generate the executable workflow as a PMC task workflow and a sample PBS submit script that submits this workflow. 
                    </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.register<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.1.-<br>
<span class="bold"><strong>Type  </strong></span>      : Boolean<br>
<span class="bold"><strong>Default     :</strong></span> true</p></div></td>
<td>
<p>Pegasus creates registration jobs to register the
              output files in the replica catalog. An output file is
              registered only if</p>
<p>1) a user has configured a
              replica catalog in the properties 2) the register flags for the
              output files in the DAX are set to true</p>
<p>This
              property can be used to turn off the creation of the
              registration jobs even though the files maybe marked to be
              registered in the replica catalog.</p>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.data.reuse.scope<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.5.0<br>
<span class="bold"><strong>Type        : </strong></span>Enumeration<br>
<span class="bold"><strong>Value       : </strong></span>none|partial|full<br>
<span class="bold"><strong>Default     :</strong></span> full</p></div></td>
<td>
<p>This property is used to control the behavior of
              the data reuse algorithm in Pegasus</p>
<div class="variablelist"><dl>
<dt><span class="term">none</span></dt>
<dd>
                       This is same as disabling data reuse. It is equivalent to passing the --force option to pegasus-plan on the command line. 
                    </dd>
<dt><span class="term">partial</span></dt>
<dd>
                       In this case, only certain jobs ( those that have pegasus profile key enable_for_data_reuse set to true ) are checked for presence of output files in the replica catalog. This gives users control over what jobs are deleted as part of the data reuse algorithm. 
                    </dd>
<dt><span class="term">full</span></dt>
<dd>
                       This is the default behavior, where all the jobs output files are looked up in the replica catalog. 
                    </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.catalog.transformation.mapper<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>Enumeration<br>
<span class="bold"><strong>Value       : </strong></span>All|Installed|Staged|Submit<br>
<span class="bold"><strong>Default     :</strong></span> All</p></div></td>
<td>
<p>Pegasus supports transfer of statically linked
              executables as part of the executable workflow. At present,
              there is only support for staging of executables referred to by
              the compute jobs specified in the DAX file. Pegasus determines
              the source locations of the binaries from the transformation
              catalog, where it searches for entries of type STATIC_BINARY for
              a particular architecture type. The PFN for these entries should
              refer to a globus-url-copy valid and accessible remote URL. For
              transfer of executables, Pegasus constructs a soft state map
              that resides on top of the transformation catalog, that helps in
              determining the locations from where an executable can be staged
              to the remote site.</p>
<p>This property determines, how
              that map is created. </p>
<div class="variablelist"><dl>
<dt><span class="term">All</span></dt>
<dd>
                       In this mode, all sources with entries of type STATIC_BINARY for a particular transformation are considered valid sources for the transfer of executables. This the most general mode, and results in the constructing the map as a result of the cartesian product of the matches. 
                    </dd>
<dt><span class="term">Installed</span></dt>
<dd>
                       In this mode, only entries that are of type INSTALLED are used while constructing the soft state map. This results in Pegasus never doing any transfer of executables as part of the workflow. It always prefers the installed executables at the remote sites. 
                    </dd>
<dt><span class="term">Staged</span></dt>
<dd>
                       In this mode, only entries that are of type STATIC_BINARY are used while constructing the soft state map. This results in the concrete workflow referring only to the staged executables, irrespective of the fact that the executables are already installed at the remote end. 
                    </dd>
<dt><span class="term">Submit</span></dt>
<dd>
                       In this mode, only entries that are of type STATIC_BINARY and reside at the submit host ("site" local), are used while constructing the soft state map. This is especially helpful, when the user wants to use the latest compute code for his computations on the grid and that relies on his submit host. 
                    </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.selector.transformation<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.0<br>
<span class="bold"><strong>Type        : </strong></span>Enumeration<br>
<span class="bold"><strong>Value       : </strong></span>Random|Installed|Staged|Submit<br>
<span class="bold"><strong>Default     :</strong></span> Random</p></div></td>
<td>
<p>In case of transfer of executables, Pegasus could
              have various transformations to select from when it schedules to
              run a particular compute job at a remote site. For e.g it can
              have the choice of staging an executable from a particular
              remote site, from the local (submit host) only, use the one that
              is installed on the remote site only.</p>
<p>This property
              determines, how a transformation amongst the various candidate
              transformations is selected, and is applied after the property
              pegasus.tc has been applied. For e.g specifying pegasus.tc as
              Staged and then pegasus.transformation.selector as INSTALLED
              does not work, as by the time this property is applied, the soft
              state map only has entries of type
              STAGED.</p>
<div class="variablelist"><dl>
<dt><span class="term">Random</span></dt>
<dd>
                       In this mode, a random matching candidate transformation is selected to be staged to the remote execution site. 
                    </dd>
<dt><span class="term">Installed</span></dt>
<dd>
                       In this mode, only entries that are of type INSTALLED are selected. This means that the concrete workflow only refers to the transformations already pre installed on the remote sites. 
                    </dd>
<dt><span class="term">Staged</span></dt>
<dd>
                       In this mode, only entries that are of type STATIC_BINARY are selected, ignoring the ones that are installed at the remote site. 
                    </dd>
<dt><span class="term">Submit</span></dt>
<dd>
                       In this mode, only entries that are of type STATIC_BINARY and reside at the submit host ("site" local), are selected as sources for staging the executables to the remote execution sites. 
                    </dd>
</dl></div>
</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.parser.dax.preserver.linebreaks<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 2.2.0<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> false</p></div></td>
<td>The DAX Parser normally does not preserve line breaks
              while parsing the CDATA section that appears in the arguments
              section of the job element in the DAX. On setting this to true,
              the DAX Parser preserves any line line breaks that appear in the
              CDATA section.</td>
</tr>
<tr>
<td><div class="literallayout"><p><span class="bold"><strong><span class="bold"><strong>Property Key: </strong></span></strong></span>pegasus.parser.dax.data.dependencies<span class="bold"><strong><span class="bold"><strong><br>
Profile  Key: </strong></span></strong></span>N/A<span class="bold"><strong><br>
Scope       :</strong></span> Properties<br>
<span class="bold"><strong>Since       :</strong></span> 4.4.0<br>
<span class="bold"><strong>Type        : </strong></span>Boolean<br>
<span class="bold"><strong>Default     :</strong></span> true</p></div></td>
<td>If this property is set to true, then the planner will
              automatically add edges between jobs in the DAX on the basis of
              exisitng data dependencies between jobs. For example, if a JobA
              generates an output file that is listed as input for JobB, then
              the planner will automatically add an edge between JobA and
              JobB.</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break">
</div>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="profiles.php">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="configuration.php">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="submit_directory.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">12.2. Profiles </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> Chapter 13. Submit Directory Details</td>
</tr>
</table>
</div>
</div><?php  
            do_html_footer();
        ?>
