<?php  
            include_once( $_SERVER['DOCUMENT_ROOT']."/static/includes/common.inc.php" );
            do_html_header("Documentation");
        ?><div id="content">
<div class="navheader">
<table width="100%" summary="Navigation header"><tr>
<td width="20%" align="left">
<a accesskey="p" href="execution_environments.php">Prev</a> </td>
<td width="60%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="20%" align="right"> <a accesskey="n" href="cloud.php">Next</a>
</td>
</tr></table>
<hr>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="condor_pool"></a>7.2. Condor Pool</h2></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="condor_pool.php#glideins">7.2.1. Glideins</a></span></dt>
<dt><span class="section"><a href="condor_pool.php#idp47956128">7.2.2. CondorC</a></span></dt>
</dl></div>
<p>A HTCondor pool is a set of machines that use HTCondor for resource
    management. A HTCondor pool can be a cluster of dedicated machines or a
    set of distributively owned machines. Pegasus can generate concrete
    workflows that can be executed on a HTCondor pool.</p>
<div class="figure">
<a name="idp48378208"></a><p class="title"><b>Figure 7.1. The distributed resources appear to be part of a HTCondor
      pool.</b></p>
<div class="figure-contents"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" style="cellpadding: 0; cellspacing: 0;" width="100%"><tr><td><img src="images/condor_layout.png" height="360" alt="The distributed resources appear to be part of a HTCondor pool."></td></tr></table></div></div>
</div>
<br class="figure-break"><p>The workflow is submitted using DAGMan from one of the job
    submission machines in the HTCondor pool. It is the responsibility of the
    Central Manager of the pool to match the task in the workflow submitted by
    DAGMan to the execution machines in the pool. This matching process can be
    guided by including HTCondor specific attributes in the submit files of
    the tasks. If the user wants to execute the workflow on the execution
    machines (worker nodes) in a HTCondor pool, there should be a resource
    defined in the site catalog which represents these execution machines. The
    universe attribute of the resource should be vanilla. There can be
    multiple resources associated with a single HTCondor pool, where each
    resource identifies a subset of machine (worker nodes) in the pool.</p>
<p>When running on a HTCondor pool, the user has to decide how Pegasus
    should transfer data. Please see the <a class="link" href="data_staging_configuration.php" title="5.3. Data Staging Configuration">Data Staging Configuration</a> for
    the options. The easiest is to use <span class="bold"><strong>condorio</strong></span> as that mode does not require any extra
    setup - HTCondor will do the transfers using the existing HTCondor
    daemons. For an example of this mode see the example workflow in
    <code class="filename"> share/pegasus/examples/condor-blackdiamond-condorio/</code>
    . In HTCondorio mode, the site catalog for the execution site is very
    simple as storage is provided by HTCondor:</p>
<pre class="programlisting">
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;sitecatalog xmlns="http://pegasus.isi.edu/schema/sitecatalog"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://pegasus.isi.edu/schema/sitecatalog http://pegasus.isi.edu/schema/sc-4.0.xsd"
             version="4.0"&gt;

    &lt;site  handle="local" arch="x86_64" os="LINUX"&gt;
        &lt;directory type="shared-scratch" path="/tmp/wf/work"&gt;
            &lt;file-server operation="all" url="file:///tmp/wf/work"/&gt;
        &lt;/directory&gt;
        &lt;directory type="local-storage" path="/tmp/wf/storage"&gt;
            &lt;file-server operation="all" url="file:///tmp/wf/storage"/&gt;
        &lt;/directory&gt;
    &lt;/site&gt;

    &lt;site  handle="condorpool" arch="x86_64" os="LINUX"&gt;
        &lt;profile namespace="pegasus" key="style" &gt;condor&lt;/profile&gt;
        &lt;profile namespace="condor" key="universe" &gt;vanilla&lt;/profile&gt;
    &lt;/site&gt;

&lt;/sitecatalog&gt;
</pre>
<p>There is a set of HTCondor profiles which are used commonly when
    running Pegasus workflows. You may have to set some or all of these
    depending on the setup of the HTCondor pool:</p>
<pre class="programlisting">  &lt;!-- Change the style to HTCondor for jobs to be executed in the HTCondor Pool.
       By default, Pegasus creates jobs suitable for grid execution. --&gt;
  &lt;profile namespace="pegasus" key="style"&gt;condor&lt;/profile&gt;

  &lt;!-- Change the universe to vanilla to make the jobs go to remote compute
       nodes. The default is local which will only run jobs on the submit host --&gt;
  &lt;profile namespace="condor" key="universe" &gt;vanilla&lt;/profhile&gt;

  &lt;!-- The requirements expression allows you to limit where your jobs go --&gt;
  &lt;profile namespace="condor" key="requirements"&gt;(Target.FileSystemDomain != &amp;quot;yggdrasil.isi.edu&amp;quot;)&lt;/profile&gt;

  &lt;!-- The following two profiles forces HTCondor to always transfer files. This
       has to be used if the pool does not have a shared filesystem --&gt;
  &lt;profile namespace="condor" key="should_transfer_files"&gt;True&lt;/profile&gt;
  &lt;profile namespace="condor" key="when_to_transfer_output"&gt;ON_EXIT&lt;/profile&gt;</pre>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="glideins"></a>7.2.1. Glideins</h3></div></div></div>
<p>In this section we describe how machines from different
      administrative domains and supercomputing centers can be dynamically
      added to a HTCondor pool for certain timeframe. These machines join the
      HTCondor pool temporarily and can be used to execute jobs in a non
      preemptive manner. This functionality is achieved using a HTCondor
      feature called <span class="bold"><strong>glideins</strong></span> (see <a class="ulink" href="http://cs.wisc.edu/condor/glidein" target="_top">
      http://cs.wisc.edu/condor/glidein</a>) . The startd daemon is the
      HTCondor daemon which provides the compute slots and runs the jobs. In
      the glidein case, the submit machine is usually a static machine and the
      glideins are told configued to report to that submit machine. The
      glideins can be submitted to any type of resource: a GRAM enabled
      cluster, a campus cluster, a cloud environment such as Amazon AWS, or
      even another HTCondor cluster.</p>
<div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Tip</h3>
<p>As glideins are usually coming from different compute resource,
        and/or the glideins are running in an administrative domain different
        from the submit node, there is usually no shared filesystem available.
        Thus the most common <a class="link" href="data_staging_configuration.php" title="5.3. Data Staging Configuration">data
        staging modes</a> are <span class="bold"><strong>condorio</strong></span> and
        <span class="bold"><strong>nonsharedfs</strong></span> .</p>
</div>
<p>There are many useful tools which submits and manages glideins for
      you:</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem"><p><a class="ulink" href="http://www.uscms.org/SoftwareComputing/Grid/WMS/glideinWMS/" target="_top">
          GlideinWMS</a> is a tool and host environment used mostly on the
          <a class="ulink" href="http://www.opensciencegrid.org/" target="_top">Open Science
          Grid</a>.</p></li>
<li class="listitem"><p><a class="ulink" href="http://pegasus.isi.edu/projects/corralwms" target="_top">
          CorralWMS</a> is a personal frontend for GlideinWMS. CorralWMS
          was developed by the Pegasus team and works very well for high
          throughput workflows.</p></li>
<li class="listitem"><p><a class="ulink" href="http://research.cs.wisc.edu/condor/manual/v7.6/condor_glidein.html" target="_top">
          condor_glidein</a> is a simple glidein tool for Globus GRAM
          clusters. HTCondor_glidein is shipped with HTCondor.</p></li>
<li class="listitem"><p>Glideins can also be created by hand or scripts. This is a
          useful solution for example for cluster which have no external job
          submit mechanisms or do not allow outside networking.</p></li>
</ul></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="idp47956128"></a>7.2.2. CondorC</h3></div></div></div>
<p>Using HTCondorC users can submit workflows to remote HTCondor
      pools. HTCondorC is a HTCondor specific solution for remote submission
      that does not involve the setting up a GRAM on the headnode. To enable
      HTCondorC submission to a site, user needs to associate pegasus profile
      key named style with value as HTCondorc. In case, the remote HTCondor
      pool does not have a shared filesytem between the nodes making up the
      pool, users should use pegasus in the HTCondorio data configuration. In
      this mode, all the data is staged to the remote node in the HTCondor
      pool using HTCondor File transfers and is executed using
      PegasusLite.</p>
<p>A sample site catalog for submission to a HTCondorC enabled site
      is listed below</p>
<pre class="programlisting">
&lt;sitecatalog xmlns="http://pegasus.isi.edu/schema/sitecatalog"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://pegasus.isi.edu/schema/sitecatalog http://pegasus.isi.edu/schema/sc-4.0.xsd"
             version="4.0"&gt;
      
    &lt;site  handle="local" arch="x86_64" os="LINUX"&gt;
        &lt;directory type="shared-scratch" path="/tmp/wf/work"&gt;
            &lt;file-server operation="all" url="file:///tmp/wf/work"/&gt;
        &lt;/directory&gt;
        &lt;directory type="local-storage" path="/tmp/wf/storage"&gt;
            &lt;file-server operation="all" url="file:///tmp/wf/storage"/&gt;
        &lt;/directory&gt;
    &lt;/site&gt;

    &lt;site  handle="condorcpool" arch="x86_86" os="LINUX"&gt;
         &lt;!-- the grid gateway entries are used to designate
              the remote schedd for the HTCondorC pool --&gt;
         &lt;grid type="condor" contact="ccg-condorctest.isi.edu" scheduler="Condor" jobtype="compute" /&gt;
         &lt;grid type="condor" contact="ccg-condorctest.isi.edu" scheduler="Condor" jobtype="auxillary" /&gt;
        
        &lt;!-- enable submission using HTCondorc --&gt;
        &lt;profile namespace="pegasus" key="style"&gt;condorc&lt;/profile&gt;

        &lt;!-- specify which HTCondor collector to use. 
             If not specified defaults to remote schedd specified in grid gateway --&gt;
        &lt;profile namespace="condor" key="condor_collector"&gt;condorc-collector.isi.edu&lt;/profile&gt;
        
        &lt;profile namespace="condor" key="should_transfer_files"&gt;Yes&lt;/profile&gt;
        &lt;profile namespace="condor" key="when_to_transfer_output"&gt;ON_EXIT&lt;/profile&gt;
        &lt;profile namespace="env" key="PEGASUS_HOME" &gt;/usr&lt;/profile&gt;
        &lt;profile namespace="condor" key="universe"&gt;vanilla&lt;/profile&gt;

    &lt;/site&gt;

&lt;/sitecatalog&gt;
</pre>
<p>To enable PegasusLite in HTCondorIO mode, users should set the
      following in their properties</p>
<pre class="programlisting"># pegasus properties
pegasus.data.configuration    HTCondorio</pre>
</div>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="execution_environments.php">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="execution_environments.php">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="cloud.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Chapter 7. Execution Environments </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> 7.3. Cloud (Amazon EC2/S3, Google Cloud, ...)</td>
</tr>
</table>
</div>
</div><?php  
            do_html_footer();
        ?>
