<?php  
            require('/srv/new-pegasus.isi.edu/includes/common.php'); 
            pegasus_header("11.3. How to Scale Large Workflows");
        ?><div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.php">Pegasus 4.8.0 User Guide</a></span> &gt; <span class="breadcrumb-link"><a href="optimization.php">Optimizing Workflows for Efficiency and Scalability</a></span> &gt; <span class="breadcrumb-node">How to Scale Large Workflows</span>
</div><hr><div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="large_workflows"></a>11.3. How to Scale Large Workflows</h2></div></div></div>
<p><span class="emphasis"><em>Issue:</em></span> When planning and running large
    workflows, there are some scalability issues to be aware of. During the
    planning stage, Pegasus traverses the graphs multiple times, and some of
    the graph transforms can be slow depending on if the graph is large in the
    number of tasks, the number of files, or the number of dependencies. Once
    planned, large workflows can also see scalability limits when interacting
    with the operating system. A common problem is the number of files in a
    single directory, such as thousands or millons input or output
    files.</p>
<p><span class="emphasis"><em>Solution:</em></span> The most common solution to these
    problems is to use <a class="link" href="hierarchial_workflows.php" title="11.4. Hierarchical Workflows">hierarchical
    workflows</a>, which works really well if your workflow can be
    logically partitioned into smaller workflows. A hierarchical workflow
    still runs like a single workflow, with the difference being that some
    jobs in the workflow are actually sub-workflows.</p>
<p>For workflows with a large number of files, you can control the
    number of files in a single directory by reorganizing the files into a
    deep directory structure.</p>
</div><div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="job_clustering.php">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="optimization.php">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="hierarchial_workflows.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">11.2. Job Clustering </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> 11.4. Hierarchical Workflows</td>
</tr>
</table>
</div><?php  
            pegasus_footer();
        ?>
