<?php  
            require('/srv/new-pegasus.isi.edu/includes/common.php'); 
            pegasus_header("pegasus-s3");
        ?><div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.php">Pegasus 4.8.0 User Guide</a></span> &gt; <span class="breadcrumb-link"><a href="cli.php">Command Line Tools</a></span> &gt; <span class="breadcrumb-node">pegasus-s3</span>
</div><hr><div lang="en" class="refentry">
<a name="cli-pegasus-s3"></a><div class="titlepage"></div>
<div class="refnamediv">
<h2>Name</h2>
<p>pegasus-s3 — Upload, download, delete objects in Amazon S3</p>
</div>
<div class="refsynopsisdiv">
<a name="pegasus-s3_synopsis"></a><h2>Synopsis</h2>
<div class="blockquote"><blockquote class="blockquote"><div class="literallayout"><p><span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>help</strong></span><br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>ls</strong></span> [options] <span class="emphasis"><em>URL</em></span><br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>mkdir</strong></span> [options] <span class="emphasis"><em>URL…</em></span><br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>rmdir</strong></span> [options] URL…<br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>rm</strong></span> [options] [<span class="emphasis"><em>URL…</em></span>]<br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>put</strong></span> [options] <span class="emphasis"><em>FILE</em></span> <span class="emphasis"><em>URL</em></span><br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>get</strong></span> [options] <span class="emphasis"><em>URL</em></span> [<span class="emphasis"><em>FILE</em></span>]<br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>lsup</strong></span> [options] <span class="emphasis"><em>URL</em></span><br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>rmup</strong></span> [options] <span class="emphasis"><em>URL</em></span> [<span class="emphasis"><em>UPLOAD</em></span>]<br>
<span class="strong"><strong>pegasus-s3</strong></span> <span class="strong"><strong>cp</strong></span> [options] <span class="emphasis"><em>SRC…</em></span> <span class="emphasis"><em>DEST</em></span></p></div></blockquote></div>
</div>
<div class="refsect1">
<a name="pegasus-s3_description"></a><h2>Description</h2>
<p><span class="strong"><strong>pegasus-s3</strong></span> is a client for the Amazon S3 object storage service
and any other storage services that conform to the Amazon S3 API,
such as Eucalyptus Walrus.</p>
</div>
<div class="refsect1">
<a name="pegasus-s3_options"></a><h2>Options</h2>
<div class="refsect2">
<a name="pegasus-s3_global_options"></a><h3>Global Options</h3>
<div class="variablelist"><dl class="variablelist">
<dt>
<span class="term">
<span class="strong"><strong>-h</strong></span>
, </span><span class="term">
<span class="strong"><strong>--help</strong></span>
</span>
</dt>
<dd>
Show help message for subcommand and exit
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-d</strong></span>
, </span><span class="term">
<span class="strong"><strong>--debug</strong></span>
</span>
</dt>
<dd>
Turn on debugging
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-v</strong></span>
, </span><span class="term">
<span class="strong"><strong>--verbose</strong></span>
</span>
</dt>
<dd>
Show progress messages
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-C</strong></span> <span class="emphasis"><em>FILE</em></span>
, </span><span class="term">
<span class="strong"><strong>--conf</strong></span>=<span class="emphasis"><em>FILE</em></span>
</span>
</dt>
<dd>
Path to configuration file
</dd>
</dl></div>
</div>
<div class="refsect2">
<a name="pegasus-s3_ls_options"></a><h3>ls Options</h3>
<div class="variablelist"><dl class="variablelist">
<dt>
<span class="term">
<span class="strong"><strong>-l</strong></span>
, </span><span class="term">
<span class="strong"><strong>--long</strong></span>
</span>
</dt>
<dd>
Use long listing format that includes size, etc.
</dd>
</dl></div>
</div>
<div class="refsect2">
<a name="pegasus-s3_rm_options"></a><h3>rm Options</h3>
<div class="variablelist"><dl class="variablelist">
<dt>
<span class="term">
<span class="strong"><strong>-f</strong></span>
, </span><span class="term">
<span class="strong"><strong>--force</strong></span>
</span>
</dt>
<dd>
If the URL does not exist, then ignore the error.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-F</strong></span> <span class="emphasis"><em>FILE</em></span>
, </span><span class="term">
<span class="strong"><strong>--file</strong></span>=<span class="emphasis"><em>FILE</em></span>
</span>
</dt>
<dd>
File containing a list of URLs to delete
</dd>
</dl></div>
</div>
<div class="refsect2">
<a name="pegasus-s3_put_options"></a><h3>put Options</h3>
<div class="variablelist"><dl class="variablelist">
<dt>
<span class="term">
<span class="strong"><strong>-r</strong></span>
, </span><span class="term">
<span class="strong"><strong>--recursive</strong></span>
</span>
</dt>
<dd>
Upload all files in the directory named FILE to keys with prefix URL.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-c</strong></span> <span class="emphasis"><em>X</em></span>
, </span><span class="term">
<span class="strong"><strong>--chunksize</strong></span>=<span class="emphasis"><em>X</em></span>
</span>
</dt>
<dd>
Set the chunk size for multipart uploads to X MB. A value of 0
disables multipart uploads. The default is 10MB, the min is 5MB
and the max is 1024MB. This parameter only applies for sites that
support multipart uploads (see multipart_uploads configuration
parameter in the <a class="link" href="cli-pegasus-s3.php#CONFIGURATION" title="Configuration"><span class="strong"><strong>CONFIGURATION</strong></span></a> section). The
maximum number of chunks is 10,000, so if you are
uploading a large file, then the chunk size is automatically
increased to enable the upload. Choose smaller values to reduce
the impact of transient failures.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-p</strong></span> <span class="emphasis"><em>N</em></span>
, </span><span class="term">
<span class="strong"><strong>--parallel</strong></span>=<span class="emphasis"><em>N</em></span>
</span>
</dt>
<dd>
Use N threads to upload <span class="emphasis"><em>FILE</em></span> in parallel. The default value is 4, which
enables parallel uploads with 4 threads. This parameter is only valid if
the site supports mulipart uploads and the <span class="strong"><strong>--chunksize</strong></span> parameter is not 0.
Otherwise parallel uploads are disabled.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-b</strong></span>
, </span><span class="term">
<span class="strong"><strong>--create-bucket</strong></span>
</span>
</dt>
<dd>
Create the destination bucket if it does not already exist
</dd>
</dl></div>
</div>
<div class="refsect2">
<a name="pegasus-s3_get_options"></a><h3>get Options</h3>
<div class="variablelist"><dl class="variablelist">
<dt>
<span class="term">
<span class="strong"><strong>-r</strong></span>
, </span><span class="term">
<span class="strong"><strong>--recursive</strong></span>
</span>
</dt>
<dd>
Download all keys that match URL exactly or begin with URL+"/". For example,
<span class="emphasis"><em>pegasus-s3 get -r s3://u@h/bucket/key</em></span> will match both <span class="emphasis"><em>key</em></span> and <span class="emphasis"><em>key/foo</em></span>
but not <span class="emphasis"><em>keyfoo</em></span>. Since S3 allows names to exist as both keys (the bare <span class="emphasis"><em>key</em></span>)
and folders (the <span class="emphasis"><em>key</em></span> in <span class="emphasis"><em>key/foo</em></span>), but file systems do not, you will get
an error when using <span class="strong"><strong>-r</strong></span>/<span class="strong"><strong>--recursive</strong></span> on a bucket that contains such
duplicate names. An entire bucket can be downloaded at once by specifying
only the bucket name in URL.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-c</strong></span> <span class="emphasis"><em>X</em></span>
, </span><span class="term">
<span class="strong"><strong>--chunksize</strong></span>=<span class="emphasis"><em>X</em></span>
</span>
</dt>
<dd>
Set the chunk size for parallel downloads to X megabytes. A value of 0 will
avoid chunked reads. This option only applies for sites that support ranged
downloads (see ranged_downloads configuration parameter). The default chunk
size is 10MB, the min is 1MB and the max is 1024MB. Choose smaller values to
reduce the impact of transient failures.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-p</strong></span> <span class="emphasis"><em>N</em></span>
, </span><span class="term">
<span class="strong"><strong>--parallel</strong></span>=<span class="emphasis"><em>N</em></span>
</span>
</dt>
<dd>
Use N threads to upload FILE in parallel. The default value is 4, which
enables parallel downloads with 4 threads. This parameter is only valid
if the site supports ranged downloads and the <span class="strong"><strong>--chunksize</strong></span> parameter
is not 0. Otherwise parallel downloads are disabled.
</dd>
</dl></div>
</div>
<div class="refsect2">
<a name="pegasus-s3_rmup_options"></a><h3>rmup Options</h3>
<div class="variablelist"><dl class="variablelist">
<dt>
<span class="term">
<span class="strong"><strong>-a</strong></span>
, </span><span class="term">
<span class="strong"><strong>--all</strong></span>
</span>
</dt>
<dd>
Cancel all uploads for the specified bucket
</dd>
</dl></div>
</div>
<div class="refsect2">
<a name="pegasus-s3_cp_options"></a><h3>cp Options</h3>
<div class="variablelist"><dl class="variablelist">
<dt>
<span class="term">
<span class="strong"><strong>-c</strong></span>
, </span><span class="term">
<span class="strong"><strong>--create-dest</strong></span>
</span>
</dt>
<dd>
Create the destination bucket if it does not exist.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-r</strong></span>
, </span><span class="term">
<span class="strong"><strong>--recursive</strong></span>
</span>
</dt>
<dd>
If SRC is a bucket, copy all of the keys in that bucket to DEST. In that case
DEST must be a bucket.
</dd>
<dt>
<span class="term">
<span class="strong"><strong>-f</strong></span>
, </span><span class="term">
<span class="strong"><strong>--force</strong></span>
</span>
</dt>
<dd>
If DEST exists, then overwrite it.
</dd>
</dl></div>
</div>
</div>
<div class="refsect1">
<a name="pegasus-s3_subcommands"></a><h2>Subcommands</h2>
<p><span class="strong"><strong>pegasus-s3</strong></span> has several subcommands for different storage service operations.</p>
<div class="variablelist"><dl class="variablelist">
<dt><span class="term">
<span class="strong"><strong>help</strong></span>
</span></dt>
<dd>
The help subcommand lists all available subcommands.
</dd>
<dt><span class="term">
<span class="strong"><strong>ls</strong></span>
</span></dt>
<dd>
The <span class="strong"><strong>ls</strong></span> subcommand lists the contents of a URL. If the URL does not contain
a bucket, then all the buckets owned by the user are listed. If the URL
contains a bucket, but no key, then all the keys in the bucket are listed.
If the URL contains a bucket and a key, then all keys in the bucket that
begin with the specified key are listed.
</dd>
<dt><span class="term">
<span class="strong"><strong>mkdir</strong></span>
</span></dt>
<dd>
The <span class="strong"><strong>mkdir</strong></span> subcommand creates one or more buckets.
</dd>
<dt><span class="term">
<span class="strong"><strong>rmdir</strong></span>
</span></dt>
<dd>
The <span class="strong"><strong>rmdir</strong></span> subcommand deletes one or more buckets from the storage service.
In order to delete a bucket, the bucket must be empty.
</dd>
<dt><span class="term">
<span class="strong"><strong>rm</strong></span>
</span></dt>
<dd>
The <span class="strong"><strong>rm</strong></span> subcommand deletes one or more keys from the storage service.
</dd>
<dt><span class="term">
<span class="strong"><strong>put</strong></span>
</span></dt>
<dd>
<p class="simpara">
The <span class="strong"><strong>put</strong></span> subcommand stores the file specified by FILE in the storage service
under the bucket and key specified by URL. If the URL contains a bucket,
but not a key, then the file name is used as the key. If URL ends with a "/",
then the file name is appended to the URL to create the key name (e.g.
<span class="emphasis"><em>pegasus-s3 put foo s3://u@h/bucket/key</em></span> will create a key called "key", while
<span class="emphasis"><em>pegasus-s3 put foo s3://u@h/bucket/key/</em></span> will create a key called "key/foo".
The same is true of directories when used with the <span class="strong"><strong>-r</strong></span>/<span class="strong"><strong>--recursive</strong></span> option.
</p>
<p class="simpara">The <span class="strong"><strong>put</strong></span> subcommand can do both chunked and parallel uploads if the service
supports multipart uploads (see <span class="strong"><strong>multipart_uploads</strong></span> in the
<a class="link" href="cli-pegasus-s3.php#CONFIGURATION" title="Configuration"><span class="strong"><strong>CONFIGURATION</strong></span></a> section). Currently only Amazon S3 supports
multipart uploads.</p>
<p class="simpara">This subcommand will check the size of the file to make sure it can
be stored before attempting to store it.</p>
<p class="simpara">Chunked uploads are useful to reduce the probability of an upload
failing. If an upload is chunked, then <span class="strong"><strong>pegasus-s3</strong></span> issues separate
PUT requests for each chunk of the file. Specifying smaller chunks
(using <span class="strong"><strong>--chunksize</strong></span>) will reduce the chances of an upload failing due
to a transient error. Chunksizes can range from 5 MB to 1GB (chunk
sizes smaller than 5 MB produced incomplete uploads on Amazon S3).
The maximum number of chunks for any single file is 10,000, so if a
large file is being uploaded with a small chunksize, then the chunksize
will be increased to fit within the 10,000 chunk limit. By default,
the file will be split into 10 MB chunks if the storage service
supports multipart uploads. Chunked uploads can be disabled by specifying
a chunksize of 0. If the upload is chunked, then each chunk is retried
independently under transient failures. If any chunk fails permanently,
then the upload is aborted.</p>
<p class="simpara">Parallel uploads can increase performance for services that support
multipart uploads. In a parallel upload the file is split into N
chunks and each chunk is uploaded concurrently by one of M threads
in first-come, first-served fashion. If the chunksize is set to 0,
then parallel uploads are disabled. If M &gt; N, then the actual number
of threads used will be reduced to N. The number of threads can be
specified using the --parallel argument. If --parallel is 1,
then only a single thread is used. The default value is 4. There is
no maximum number of threads, but it is likely that the link will
be saturated by 4 to 8 threads.</p>
<p class="simpara">Under certain circumstances, when a multipart upload fails it could
leave behind data on the server. When a failure occurs the <span class="strong"><strong>put</strong></span>
subcommand will attempt to abort the upload. If the upload cannot be
aborted, then a partial upload may remain on the server. To check
for partial uploads run the <span class="strong"><strong>lsup</strong></span> subcommand. If you see an upload
that failed in the output of <span class="strong"><strong>lsup</strong></span>, then run the <span class="strong"><strong>rmup</strong></span> subcommand
to remove it.</p>
</dd>
<dt><span class="term">
<span class="strong"><strong>get</strong></span>
</span></dt>
<dd>
<p class="simpara">
The <span class="strong"><strong>get</strong></span> subcommand retrieves an object from the storage service
identified by URL and stores it in the file specified by FILE. If
FILE is not specified, then the part of the key after the last "/"
is used as the file/directory name, and the results are placed in the
current working directory. If FILE ends with a "/", then the last
component of the key name is appended to FILE to create the output
path (e.g. <span class="emphasis"><em>pegasus-s3 get s3://u@h/bucket/key /tmp/</em></span> will create a
file called <span class="emphasis"><em>/tmp/key</em></span> while <span class="emphasis"><em>pegasus-s3 get s3://u@h/bucket/key /tmp/foo</em></span>
will put the contents of <span class="emphasis"><em>key</em></span> in a file called <span class="emphasis"><em>/tmp/foo</em></span>). The same
is true of folders/directories with the <span class="strong"><strong>-r</strong></span>/<span class="strong"><strong>--recursive</strong></span> option.
</p>
<p class="simpara">The <span class="strong"><strong>get</strong></span> subcommand can do both chunked and parallel downloads if the
service supports ranged downloads (see <span class="strong"><strong>ranged_downloads</strong></span> in the
<a class="link" href="cli-pegasus-s3.php#CONFIGURATION" title="Configuration"><span class="strong"><strong>CONFIGURATION</strong></span></a> section). Currently only Amazon
S3 has good support for ranged downloads. Eucalyptus Walrus supports
ranged downloads, but version 1.6 is inconsistent with
the Amazon interface and has a bug that causes ranged downloads to hang
in some cases. It is recommended that ranged downloads not be used with
Walrus 1.6.</p>
<p class="simpara">Chunked downloads can be used to reduce the probability of a
download failing. When a download is chunked, <span class="strong"><strong>pegasus-s3</strong></span> issues
separate GET requests for each chunk of the file. Specifying
smaller chunks (using <span class="strong"><strong>--chunksize</strong></span>) will reduce the chances that
a download will fail to do a transient error. Chunk sizes can
range from 1 MB to 1 GB. By default, a download will be split
into 10 MB chunks if the site supports ranged downloads. Chunked
downloads can be disabled by specifying a <span class="strong"><strong>--chunksize</strong></span> of 0. If
a download is chunked, then each chunk is retried independently
under transient failures. If any chunk fails permanently, then
the download is aborted.</p>
<p class="simpara">Parallel downloads can increase performance for services that
support ranged downloads. In a parallel download, the file to
be retrieved is split into N chunks and each chunk is downloaded
concurrently by one of M threads in a first-come, first-served
fashion. If the chunksize is 0, then parallel downloads are
disabled. If M &gt; N, then the actual number of threads used will
be reduced to N. The number of threads can be specified using the
--parallel argument. If --parallel is 1, then only a single
thread is used. The default value is 4. There is no maximum number
of threads, but it is likely that the link will be saturated by
4 to 8 threads.</p>
</dd>
<dt><span class="term">
<span class="strong"><strong>lsup</strong></span>
</span></dt>
<dd>
<p class="simpara">
The <span class="strong"><strong>lsup</strong></span> subcommand lists active multipart uploads. The URL
specified should point to a bucket. This command is only valid
if the site supports multipart uploads. The output of this command
is a list of keys and upload IDs.
</p>
<p class="simpara">This subcommand is used with <span class="strong"><strong>rmup</strong></span> to help recover from failures
of multipart uploads.</p>
</dd>
<dt><span class="term">
<span class="strong"><strong>rmup</strong></span>
</span></dt>
<dd>
<p class="simpara">
The <span class="strong"><strong>rmup</strong></span> subcommand cancels and active upload. The URL specified
should point to a bucket, and UPLOAD is the long, complicated upload
ID shown by the <span class="strong"><strong>lsup</strong></span> subcommand.
</p>
<p class="simpara">This subcommand is used with <span class="strong"><strong>lsup</strong></span> to recover from failures of
multipart uploads.</p>
</dd>
<dt><span class="term">
<span class="strong"><strong>cp</strong></span>
</span></dt>
<dd>
The <span class="strong"><strong>cp</strong></span> subcommand copies keys on the server. Keys cannot be copied
between accounts.
</dd>
</dl></div>
</div>
<div class="refsect1">
<a name="pegasus-s3_url_format"></a><h2>URL Format</h2>
<p>All URLs for objects stored in S3 should be specified in the
following format:</p>
<pre class="screen">s3[s]://USER@SITE[/BUCKET[/KEY]]</pre>
<p>The protocol part can be <span class="emphasis"><em>s3://</em></span> or <span class="emphasis"><em>s3s://</em></span>. If <span class="emphasis"><em>s3s://</em></span> is used, then <span class="strong"><strong>pegasus-s3</strong></span>
will force the connection to use SSL and override the setting in the
configuration file. If s3:// is used, then whether the connection uses SSL or
not is determined by the value of the <span class="emphasis"><em>endpoint</em></span> variable in the configuration
for the site.</p>
<p>The <span class="emphasis"><em>USER@SITE</em></span> part is required, but the <span class="emphasis"><em>BUCKET</em></span> and <span class="emphasis"><em>KEY</em></span> parts may be optional
depending on the context.</p>
<p>The <span class="emphasis"><em>USER@SITE</em></span> portion is referred to as the “identity”, and the <span class="emphasis"><em>SITE</em></span> portion
is referred to as the “site”. Both the identity and the site are looked up in
the configuration file (see <a class="link" href="cli-pegasus-s3.php#CONFIGURATION" title="Configuration"><span class="strong"><strong>CONFIGURATION</strong></span></a>) to determine
the parameters to use when establishing a connection to the service. The site
portion is used to find the host and port, whether to use SSL, and other
things. The identity portion is used to determine which authentication
tokens to use. This format is designed to enable users to easily use
multiple services with multiple authentication tokens. Note that neither
the <span class="emphasis"><em>USER</em></span> nor the <span class="emphasis"><em>SITE</em></span> portion of the URL have any meaning outside of
<span class="strong"><strong>pegasus-s3</strong></span>. They do not refer to real usernames or hostnames, but are
rather handles used to look up configuration values in the configuration
file.</p>
<p>The BUCKET portion of the URL is the part between the 3rd and 4th slashes.
Buckets are part of a global namespace that is shared with other users of
the storage service. As such, they should be unique.</p>
<p>The KEY portion of the URL is anything after the 4th slash. Keys can
include slashes, but S3-like storage services do not have the concept of
a directory like regular file systems. Instead, keys are treated like opaque
identifiers for individual objects. So, for example, the keys <span class="emphasis"><em>a/b</em></span> and <span class="emphasis"><em>a/c</em></span>
have a common prefix, but cannot be said to be in the same <span class="emphasis"><em>directory</em></span>.</p>
<p>Some example URLs are:</p>
<pre class="screen">s3://ewa@amazon
s3://juve@skynet/gideon.isi.edu
s3://juve@magellan/pegasus-images/centos-5.5-x86_64-20101101.part.1
s3s://ewa@amazon/pegasus-images/data.tar.gz</pre>
</div>
<div class="refsect1">
<a name="CONFIGURATION"></a><h2>Configuration</h2>
<p>Each user should specify a configuration file that <span class="strong"><strong>pegasus-s3</strong></span> will use to
look up connection parameters and authentication tokens.</p>
<div class="refsect2">
<a name="pegasus-s3_search_path"></a><h3>Search Path</h3>
<p>This client will look in the following locations, in order, to locate the
user’s configuration file:</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem">
The -C/--conf argument
</li>
<li class="listitem">
The S3CFG environment variable
</li>
<li class="listitem">
$HOME/.pegasus/s3cfg
</li>
<li class="listitem">
$HOME/.s3cfg
</li>
</ol></div>
<p>If it does not find the configuration file in one of these locations it
will fail with an error. The $HOME/.s3cfg location is only supported for
backward-compatibility. $HOME/.pegasus/s3cfg should be used instead.</p>
</div>
<div class="refsect2">
<a name="pegasus-s3_configuration_file_format"></a><h3>Configuration File Format</h3>
<p>The configuration file is in INI format and contains two types of entries.</p>
<p>The first type of entry is a site entry, which specifies the configuration
for a storage service. This entry specifies the service endpoint that
<span class="strong"><strong>pegasus-s3</strong></span> should connect to for the site, and some optional features
that the site may support. Here is an example of a site entry for Amazon S3:</p>
<pre class="screen">[amazon]
endpoint = http://s3.amazonaws.com/</pre>
<p>The other type of entry is an identity entry, which specifies the
authentication information for a user at a particular site. Here is an example
of an identity entry:</p>
<pre class="screen">[pegasus@amazon]
access_key = 90c4143642cb097c88fe2ec66ce4ad4e
secret_key = a0e3840e5baee6abb08be68e81674dca</pre>
<p>It is important to note that user names and site names used are only
logical—they do not correspond to actual hostnames or usernames, but
are simply used as a convenient way to refer to the services and
identities used by the client.</p>
<p>The configuration file should be saved with limited permissions. Only the
owner of the file should be able to read from it and write to it (i.e. it
should have permissions of 0600 or 0400). If the file has more liberal
permissions, then <span class="strong"><strong>pegasus-s3</strong></span> will fail with an error message. The purpose of
this is to prevent the authentication tokens stored in the configuration file
from being accessed by other users.</p>
</div>
<div class="refsect2">
<a name="pegasus-s3_configuration_variables"></a><h3>Configuration Variables</h3>
<div class="variablelist"><dl class="variablelist">
<dt><span class="term">
<span class="strong"><strong>endpoint</strong></span> (site)
</span></dt>
<dd>
The URL of the web service endpoint. If the URL begins with <span class="emphasis"><em>https</em></span>, then SSL will be used.
</dd>
<dt><span class="term">
<span class="strong"><strong>max_object_size</strong></span> (site)
</span></dt>
<dd>
The maximum size of an object in GB (default: 5GB)
</dd>
<dt><span class="term">
<span class="strong"><strong>multipart_uploads</strong></span> (site)
</span></dt>
<dd>
Does the service support multipart uploads (True/False, default: False)
</dd>
<dt><span class="term">
<span class="strong"><strong>ranged_downloads</strong></span> (site)
</span></dt>
<dd>
Does the service support ranged downloads? (True/False, default: False)
</dd>
<dt><span class="term">
<span class="strong"><strong>access_key</strong></span> (identity)
</span></dt>
<dd>
The access key for the identity
</dd>
<dt><span class="term">
<span class="strong"><strong>secret_key</strong></span> (identity)
</span></dt>
<dd>
The secret key for the identity
</dd>
</dl></div>
</div>
<div class="refsect2">
<a name="pegasus-s3_example_configuration"></a><h3>Example Configuration</h3>
<p>This is an example configuration that specifies a two sites (amazon and
magellan) and three identities (<code class="literal">pegasus@amazon</code>,<code class="literal">juve@magellan</code>, and
<code class="literal">voeckler@magellan</code>). For the amazon site the maximum object size
is 5TB, and the site supports both multipart uploads and ranged downloads,
so both uploads and downloads can be done in parallel.</p>
<pre class="screen">[amazon]
endpoint = https://s3.amazonaws.com/
max_object_size = 5120
multipart_uploads = True
ranged_downloads = True

[pegasus@amazon]
access_key = 90c4143642cb097c88fe2ec66ce4ad4e
secret_key = a0e3840e5baee6abb08be68e81674dca

[magellan]
# NERSC Magellan is a Eucalyptus site. It doesn't support multipart uploads,
# or ranged downloads (the defaults), and the maximum object size is 5GB
# (also the default)
endpoint = https://128.55.69.235:8773/services/Walrus

[juve@magellan]
access_key = quwefahsdpfwlkewqjsdoijldsdf
secret_key = asdfa9wejalsdjfljasldjfasdfa

[voeckler@magellan]
# Each site can have multiple associated identities
access_key = asdkfaweasdfbaeiwhkjfbaqwhei
secret_key = asdhfuinakwjelfuhalsdflahsdl</pre>
</div>
</div>
<div class="refsect1">
<a name="pegasus-s3_example"></a><h2>Example</h2>
<p>List all buckets owned by identity <span class="emphasis"><em>user@amazon</em></span>:</p>
<pre class="screen">$ pegasus-s3 ls s3://user@amazon</pre>
<p>List the contents of bucket <span class="emphasis"><em>bar</em></span> for identity <span class="emphasis"><em>user@amazon</em></span>:</p>
<pre class="screen">$ pegasus-s3 ls s3://user@amazon/bar</pre>
<p>List all objects in bucket <span class="emphasis"><em>bar</em></span> that start with <span class="emphasis"><em>hello</em></span>:</p>
<pre class="screen">$ pegasus-s3 ls s3://user@amazon/bar/hello</pre>
<p>Create a bucket called <span class="emphasis"><em>mybucket</em></span> for identity <span class="emphasis"><em>user@amazon</em></span>:</p>
<pre class="screen">$ pegasus-s3 mkdir s3://user@amazon/mybucket</pre>
<p>Delete a bucket called <span class="emphasis"><em>mybucket</em></span>:</p>
<pre class="screen">$ pegasus-s3 rmdir s3://user@amazon/mybucket</pre>
<p>Upload a file <span class="emphasis"><em>foo</em></span> to bucket <span class="emphasis"><em>bar</em></span>:</p>
<pre class="screen">$ pegasus-s3 put foo s3://user@amazon/bar/foo</pre>
<p>Download an object <span class="emphasis"><em>foo</em></span> in bucket <span class="emphasis"><em>bar</em></span>:</p>
<pre class="screen">$ pegasus-s3 get s3://user@amazon/bar/foo foo</pre>
<p>Upload a file in parallel with 4 threads and 100MB chunks:</p>
<pre class="screen">$ pegasus-s3 put --parallel 4 --chunksize 100 foo s3://user@amazon/bar/foo</pre>
<p>Download an object in parallel with 4 threads and 100MB chunks:</p>
<pre class="screen">$ pegasus-s3 get --parallel 4 --chunksize 100 s3://user@amazon/bar/foo foo</pre>
<p>List all partial uploads for bucket <span class="emphasis"><em>bar</em></span>:</p>
<pre class="screen">$ pegasus-s3 lsup s3://user@amazon/bar</pre>
<p>Remove all partial uploads for bucket <span class="emphasis"><em>bar</em></span>:</p>
<pre class="screen">$ pegasus-s3 rmup --all s3://user@amazon/bar</pre>
</div>
<div class="refsect1">
<a name="pegasus-s3_return_value"></a><h2>Return Value</h2>
<p><span class="strong"><strong>pegasus-s3</strong></span> returns a zero exist status if the operation is successful. A
non-zero exit status is returned in case of failure.</p>
</div>
<div class="refsect1">
<a name="pegasus-s3_author"></a><h2>Author</h2>
<p>Gideon Juve <code class="literal">&lt;gideon@isi.edu&gt;</code></p>
<p>Pegasus Team <a class="ulink" href="http://pegasus.isi.edu" target="_top">http://pegasus.isi.edu</a></p>
</div>
</div><div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="cli-pegasus-run.php">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="cli.php">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="cli-pegasus-sc-converter.php">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">pegasus-run </td>
<td width="20%" align="center"><a accesskey="h" href="index.php">Table of Contents</a></td>
<td width="40%" align="right" valign="top"> pegasus-sc-converter</td>
</tr>
</table>
</div><?php  
            pegasus_footer();
        ?>
