قالب وردپرس درنا توس
Home / Tips and Tricks / Sync files from Linux to Amazon S3 – CloudSavvy IT

Sync files from Linux to Amazon S3 – CloudSavvy IT



  Amazon S3.

AWS S3 is Amazon's cloud storage service, which allows you to store individual files as objects in a bucket. You can upload files from the command line on your Linux server or even sync entire folders with S3.

If you only want to share files between EC2 instances, you can use an EFS volume and mount it directly on multiple servers, cutting all the way from the "cloud". But you shouldn't use it for everything, because it's much more expensive than S3, even if Infrequent Access is enabled.

Restrict S3 access to an IAM user

Your server probably does not need full root access to your AWS account, so before doing any kind of file synchronization you need to create a new IAM user that will serve your server can use. With an IAM user, you can limit your server to only manage your S3 buckets.

Create a new user from the IAM Management Console and enable "Programmatic Access".

<img class = "alignnone wp-image-579 size-full" data-pagespeed-lazy-src = "https://www.cloudsavvyit.com/thumbcache/0/0/7d1

593ae03066ffebf0060d312e13f66/p/uploads/2019 /06/x277b1aca-3.png.pagespeed.gp+jp+jw+pj+ws+js+rj+rp+rw+ri+cp+md.ic.v66lbWhhSe.jpg "alt =" Set the User Details menu. [19659002] You will be asked to choose permissions for this user. Create a new group and assign it "AmazonS3FullAccess" permission.

 Assign group permissions.

You will be given an access key and secret key Note these down, you will need them to verify your server

You can also manually assign more detailed S3 permissions, such as permission to use a specific bucket or just upload files, but limit access to S3 alone should be fine in most cases.

File synchronization with s3cmd

s3cmd is a utility program designed to make working with S3 easier from the command line. It is not part of the AWS CLI, so you have to manually install it from your distro's package manager. For Debian based systems like Ubuntu it would be:

  sudo apt-get install s3cmd 

Once s3cmd you need to associate it with the IAM user you created to manage S3. Perform the configuration with:

  s3cmd --configure 

. You will be asked for the access key and secret key that the IAM Management Console gave you. Paste it here. There are a few more options, such as changing the endpoints for S3 or enabling encryption, but you can leave them all by default and just select "Y" at the end to save the configuration.

To upload a file, use:

 ] s3cmd put file s3: // bucket 

Replace "bucket" with your bucket name. To retrieve those files, run:

  s3cmd get s3: // bucket / remotefile localfile 

And if you want to sync across an entire directory, run:

  s3cmd sync directory s3: / / bucket / [19659015] Copies the entire folder to a folder in S3. The next time you run it, only the files that have changed since it was last run will be copied. It will not delete files unless you run it with the option  - delete-remove . 

s3cmd sync does not run automatically, so if you want to keep this folder updated regularly, you need to run this command regularly. You can automate this with cron ; Open your crontab with crontab -e and add this command to finish:

  0 0 * * * s3cmd sync directory s3: // bucket> / dev / null 2> & 1 [19659014] This will sync "directory" once a day with "bucket". By the way, if you got stuck with  crontab -e  in  vim  you can change the default text editor with  export VISUAL = nano;  or whatever preference you have 

s3cmd has many subcommands; you can copy between buckets with cp move files with mv and even create and delete buckets from the command line with mb and rb respectively. For a complete list, use s3cmd -h .

Another option: AWS CLI

Beyond s3cmd there are a few other command line options for synchronizing files to S3. AWS provides their own tools with the AWS CLI. You need Python 3+ and you can install the CLI from pip3 with:

  pip3 install awscli --upgrade --user 

This will install the aws command, that you can use to communicate with AWS services. You must configure it in the same way as s3cmd, which you can do with:

  aws configure 

You will be asked to enter the access key and secret key for your IAM user.

The syntax for AWS CLI is similar to s3cmd . To upload a file, use:

  aws s3 cp file s3: // bucket 

To sync an entire folder, use:

  aws s3 sync folder s3: / / bucket 

You can copy and even sync between buckets with the same commands. You can use aws help for a complete assignment list, or read the assignment reference on their website.

Full Backups: Restic, Duplicity

If you want to make large backups, you might want to use some other tool than a simple sync tool. If you sync to S3 with s3cmd or the AWS CLI, any changes you made will overwrite the current files. Since the main concern of cloud file storage is usually not disk failure but accidental deletion without accessing revision history, this is a problem.

AWS supports file versioning, which somewhat solves this problem, but you may still want to use a more powerful backup program to handle it yourself, especially if you're making full-drive backups.

Duplicity is a simple utility that backs up files in the form of encrypted TAR volumes. The first archive is a full backup, and all subsequent archives are incremental, saving only the changes made since the last archive.

This is very efficient, but restoring from a backup is less efficient, as the recovery process will have to follow the series of changes to arrive at the final state of the data. Restic solves this problem by storing data in deduplicated encrypted blocks and keeping a snapshot of each version for recovery. In this way, the current status of the files is easy to consult and any revision is still accessible.

Both tools can be configured to work with AWS S3, as well as multiple other storage providers. Alternatively, if you only want to back up EBS-based EC2 instances, you can use incremental EBS snapshots, although it is more expensive than backing up to S3 manually.


Source link