Quick Start; Spring Security 5 OAuth2 Login

Standard

Social logins using Oauth2 have become a industry standard. It revolutionized the way sites share data and has allowed users to quickly access new applications without having to create a new set of credentials. This article gives an example of why OAuth2 was invented as well as provides a working example of Spring Security 5 application integrated with Google.

The source code for this tutorial is available on Github.

But what did sites do before OAuth2?

Before OAuth2, sites did some pretty scary things to get users data from other applications. For example; requiring your login credentials to get your contacts from another application. The classic example of this is Yelp in 2008 where they would ask for your login credentials to gather your contacts from MSN, Yahoo, AOL, and Gmail. This was a huge security risk because it meant you were giving your password to Yelp, and they were promising they wouldn’t do anything bad to your account.

Example of Yelp before OAuth2

But how is OAuth2 going to solve this problem?

Have you ever created an account on a site using your Facebook, Google, or Microsoft account? These are examples of OAuth2 client providers. Using one of these providers we can register our application to allow user sign ins with their existing email accounts, while at the same time not compromising the user. Essentially we are offhanding all the authentication to the client provider.

With Spring Security 5 this process could not be more simple to implement. In this quickstart tutorial we will see how to quickly setup a spring boot app to use Spring Security 5 to authenticate users with Google.

Begin by creating a new Spring Boot project. Because we want this to have a restful interface I am including the “Spring Web” dependencies and of course the “Spring Security” dependency as this gives us the OAuth2 client libraries. I am using version 2.2.6.RELEASE for the example;

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-security</artifactId>
</dependency>

<!-- oauth2 -->
<dependency>
	<groupId>org.springframework.security</groupId>
	<artifactId>spring-security-oauth2-client</artifactId>
	<version>5.3.0.RELEASE</version>
</dependency>
<dependency>
	<groupId>org.springframework.security</groupId>
	<artifactId>spring-security-oauth2-jose</artifactId>
	<version>5.3.0.RELEASE</version>
</dependency>

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-web</artifactId>
</dependency>

I also include swagger dependencies. Swagger is a tool that lets you document your api endpoints. It comes with a nice user interface, and you will see us using this later on to test restful endpoints.

<dependency>
	<groupId>io.springfox</groupId>
	<artifactId>springfox-swagger2</artifactId>
	<version>2.9.2</version>
</dependency>
<dependency>
	<groupId>io.springfox</groupId>
	<artifactId>springfox-swagger-ui</artifactId>
	<version>2.9.2</version>
</dependency>

Creating an app in Google Cloud

Now that we have all our dependencies setup we must register our application within the provider (Google Cloud); Google OAuth2 implementation follows the OpenID Connect 1.0 specification which is an identity layer added to the protocol and allows clients to verify the identity of the end-user based on the authentication performed by the authorization server. It provides basic profile information about the end-user.

To authenticate users you must create a new application within the Google Cloud Platform. If you don’t have an account yet, quickly sign up for one. Then navigate to the “API & Services” -> “Credentials” section to generate “OAuth 2.0 Client Ids”. For this example I am making this an internal api, which only allowing users within my organization to authenticate. I have also set the authorized redirect URI of “http://localhost:5000/login/oauth2/code/google”.

Registering a new Oauth client within Google cloud

After creation you should receive an appId and secret key.

Using the newly generated clientId create a Oauth2LoginConfig class within our app.

@Configuration
public class OAuth2LoginConfig {

    @EnableWebSecurity
    public static class OAuth2LoginSecurityConfig extends WebSecurityConfigurerAdapter {

        @Override
        protected void configure(HttpSecurity http) throws Exception {
            http.authorizeRequests(authorize -> authorize.anyRequest().authenticated())
                    .oauth2Login(Customizer.withDefaults());
        }
    }

    @Bean
    public ClientRegistrationRepository clientRegistrationRepository() {
        return new InMemoryClientRegistrationRepository(this.googleClientRegistration());
    }

    @Bean
    public OAuth2AuthorizedClientService authorizedClientService(
            ClientRegistrationRepository clientRegistrationRepository) {
        return new InMemoryOAuth2AuthorizedClientService(clientRegistrationRepository);
    }

    @Bean
    public OAuth2AuthorizedClientRepository authorizedClientRepository(
            OAuth2AuthorizedClientService authorizedClientService) {
        return new AuthenticatedPrincipalOAuth2AuthorizedClientRepository(authorizedClientService);
    }

    private ClientRegistration googleClientRegistration() {
        return CommonOAuth2Provider.GOOGLE.getBuilder("google")
                .clientId("INSERT_APP_ID")
                .clientSecret("INSERT_CLIENT_ID")
                .scope("email",
                        "profile",
                        "openid",
                        "https://www.googleapis.com/auth/user.addresses.read",
                        "https://www.googleapis.com/auth/user.phonenumbers.read",
                        "https://www.googleapis.com/auth/user.birthday.read",
                        "https://www.googleapis.com/auth/user.emails.read")
                .build();
    }
}

This class configures a does several things;

  • Creates a WebSecurityConfigurerAdapter which secures all restful endpoints.
  • Registers a ClientRegistrationRepository and Oauth2AuthorizedClientService both using there respective in memory services / respositories. This means that the app will store all user data in memory once logged in.
  • The highlight of this class is the google client registration at the bottom. Using the api key and api secret from our google cloud account update the missing variables. This DSL also configures scopes which we will request from Google. The user must approve the scopes to give us access to there profile information

This configuration uses a DSL setup, but you can easily configure the google client registration using a yml file as well.

What is DSL?

DSL stands for domain specific language or also referred to as a fluent interface based on the builder pattern. You can see this by how the chaining together configuration options for the “http security” and “client registration”.

Creating a secure api

At this point the application is ready to run. But we want to see the results of our authenication with Google. Create a simple SecureController and output the results of the OAuth2User within the spring application.

@RestController
public class SecureController {

    @ApiOperation(
            value = "Get the current logged-in user",
            notes = "This example only returns the user logged in from Google")
    @GetMapping("/")
    public HashMap<String, Object> index(@ApiIgnore @RegisteredOAuth2AuthorizedClient OAuth2AuthorizedClient authorizedClient,
                                         @ApiIgnore @AuthenticationPrincipal OAuth2User oauth2User) {
        HashMap<String, Object> results = new HashMap<>();
        results.put("username", oauth2User.getName());
        results.put("attributes", oauth2User.getAttributes());
        results.put("authorities", oauth2User.getAuthorities());
        results.put("clientScopes", authorizedClient.getClientRegistration().getScopes());
        results.put("clientName", authorizedClient.getClientRegistration().getClientName());
        return results;
    }
}

What is this RegisteredOAuth2AuthorizedClient and AuthenticationPrincipal?

The RegisteredOAuth2AuthorizedClient annotation resolves to a Oauth2AuthorizedClient which we have registered in our application using the OAuth2LoginConfig class above. In this case it will be Google, which it will resolve to our registered client that has our public and private key. The Oauth2AuthorizedClient also contains the access token, refresh token, and client registration with the user scopes we have requested.

The AuthenticationPrincipal holds our currently authenticated user within the application. It has attributes pretaining to the OpenId 1.0 specification such as name, profile, picture, email, birthdate, etc. which are all consider standard claims.

Swagger Config

Finally, create a SwaggerConfig class to register our rest controller location

@Configuration
@EnableSwagger2
public class SwaggerConfig {

    @Bean
    public Docket api() {
        return new Docket(DocumentationType.SWAGGER_2)
                .select()
                .apis(RequestHandlerSelectors.basePackage("com.sixthpoint.spring.security.oauth2login.controller"))
                .apis(RequestHandlerSelectors.any())
                .paths(PathSelectors.any())
                .build();
    }
}

Running the application

Launch the Spring Boot 2.0 app and go to http://localhost:8080/swagger-ui.html. You are then redirected to a Google for authentication.

After you have authenticated with your google account credentials, the next page will ask you to consent to your app having access to the Oauth Client registered in your Google Cloud account. Clicking allow will authorize our spring boot app to access the users email and basic profile information as identified in our scopes from the ClientRegistration.

Once consent has been granted you will be redirected back to the swagger-ui page. Execute the secure controller get request to see your authenticate user.

The application currently sets a cookie for an authorized user using a default session called JSESSIONID. This session is tied to our authenicated user within the application. If I were to restart the application the in-memory store is lost. Meaning I would have to log back into the application again.

Wrap Up

Today this article showed how to quickly get up and running with Spring Security 5 OAuth2. The app integrates with Google to allow for a secure authentication and consent of users with a google account. The applications API was secured using a session token which is generated using the Spring Security 5.3 OAuth2 libraries.

Source code can be found on Github.

Optimizing your application using Spring Boot 2.x and Amazon ElastiCache for Redis

Standard

Has your project gotten to the point when big data sets and or time consuming calculations begin to affect performance? Struggling to optimize your queries and need to cache some information to avoid continually hitting your database? Then caching could be your solution.

For this article I will demonstrate how to utilize Amazon ElastiCache for Redis to speed up areas of your application. The exmple application we will build uses Spring Boot 2.x and is available on Github.

What is Amazon ElastiCache for Redis?

ElastiCache for Redis is a super fast, in memory, key-value store database. It supports many different kinds of abstract data structures, such as strings, lists, maps, sets, sorted sets, hyperloglogs, bitmaps and spatial indexes. At its highest level, ElastiCache for Redis is an fully managed service for a standard Redis installation and uses all the standard Redis API’s. Which means the app we are building can rely not only on the Amazon ElastiCache flavor of Redis, but any other environment matching the version of your choosing.

Setting up ElastiCache for Redis

Begin by navigating to the ElasticCache dashboard, selecting Redis, and create a new cluster. You will be prompted to define a cache name, description, node type (server size), and number of replicates.

I filled in the name, description, and changed the node type to a smaller instance.

VPC & Security Groups

To be able to access your Redis cluster the instance(s) running our app must have be in the same Virtual Private Network (VPC) and contain the proper security groups. Your EC2 instances must allow the port of your redis cluster (6379) to be able to communicate. By default your Redis cluster is only accessible internally from the VPC selected. This is done purposely, as no internet gateway should be connected as it would defeat the purpose of a high efficiency, in-memory cache that Redis provides. 

Our app will only be able to access Redis once it is deployed to AWS.

If you wish to run the app locally, consider installing Redis using docker. The variables outlined in our application.properties file below can be modified to run locally.

Launching your Redis Cluster

Once your have properly configured your security groups and VPC, click “create”. ElastiCache will now provision and launch you new Redis cluster. When the status turns to available the cluster is ready to handle connections.

We need the primary endpoint for our new spring boot application.

Building our application

For this example application we will be using Spring Boot 2.x with the Spring-Data-Redis and Jedis (client library for redis). I first begin by importing them into my project

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<dependency>
<groupId>redis.clients</groupId>
<artifactId>jedis</artifactId>
<version>2.9.0</version>
</dependency>

These  libraries allow us to setup our caching config in Spring. A important concept to understand is that the Spring Framework provides its own abstraction for transparently adding caching. You do not only have to use Redis, the abstraction provides a list of providers; CouchbaseEhCache 2.xHazelcast, etc. As you will see adding caching to a service method is as simple as providing the appropriate annotation.

Now that we have included the required libraries in our pom file, Spring will try to autoconfigure a RedisCacheManager. I personally do not like magic behind the scenes so we are going to setup and configure our own annotation based RedisCacheManager. 

A RedisCacheManager is how we configure, and build a cacheManger for Spring to use. Notice that I have defined the redisHostName, redisPort, and redisPrefix for the Jedis (client library for redis) to use to connect to our cluster.

@Configuration
@EnableCaching
public class RedisConfig {

@Value("${redis.hostname}")
private String redisHostName;

@Value("${redis.port}")
private int redisPort;

@Value("${redis.prefix}")
private String redisPrefix;

@Bean
JedisConnectionFactory jedisConnectionFactory() {
RedisStandaloneConfiguration redisStandaloneConfiguration = new RedisStandaloneConfiguration(redisHostName, redisPort);
return new JedisConnectionFactory(redisStandaloneConfiguration);
}

@Bean(value = "redisTemplate")
public RedisTemplate<String, Object> redisTemplate(RedisConnectionFactory redisConnectionFactory) {
RedisTemplate<String, Object> redisTemplate = new RedisTemplate<>();
redisTemplate.setConnectionFactory(redisConnectionFactory);
return redisTemplate;
}

@Primary
@Bean(name = "cacheManager") // Default cache manager is infinite
public CacheManager cacheManager(RedisConnectionFactory redisConnectionFactory) {
return RedisCacheManager.builder(redisConnectionFactory).cacheDefaults(RedisCacheConfiguration.defaultCacheConfig().prefixKeysWith(redisPrefix)).build();
}

@Bean(name = "cacheManager1Hour")
public CacheManager cacheManager1Hour(RedisConnectionFactory redisConnectionFactory) {
Duration expiration = Duration.ofHours(1);
return RedisCacheManager.builder(redisConnectionFactory)
.cacheDefaults(RedisCacheConfiguration.defaultCacheConfig().prefixKeysWith(redisPrefix).entryTtl(expiration)).build();
}
}

I have defined 2 cache managers. One that is infinite (default), and once that will cause all keys to expire in 1 hour called “cacheManager1Hour” .

The cluster information is passed in from the applications.properties:

redis.hostname=URL_TO_ELASTIC_CACHE_REDIS_CLUSTER
redis.port=6379
redis.prefix=testing

Implementing a simple service

Now that our Redis cluster is configured in Spring, annotation based caching is enabled. Let’s assume you have a long running task that takes 2 seconds to do its work. By annotation the service with @Cacheable the result of the method call will be cached. I have given this cachable a value of “getLongRunningTaskResult” which will be used in its compound key, a key (by default is generated for you), and a cacheManager (“cacheManager1Hour” configured above).

@Cacheable(value = "getLongRunningTaskResult", key="{#seconds}", cacheManager = "cacheManager1Hour")
public Optional<TaskDTO> getLongRunningTaskResult(long seconds) {
try {
long randomMultiplier = new Random().nextLong();
long calculatedResult = randomMultiplier * seconds;
TaskDTO taskDTO = new TaskDTO();
taskDTO.setCalculatedResult(calculatedResult);
Thread.sleep(2000); // 2 Second Delay to Simulate Workload
return Optional.of(taskDTO);
} catch (InterruptedException e) {
return Optional.of(null);
}
}

Note: It is important that the resulting object of the method is serializable otherwise the cacheManager will not be able to cache the result.

Testing for performance improvements

To easy test the API, I have included swagger-ui which make it simple for developers to interact with the api we have built. I have also created a few simple endpoints to create, and flush the cache.

@ApiOperation(
value = "Perform the long running task")
@RequestMapping(value = "/{seconds}", method = RequestMethod.GET)
public ResponseEntity<TaskDTO> longRunningTask(@PathVariable long seconds) {
Optional<TaskDTO> user = taskService.getLongRunningTaskResult(seconds);
return ResponseUtil.wrapOrNotFound(user);
}

@ApiOperation(
value = "Resets the cache for a key")
@RequestMapping(value = "/reset/{seconds}", method = RequestMethod.DELETE)
public ResponseEntity<?> reset(@PathVariable long seconds) {
taskService.resetLongRunningTaskResult(seconds);
return new ResponseEntity<>(HttpStatus.ACCEPTED);
}

Once you deploy your app to EC2 navigate to URL path: /swagger-ui.html

From this GUI you can easily test your API performance improvements. Calling the GET endpoint for the first time should take roughly 2 seconds to return the new calculated result. Calling it subsequently will return an almost instant response as the long running task is now cached in Redis.

Final thoughts

Today’s applications demand a responsive user experience, design your queries and calculations to be as performant as possible, but every once in awhile when you can sacrifice real time data, just cache it.

Full project code is available at: https://github.com/sixthpoint/spring-boot-elasticache-redis-tutorial

Quickstart AWS SQS + Spring Boot Processing FIFO Queues

Standard

AWS SQS (Simple Queue Service) can provide developers with flexibility, and scalability when building microservice application(s). In this quickstart tutorial it demonstrates how to configure a FIFO queue for a fictional online marketplace.

The code for this application is available on Github.

What is a FIFO queue?

A FIFO (First in first out) queue is used when the order of items or events is critical, or where duplicates items in the queue are not permitted. For example;

  • Prevent a user from buying an item that isn’t available on a marketplace.

Configuring AWS

This tutorial assumes you have an AWS account already created. Begin by creating a SQS queue call PendingOrders.fifo in a region of your choice with content-based deduplication enabled. Each queue has its own unique URL.

The format of this URL is https://sqs.<region>.amazonaws.com/<account_id>/<queue_name>. Take note of this URL, as it will need it to run the application (Set this in a SIXTHPOINT_SQSURL environment variable). You can see the URL for your SQS in the details screen of SQS in AWS console or by using the aws cli command aws sqs list-queues. Please see Figure 1 for reference.

Figure 1: PendingOrders.fifo created with Content-Based Deduplication enabled

Content-Based Deduplication:

FIFO queues do not allow for duplicate messages. If you retry to send an identical message within the 5 minute deduplication interval, the message will be tossed out. This assures that your messages will be processes exactly once. So by enabling this feature AWS SQS uses a SHA-256 hash to generate the deduplication ID using the body of the message; Not the attributes of the message.

This is useful for our simple example because our message payload will contain only a itemCount which is the total number of items they are buying, and a requestTime which is the epoch time the order was received.

The Application

The sample application is composed of a single OrderProcessingService which is stateful. This service handles submitting the order to the queue, as well as listening for orders via the @SqsListener annotation.

Creating an order

To create an order a PendingOrder class is written to a single json string. The default client for AmazonSQSClientBuilder is then used to send a message to the queue.

public void createOrder(int itemCount) {
try {
PendingOrder pendingOrder = new PendingOrder();
pendingOrder.setItemCount(itemCount);
pendingOrder.setRequestTime(LocalDateTime.now().toEpochSecond(ZoneOffset.UTC));
String pendingOrderJson = configProperties.getObjectMapper().writeValueAsString(pendingOrder);

final AmazonSQS sqs = AmazonSQSClientBuilder.defaultClient();
sqs.sendMessage(new SendMessageRequest(configProperties.getSqsURL(), pendingOrderJson).withMessageGroupId("sixthpointGroupId"));

} catch (final AmazonClientException | JsonProcessingException ase) {
log.error("Error Message: " + ase.getMessage());
}
}

The PendingOrder contains an itemCount which is the total number of items they are trying to purchase, and a requestTime. Since we are using content-based deduplication this means that no more than 1 request can be done per second with the same itemCount.

Processing the queue

Using the @SqsListener annotation the application checks the PendingOrders.fifo periodically to process pending items. An item is read in and mapped using the object mapper. The availableItems is either decremented by the supplied pendingOrder count or an error logged for no more items remaining.

private int availableItems = 5;

@SqsListener("PendingOrders.fifo")
public void process(String json) throws IOException {
    PendingOrder pendingOrder = configProperties.getObjectMapper().readValue(json, PendingOrder.class);
    if(availableItems > 0 && availableItems >= pendingOrder.getItemCount())
    {
        availableItems = availableItems - pendingOrder.getItemCount();
        log.info("Items purchased, now have {} items remaining", availableItems);
    } else {
         log.error("No more items are available");
    }
}

Wrap up

This article quickly shows how to setup and configure your first AWS SQS FIFO queue using Spring Boot.

Full code with configurable environment variables can be found on
Github.

Using Docker + AWS to build, deploy and scale your application

Standard

I recently worked to develop a software platform that relied on Spring Boot and Docker to prop up an API. Being the only developer on the project, I needed to find a way to quickly and efficiently deploy new releases. However, I found many solutions overwhelming to set up.

That was until I discovered AWS has tools that allow any developer to quickly build and deploy their application.

In this 30 minute tutorial, you will discover how to utilize the following technologies:

Once finished, you will have a Docker application running that automatically builds your software on commit, and deploys it to the Elastic beanstalk sitting behind a load balancer for scalability. This continuous integration pipeline will allow you to worry less about your deployments and get back to focusing on feature development within your application.

Here is the order in which to configure services:

  1. Git repository initialization using CodeCommit
  2. CodeBuild Setup
  3. EBS Configuration
  4. CodePipeline Configuration
Background knowledge

I am using Docker for this tutorial application. However AWS supports a wide range of configurable environments in the Elastic beanstalk; .NET, Java, NodeJS, PHP, Python, Ruby. Docker was chosen for this tutorial so that the reader can focus more on the build process and less on the project setup. With that being said, I will not be diving deeply into Docker. If you wish to learn more about Docker, start by reading the introduction on the Docker website.

The Application

The example Spring Boot source code that will be used can be found at: https://github.com/sixthpoint/Docker-AWS-CodePipeline

The application is a Spring Boot project configured to run on port 5000 and has a REST controller with a single endpoint.

The API REST controller is very basic. It maps /api/ path to a method which returns a list of strings in JSON format. This is the endpoint we will use to verify our application has successfully built and deployed on the AWS EBS.

ApiController.java
@RestController
@RequestMapping( value = "/api" )
public class ApiController {

    @RequestMapping( value = "/", method = RequestMethod.GET )
    public List<String> index() {
        List<String> s = new ArrayList<>();
        s.add("Docker + AWS CodePipline Tutorial");
        s.add("Learn more at: https://github.com/sixthpoint/Docker-AWS-CodePipeline");
        return s;
    }
}

The application creates am example-1.0.0-SNAPSHOT.jar file when built using Maven. This file is important for us to reference in our Dockerfile.

Maven build:
mvn clean install

Would produce target/example-1.0.0-SNAPSHOT.jar. The Dockerfile below uses a flavor of Alpine Linux to add, expose and run the Spring Boot application.

Dockerfile
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ADD target/example-1.0.0-SNAPSHOT.jar app.jar
EXPOSE 5000
ENV JAVA_OPTS=""
ENTRYPOINT [ "sh", "-c", "java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar /app.jar" ]
1. Git repository initialization using CodeCommit

First things first, we need a git repository to build our code from. AWS CodeCommit is cheap, reliable, and secure. It uses S3 which is a scalable storage solution subject to S3 storage pricing.

Begin by logging into your AWS console and creating a repository in CodeCommit. For the purpose of this tutorial, I have called the repository name the same name as the Spring Boot application. Once created, you will be presented with the standard HTTPS and SSH URLs of the repository.

The above example has generated the following repository location, notice if I try to do a clone from the repository access is denied.

git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/DockerCodePipleline

The above example has generated the following repository location; notice if I try to do a clone from the repository, access is denied.

1A. CONFIGURING IDENTITY AND ACCESS MANAGEMENT (IAM)

IAM or Identity and Access Management enables you to securely control access to AWS services and resources. To authorize a user to access our private git repository, navigate to the IAM services page. Begin by adding a user. I have named the user the same name as the project and git repository. Choose programmatic access which will allow for policies to be added.

In order to allow this new user to fully administer our new git repository, attach the AWSCodeCommitFullAccess policy. Once added, click through to finish creating your user.

Now that a user has been created with the correct policies, GIT credentials are needed to work with the new CodeCommit repository. Navigate to the new user and look for the “HTTPS Git credentials for AWS CodeCommit” shown below. Generate a new username and password and download the .gitCrendientialsfile once prompted. Inside that file is the information needed to access your repository.

Note: Only two keys are allowed per user at this time. If you lose your key, a new one will need to be generated to access the repository. For more in-depth information on setting up git credentials in AWS, check out the guide for setting up HTTPS users using Git credentials.

1B. MOVING THE CODE TO THE NEW CODECOMMIT REPOSITORY

With the new repository created, clone the Github repository holding our sample Spring Boot application. Change the remote to your new CodeCommit repository location, then finally push the master branch to master.

git clone https://github.com/sixthpoint/Docker-AWS-CodePipeline.git
git remote set-url origin git://https://git-codecommit.us-east-1.amazonaws.com/v1/repos/DockerCodePipleline
git push master master
2. CodeBuild Setup

Now that the CodeCommit repository holds our sample Spring boot application, the code needs to be built for deployment. Navigate to CodeBuild. CodeBuild is a source code compiler which is pay on demand.

Start by creating a new build project and point the source to the AWS CodeCommit repository that was created in Step 1. You can see I have pointed this new build project to the AWS CodeCommit source provider, and specified the DockerCodePipeline repository.

Next it asks for environmental information. The default system image is fine for this build process. The most important part is to tell CodeBuild to use the buildspec.yml. The buildspec contains the necessary commands to generate the artifacts needed to deploy to the EBS.

Included in the sample Spring Boot application is a buildspec.yml. This file is used to tell CodeBuild what commands to run in each phase, and what files to bundle up and save in the artifacts.

Additional configuration options can be found at: http://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html.

Buildspec.yml
version: 0.2

phases:
  build:
    commands:
      - mvn clean
      - mvn install
artifacts:
  files:
    - 'Dockerfile'
    - 'target/example-1.0.0-SNAPSHOT.jar'

Final setup for the build process is to specify the location where the artifact made from the buildspec.ymlwill be stored. In the example below, I put all artifacts in the Amazon S3 under the name dockerAWSCodePipeline, and in a bucket named irdb-builds. The bucket can be in bucket of your choice. You must go into S3 and create this bucket prior to creating the build project.



The build project is now configured and ready to use. Builds can manually be run from the console creating artifacts stored in S3 as defined above.

4. EBS Setup

Now that the code is in CodeCommit, and the artifacts are built using CodeBuild, the final resource needed is a server to deploy the code. That is where the Elastic beanstalk comes in useful. The EBS is a service that automatically handles provisioning, load balancing, auto-scaling, etc. It is a very powerful tool to help your manage and monitor your applications servers.

Let’s assume, for example, my API needs to have four servers due to the amount of requests I am receiving. The EBS makes the scaling of those servers simple with configuration options.

Begin by creating a new webserver environment and give it a name and domain name. This domain name is your AWS domain name; if you have a personal domain name you can point it to this load balancer being created using Route53.

The last step of creating your webserver worker environment is to tell EBS that we want to run Docker and to use the example application code. Later our code from CodeBuild will replace the AWS sample application.

The server and environment will take several minutes to start. Once complete, navigate to the configuration page of your new EBS environment.

By default the environment has a load balancer installed and auto scales. A scaling trigger can be set to adjust the number of instances to run given certain requirements. For example: I could set my minimum instances to 1 and maximum to 4 and tell the trigger to start a new instance each time the CPUUtilization exceeds 75%. The load balancer would then scale requests across the number of instances currently running.

5. CodePipeline Configuration

This is the final piece of the puzzle which brings all steps 1-4 above together. You will notice that up until now we have had to manually tell CodeBuild to run, then would have to go to the EBS and manually specify the artifact for deployment. Wouldn’t it be great if all this could be done for us?

That is exactly what Codepipeline does. It fully automates the building and provisioning of the project. Once new code is checked in, the system magically takes care of the rest. Here is how to set it up.

Begin by creating a new CodePipeline. In each step. select the repository, build project, and EBS environment created in step 1-4 above.

 

Once complete the CodePipeline will begin monitoring changes to your repository. When a change is detected, it will build the project, and deploy it to the available servers in your EBS application. You can monitor the CodePipeline in real time from the pipelines detail page.

A Final Word

When configured properly, the CodePipeline is a handy tool for the developer who wants to code more and spend less time on DevOps.

This pipeline gives a developer easy access to manage a application big or small. It doesn’t take a lot of time or money to set yourself up with a scalable application that utilizes a quick and efficient build and deployment process.

If you are in need of a solution to build, test, deploy, and scale your application, consider AWS CodePipeline as a great solution to get your project up and running quickly.

 

BackboneJS with Webpack: A lesson in optimization

Standard

Developing a large BackboneJS application presents a unique design problem. As developers, we like to organize our code so it is understandable, logical, and predictable. However, doing so can cause performance issues on the client side.

In this blog I will discuss a handy tool I like to use for this purpose: Webpack. I’ll show it in action, how to use it, and what it is good for. But first, let’s talk about how I came across Webpack.

An Example

On a previous project, I was building an audio and video streaming player. The frontend was developed using BackboneJS. I used libraries such as jQuery and SocketIO. Using Require shim configuration, I ended up with my dependencies / exports organized as follows.

require.config({
    baseUrl: "js/",
    paths: {
        jquery: 'libs/jquery.js',
        underscore: '/libs/underscore.js/1.6.0/underscore-min',
        socketio: '/libs/socket.io/0.9.16/socket.io.min',
        backbone: '/libs/backbone.js/1.1.0/backbone-min',
        templates: 'templates'
    },
    shim: {
        'jquery': {
            exports: '$'
        },
        'backbone': {
            deps: ['underscore', 'jquery'],
            exports: 'Backbone'
        },
        'underscore': {
            exports: '_'
        },
        'socketio': {
            exports: 'io'
        }
    }
});

This worked great for my loading all my libraries. For each of my files, I defined the libraries I wanted to use. Breaking down each of my files into modules is a great way to handle organizing the large code base. Then I used the RequireJS AMD text resource loader plugin to load all template files.

define([
    'jquery',
    'underscore',
    'backbone',
    'text!templates/recordings/recordingTemplate.html'
], function($, _, Backbone, recordingTemplate) {
   // Do some stuff
}

At the time this was a decent solution. My code base was organized, easy to understand, and predictable. However, the larger my application became, a performance problem began to develop. For each template that was added, a new call to the server was made. This began to balloon the initial loading time of the application.

templateLoading

Loading of all templates

Wouldn’t it be great if all necessary resources were compacted into a single file, optimizing the loading time, and still be able to keep our code organized?

Developing a BackboneJS app all in a single file would be frustrating experience to manage. That’s where Webpack comes to the rescue.

What is Webpack?

Webpack is a module bundler that takes your files, compacts them, and generates a static file. Think of it as your own personal secretary there to help keep your life organized. You provide the configuration, it supplies the optimization.

Webpack’s primary goal is to keep initial loading time down. It does this by code splitting, and loaders.

Code Splitting

Code splitting is ideal for large applications where it is not efficient to put all code into a single file. Some blocks of code may only be useful for certain pages of the site. Using this opt-in feature allows you to define the split points in your code base, and Webpack will optimize the dependencies required to generate the optimal bundle.

var a = require("a");
require.ensure(["b"], function(require) {
    require("a").dosomething();
    var c = require("c");
});

This example uses commonJS require.ensure to load resources on demand. The final output would contain two chunked files:

  • output.js – the primary entry point chunk containing
    • chunk loading logic
    • module A
  • 1.output.js – the additional chunk to be loaded on demand containing
    • module B
    • module C
[sixthpoint@sixthpoint webpack-code-splitting]$ webpack
Hash: e863fe1f9db99737fcd2
Version: webpack 1.12.2
Time: 564ms
      Asset       Size  Chunks             Chunk Names
  output.js     302 kB       0  [emitted]  main
1.output.js  172 bytes       1  [emitted]  
   [0] ./main.js 338 bytes {0} [built]
   [6] ./a.js 18 bytes {0} [built]
   [7] ./b.js 19 bytes {1} [built]
   [8] ./c.js 17 bytes {1} [built]

Loaders

Loaders preprocess files as you use the require() method. Using a loader you can easy require() resources such as CSS, images, or compile-to JS languages (CoffeeScript or JSX).

By default, Webpack knows how to process your JavaScript files, minify, and combine them. But it doesn’t really know how to do much else. Loaders are the solution to processing different types of files and turn them into usable resources for your application.

Load bootstrap CSS file:

require('./styles.css');

Create a image element, and set the src to the image resource file:

var img = document.createElement('img');
img.src = require('./bootstrap-logo.png');

Using the CLI all loaders can be defined in the webpack.config.js file.

module.exports = {
  entry: './main.js',
  output: {
    filename: 'output.js'       
  },
  module: {
    loaders: [
      { test: /\.css$/, loader: "style!css" },
      { test: /\.png$/, loader: "url-loader?mimetype=image/png" }
    ]
  }
};

This example configuration file sets the application entry point, desired output file name for webpack to build, and a list of two loaders (CSS and PNG).

If you want to test this out for yourself, check out this github repo: https://github.com/sixthpoint/webpack-async-code-splitting

How do I use BackboneJS with Webpack?

Above I showed my starter application which had an initial loading performance issue. Remember all those template calls? Webpack async loading and code splitting is going to significantly decrease load times. Let’s assume my application needs only a two entry points:

  • #/nowplaying – will be responsible for loading data from socket.io
  • #/schedules – will display all scheduling information

To start I have modified my Webpack config file and using the providePlugin added jQuery, Backbone, and Underscore to the global scope of my application. I will no longer have to require these libraries through my app.  This is similar to the shim config above.

var webpack = require('webpack');

module.exports = {
  entry: './main.js',
  output: {
    filename: 'output.js'       
  },
  module: {
    loaders: [
      { test: /\.css$/, loader: "style-loader!css-loader" },
      { test: /\.png$/, loader: "url-loader?mimetype=image/png" },
    ]
  },
  plugins : [ new webpack.ProvidePlugin({
			$ : "jquery",
			Backbone : "backbone",
			_ : "underscore"
		}) ]
};

The most important file of this app is the Backbone router. The router defines the code splitting points.

Notice that by using require.ensure, I will load only the socket.io API resource when navigating to the now playing page. This way, if somebody never goes to the now playing page, the resources for that page will never have to be loaded. If the user navigates to the now playing page, it will then be cached for if they return, for performance reasons.

var AppRouter = Backbone.Router.extend({
	routes : {
		'nowplaying' : 'nowplaying',
		'schedule' : 'schedule',
		'*actions' : 'home'
	}
});

var initialize = function() {
	var appRouter = new AppRouter;

	appRouter.on('route:home', function() {
		$('#content').text("Home Screen");
	});

	appRouter.on('route:nowplaying', function() {
		require.ensure([], function() {
                    // nowPlayingView contains socketIO resource
		    require('./nowplayingView');
		  });
	});

	appRouter.on('route:schedule', function() {
		require.ensure([], function() {
		    require('./scheduleView');
		  });
	});

	Backbone.history.start();
};

module.exports = initialize;

So how does Webpack organize this? Simple, both the now playing view (1.output.js) and the schedule view (2.output.js) get their respective files since they are async loaded.

Here is the output of the terminal, as expected:

[sixthpoint@sixthpoint webpack-backboneJS-socketIO-client]$ webpack
Hash: b29b2a6017bad0dd0577
Version: webpack 1.12.2
Time: 808ms
      Asset       Size  Chunks             Chunk Names
  output.js     388 kB       0  [emitted]  main
1.output.js     180 kB       1  [emitted]  
2.output.js  248 bytes       2  [emitted]  
   [0] ./main.js 54 bytes {0} [built]
   [1] ./router.js 604 bytes {0} [built]
   [5] ./nowplayingView.js 132 bytes {1} [built]
  [56] ./scheduleView.js 37 bytes {2} [built]
    + 53 hidden modules

Final Thoughts

What kind of project is Webpack good for? Webpack is great for any scale of project. It is simple to use and configure. Anyone who is developing a JavaScript application should consider using Webpack for its performance improvements and excellent set of plugins.

The complete source code of the optimized project can be found on github: https://github.com/sixthpoint/webpack-backbonejs-socketIO-client

Using the webpack dev server

Standard

A great feature of Webpack is has a build in webserver for testing your application. It will monitor your files for changes and rebuild. This is similar to watch mode that can be enabled during configuration. However, the dev server expands on that by providing a localhost 8080 port address and automatic refreshing of the view when content changes.

First, install the webpack-dev-server globally

npm install webpack-dev-server -g

To start the server, navigate to your file directory and type the command:

webpack-dev-server

This will start the server, output should look similar to below.

http://localhost:8080/webpack-dev-server/
webpack result is served from /
content is served from /path/to/your/files
Hash: b9af11d6ad3b3743b572
Version: webpack 1.12.2
Time: 572ms
    Asset    Size  Chunks             Chunk Names
output.js  299 kB       0  [emitted]  main
chunk    {0} output.js (main) 297 kB [rendered]
    [0] ./main.js 197 bytes {0} [built]
    [1] ./styles.css 898 bytes {0} [built]

Now you can navigate to the running site: http://localhost:8080/webpack-dev-server/

http://localhost:8080/webpack-dev-server/

Notice the bar that says “app ready”. This is the status bar that webpack has put into the browser. This does have HTML placed on the page using an iFrame. At some point you will not want this on your application, but for simple senario’s this is fine.

To remove the status bar, navigate your browser to the base url (http://localhost:8080/). Downside to this is that now the browser is not automatically refreshed when files are modified. To enable watch mode and auto refreshing on the dev server, specify the inline tag

webpack-dev-server --inline

Now content will automatically refresh without that pesky status bar in the way. Happy web packing!

Inversion of Control (IoC) with JSF

Standard

Power of containers

Flash back to the early 2000s and this article would be focused on POJOs and how they are transforming the way we organize our logic. But luckily, its 2015 and we don’t need concern ourselves with managing the state of our objects when developing server-side applications. Most likely you are using a version of inversion of control already in your application without knowing it.

Below is a simple example of JSF 2.2 using CDI for bean injection.

@SessionScoped
@Named
public class userProfile {

   public String username;

   public String getUsername() {
      return username;
   }

   public void setUsername(String username) {
      this.username = username;
   }
}

To understand the key concepts associated with IoC, consider the above example. The annotation @SessionScoped is providing a length of time for this container managed class to hang around. By definition a session scoped bean maintains its state across more than one JSF view. Since this is the user that is logged onto the site, this bean must be accessible for the duration of them browsing the application. CDI has implemented the definition of a session scoped bean using IoC facets.

There are 3 core key facets of IoC.

  • Manages constructor injection of managed objects – The developer does not need to explicitly instantiate the object. The container would use a default constructor to invoke the object. It should be noted that overriding the default constructor in IoC is possible given unique situations.
  • Dependency Handling – Certain objects can depend on each other to function. The container must have the logic to handle cyclical dependencies.
  • Life cycle and Configuration – Customization of the lifecycle must be provided through annotation or configuration (xml).

Inversion of control (IoC) is a concept that has been implemented in various containers/frameworks such as Spring, JSF, Google Guice, or PicoContainer. These frameworks provide the abilities similar to the above example. Using a framework eliminates the need to write large amounts of boilerplate code. For example; Writing a class to handle application, session, and view scoped logic.

What would it be like without IoC?

Simple answer is… a large headache. Imagine you have a web application. You have to manage a single class that is used by the application. Lets call this our applicationClazz. Now when each new user access the application we need to store there current application context. This user context would have to be stored in our applicationClazz. Then to add functionality, lets assume the site has a login page and stores information in a loginClazz. This login page is specific to each individual user context. So for each user that is using the application, the applicationClazz would have to maintain a map of all the loginClazz’s and maintain an association to the current user context. To make things even more complicated consider how difficult it would become to clean up and managing this application map if you had 20, 50, or 100 classes on your application that had different lifecycles. This is why we use IoC, to do all our heavy lifting.

CDI or Managed Property with JSF?

Prior to JSF 2.0 the annotation @ManagedProperty was widely used. Now mostly @Named is used which is context dependency injection (CDI). Both have support similar life cycles.

The following are a list of the most common CDI scopes used, there duration, and a example use case.

Session Scoped – User’s interaction lasts across multiple HTTP requests. Often used to store a logged in users information for the duration of there time on the site.

Request Scoped – User’s interaction lasts across a single HTTP request. This scope would be best suited for pages that require little to no ajax/form interaction. An example simple example would be a page that displays the date / time. If a ajax request were implemented to refresh the content. Since it is request scoped, a new bean would be created for each ajax request.

Application Scoped – Contents are shared across all users interacting with the web application. Let’s assume you have a dropdown list that will always have the same values no matter the user. The best solution would be to put those values into a application scoped bean so that they are alway sin memory, improving performance.

A Short Summary

The most important thing to take away from this article is; IoC is your friend. It does a lot of the heavy lifting managing classes. CDI give you the tools to quickly create applications using sesssion, request, and application scoped beans. Without it much of your time would be spent managing lifecycles.

Death to the back button in JSF

Standard

The browser back button is notorious for being the most hated browser feature by developers. It posses many design challenges and considerations. This article will cover a few approaches to handling the browser back button, as well as highlighting a way to create your own within a JSF application.

Stateful vs Stateless

When laying out your applications workflow, it is smart to consider how you want the application to flow and look to the end user. In a stateful application you attempt to store as much data in the backing beans, whereas with a stateless approach you load data as pages change. In JSF you have access to different kinds of managed beans. Some types work better for different implementations. Using view scoped and request scoped for a more stateless approach, whereas it would be smarter to use conversation scoped or session scoped for a more stateful approach. Each have there benefits and drawbacks.

Start by determining the applications purpose, this will help when selecting which type of bean to use. For example; when developing a search feature that spans multiple tables in various databases it may become inefficient to load the search results over again if the user presses the back button. Thus, a more stateful scoped bean (conversation scoped, or session scoped) would be smarter choice.

Consider the following workflow:

userSearch.xhtml -> finds a user and clicks -> userDetail.xhml

In a typical stateful workflow we could manage this entire page navigation using a conversation or session scoped bean in JSF. The data that is returned from the user search limits the users content to be shown on the user detail page. It only requires one backing bean to between both pages.

Benefits of a Stateful approach:
  • Additional security requirements are not needed since the user id is hidden.
  • No need to load data again in between views
  • Routes can easily be managed in backing beans using explicit routing
Drawbacks of a Stateful approach:
  • Backing beans can become cluttered
  • Stale data may become a issue
  • Pages are not bookmarkable (important in modern web applications)
  • Relies heavily on POST commands to navigate which is not search engine friendly
  • Memory usage can become an issue with larger / high traffic sites
A better stateless approach

Let’s continue to look at the following workflow, but with a different way to implement it. For this case I am going to assume that the userSearch is efficient.

userSearch.xhtml -> finds a user and clicks -> userDetail.xhml?id=123

Notice the “id?=123” has been added to the user detail page. This represents the id of the user that is expected be loaded. With a stateless implementation the user search page and the user detail page have no knowledge of each other. They would in fact be completely separate backing beans, most likely viewscoped. When the user is shown a list of search results, those links are generated using implicit routing and rendered to the DOM. Hovering over the link would show you the full url path. No need to hit a backing bean to determine routes like when using the stateful approach. The route is predetermined. This is one of the huge benefits to creating views that are stateless.

Benefits of a Stateless approach:
  • Pages are bookmarkable
  • Data is never stale
  • Links do not have to rely on backing beans, they can be generated on the page, SEO friendly
  • Less of a memory hog per session
Drawbacks of a Stateless approach:
  • Have to consider security implications when exposing id’s in the URL.
  • If performing heavy calculation, could hurt server performance.
Stateless with the back button

But how do we handle the back button in JSF applications? Designing your application to use stateless beans enables the ability to bring back the back button into your JSF application.

A typical enterprise application built in JSF will cripple if the back button is pressed. In fact a lot of developers have gone as far as building there own stateful back button to display on the page. This back button functions just like the browser back button, but has additional knowledge to control the stateful views. All of this is not necessary if your views are stateless.

It is my opinion that you should never give JSF too much control over the browser. If you have to implement your own back button within your application, do it so with stateless views. Stateless views by design should all have unique URL’s which you can track. Simply add a preRender event each JSF page which calls this BrowserHistoryController. This controller maintains a page stack of all url’s visited. It has a small amount of intelligence to handle users switching between pressing a on page back button, and the browser back button.

On any of your xhtml pages that you want tracked

<f:metadata>
   <f:event type="preRenderView" listener="#{browserHistoryController.addPageUrl()}"/>
</f:metadata>

Creating your own back link

<h:outputLink value="#{browserHistoryController.backUrl}" rendered="#{browserHistoryController.hasPageBack()}">Back</h:outputLink>

BrowserHistoryController.java

@Named
@SessionScoped
public class BrowserHistoryController extends BaseController implements Serializable {

    private static final long serialVersionUID = 1L;

    private Stack<String> pageStack = new Stack<>();
    private Integer maxHistoryStackSize = 20;

    public void addPageUrl() {
        FacesContext facesContext = FacesContext.getCurrentInstance();
        if (!facesContext.isPostback() && !facesContext.isValidationFailed()) {
            HttpServletRequest servletRequest = (HttpServletRequest) facesContext.getExternalContext().getRequest();
            String fullUrl = servletRequest.getRequestURL() + "?" + servletRequest.getQueryString();
            updatePageStack(fullUrl);
        }
    }

    public String getBackUrl() {
        Integer stackSize = pageStack.size();
        if (stackSize > 1) {
            return pageStack.get(stackSize - 2);
        }
        // Just in case hasPageBack was not implemented (be safe)
        return pageStack.get(stackSize - 1);
    }

    public Stack<String> getPageStack() {
        return pageStack;
    }

    public boolean hasPageBack() {
        return pageStack.size() > 1;
    }

    public void setPageStack(Stack<String> pageStack) {
        this.pageStack = pageStack;
    }

    private void updatePageStack(String navigationCase) {

        Integer stackSize = pageStack.size();

        // If stack is full, then make room by removing the oldest item
        if (stackSize >= maxHistoryStackSize) {
            pageStack.remove(0);
            stackSize = pageStack.size();
        }

        // If the first page visiting, add to stack
        if (stackSize == 0) {
            pageStack.push(navigationCase);
            return;
        }

        // If it appears the back button has been pressed, in other words:
        // If the A -> B -> C, and user navigates from C -> B, then remove C
        if (stackSize > 1 && pageStack.get(stackSize - 2).equals(navigationCase)) {
            pageStack.remove(stackSize - 1);
            return;
        }

        // If we are on the same page
        // If A == A then ignore
        if (pageStack.get(stackSize - 1).equals(navigationCase)) {
            return;
        }

        // In a normal case, we add the item to the stack
        if (stackSize >= 1) {
            pageStack.push(navigationCase);
            return;
        }

    }
}

Use this controller in combination with stateless views and the browser back button should no longer be a issue when coding for your application.

Terminate a running CentOS program using xkill

Standard

When an application is unresponsive in CentOS it sometimes requires the task be terminated. This is a neat trick to closing a application with minimal command line use.

Start by opening your terminal. Then type the following command:

[sixthpoint@new-host ~]$ xkill

Now click on the window you want to terminate. The program will automatically get the pid from the application window and terminate the process.

Output after click:

xkill: killing creator of resource 0x4a00034

 

Backup filesystem to Amazon S3

Standard

Every server needs to be backed up periodically. The trouble is finding an affordable place to store your filesystem if it contains large amounts of data. Amazon S3 is the solution with reasonably priced standard storage ($0.0300 per GB), as well as reduced redundancy storage ($0.0240 per GB) at the time of writing this article. Updated pricing can be seen at http://aws.amazon.com/s3/pricing/

This short tutorial will show how to backup a servers filesystem using s3cmd. S3cmd is a command line tool for uploading, retrieving, and managing data in Amazon S3. This implementation will use a cronjob to automate the backup processing. The filesystem will be scheduled to be synced nightly.

How to install s3cmd?

This example assumes you are using CentOS, or RHEL. The s3cmd library is included in the default rpm repositories.

yum install s3cmd

After installation the library will be ready to configure.

Configuring s3cmd

An Access Key and Secret Key are required from your AWS account. These credentials can be found on the IAM page.

Start by logging in to AWS and navigating to the Identity & Access Management (IAM) service. Here you will first create a new user. I have excluded my username below.

Next create a group. This group will hold the permission for our user to be able to access all your S3 buckets. Notice under permissions the group has been granted the right to “AmazonS3FullAccess” which means any user in this group can modify any S3 bucket. To grant your new user access to the group click “Add Users to Group” and select your new user from the list.

For s3cmd to connect to AWS it requres a set of user security credentials. Generate an access key for the new user by navigating back to the user details page. Look to the bottom of the page for the “Security Credentials” tab. Under Access Key click “Create Access Key”. It will generate a Access Key ID and Secret Access Key. Both these are required for configuring s3cmd.

You now have a user setup with permissions to access the S3 API. Back on your server you need to input your new access key into s3cmd. To begin configuration type:

s3cmd --configure

You should now see this page and be able to enter your Access Key Id and Secret Key.

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3
Access Key: xxxxxxxxxxxxxxxxxxxxxx
Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password: xxxxxxxxxx
Path to GPG program [/usr/bin/gpg]:

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP and can't be used if you're behind a proxy
Use HTTPS protocol [No]: Yes

New settings:
  Access Key: xxxxxxxxxxxxxxxxxxxxxx
  Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  Encryption password: xxxxxxxxxx
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: True
  HTTP Proxy server name:
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] Y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)

Now verifying that encryption works...
Success. Encryption and decryption worked fine :-)

Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'

At this point s3cmd is fully configured and ready to push data to S3. The final step is to create your own S3 bucket. This bucket will serve as the storage location for our filesystem.

Setting up your first S3 bucket

Navigate to the AWS S3 service and create a new bucket. You can give the bucket any name you want and pick the region for the data to be stored. This bucket name will be used in the s3cmd command.

Each file pushed to S3 is given a storage category of standard or reduced redundancy storage. This is configurable when syncing files. For the purpose of this tutorial all files will be stored in reduced redundancy storage.

Standard vs Reduced Redundancy Storage

The primary difference between the two options is durability; or how quickly do you need access to your data. Standard storage gives you nearly instant access to your data, where as reduced redundancy storage (RRS) may take up to several hours to retrieve the file(s). For the use case of this tutorial all files are storage in RRS. As noted previous RRS is considerably cheaper than standard storage.

Configuring a simple cronjob

To enter the cronjob editor simply type

crontab -e

Once in the editor create the cronjob below which will run Monday – Friday at 3:30 a.m. every morning.

30      3       *       *       1-5     /usr/bin/s3cmd sync -rv --config /root/.s3cfg --delete-removed --reduced-redundancy put /PATH/TO/FILESYSTEM/LOCATION/ s3://MYBUCKET/ >/dev/null 2>&1

This cronjob calls the s3cmd sync command and loads the default configuration which you have entered above. The –delete-removed option tells s3cmd to scan for locally deleted files, then remove them from the remove S3 bucket as well. The –reduced-redundancy option places all files in RRS for cost savings. Any folder location can be synced, just change the path to your desired location. Make sure to change mybucket to the name of your S3 bucket.

The server has now been configured to do nightly backups of the filesystem to AWS S3  using s3cmd library. Enjoy!