Classic Solutions Architecture Discussion

Solution architecture

How do you use all these components to make them all work together into one architecture.

We will study solution architecture and how to come up with them via case studies.

WhatsTheTime.com

Let people know what time it is. We don't need a database because every EC2 instances know the time.

We want to start small and downtime is acceptable, then scale vertically and horizontally with no downtime.

Initial solution

We start with a public t2.micro EC2 instance, and user will ask what is the time, it just spit back the current time. We attache an elastic IP address so the IP address of the EC2 instance is static.

Now users starts to come into our app, now our t2.micro can't keep up with the load, maybe we should scale it vertically and make it into m5 instances. We have to stop our app, and change our EC2 instance size to be m5. We have downtime when upgrading our app. This isn't great.

Now even more people come in, we scale it horizontally to three EC2 instances. But users needs to know about the IP address of those horizontally scaled EC2 instances.

To fix this, we can leverage Route 53. Set a A record to point to those three EC2 instances. Now users can access the time API just through a common endpoint, api.whatisthetime.com. With TTL of 1 hour.

Now if we are going to make an upgrade and take down one of the instances that the Route 53 points to, it is going to make some users suffer because the TTL is 1 hour. They won't be routed to other EC2 instances that are still up! They will be unhappy!

How do we remediate this?

We make our EC2 instances private, and front it with elastic load balancer with health checks. Route 53 need to have an Alias record that point to the ELB resource. Now it is working properly, no downtime for nay user because of health checks.

Now manually launching groups is tedious, we can have an auto scaling group to scale on-demand. We set min, max, and desire count of the instances.

But what happens if the ELB that we fronted with is in a availability zone that just had an earthquake? Our application will still go down! To solve this, we can deploy our ELB in multi-AZ, say AZ 1-3. Our auto scaling will group will also launch instances in different AZ. Now it is highly available great!

Now after optimizing the architecture, you will switch to thinking about cost saving. You can reserve capacity for cost saving! Reserving minimum capacity of our auto scaling group we can save lots of money!

Now this is good architecture. We are considering 5 pillars for a well architect ed applications: Cost (reserved instances for optimized cost + ASG), performance (Vertical scaling, ELB, adapt performance over time), reliability (Route 53, multi-AZ deployment), security (Security group to link ELB to EC2), operational excellence.

MyClothes.com (stateful app)

Now let's try to make a stateful app. This e-com web app will have a shopping cart and we need some place to store all these user informations. We want to keep our web app as stateless as possible, and user should not lose their shopping cart when they refresh their pages.

Details such as address should be in the database.

We will have the same set up from the Whatsthetime app with ELB, ASG, Multi-AZ, and Route 53. Now whenever the user add to chart, the page refreshes and they lose their data because they are talking to different EC2 instances from before they are redirected. How do we remediate this? We introduce stickiness so that user will be always talking to the same EC2 instances.

But if the EC2 instance is terminated data will still be lost, so stickiness isn't a complete solution.

We introduce server session, we set a cookie called session_id, and use ElastiCache to store user session. So that as long as the user have the same session_id, it will retrieve the same data for that user.

Now we can also introduce Amazon RDS to store user data in a database. But now there is too many reads what do we do? We add read replicas to the RDS, we can have up to 5 read replicas. We can also then add on top of it lazy read with ElastiCache, but this pattern require code change to your repository, however, it is more efficient since frequently accessed data will be in ElastiCache and doesn't need to hit RDS constantly.

Now how do we make it Multi AZ? In order to survive disasters. Route 53 we don't have to worry about it since it is highly available already. ELB make it multi-AZ, ASG make it multi-AZ, ElasticCache also have multi-AZ if you use Redis, RDS you can do multi-AZ to have a standby instance if the master goes down.

For security group, we restrict traffic only from the resources that it is fronted with.