top of page
  • Writer's pictureSathish Kumar

Understanding Public Cloud Infrastructure

In my previous post, I gave an overview of the cloud from a software standpoint. Companies spend millions of dollars in running their data centers that host these enterprise applications. To understand, public cloud infrastructure we need to start from the beginning i.e understand data centers (On-Premise deployment model).




In on-premise deployment model- all the components required for applications run in enterprise datacenters. The following are considered to be 3 pillars of any datacenter:

  • Compute (Including Virtualization)

  • Network

  • Storage

While extensive discussion of each of these pillars is impossible- considering billion-dollar companies provide software products to only one of these pillars and are wannabe's in other areas, few trends are worth mentioning.


Compute: Most compute resources are virtualized today. The server's physical resources like CPU, RAM, Network interfaces are shared between multiple virtual machines which run different operating systems. A special software call hypervisor runs on top of hardware that "virtualizes" resources and presents them to the guest operating system. The operating system resources are limited to the amount of CPU, Disk, RAM that the hypervisor presents it. Using hypervisors enables optimal utilization of hardware resources. VMWARE's ESXI is the market leader in the hypervisor. There are various other hypervisor solutions including many open-source ones.


A more recent trend in compute is the use of microservices and containers.


Network: In the past decade or so, with more applications using service-oriented architecture (or a variation thereof)- traffic "within" data-center (east-west traffic) has increased compared to traffic coming from outside. Also, with compute getting virtualized, legacy DC networks (based on VLANs) do not scale very well. For the most part, data center networks use VxLAN overlays today. VxLAN enables a virtual machine connected to a switch to communicate with another virtual machine on a remote switch without the need for VLAN to be stretched between these switches.


Overlays/VxLAN provide the ability to "slice" the network- just like hypervisors enabled compute resources to be sliced. Overlays are a separate topic by themselves and I will write about them sometime.


Storage: Storage in a data center consists of the following categories:

  • Local Attached Storage: These are hard disks installed in servers.

  • Remote Storage: These are hard disks or storage managed by a network-attached storage system. Just like virtualized compute, the NAS system provides a mechanism to share the storage resources between various clients. Many NAS systems support a protocol called iSCSI which uses LUN (logical unit number) to slice the hard disk array. Additionally RAID takes care of HDD failures in these arrays.

With this, let's look at tasks involved in setting up and running a data center:


  • Setup physical data center- ensure physical/access security, temperature/humidity control, fire detection.

  • Install/Manage Servers.

  • Install/Manage Hypervisor software.

  • Install/Manage Guest VM's.

  • Install/Manage networking hardware.

  • Setup/manage network- typically overlays/segment routing based.

  • Setup/Manage storage. Perform periodic backups.

  • Install/Manage security components like firewalls, intrusion detection/prevention systems.

  • Perform periodic software updates to hypervisors, operating system, network/storage hardware.

  • Run periodic audits to ensure compliance (ISO).

Finally, plan for disaster recovery- DR site in separate site or region.


If the enterprise consists of multiple sites- a data center would have to be setup in each of these sites to host applications/contents specific to these sites. They have to be interconnected by some high-speed WAN network (Data Center Interconnects).




As you can imagine, running data centers involves huge capital expenditure (CAPEX) as well as an operational expense (OPEX), in addition to personnel requirements.


Essentially, public cloud providers like AWS allow users to use a portion of their infra (Infra As a Service) for a fee. Cloud providers have multiple DC's in a region and are present in multiple sites around the world. All the DC's, sites are connected with high-speed private network and hence it is possible to service multi-site enterprise with public cloud.


Now, let's look at various infrastructure components of AWS.


AWS regions, Availablity zones, and VPCs


AWS Regions



The above map shows various AWS regions- currently AWS has 25 regions around the world. The regions are connected by AWS private network.


Each region is made up of Availablity Zones.




AZ or Availablity Zone is a logical datacenter made up of compute, network, storage resources with redundant power supply and physical security. The keyword here is "logical"- a single availability zone might be made up of more than one data center. That said, no two Availablity Zones (AZ's) will share the data center. Another way to put it- failure of one AZ will not affect another AZ. AZ's within the region are connected by high speed, low latency network.


For redundancy, services can be spun across different AZ's.


VPC or Virtual Private Cloud is a logically isolated network mapped to a customer- it consists of computing (EC2), storage, and other services provisioned by the customer. VPC isolates customer resource from another.




Now that we know the different infrastructure pieces of AWS, let's look at steps involved in provisioning and "running" enterprise applications in AWS:

  • Setup AWS accounts.

  • Provision VPC (default VPC is pre-provisioned).

  • Setup computes, storage, and network- either with AWS console or with APIs.

  • Setup redundancy with multiple AZ's.

  • Setup multi-region point of presence by creating VPCs in various AZ's

  • Maintaining- infra and software applications.

As we can see, the cloud negates the need to set up and maintain a data center saving capital expenditure. However, a public cloud might not be suitable for all enterprises- for example, financial institutions will never allow data to leave their premises and will probably never use the public cloud.


The choice to deploy in the cloud vs running your own DC will boil down to capex vs opex expenditure comparisions and data privacy requirements.


Hope this short intro to cloud infra was useful. Ciao and have a great week ahead!!

84 views0 comments

Recent Posts

See All

Comments


bottom of page