leanesch
leanesch
Bloggin' it out
3 posts
Don't wanna be here? Send us removal request.
leanesch · 3 years ago
Text
EKS must know:
Tumblr media
Amazon Elastic Kubernetes Service (aka Amazon EKS) is a managed container service to run and scale Kubernetes applications in the cloud or on-premises.
There is also EKS anywhere which I will be talking about in another article which allows customers to create and operate Kubernetes clusters on-premises while deploying on customers virtual machines. There are two options supported, One is Bare Metal cluster and Second is VMware Vsphere.
EKS uses aws-iam-authenticator to generate tokens that should be passed to the kube-apiserver in order to verify authentication.
The command is : aws eks get-token --cluster <cluster-name>
After the authentication, Authorization is made by verifying the user access by checking aws-auth configmap. Here, we are talking about RBAC rules that were discussed in my previous article.
Make sure to grant the least privileged access to IAM users in aws-auth.
There are two types of endpoints of EKS cluster, public and private. If public endpoint is needed, you ca restrict access to a range of IPs
When the cluster is created, the creator is granted system:masters permission, however, this is not included in the aws-auth.
Two things to note here, using this role should be limited to creating new permissions in the configmap or in emergency cases.
Second thing is to avoid giving this role any other rbac permissions in the configmap as it overrides the system:masters.
The best way for pods to be allowed certain permissions to call kubernetes APIs is to use a service account (namespace default or a custom one).
This service account's token will be mounted at /var/run/secrets/kubernetes.io/serviceaccount.
Please make sure to check IRSA which is a feature to assign roles to service accounts through an IAM OIDC provider. The AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE are injected in the pod as environment variables.
Blocking/limiting access to instance metadata from pods is also recommended.
Avoid running pods in privileged mode as it inherits all of the linux capabilities associated with root on the host.
There are different types to apply certain requirements for pods before being created such as OPA gatekeeper and Pod security admission( offering 3 modes : audit/warn/enforce)
You can disable service account token mounts if the pod doesnt need access to k8s APIs.
It is recommended to enable controle plane logs which include API server, controller manager and scheduler logs.
You can check cloudwatch log insights for more detailed logs of your eks cluster
With eks, you can use network policies as well as calico or cilium.
Check AWS VPC flow logs for information about traffic going thru your cluster to look for unusual activities
When creating EKS cluster, a security group is created to allow traffic between control plane and the woker nodes.
For volume provisioning and secrets, you can check EBS CSI driver, EFS CSI driver, secrets store CSI driver.
To enforce security and permission boundries, you can use bottlerocket OS that is made to run linux containers
Make sure to always update your worker nodes with the latest patch/updates.
With eks fargate, AWS automatically updates the nodes for you.
Make sure to always scan/sign your docker images.
Install kubernetes metrics server in order to collect metrics from applications that can be used to scale applications using HPA and VPA
Make use of health checks such as liveness probe, startup probe and readiness probe. Kubelet is the one responsible for executing these health checks.
Use PDB, AWS node termination handler to control the behavior of pods termination in case of an update or crash of worker nodes.
Check Xray or Jaeger for tracing to have detailed information on your applications requests.
Check topology spread constraints for pods in order to avoid failures of AZs which impacts your pods.
EKS supports AWS VPC CNI for assigning IPs to pods. Please note that the number of IPs that can be allocated depends on the number of ENIs that can be attached to a worker node and how many IPs it supports
L-IPAMD is a local IP adress management daemon who is responsible for assigning IPs to pods.
You can check CNI custom networking to avoid IP allocation/shortage issues. This can be done by setting AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG to true.
To calculate the maximum number of pods that can be placed on a worker node : max = (number of ENIs -1) * (max IPv4 adr per ENI -1 ) +2
If you are using IPv6 for your cluster, it is no longer needed to use custom networking.
Please note that a security group is attached to an ec2 instance, meaning that all of the ENIs attached to an ec2 share the same security group. However, you can use "security groups for pods" which will be applied to specific pods meaning that the networking security rules will be applied at the pod level. This is done by creating and attaching a trunk interface to the nodes. The VPC resource controller then creates branch interfaces that will be associated to pods
0 notes
leanesch · 3 years ago
Text
Kubernetes must know:
Tumblr media
First thing to know is that Kubernetes has many competitors such as Docker Swarm, Zookeeper, Nomad etc.. and Kubernetes is not the solution for every architecture so please define your requirements and check other alternatives first before starting with Kuberenetes as it can be complex or not really that beneficial in your case and that an easier orchestrator can do the job.
If you are using a cloud provider, and you want a managed kubernetes service, you can check EKS for AWS, GCP for Google Cloud or AKS for Azure.
Make sure to have proper monitoring and alerting for your cluster as this enables more visibility and eases the management of containerized infrastructure by tracking utilization of cluster resources including memory, CPU, storage and networking performance. It is also recommended to monitor pods and applications in the cluster. The most common tools used for Kubernetes monitoring are ELK/EFK, datadog, Prometheus and Grafana which will be my topic for the next article, etc..
Please make sure to backup your cluster’s etcd data regularly.
In order to ensure that your kubernetes cluster resources are only accessed by certain people, it's recommended to use RBAC in your cluster in order to build roles with the right access.
Scalability and what's more important than scalability, 3 types we must know and include in our cluster architecture are Cluster autoscaler, HPA and VPA.
Resource management is important as well, setting and rightsizing cluster resources requests and limits will help avoiding issues like OOM and Pod eviction and saves you money!
You may want to check Kubernetes CIS Benchmark which is a set of recommendations for configuring Kubernetes to support a strong security posture, you can take a look at this article to learn more about it.
Try to always get the latest Kubernetes stable GA version for newer functionalities and if using cloud, for supported versions.
Scan containers for security vulnerabilities is very important as well, here we can talk about tools like Kube Hunter, Kube Bench etc..
Make use of Admission controllers when possible as they intercept and process requests to the Kubernetes API prior to persistence of the object, but after the request is authenticated and authorized, which is used when you have a set of constraints/behavior to be checked before a resource is deployed. It can also block vulnerable images from being deployed.
Speaking about Admission controller, you can also enforce policies in Kubernetes using a tool like OPA which lets you define sets of security and compliance policies as code.
Using a tool like Falco for auditing the cluster, this is a nice way to log and monitor real time activities and interactions with the API.
Another thing to take a look at is how to handle logging of applications running in containers (I recommend checking logging agents such fluentd/fluentbit) and especially how to setup Log rotation to reduce the storage growth and avoid performance issues.
In case you have multiple microservices running in the cluster, you can also implement a service mesh solution in order to have a reliable and secure architecture and other features such as encryption, authentication, authorization, routing between services and versions and load balancing. One of the famous service mesh solutions is Istio. You can take a look at this article for more details about service mesh.
One of the most important production ready clusters features is to have a backup&restore solution and especially a solution to take snapshots of your cluster’s Persistent Volumes. There are multiple tools to do this that you might check and benchmark like velero, portworx etc..
You can use quotas and limit ranges to control the amount of resources in a namespace for multi-tenancy.
For multi cluster management, you can check Rancher, weave Flux, Lens etc..
0 notes
leanesch · 3 years ago
Text
Terraform must know:
Tumblr media
Terraform got plenty of providers that you can use to create resources like cloud providers or others such as kubernetes which will be the topic of my next article.
If you can't find the provider you are looking for, you can create your own provider. Here is an example that you can follow.
Please do not store passwords or critical information in terraform files. You can instead use AWS Secret manager, GCP Secret manager, Hashicorp Vault and then consume the secrets using data sources.
You can use terraform modules in order to increase reusability of your code
One of the most important things to do when working with tf state is to introduce locking in order to make sure the state is locked when used by another user.
Make use of the outputs.tf file as it facilitates the consumption of the values in other modules/code.
If you have some resources that were created manually, you can use the import command in order to import existing infrastructure into Terraform.
One of the commands that are not very used is the fmt and the validate command which can be useful to check if there are any issues with your code.
Always make sure to use the latest release of Terraform in order to keep up to date with the new functionalities.
Most of times, you will need more information to debug an error, adding the export TF_LOG=”DEBUG” would help generating the debug file. You can also change the verbosity of the log by using any of these: TRACE, INFO, WARN or ERROR.
Testing your terraform code is as important as writing the code itself as it saves you time and energy to test your code before applying it. You can find in here some popular tools for testing.
I highly recommend checking this github link where you'll find multiple resouces and toolings linked to Terraform.
In case, you want to have a SaaS solution for free remote state storage, a stable run environment, version control system (VCS) driven plans and applies, a collaborative web GUI and most of all, calculate costs before applying infrastructure changes, and control them using policy as code, you may want to check Terraform Cloud.
Useful commands : (This article explains well the following commands)
terraform -help
terraform fmt
terraform version
terraform init
terraform get
terraform validate
terraform plan
terraform apply
terraform destroy
terraform taint
terraform refresh
terraform show
terraform state
terraform import
terraform providers
terraform workspace 
terraform output
1 note · View note