Using Spot instances with deployments
AKS Spot instances provide a massive cost saving where acceptable interruptions are possible e.g. non production environments, well designed application deployments etc. Its possible to use spot instances within applications and the templates and code required for this are available via the Chart-Library.
To make use of spot instances for an application the following value must be set in values.yaml for any given Helm chart:
spotInstances:
enabled: true
Simulating VMSS evictions
If there are concerned with how an application will be affected by using Spot instance types its possible to test what will happen when Azure recover their capacity and a spot instance goes offline. You can manually evict an AKS node from the node pool in the same way that Azure will do this when a spot instance is removed. The following script is an example showing how to evict an instance from VMSS pools on multiple AKS clusters. The script will simply pick the first node from the list so you will need to ensure your apps are on this node otherwise the test is pointless. You can however change the script to be more targeted to achieve the desired testing.
eviction script
“`bash #!/bin/bash resourceGroup00=Further Reading
AKS Spot Instances * https://learn.microsoft.com/en-us/azure/aks/spot-node-pool * https://techcommunity.microsoft.com/t5/azure-architecture-blog/optimize-azure-kubernetes-service-node-cost-by-combining/ba-p/3751787 * https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-orchestration-modes * https://learn.microsoft.com/en-us/azure/virtual-machines/spot-vms * https://learn.microsoft.com/en-us/azure/architecture/guide/spot/spot-eviction#understand-interruptible-workloads * https://cast.ai/blog/topology-spread-constraints-for-increased-cluster-availability-and-efficiency-and-a-much-better-cost/ * https://www.linkedin.com/pulse/running-critical-workloads-using-spot-instances-aks-deenar-toraskar-1e
Topology Spread Constraints * https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/ * https://pauldally.medium.com/the-most-common-reason-your-topologyspreadconstraint-isnt-working-fb9ce25297cd * https://mby.io/blog/topology-spread-constraints/ * https://medium.com/wise-engineering/avoiding-kubernetes-pod-topology-spread-constraint-pitfalls-d369bb04689e * https://techcommunity.microsoft.com/t5/azure-architecture-blog/optimize-azure-kubernetes-service-node-cost-by-combining/ba-p/3751787