本章重点: 从源码角度分析hpa的计算逻辑

1. hpa介绍

1.1 hpa是什么

hpa指的是 Pod 水平自动扩缩,全名是Horizontal Pod Autoscaler简称HPA。它可以基于 CPU 利用率或其他指标自动扩缩 ReplicationController、Deployment 和 ReplicaSet 中的 Pod 数量。

用处: 用户可以通过设置hpa,实现deploy pod数量的自动扩缩容。比如流量大的时候,pod数量多一些。流量小的时候,Pod数量降下来,避免资源浪费。

image-20210521145344510


1.2 hpa如何用起来

(1)需要一个deploy/svc等,可以参考社区

(2)需要对应的hpa

举例:

(1) 创建1个deploy。这里只有1个副本

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: zx-hpa-test
  name: zx-hpa
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
  replicas: 2
  selector:
    matchLabels:
      app: zx-hpa-test
  template:
    metadata:
      labels:
        app: zx-hpa-test
      name: zx-hpa-test
    spec:
      terminationGracePeriodSeconds: 5
      containers:
        - name: busybox
          image: busybox:latest
          imagePullPolicy: IfNotPresent
          command:
            - sleep
            - "3600"

(2)创建对应的hpa。

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa-zx-1
  annotations:
    metric-containerName: zx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1   // 这里必须指定需要监控那个对象
    kind: Deployment
    name: zx-hpa
  minReplicas: 1          // deploy最小的Pod数量
  maxReplicas: 3          // deploy最大的Pod数量
  metrics:
    - type: Pods
      pods:
        metricName: pod_cpu_1m
        targetAverageValue: 60

hpa是从同命名空间下,找对应的deploy。所以yaml中指定deploy的时候不要指定namespaces。这也就要求,hpa 和deploy必须在同一命名空间。


这里我使用的 pod_cpu_1m这个指标。这是一个自定义指标。接下来就是分析

创建好之后,观察hpa,当deploy的cpu利用率变化时,deploy的副本会随之改变。


2. hpa 源码分析

2.1 启动参数介绍

hpa controller随controller manager的初始化而启动,hpa controller将以下flag添加到controller manager的flag中,通过controller manager的CLI端暴露给用户:

// AddFlags adds flags related to HPAController for controller manager to the specified FlagSet.
func (o *HPAControllerOptions) AddFlags(fs *pflag.FlagSet) {
	if o == nil {
		return
	}

	fs.DurationVar(&o.HorizontalPodAutoscalerSyncPeriod.Duration, "horizontal-pod-autoscaler-sync-period", o.HorizontalPodAutoscalerSyncPeriod.Duration, "The period for syncing the number of pods in horizontal pod autoscaler.")
	fs.DurationVar(&o.HorizontalPodAutoscalerUpscaleForbiddenWindow.Duration, "horizontal-pod-autoscaler-upscale-delay", o.HorizontalPodAutoscalerUpscaleForbiddenWindow.Duration, "The period since last upscale, before another upscale can be performed in horizontal pod autoscaler.")
	fs.MarkDeprecated("horizontal-pod-autoscaler-upscale-delay", "This flag is currently no-op and will be deleted.")
	fs.DurationVar(&o.HorizontalPodAutoscalerDownscaleStabilizationWindow.Duration, "horizontal-pod-autoscaler-downscale-stabilization", o.HorizontalPodAutoscalerDownscaleStabilizationWindow.Duration, "The period for which autoscaler will look backwards and not scale down below any recommendation it made during that period.")
	fs.DurationVar(&o.HorizontalPodAutoscalerDownscaleForbiddenWindow.Duration, "horizontal-pod-autoscaler-downscale-delay", o.HorizontalPodAutoscalerDownscaleForbiddenWindow.Duration, "The period since last downscale, before another downscale can be performed in horizontal pod autoscaler.")
	fs.MarkDeprecated("horizontal-pod-autoscaler-downscale-delay", "This flag is currently no-op and will be deleted.")
	fs.Float64Var(&o.HorizontalPodAutoscalerTolerance, "horizontal-pod-autoscaler-tolerance", o.HorizontalPodAutoscalerTolerance, "The minimum change (from 1.0) in the desired-to-actual metrics ratio for the horizontal pod autoscaler to consider scaling.")
	fs.BoolVar(&o.HorizontalPodAutoscalerUseRESTClients, "horizontal-pod-autoscaler-use-rest-clients", o.HorizontalPodAutoscalerUseRESTClients, "If set to true, causes the horizontal pod autoscaler controller to use REST clients through the kube-aggregator, instead of using the legacy metrics client through the API server proxy.  This is required for custom metrics support in the horizontal pod autoscaler.")
	fs.DurationVar(&o.HorizontalPodAutoscalerCPUInitializationPeriod.Duration, "horizontal-pod-autoscaler-cpu-initialization-period", o.HorizontalPodAutoscalerCPUInitializationPeriod.Duration, "The period after pod start when CPU samples might be skipped.")
	fs.MarkDeprecated("horizontal-pod-autoscaler-use-rest-clients", "Heapster is no longer supported as a source for Horizontal Pod Autoscaler metrics.")
	fs.DurationVar(&o.HorizontalPodAutoscalerInitialReadinessDelay.Duration, "horizontal-pod-autoscaler-initial-readiness-delay", o.HorizontalPodAutoscalerInitialReadinessDelay.Duration, "The period after pod start during which readiness changes will be treated as initial readiness.")
}
参数 默认 说明
horizontal-pod-autoscaler-sync-period 15s controller同步HPA信息的同步周期
horizontal-pod-autoscaler-downscale-stabilization 5m 缩容稳定窗口,缩容间隔时间(v1.12支持)
horizontal-pod-autoscaler-tolerance 0.1 最小缩放容忍度:计算出的期望值和实际值的比率<最小容忍比率,则不进行扩缩容
horizontal-pod-autoscaler-cpu-initialization-period 5m pod刚启动时,一定时间内的CPU使用率数据不参与计算。
horizontal-pod-autoscaler-initial-readiness-delay 30s 扩容等待pod ready的时间(无法得知pod何时就绪)

kcm中需要设置这个,才能启动自定义的rest-clients。 –horizontal-pod-autoscaler-use-rest-clients=true


2.2 启动流程

**代码流程: **

startHPAControllerWithMetricsClient -> startHPAControllerWithMetricsClient -> Run -> worker -> processNextWorkItem -> reconcileKey->reconcileAutoscaler

func (a *HorizontalController) reconcileKey(key string) (deleted bool, err error) {
	namespace, name, err := cache.SplitMetaNamespaceKey(key)
	if err != nil {
		return true, err
	}

	hpa, err := a.hpaLister.HorizontalPodAutoscalers(namespace).Get(name)
	if errors.IsNotFound(err) {
		klog.Infof("Horizontal Pod Autoscaler %s has been deleted in %s", name, namespace)
		delete(a.recommendations, key)
		return true, nil
	}

	return false, a.reconcileAutoscaler(hpa, key)
}


2.3 核心计算逻辑

metric的定义类型分为3种,resource、pods和external,这里只分析pods类型的metric。

reconcileAutoscaler函数就是hpa的核心函数。该函数主要逻辑如下:

  • 1.做一些类型转换,用于接下来的Hpa计算
  • 2.计算hpa 的期望副本数量。
  • 3.根据计算的结果判断是否需要改变副本数,需要改变的话,调用接口修改,然后做错误处理。

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    
    func (a *HorizontalController) reconcileAutoscaler(hpav1Shared *autoscalingv1.HorizontalPodAutoscaler, key string) error {
    	// 1. 调用client向apiserver发送请求,scale是返回的hpa实体,然后做各种数据类型转换,然后通过一个client向apiserver获取scale,以及当然还有一些backup、把错误写入hpa event的操作
    。。。。代码省略
    
    // 2. 判断是否需要计算副本数,如果需要,就调用computeReplicasForMetrics函数计算当前hpa的副本数。
    	desiredReplicas := int32(0)
    	rescaleReason := ""
    
    	var minReplicas int32
    
    	if hpa.Spec.MinReplicas != nil {
    		minReplicas = *hpa.Spec.MinReplicas
    	} else {
    		// Default value
    		minReplicas = 1
    	}
    
    	rescale := true
    
    	if scale.Spec.Replicas == 0 && minReplicas != 0 {
    		// Autoscaling is disabled for this resource
    		desiredReplicas = 0
    		rescale = false
    		setCondition(hpa, autoscalingv2.ScalingActive, v1.ConditionFalse, "ScalingDisabled", "scaling is disabled since the replica count of the target is zero")
    	} else if currentReplicas > hpa.Spec.MaxReplicas {
    		rescaleReason = "Current number of replicas above Spec.MaxReplicas"
    		desiredReplicas = hpa.Spec.MaxReplicas
    	} else if currentReplicas < minReplicas {
    		rescaleReason = "Current number of replicas below Spec.MinReplicas"
    		desiredReplicas = minReplicas
    	} else {
    		var metricTimestamp time.Time
    		metricDesiredReplicas, metricName, metricStatuses, metricTimestamp, err = a.computeReplicasForMetrics(hpa, scale, hpa.Spec.Metrics)
    		if err != nil {
    			a.setCurrentReplicasInStatus(hpa, currentReplicas)
    			if err := a.updateStatusIfNeeded(hpaStatusOriginal, hpa); err != nil {
    				utilruntime.HandleError(err)
    			}
    			a.eventRecorder.Event(hpa, v1.EventTypeWarning, "FailedComputeMetricsReplicas", err.Error())
    			return fmt.Errorf("failed to compute desired number of replicas based on listed metrics for %s: %v", reference, err)
    		}
    
    		klog.V(4).Infof("proposing %v desired replicas (based on %s from %s) for %s", metricDesiredReplicas, metricName, metricTimestamp, reference)
    
    		rescaleMetric := ""
    		if metricDesiredReplicas > desiredReplicas {
    			desiredReplicas = metricDesiredReplicas
    			rescaleMetric = metricName
    		}
    		if desiredReplicas > currentReplicas {
    			rescaleReason = fmt.Sprintf("%s above target", rescaleMetric)
    		}
    		if desiredReplicas < currentReplicas {
    			rescaleReason = "All metrics below target"
    		}
    		desiredReplicas = a.normalizeDesiredReplicas(hpa, key, currentReplicas, desiredReplicas, minReplicas)
    		rescale = desiredReplicas != currentReplicas
    	}
      
    // 3.进行扩缩容,并进行错误处理。
    	if rescale {
    		scale.Spec.Replicas = desiredReplicas
    		_, err = a.scaleNamespacer.Scales(hpa.Namespace).Update(targetGR, scale)
    		if err != nil {
    			a.eventRecorder.Eventf(hpa, v1.EventTypeWarning, "FailedRescale", "New size: %d; reason: %s; error: %v", desiredReplicas, rescaleReason, err.Error())
    			setCondition(hpa, autoscalingv2.AbleToScale, v1.ConditionFalse, "FailedUpdateScale", "the HPA controller was unable to update the target scale: %v", err)
    			a.setCurrentReplicasInStatus(hpa, currentReplicas)
    			if err := a.updateStatusIfNeeded(hpaStatusOriginal, hpa); err != nil {
    				utilruntime.HandleError(err)
    			}
    			return fmt.Errorf("failed to rescale %s: %v", reference, err)
    		}
    		setCondition(hpa, autoscalingv2.AbleToScale, v1.ConditionTrue, "SucceededRescale", "the HPA controller was able to update the target scale to %d", desiredReplicas)
    		a.eventRecorder.Eventf(hpa, v1.EventTypeNormal, "SuccessfulRescale", "New size: %d; reason: %s", desiredReplicas, rescaleReason)
    		klog.Infof("Successful rescale of %s, old size: %d, new size: %d, reason: %s",
    			hpa.Name, currentReplicas, desiredReplicas, rescaleReason)
    	} else {
    		klog.V(4).Infof("decided not to scale %s to %v (last scale time was %s)", reference, desiredReplicas, hpa.Status.LastScaleTime)
    		desiredReplicas = currentReplicas
    	}
    
    	a.setStatus(hpa, currentReplicas, desiredReplicas, metricStatuses, rescale)
    	return a.updateStatusIfNeeded(hpaStatusOriginal, hpa)
    }


这里主要关心第二个步骤:hpa如何计算期望副本数量

2.4 计算期望副本数量

概念:

最小值:minReplicas。 这个是用户在hpa里面的yaml设置的。这个是可选的,如果不设置,默认是1。

最大值:MaxReplicas。 这个是用户在hpa里面的yaml设置的。这个必填的,如果不设置,会报错, 如下。

当前值:currentReplicas。这个是hpa获得的当前deploy的副本数量。

期望值:desiredReplicas。 这个是hpa希望deploy的副本数量。

error: error validating "nginx-deployment-hpa-test.yaml": error validating data: ValidationError(HorizontalPodAutoscaler.spec): missing required field "maxReplicas" in io.k8s.api.autoscaling.v2beta1.HorizontalPodAutoscalerSpec; if you choose to ignore these errors, turn validation off with --validate=false

计算逻辑分为两部分,第一种情况是不需要算,就可以直接得出期望值。 第二种情况需要调用函数计算。

情况1:不需要计算

(1)当前值等于0。 期望值=0. 不扩容,

(2)当前值 > 最大值。 没必要计算期望值。 期望值=最大值,需要扩缩容。

(3)当前值 < 最小值。 没必要计算期望值。 期望值=最小值,需要扩缩容。


情况2: 最小值 <= 当前值 <= 最大值。 需要调用函数计算 期望值。

这里的调用链为 computeReplicasForMetrics -> computeReplicasForMetric -> GetMetricReplicas

这里computeReplicasForMetrics有一个需要注意的点就是。这里可以处理了多个metric的情况。例如:这里一个hpa有多个指标。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
 - type: Resource
    resource:
      name: cpu
      # Utilization类型的目标值,Resource类型的指标只支持Utilization和AverageValue类型的目标值
      target:
        type: Utilization
        averageUtilization: 50
  # Pods类型的指标
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      # AverageValue类型的目标值,Pods指标类型下只支持AverageValue类型的目标值
      target:
        type: AverageValue
        averageValue: 1k

这里hpa的逻辑是,谁最大取谁。例如, 通过cpu.Utilization hpa算出来应该需要 4个pod。 但是packets-per-second算出来需要5个。这个时候就已5个为准。见下面代码:

// computeReplicasForMetrics computes the desired number of replicas for the metric specifications listed in the HPA,
// returning the maximum  of the computed replica counts, a description of the associated metric, and the statuses of
// all metrics computed.
func (a *HorizontalController) computeReplicasForMetrics(hpa *autoscalingv2.HorizontalPodAutoscaler, scale *autoscalingv1.Scale,
	metricSpecs []autoscalingv2.MetricSpec) (replicas int32, metric string, statuses []autoscalingv2.MetricStatus, timestamp time.Time, err error) {

	for i, metricSpec := range metricSpecs {
		replicaCountProposal, metricNameProposal, timestampProposal, condition, err := a.computeReplicasForMetric(hpa, metricSpec, specReplicas, statusReplicas, selector, &statuses[i])

		if err != nil {
			if invalidMetricsCount <= 0 {
				invalidMetricCondition = condition
				invalidMetricError = err
			}
			invalidMetricsCount++
		}
		if err == nil && (replicas == 0 || replicaCountProposal > replicas) {
			timestamp = timestampProposal
			replicas = replicaCountProposal
			metric = metricNameProposal
		}
	}

	// If all metrics are invalid return error and set condition on hpa based on first invalid metric.
	if invalidMetricsCount >= len(metricSpecs) {
		setCondition(hpa, invalidMetricCondition.Type, invalidMetricCondition.Status, invalidMetricCondition.Reason, invalidMetricCondition.Message)
		return 0, "", statuses, time.Time{}, fmt.Errorf("invalid metrics (%v invalid out of %v), first error is: %v", invalidMetricsCount, len(metricSpecs), invalidMetricError)
	}
	setCondition(hpa, autoscalingv2.ScalingActive, v1.ConditionTrue, "ValidMetricFound", "the HPA was able to successfully calculate a replica count from %s", metric)
	return replicas, metric, statuses, timestamp, nil
}


针对具体某个metric指标。计算分为俩步:

(1)GetRawMetric函数: 得到 具体的metric值

(2)calcPlainMetricReplicas :计算期望副本值

这里需要注意一点就是targetUtilization进行了数据转换。乘以了10^3。

// GetMetricReplicas calculates the desired replica count based on a target metric utilization
// (as a milli-value) for pods matching the given selector in the given namespace, and the
// current replica count
func (c *ReplicaCalculator) GetMetricReplicas(currentReplicas int32, targetUtilization int64, metricName string, namespace string, selector labels.Selector, metricSelector labels.Selector) (replicaCount int32, utilization int64, timestamp time.Time, err error) {
	metrics, timestamp, err := c.metricsClient.GetRawMetric(metricName, namespace, selector, metricSelector)
	if err != nil {
		return 0, 0, time.Time{}, fmt.Errorf("unable to get metric %s: %v", metricName, err)
	}

	replicaCount, utilization, err = c.calcPlainMetricReplicas(metrics, currentReplicas, targetUtilization, namespace, selector, v1.ResourceName(""))
	return replicaCount, utilization, timestamp, err
}


2.4.1 GetRawMetric-具体的metric值
// GetRawMetric gets the given metric (and an associated oldest timestamp)
// for all pods matching the specified selector in the given namespace
func (c *customMetricsClient) GetRawMetric(metricName string, namespace string, selector labels.Selector, metricSelector labels.Selector) (PodMetricsInfo, time.Time, error) {
  // 1.这里直接调用 GetForObjects,发送restful请求获取数据
	metrics, err := c.client.NamespacedMetrics(namespace).GetForObjects(schema.GroupKind{Kind: "Pod"}, selector, metricName, metricSelector)
	if err != nil {
		return nil, time.Time{}, fmt.Errorf("unable to fetch metrics from custom metrics API: %v", err)
	}

	if len(metrics.Items) == 0 {
		return nil, time.Time{}, fmt.Errorf("no metrics returned from custom metrics API")
	}
  
  // 2. 对获取的数据进行处理。这里看起来是乘以了 10^3
	res := make(PodMetricsInfo, len(metrics.Items))
	for _, m := range metrics.Items {
		window := metricServerDefaultMetricWindow
		if m.WindowSeconds != nil {
			window = time.Duration(*m.WindowSeconds) * time.Second
		}
		res[m.DescribedObject.Name] = PodMetric{
			Timestamp: m.Timestamp.Time,
			Window:    window,
			Value:     int64(m.Value.MilliValue()),
		}

		m.Value.MilliValue()
	}

	timestamp := metrics.Items[0].Timestamp.Time

	return res, timestamp, nil
}


2.4.2 calcPlainMetricReplicas-计算期望副本值

这里代码省略,直接贴逻辑。

3.1 先从apiserver端拿到所有相关的pod,将这些pod分为三类:

a.missingPods用于记录处于running状态,但不提供该metric的pod

b.ignoredPods 用于处理resource类型cpu相关metric的延迟(就是pod未就绪),这里不深入讨论

c.readyPodCount记录状态为running,且能提供该metric的pod

3.2 调用GetMetricUtilizationRatio计算实际值与期望值的对比情况。计算时,对于所有可获取到metric的pod,取它们metric value的平均值得到:usageRatio=实际值/期望值;utilization=实际值(平均)

3.3 计算期望pod数量DesiredReplicas。对于missingPods为0,即所有target pod都处于running可获取metric value的情况:

a.如果实际值与期望值的对比usageRatio处于可容忍范围内,不执行scale操作。默认情况下c.tolerance=0.1,即usageRatio处于

[0.9,1.1]时pod数量不变化

if math.Abs(1.0-usageRatio) <= c.tolerance {
    // return the current replicas if the change would be too small
    return currentReplicas, utilization, nil
}

b.实际值与期望值的对比usageRatio不在可容忍范围内,向上取整得到desiredReplicas return int32(math.Ceil(usageRatio * float64(readyPodCount))), utilization, nil

对于missingPods>0,即有target pod的metric value没有获取到的情况。 缩容时,对于找不到metric的pod,视为正好用了desired value

if usageRatio < 1.0 {
// on a scale-down, treat missing pods as using 100% of the resource request
for podName := range missingPods {
	metrics[podName] = metricsclient.PodMetric{Value: targetUtilization}
        }
} 

扩容时,对于找不到metric的pod,视为该pod对指定metric的使用量为0

for podName := range missingPods {
	metrics[podName] = metricsclient.PodMetric{Value: 0}
}

经过上面的处理后,重新计算实际值与期望值的对比newUsageRatio。

在下面两种情况下,不执行scale操作:新的实际值与期望值的对比newUsageRatio在容忍范围内; 赋值处理前后,一个需要scale up,另一个需要scale down。

其它情况下,同样地执行向上取整操作

if math.Abs(1.0-newUsageRatio) <= c.tolerance || (usageRatio < 1.0 && newUsageRatio > 1.0) || (usageRatio > 1.0 && newUsageRatio < 1.0) {
		// return the current replicas if the change would be too small,
		// or if the new usage ratio would cause a change in scale direction
		return currentReplicas, utilization, nil
	}
return int32(math.Ceil(newUsageRatio * float64(len(metrics)))), utilization, nil


最后,Hpa将desiredReplicas写到scale.Spec.Replicas,调用a.scaleNamespacer.Scales(hpa.Namespace).Update(targetGR, scale)向apiserver发送更新hpa的请求,对某个hpa的一轮更新操作就完成了。


3. 举例说明计算过程

3.1 hpa扩容计算逻辑

关键概念:tolerance(hpa扩容容忍度), 默认为0.1。

Custom server: 自定义metric服务。这里是一个抽象,用于给hpa提供具体的metric值。Custom server具体可以是prometheus,或者其他的监控系统。下一篇文章会讲如何将Custom server和hpa联系起来。


3.2 场景1

当前有deployA, 运行着俩个pod, A1和A2。 deploy设置了hpa,指标是内存使用量,并且规定,当平均使用量大于60就要扩容。

image-20210616210849063

hpa扩容计算步骤:

第一步: 往monitor-adaptor发送请求, 要求获得deployA下所有pod的metric值。 这里收到了 A1=50; A2=100

第二步: 补全metric值,给获取不到metric值的pod赋值。 这里hpa会查看集群状态,发现deployA 下有俩个pod,A1,A2。并且这两个pod的metric值都获取到了。 这个时候就不用补全。(下面例子就介绍需要补全metric的情况)

第三步: 开始计算

(1)计算 平均pod metric值和 target的比例。也可以叫扩容比例系数

  ratio = (A1+A2)/(2*target) = (50+100)/120 = 1.25

按理说不用再除target值,直接(50+100)/2=75,然后拿75和60比就行。 75比60大就应该扩容。

这里使用系数表示主要有俩个原因:

  • 有容忍度的概念,使用比例方便和计算是否超出了容忍度
  • 用于扩缩容计算

(2)判断是否超过容忍度

这里 1.25-1 > 0.1(默认容忍度)。 因此这种情况是需要扩容的。

这里就体现了容忍度的作用。有了容忍度, 平均metric需要大于 66才会扩容(60*1.1)

(3)计算真正的副本数量

向上取整: 扩容比例系数*当前的副本数

这里就是: 1.25*2 = 2.5 , 取整后就是3。


3.3 场景2

和场景1不同在于:由于某件原因,导致 monitor-adaptor往hpa发送的时候,只有 A1=20。 A2的数据丢失。

image-20210616211337218hpa扩容计算步骤:

第一步: 往monitor-adaptor发送请求, 要求获得deployA下所有pod的metric值。 这里收到了 A1=2;

第二步: 补全metric值,给获取不到metric值的pod赋值。 这里hpa会查看集群状态,发现deployA 下有俩个pod,A1,A2。但是这里发现只有A1的值,这个时候hpa就认为A2 有数据,但是获取失败。所以就会给A2自己赋值, 0/target。

赋值逻辑如下: 当 A1 > target的时候,A2=0; 当A1<= target的时候,赋值为 target。

这里由于 A1=2, 比target(60)小,所以最终hpa计算时:

A1=2; A2=60; target=60;

第三步: 开始计算

(1)计算 平均pod metric值和 target的比例。也可以叫扩容比例系数

  ratio = (A1+A2)/(2*target) = (2+60)/120 = 0.517

(2)判断是否超过容忍度

这里 1-0.517 > 0.1(默认容忍度)。 因此这种情况是需要缩容的。

(3)计算真正的副本数量

向上取整: 扩容比例系数*当前的副本数(这里就是metric数量,A1,A2)

对应就是: 0.517*2 = 1.034 , 取整后就是2。


4. 总结

(1)hpa可以设置多个metric。当有多个metric时,谁算出来的副本值最大,取谁的值

(2)针对具体的metric而言(这里是以pods这种为例),首先获得用户定义的hpa指标。比如最大值,最小值,阈值等。

这里有一个点在于。阈值乘以了1000用于计算。

(3)获取metric的值,这里是使用了自定义rest服务。hpa只要发送rest请求,就有数据。这种情况非常适用于公司使用自己的监控数据做扩缩容。 注意:这里每个值也乘以了1000。这样和阈值就是相互抵消了。

(4)利用公式计算期望值。 期望值*X <= 当前pod所有的metric值。X取小的正整数。具体逻辑可以看上文的计算过程。