Skip to content

Scaler Configuration

This document describes the comprehensive configuration options for the ICC Scaler service, which implements intelligent autoscaling for Node.js watts using dual reactive and predictive algorithms.

The ICC Scaler service provides intelligent autoscaling for Node.js watts in Kubernetes environments. It combines a Reactive Autoscaler Algorithm for real-time scaling responses and a Trends Learning Algorithm for predictive scaling based on historical patterns. The system is designed to handle 30-50 scaling events per day with high efficiency and transparency.

  1. Reactive Autoscaler: Responds in real-time to performance signals (ELU > 90%)
  2. Trends Learning Algorithm: Analyzes 30-day history (900-1500 events) for predictive scaling
  3. Dual Integration: Combines reactive and proactive scaling decisions
  • Event Loop Utilization (ELU) and Heap monitoring
  • Historical performance clustering
  • 3-day half-life decay for recent trend prioritization
  • Leader election for multi-instance deployments
  • Comprehensive validation and success scoring

These values are used when creating a default scale configuration for a new application:

VariableTypeDefaultDescription
PLT_SCALER_MIN_PODS_DEFAULTInteger1Default minimum number of pods when creating new scale configs
PLT_SCALER_MAX_PODS_DEFAULTInteger10Default maximum number of pods when creating new scale configs

Per-Application Limits via Kubernetes Labels

Section titled “Per-Application Limits via Kubernetes Labels”

You can configure min/max pod limits for individual applications using Kubernetes labels on your Deployments or StatefulSets. These labels override the default values for specific applications.

Label Configuration:

These values are used when syncing scale configuration from Kubernetes labels:

VariableTypeDefaultDescription
PLT_SCALER_POD_MIN_LABELStringicc.platformatic.dev/scaler-minKubernetes label name for minimum pods
PLT_SCALER_POD_MIN_DEFAULT_VALUEInteger1Default minimum pods when syncing from K8s labels and label is not set
PLT_SCALER_POD_MAX_LABELStringicc.platformatic.dev/scaler-maxKubernetes label name for maximum pods
PLT_SCALER_POD_MAX_DEFAULT_VALUEInteger10Default maximum pods when syncing from K8s labels and label is not set

Example Kubernetes Deployment with scaler labels:

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-application
labels:
icc.platformatic.dev/scaler-min: "2"
icc.platformatic.dev/scaler-max: "20"
spec:
replicas: 2
selector:
matchLabels:
app: my-application
template:
metadata:
labels:
app: my-application
icc.platformatic.dev/scaler-min: "2"
icc.platformatic.dev/scaler-max: "20"
spec:
containers:
- name: app
image: my-app:latest

How it works:

  • The scaler syncs configuration from Kubernetes labels automatically during the periodic K8s sync (default: every 60 seconds)
  • Labels are read from the controller (Deployment/StatefulSet) via pod labels
  • You can set only scaler-min, only scaler-max, or both labels
  • Values must be valid integers greater than or equal to 1
  • scaler-min must not be greater than scaler-max
  • If validation fails, the sync is skipped and a warning is logged
  • To manually trigger a sync for all applications, use the syncScalerConfigFromLabels() function

Custom label names:

You can customize the label names used by setting PLT_SCALER_POD_MIN_LABEL and PLT_SCALER_POD_MAX_LABEL environment variables. This is useful if you want to use your own label naming conventions:

Terminal window
PLT_SCALER_POD_MIN_LABEL=custom.example.com/min-replicas
PLT_SCALER_POD_MAX_LABEL=custom.example.com/max-replicas
VariableTypeDefaultDescription
PLT_SCALER_DEBOUNCEInteger10000Debounce time in milliseconds to prevent rapid scaling
PLT_SCALER_MAX_HISTORYInteger10Maximum number of history events to consider
PLT_SCALER_MAX_CLUSTERSInteger5Maximum number of clusters for historical analysis
PLT_SCALER_ELU_THRESHOLDFloat0.9Event Loop Utilization threshold (90%)
PLT_SCALER_HEAP_THRESHOLDFloat0.85Heap usage threshold (85%)
PLT_SCALER_POST_EVAL_WINDOWInteger300Post-evaluation window in seconds
PLT_SCALER_COOLDOWNInteger300Cooldown period between scaling operations (seconds)
VariableTypeDefaultDescription
PLT_SCALER_PERIODIC_TRIGGERInteger60Periodic trigger interval for metrics-based scaling (seconds)

The reactive algorithm uses the following key thresholds:

  • ELU Trigger: 90% (configurable via PLT_SCALER_ELU_THRESHOLD)
  • ELU Fallback: 95% (ensures critical scaling)
  • Heap Trigger: 85% (configurable via PLT_SCALER_HEAP_THRESHOLD)
  • Signal Window: 60 seconds of metrics data
  • Cooldown Period: 300 seconds between scaling operations

The predictive algorithm operates with:

  • History Window: 30 days (900-1500 events)
  • Half-life Decay: 3 days (λ ≈ 2.674 × 10⁻⁶)
  • Execution Frequency: Hourly scheduled runs
  • Confidence Threshold: 80% for prediction acceptance
  • Prediction Window: 30 seconds for proactive scaling

Success scoring combines:

  • Responsiveness: 70% weight (ELU/Heap < thresholds post-scaling)
  • Resource Optimization: 30% weight (optimal pod utilization)
  1. Signal Detection: Pod subprocess ELU > 90%
  2. Metrics Collection: 60 seconds of ELU/Heap data
  3. Performance Analysis: Compute pod-level scores using historical clusters
  4. Trigger Evaluation: Scale if score > 0.5 or ELU > 95%
  5. Prediction Integration: Query trends learning for proactive scaling
  6. Scaling Execution: Apply cooldown and pod limits
  7. Success Tracking: Monitor post-scaling performance
  1. Historical Analysis: Process 30-day event history (hourly)
  2. Pattern Detection: Identify time-slot probabilities with 3-day decay
  3. Sequence Modeling: Model multi-step scaling patterns
  4. Validation: Score predictions for responsiveness and optimization
  5. Schedule Generation: Create timestamped predictions with confidence scores