Scaler Configuration
This document describes the comprehensive configuration options for the ICC Scaler service, which implements intelligent autoscaling for Node.js watts using dual reactive and predictive algorithms.
Overview
Section titled “Overview”The ICC Scaler service provides intelligent autoscaling for Node.js watts in Kubernetes environments. It combines a Reactive Autoscaler Algorithm for real-time scaling responses and a Trends Learning Algorithm for predictive scaling based on historical patterns. The system is designed to handle 30-50 scaling events per day with high efficiency and transparency.
Core Architecture
Section titled “Core Architecture”Scaling Algorithms
Section titled “Scaling Algorithms”- Reactive Autoscaler: Responds in real-time to performance signals (ELU > 90%)
- Trends Learning Algorithm: Analyzes 30-day history (900-1500 events) for predictive scaling
- Dual Integration: Combines reactive and proactive scaling decisions
Key Features
Section titled “Key Features”- Event Loop Utilization (ELU) and Heap monitoring
- Historical performance clustering
- 3-day half-life decay for recent trend prioritization
- Leader election for multi-instance deployments
- Comprehensive validation and success scoring
Pod Scaling Limits
Section titled “Pod Scaling Limits”Default Limits
Section titled “Default Limits”These values are used when creating a default scale configuration for a new application:
Variable | Type | Default | Description |
---|---|---|---|
PLT_SCALER_MIN_PODS_DEFAULT | Integer | 1 | Default minimum number of pods when creating new scale configs |
PLT_SCALER_MAX_PODS_DEFAULT | Integer | 10 | Default maximum number of pods when creating new scale configs |
Per-Application Limits via Kubernetes Labels
Section titled “Per-Application Limits via Kubernetes Labels”You can configure min/max pod limits for individual applications using Kubernetes labels on your Deployments or StatefulSets. These labels override the default values for specific applications.
Label Configuration:
These values are used when syncing scale configuration from Kubernetes labels:
Variable | Type | Default | Description |
---|---|---|---|
PLT_SCALER_POD_MIN_LABEL | String | icc.platformatic.dev/scaler-min | Kubernetes label name for minimum pods |
PLT_SCALER_POD_MIN_DEFAULT_VALUE | Integer | 1 | Default minimum pods when syncing from K8s labels and label is not set |
PLT_SCALER_POD_MAX_LABEL | String | icc.platformatic.dev/scaler-max | Kubernetes label name for maximum pods |
PLT_SCALER_POD_MAX_DEFAULT_VALUE | Integer | 10 | Default maximum pods when syncing from K8s labels and label is not set |
Example Kubernetes Deployment with scaler labels:
apiVersion: apps/v1kind: Deploymentmetadata: name: my-application labels: icc.platformatic.dev/scaler-min: "2" icc.platformatic.dev/scaler-max: "20"spec: replicas: 2 selector: matchLabels: app: my-application template: metadata: labels: app: my-application icc.platformatic.dev/scaler-min: "2" icc.platformatic.dev/scaler-max: "20" spec: containers: - name: app image: my-app:latest
How it works:
- The scaler syncs configuration from Kubernetes labels automatically during the periodic K8s sync (default: every 60 seconds)
- Labels are read from the controller (Deployment/StatefulSet) via pod labels
- You can set only
scaler-min
, onlyscaler-max
, or both labels - Values must be valid integers greater than or equal to 1
scaler-min
must not be greater thanscaler-max
- If validation fails, the sync is skipped and a warning is logged
- To manually trigger a sync for all applications, use the
syncScalerConfigFromLabels()
function
Custom label names:
You can customize the label names used by setting PLT_SCALER_POD_MIN_LABEL
and PLT_SCALER_POD_MAX_LABEL
environment variables. This is useful if you want to use your own label naming conventions:
PLT_SCALER_POD_MIN_LABEL=custom.example.com/min-replicasPLT_SCALER_POD_MAX_LABEL=custom.example.com/max-replicas
Advanced Scaling Algorithm Parameters
Section titled “Advanced Scaling Algorithm Parameters”Variable | Type | Default | Description |
---|---|---|---|
PLT_SCALER_DEBOUNCE | Integer | 10000 | Debounce time in milliseconds to prevent rapid scaling |
PLT_SCALER_MAX_HISTORY | Integer | 10 | Maximum number of history events to consider |
PLT_SCALER_MAX_CLUSTERS | Integer | 5 | Maximum number of clusters for historical analysis |
PLT_SCALER_ELU_THRESHOLD | Float | 0.9 | Event Loop Utilization threshold (90%) |
PLT_SCALER_HEAP_THRESHOLD | Float | 0.85 | Heap usage threshold (85%) |
PLT_SCALER_POST_EVAL_WINDOW | Integer | 300 | Post-evaluation window in seconds |
PLT_SCALER_COOLDOWN | Integer | 300 | Cooldown period between scaling operations (seconds) |
Execution Control
Section titled “Execution Control”Variable | Type | Default | Description |
---|---|---|---|
PLT_SCALER_PERIODIC_TRIGGER | Integer | 60 | Periodic trigger interval for metrics-based scaling (seconds) |
Algorithm Configuration
Section titled “Algorithm Configuration”Reactive Autoscaler Thresholds
Section titled “Reactive Autoscaler Thresholds”The reactive algorithm uses the following key thresholds:
- ELU Trigger: 90% (configurable via
PLT_SCALER_ELU_THRESHOLD
) - ELU Fallback: 95% (ensures critical scaling)
- Heap Trigger: 85% (configurable via
PLT_SCALER_HEAP_THRESHOLD
) - Signal Window: 60 seconds of metrics data
- Cooldown Period: 300 seconds between scaling operations
Trends Learning Parameters
Section titled “Trends Learning Parameters”The predictive algorithm operates with:
- History Window: 30 days (900-1500 events)
- Half-life Decay: 3 days (λ ≈ 2.674 × 10⁻⁶)
- Execution Frequency: Hourly scheduled runs
- Confidence Threshold: 80% for prediction acceptance
- Prediction Window: 30 seconds for proactive scaling
Validation Metrics
Section titled “Validation Metrics”Success scoring combines:
- Responsiveness: 70% weight (ELU/Heap < thresholds post-scaling)
- Resource Optimization: 30% weight (optimal pod utilization)
Scaling Decision Flow
Section titled “Scaling Decision Flow”Reactive Scaling Process
Section titled “Reactive Scaling Process”- Signal Detection: Pod subprocess ELU > 90%
- Metrics Collection: 60 seconds of ELU/Heap data
- Performance Analysis: Compute pod-level scores using historical clusters
- Trigger Evaluation: Scale if score > 0.5 or ELU > 95%
- Prediction Integration: Query trends learning for proactive scaling
- Scaling Execution: Apply cooldown and pod limits
- Success Tracking: Monitor post-scaling performance
Predictive Scaling Process
Section titled “Predictive Scaling Process”- Historical Analysis: Process 30-day event history (hourly)
- Pattern Detection: Identify time-slot probabilities with 3-day decay
- Sequence Modeling: Model multi-step scaling patterns
- Validation: Score predictions for responsiveness and optimization
- Schedule Generation: Create timestamped predictions with confidence scores