Home
Skip to main content
xStryk™

Decision Intelligence for AI in production — guardrails, traceability & evaluation.

SUPPLY CHAINSCALING

Intelligent agent ecosystem for supply chain

xSingularOperations8 min read

How a supply chain operation reduced stockouts by 41% with coordinated agents for demand sensing, inventory, routing, and procurement.

-41%
Stockouts
4
Coordinated agents
+22%
Service level

Intelligent Agent Ecosystem · Supply Chain · End-to-End Orchestration

SUPPLIERSFACTORYDISTRIBUTIONRETAILSUP ASUP BSUP CPLANT 01AGENT NODEDC NORTHDC SOUTHMKT AMKT BMKT CMKT DREAL-TIME AGENT DECISIONS · SCALING · xSTRYK ORCHESTRATION LAYER

Resumen de datos clave

  • xSingular logró -41% en Stockouts mediante el ecosistema xStryk.
  • xSingular logró 4 en Coordinated agents mediante el ecosistema xStryk.
  • xSingular logró +22% en Service level mediante el ecosistema xStryk.

Operational context and challenge

A supply chain operation with presence in three countries faced a structural problem: each function — demand planning, inventory, transport, and procurement — optimized its own metrics without visibility into systemic impact. The inventory team reduced stock to lower tied-up capital; the service team demanded high availability levels; transport optimized routes based on its own costs without considering customer delivery windows that had already been committed.

The result was predictable: stockouts in high-turnover products, excess inventory in low-demand SKUs, and service levels oscillating between 71% and 89% week to week, with peak tension during high-demand seasons. Weekly coordination meetings consumed between 8 and 12 hours of management time without producing systemic decisions.

The central technical challenge was not improving any individual process — it was designing a system capable of making coordinated real-time decisions across four domains with partially conflicting objectives, heterogeneous data, and different planning horizons.

Local optimization without systemic coordination creates an operational prisoner's dilemma: each area acts rationally in its domain and the global result is suboptimal. Without an orchestration mechanism with explicit priority rules, agents compete instead of cooperating.

Data sources and feature engineering

The first month of the project was dedicated to auditing and cataloging all available data sources. The operation had a central ERP (SAP S/4HANA) but critical information was scattered across departmental spreadsheets, TMS systems from two different logistics providers, and a legacy WMS with manual daily exports.

A unified data lake was built with incremental ingestion from 11 distinct sources, with quality validation in each pipeline and data drift alerts. The consolidation process revealed that 23% of inventory records in the ERP had discrepancies with the physical WMS exceeding 5%, which invalidates any optimization model built on that data without prior correction.

  • ERP (SAP S/4HANA): purchase orders, receipts, stock movements, and supplier data with 15-minute latency
  • Physical WMS: real-time inventory by location, with automatic reconciliation against ERP to detect discrepancies
  • TMS from two providers: shipment status, historical transit times, and available capacity by route
  • External demand signals: point-of-sale data from key customers, sector activity indices, and weather data by zone
  • 36-month history of sales at SKU-location-week level for demand model calibration
  • Supplier catalog with historical lead times, delivery variability, and quality performance
  • Seasonality data, promotional events, and customer closing calendars

Multi-agent ecosystem architecture

The ecosystem design started from a central principle: each agent must be autonomous in its domain but operate under a shared coordination protocol that resolves conflicts through explicit, business-configurable priority rules. The alternative of a single monolithic model was rejected for three technical reasons: the difference in planning horizons between agents (hours for routing, days for inventory, weeks for procurement), the need to update models for each domain independently, and the requirement for granular explainability by decision type.

Orchestration was implemented as a central service that receives decision proposals from each agent, evaluates conflicts against a pre-defined dependency graph, and applies resolution rules configured by the operations team. When a conflict cannot be resolved automatically — for example, when the inventory agent proposes an emergency purchase that the procurement agent evaluates as supplier risk — the system escalates to the human planner with the full decision context and available options.

The inter-agent communication protocol uses a standardized message format that includes: the proposed decision, estimated confidence, projected impact on shared metrics, and dependencies with other agents' decisions. This allows orchestration to evaluate trade-offs systemically rather than resolving conflicts ad hoc.

  • Demand Sensing Agent: ensemble model (LightGBM + Prophet) with time series features, external signals, and business variables; generates 4, 12, and 26-week forecasts by SKU-location
  • Inventory Agent: dynamic (s, S) policy optimization by SKU with parameters based on demand forecast and supplier lead time variability; runs daily reoptimization
  • Routing Agent: Vehicle Routing Problem with Time Windows (VRPTW) solved with tabu search; intraday replanning on disruptions
  • Procurement Agent: multi-criteria supplier scoring (price, lead time, historical quality, concentration risk) and automatic purchase order generation
  • Central Orchestrator: dependency graph, conflict resolution engine with configurable rules, and escalation system to human planner
  • Messaging bus: Apache Kafka with topics by event type and exactly-once delivery guarantee for critical decisions

Key models and algorithms

The Demand Sensing agent combines three modeling layers: a time series layer with Prophet to capture multiple seasonality and calendar effects, a gradient boosting layer with LightGBM incorporating 47 external features (weather, sector indices, competitor activity), and a hierarchical reconciliation layer that ensures consistency between SKU-level and category-level forecasts. The ensemble weights the three models with weights that vary by SKU based on its historical behavior.

The Inventory agent implements a dynamic (s, S) policy: reorder point s and target level S are recalculated daily using the Demand Sensing agent forecast and the lead time distribution of the corresponding supplier. For SKUs with high demand variability, a probabilistic safety stock calibrated to achieve the target service level with 95% confidence is applied.

The Routing agent solves the VRPTW with a tabu search metaheuristic that explores the solution space for 90 seconds per replanning cycle. For routes with critical time window constraints, a deterministic repair phase is applied to guarantee feasibility. Average full-fleet replanning time is 4.2 minutes.

  • Demand Sensing: ensemble Prophet + LightGBM + hierarchical reconciliation with 47 external features
  • Inventory: dynamic (s, S) policy with 95%-confidence probabilistic safety stock per SKU
  • Routing: VRPTW with tabu search (90s per cycle) and repair phase for guaranteed feasibility
  • Procurement: multi-criteria scoring with 8 dimensions and supplier portfolio optimization
  • Anomaly detection: Isolation Forest on operational metrics for early disruption alerts
  • Confidence calibration: conformal prediction for calibrated prediction intervals in demand sensing

Validation and deployment methodology

Deployment was structured in four phases of 6 weeks each. In the first phase, the Demand Sensing agent is deployed in shadow mode: it generates forecasts in parallel with the existing manual process but does not influence any operational decision. Over 6 weeks, agent forecasts are compared against manual forecasts and against real demand, documenting errors and biases. Only when the agent's MAPE is consistently more than 15% lower than the manual process is advancement to the next phase authorized.

The same protocol applies to each agent successively. The Inventory agent enters shadow mode only after the Demand Sensing agent is in active operation, because its stock recommendations depend on the forecasts. This cascading dependency defines the activation order and extends the total validation period, but significantly reduces the risk of activating an agent with deficient inputs.

Before the full system go-live, 240 extreme scenario simulations were executed: main supplier outage, demand spike 3x above historical average, fleet capacity restrictions at 60%, and multiple combinations of simultaneous disruptions. In all scenarios the system correctly escalated to the human planner when confidence fell below the configured threshold.

The shadow mode protocol per agent is not just a safety measure — it is a calibration tool. Discrepancies between agent recommendations and manual decisions reveal implicit business assumptions that were not documented, and that must be incorporated as business rules or additional features before activation.

Measured results

Results were measured by comparing the 6 months post-deployment against the equivalent 6 months of the prior year, controlling for volume and product mix variations to isolate the system effect from business changes.

The most significant indicator was the 41% reduction in stockouts, from an average of 8.3% of lines in rupture per week to 4.9%. The improvement was concentrated in the 200 highest-turnover SKUs, where the Demand Sensing agent showed higher relative accuracy compared to the manual process. For low-turnover SKUs with sporadic demand, improvement was marginal (11%), which aligns with the expected limitations of time series models for that type of pattern.

  • Stockout reduction of 41%: from 8.3% to 4.9% of lines in rupture per week
  • Service level improvement of 22%: from 79% average to 96.4% in the 6 months post-deployment
  • Inventory capital reduction of 18%: elimination of excesses in low-turnover SKUs without service impact
  • Transport cost reduction of 14%: route consolidation and reduction of partial trips
  • Manual planning hours reduction of 67%: from 10 weekly meeting hours to 3.3 hours of supervision
  • Average response time to supplier disruption: from 3 days to 4.2 hours with automatic escalation

Infrastructure and observability architecture

The system is deployed in Docker containers orchestrated with Kubernetes, with a separate namespace per agent to isolate failures and allow independent updates. Each agent exposes performance metrics (inference latency, average prediction confidence, human escalation rate) and business metrics (forecast accuracy by SKU, deviation vs. service level target) to a centralized observability stack in Prometheus and Grafana.

The model registry uses MLflow with semantic versioning: each model version includes training data, hyperparameters, evaluation metrics, and the shadow test status that authorized its activation. This allows auditing exactly which version of each model made each decision — a mandatory condition for post-disruption analysis.

  • Infrastructure: Kubernetes with namespaces per agent, horizontal auto-scaling based on inference load
  • Messaging: Apache Kafka with topics by event type and 7-day retention for incident replay
  • Observability: Prometheus + Grafana with per-agent dashboards and automatic drift alerts
  • Model registry: MLflow with full data lineage, hyperparameters, and evaluation metrics
  • ERP integration: SAP connectors via RFC with schema validation and circuit breaker on failures
  • Security: data at rest encrypted, inter-agent communication via mTLS, access auditing

Lessons learned and limitations

The most important lesson from the project was that the biggest obstacle was not technical but organizational: teams in each domain had developed implicit decision criteria over years that were not documented in any system. Externalizing those criteria to convert them into configurable business rules for the orchestrator required 8 weeks of structured work with senior planners, and was the critical factor determining the quality of the system's decisions.

The second relevant limitation was the quality of physical inventory data. The initial 23% discrepancy between ERP and WMS required a reconciliation process that runs every 4 hours and generates alerts when the discrepancy exceeds 3% in critical SKUs. Without this process, inventory and routing agents operate with incorrect information and their decisions deteriorate rather than improve service levels.

  • The design of the inter-agent coordination protocol is more critical than the accuracy of any individual model
  • Shadow mode per agent in cascade reveals implicit business assumptions that must be formalized before activation
  • Physical inventory data quality is the most frequent bottleneck: measure ERP-WMS discrepancy before committing timelines
  • Business guardrails configurable by operations allow the system to evolve without model retraining
  • Explainability of each orchestrator decision is a necessary condition for adoption by the operations team
  • Automatic rollback per agent with manual process restoration must be tested before go-live

Have a similar challenge?

Let's talk 30 minutes about your use case. No strings attached.

Schedule call