Digital twins for divisional productivity in mining
How a mining operator multiplied operational visibility across 12 divisions with production scenario simulators and real-time KPIs.
Digital Twin Network · 12 Divisions · Real-time Simulation
Resumen de datos clave
- In the first 6 months of production operation, the mining operator recorded a net increase of +23% in average divisional productivity, with three divisions exceeding 30% improvement. Unplanned stoppag…
- xSingular logró +23% en Divisional productivity mediante el ecosistema xStryk.
- xSingular logró 12 en Divisions modeled mediante el ecosistema xStryk.
- xSingular logró 1,400+ en Simulations/month mediante el ecosistema xStryk.
Context and challenge
A large-scale mining operator with 12 active divisions faced a paradox typical of data-intensive industries: thousands of IoT sensors installed on critical equipment, full access to SAP ERP, and a PI System with years of historical records — yet the information remained siloed. Productivity reports were generated monthly in aggregated form, with no per-shift granularity or prospective simulation capability.
Planning teams could not answer essential operational questions: which division showed fleet underutilization over the past 30 days, what would happen to net productivity if 3 haul trucks were reassigned between operations, or what was the expected impact of a 6-hour conveyor belt stoppage on the processing chain. Strategic decisions relied on expert intuition and sector benchmarks rather than proprietary quantitative models.
The problem was not data scarcity — it was the absence of an integrating model capable of fusing IoT signals, ERP transactions, and dispatch records into a coherent, simulable, and real-time actionable operational representation.
Data landscape & sources
The project began with an exhaustive inventory of available data sources. Fourteen active operational sources were identified with very different characteristics in terms of frequency, format, and quality. Aggregate volume exceeded 2.8 TB of historical data over the prior 18 months, with a production ingestion rate of approximately 4.2 million records per day from sensors and transactional systems.
- PI System OSIsoft: 3,200 equipment sensor tags (vibration, temperature, pressure, power consumption) at 1 Hz frequency per critical asset
- SAP PM/PP/MM: work orders, spare parts consumption, production schedules, and inventory transactions with 15-minute latency
- FMS dispatch system: GPS position, speed, payload, and truck cycle data in real time updated every 30 seconds
- Meteorological and weather data: local stations with temperature, humidity, wind, and visibility variables relevant for open-pit operations
- Laboratory records: ore grade analysis, particle size distribution, and moisture content with daily frequency per sampling point
- Market information: spot copper price, exchange rates, and electricity costs for economic value models
Methodology & analysis
The methodological approach combined feature engineering on industrial time series with probabilistic modeling of production processes. Before any predictive model, a data quality audit phase identified missing value rates of up to 18% in some PI System tags, and specific imputation strategies were designed per sensor type. A preprocessing pipeline was built to normalize, synchronize, and validate data from all 14 sources before feeding the models.
- Time series feature engineering: exponential moving averages (EMA), rolling standard deviations, anomaly detection with Isolation Forest on critical sensor residuals
- Net productivity modeling with XGBoost (500 estimators, max_depth=6, learning_rate=0.05) trained on 14 months and validated on 4 months out-of-sample, MAE=2.3% vs actual productivity
- Equipment availability prediction with LightGBM on failure history and telemetry, F1-score=0.81 for failures within a 72-hour window
- Scenario simulation with Monte Carlo: 1,400+ iterations per scenario using empirical distributions of cycle times and failure rates calibrated per division
- Sensitivity analysis to identify the 8-12 most influential variables per division using SHAP values on the XGBoost model
- Anomalous operating regime detection with LSTM autoencoders trained per division to capture multivariate patterns of normal behavior
Model architecture / technical design
Each digital twin is an ensemble of specialized models: a productivity predictor (XGBoost), an anomaly detector (Isolation Forest + LSTM Autoencoder), and a probabilistic simulator (Monte Carlo) that run in a coordinated fashion over an operational dependency graph between divisions.
- Ingestion layer: Kafka Streams for real-time ingestion from PI System and FMS; hourly batch ETL for SAP via RFC API; end-to-end ingestion latency < 45 seconds
- Feature store: Apache Hive partitioned by division and shift; 340 features computed per division including utilization ratios, cumulative failure indices, and environmental context variables
- Predictive models: XGBoost for productivity (RMSE=1.8%, R2=0.94 on test), LightGBM for equipment availability (AUC=0.89), Prophet for long-term trends with weekly and annual seasonality
- Simulation engine: Python implementation with NumPy/SciPy over empirical distributions; 1,400 simulations/month run in parallel with Ray; compute time < 8 minutes per full scenario
- Service API: FastAPI with REST endpoints for ad-hoc queries and WebSocket streaming for real-time dashboards; P95 latency < 120 ms
- Infrastructure: on-premise deployment in client datacenter with Kubernetes; native integration with SAP via RFC and PI System via PI Web API without modifying existing systems
Implementation details
The main integration challenge was the heterogeneity of source systems. PI System exposes data via PI Web API with Kerberos authentication and proprietary compression schemes that required a custom connector to guarantee time series integrity. SAP PM has no standard data model across divisions — each had configured its own order types and work centers, requiring a semantic mapping process division by division, validated by maintenance superintendents before including those variables in the models.
The real-time data pipeline processes an average of 4.2 million events per day with a 99.5% availability SLA. A circuit breaker mechanism was implemented to handle temporary PI System outages (common during nightly maintenance windows), buffering events locally and reprocessing them when the system comes back online. Clock synchronization between sensors with different firmware versions was a non-trivial problem resolved with a temporal alignment process based on UTC timestamps and per-device drift compensation.
Validation & testing
Before production deployment, each digital twin operated for 6 weeks in shadow mode: the model generated productivity predictions and deviation alerts in parallel with real planner decisions, without intervening in the operational process. During this period, 847 predicted vs actual deviation events were recorded, with a precision rate of 79% for underutilization alerts (threshold > 12% below benchmark) and a false positive rate of 8%. Monte Carlo simulators were backtested against 18 months of historical data, reproducing the 23 unplanned stoppage events recorded in that period with a mean production impact error of 6.4%.
The validation process included structured sessions with operations superintendents from each division, presenting the most critical use cases and adjusting alert thresholds to reduce alarm fatigue. A stress test of the simulator was also conducted by injecting catastrophic failure scenarios (simultaneous shutdown of three critical assets) to verify that Monte Carlo distributions produced conservative estimates with appropriate confidence intervals. The system entered production with a formal agreement for monthly model quality KPI monitoring.
Results & business impact
In the first 6 months of production operation, the mining operator recorded a net increase of +23% in average divisional productivity, with three divisions exceeding 30% improvement. Unplanned stoppages of critical equipment fell 31% versus the prior semester, and mean response time to operational deviations dropped from 4.2 hours to 38 minutes.
- +23% average divisional productivity in the first 6 months of operation
- -31% unplanned critical equipment stoppages thanks to LightGBM predictive alerts with 72-hour lead time
- 1,400+ scenario simulations executed per month, enabling weekly prospective planning instead of monthly
- Response time to operational deviations reduced from 4.2 hours to 38 minutes (alert model MAE: 11 minutes)
- ROI calculated at 8.4x on investment over 12 months, accounting for reduced corrective maintenance costs and increased throughput
- 100% adoption by planners across all 12 divisions within the first 90 days, with no existing system replaced
Key lessons learned
- Data quality auditing is the most underestimated phase: 40% of project time was invested in understanding, cleaning, and aligning the 14 sources before training any model
- Six weeks of shadow mode is non-negotiable in industrial environments — it allows threshold tuning without operational risk and generates objective evidence that accelerates organizational adoption
- Digital twins must model interdependencies between divisions, not each division in isolation: stoppages in upstream operations have cascading effects that per-division models miss without the dependency graph
- Isolation Forest and LSTM Autoencoder are complementary: Isolation Forest detects point statistical anomalies with low latency, while the autoencoder captures temporal pattern deviations that require long-window context
- SAP integration without modifying existing configuration required manual semantic mapping — investing in this process with domain experts prevents feature engineering errors that only surface months later in production
- Monte Carlo simulators must be calibrated with empirical distributions, not parametric ones: using normal distributions for heavy-tailed processes (equipment failures) produces optimistic confidence intervals that erode operational trust
Have a similar challenge?
Let's talk 30 minutes about your use case. No strings attached.
Schedule call