What does xSingular do?

xSingular designs, builds, and deploys confidential production AI systems for critical decisions. The firm focuses on Decision Intelligence, MLOps, continuous evaluation, explainability, guardrails, and decision-level traceability.

Why are many xSingular client references anonymous?

Many xSingular deployments remain anonymous because clients operate in mining, banking, public sector, or critical infrastructure environments where confidentiality, procurement, or security policies restrict public disclosure.

What is xSingular's mission?

xSingular partners with organisations to enhance performance through cutting-edge artificial intelligence solutions, driving real-world impact. The firm focuses on Decision Intelligence, MLOps, and production AI systems.

xStryk™ is the Decision Intelligence platform created by xSingular. It includes xStryk™ Engine, xStryk™ Eval, xStryk™ DataOps, and xStryk™ Ops for production AI systems.

What is the official LinkedIn page for xSingular?

The official LinkedIn page for xSingular is https://www.linkedin.com/company/xsingular-ai.

xSingular es una empresa de ingeniería especializada en inteligencia artificial para decisiones críticas. Diseña, construye y despliega sistemas de IA en producción para minería, banca e infraestructura crítica. Su plataforma xStryk™ incluye motores de Decision Intelligence, MLOps, evaluación continua, explicabilidad y trazabilidad por decisión.

¿Qué es la inteligencia artificial para empresas?

La inteligencia artificial para empresas es el conjunto de sistemas, modelos y agentes de IA aplicados a decisiones operativas críticas: predicción, optimización, automatización inteligente y simulación cognitiva. xSingular especializa en construir estos sistemas con controles auditables, explicabilidad (XAI) y trazabilidad completa.

¿Qué son los agentes de inteligencia artificial?

Los agentes de inteligencia artificial son sistemas autónomos que perciben su entorno, razonan sobre él y ejecutan acciones para lograr objetivos definidos. xSingular diseña e implementa agentes inteligentes para operaciones en minería, banca, supply chain e infraestructura crítica, con guardrails ejecutables y evaluación continua.

¿Cómo diferencia xSingular de otras consultoras de IA?

xSingular se diferencia por operar con sistemas de IA verificables, auditables y trazables. No entrega presentaciones conceptuales ni prototipos sin continuidad operativa: entrega sistemas en producción con métricas objetivas, evaluación continua (xStryk™ Eval), explicabilidad (XAI) y guardrails ejecutables. Toda decisión del sistema queda registrada y es auditable.

¿En qué industrias trabaja xSingular con inteligencia artificial?

xSingular implementa sistemas de inteligencia artificial en minería (mantenimiento predictivo, optimización de procesos, IA para operaciones extractivas), banca (agentes inteligentes, risk scoring, detección de anomalías), infraestructura crítica, salud y supply chain. Especializado en entornos donde la precisión y la auditabilidad son mandatorias.

Unified intelligent agent platform for banking

xSingularTop 39 min read

How a Top 3 bank reduced credit decision time by 58% with risk scoring, next-best-action, and anomaly detection agents.

-58%

Credit decision time

100%

Regulatory traceability

Integrated domains

Unified AI Platform · Banking · 3-Layer Architecture · In Production

xSingular logró -58% en Credit decision time mediante el ecosistema xStryk.
xSingular logró 100% en Regulatory traceability mediante el ecosistema xStryk.
xSingular logró 3 en Integrated domains mediante el ecosistema xStryk.

Context and regulatory pressure

A Top 3 bank in the financial system operated its credit risk models, dynamic pricing, and customer intelligence in completely independent silos. Scoring models had been developed by three different vendors at different points in time, and their outputs were consumed manually by analysts who integrated them in spreadsheets to produce credit decisions. The full cycle — from application to decision — averaged 4.2 days for SME credit and 11 days for corporate credit.

Each domain had its own model update cycle: the risk team recalibrated scoring every 6 months, the pricing team adjusted rates quarterly, and the customer intelligence team updated its segmentations annually. This asynchrony created regulatory inconsistencies: a customer could receive a product offer with a spread calculated on a risk score that had been recalibrated 5 months after the pricing was set.

The catalyzing event for the project was an observation from the financial regulator during a model review: the bank could not reconstruct the reasoning chain of 43% of credit decisions from the previous quarter because individual model logs were not synchronized with the analyst's decision logs. The regulator set an 8-month deadline to demonstrate full traceability or face operational restrictions.

In regulated banking, traceability between a credit decision, the underlying model version, and a snapshot of input data at the moment of decision is not a technical improvement — it is a regulatory requirement whose non-compliance can lead to operational restrictions, fines, or limitations on portfolio growth.

Existing model audit and gap mapping

The first phase of the project was a comprehensive technical audit of existing models. Three active credit scoring models were found in production, two of which had been developed externally and delivered as binary artifacts without source code or feature documentation. Reconstructing their logic required input-output analysis with 24 months of historical decision data.

The dependency mapping revealed that the pricing engine consumed scores from risk model A via a nightly batch job, but the customer intelligence next-best-action engine used risk model B — a different model with an 11% difference in predictions for customers with scores between 550 and 650. This inconsistency meant the bank could simultaneously offer a customer a conservative product (based on risk A) and trigger an additional credit offer (based on risk B) with a different approval threshold.

Inventory of 7 active production models: 3 risk scoring, 2 pricing, 1 product propensity, 1 fraud detection
Identification of 5 undocumented dependencies between models across different domains
Quantification of inconsistency between parallel risk models: 11% difference in scores for the 550-650 segment
Mapping of 23 traceability gaps: decisions without model version, input data, or precise timestamp records
Classification of models by regulatory impact: 4 high-risk (credit decisions), 3 medium-risk (pricing and propensity)
Demographic fairness analysis on scoring outputs: disparate impact detected in 2 protected segments

Unified platform architecture

The solution was designed as an agent platform with three layers: a shared ingestion and feature store layer, a specialized agent layer by domain with a standardized communication protocol, and a traceability layer that records each decision with its complete context before sending it to the source system.

The most critical architectural decision was the design of the shared feature store. Rather than having each agent build its own features from raw data — which generates inconsistencies when two agents use slightly different definitions of the same variable — a centralized feature store was implemented with versioned semantic definitions. The risk agent, pricing agent, and next-best-action agent consume exactly the same features calculated with exactly the same logic, guaranteeing consistency between decisions from different domains for the same customer at the same time.

The shared feature store resolves the root inconsistency: when two agents calculate the same feature (for example, "debt-to-income ratio") with slightly different logic, their decisions are neither comparable nor auditable. Semantic feature consistency is the foundation of regulatory traceability.

Centralized feature store with versioned semantic definitions and consistent calculation across all agents
Risk Scoring Agent: calibrated XGBoost ensemble + logistic model for explainability, with SHAP values per decision
Dynamic Pricing Agent: margin optimization model with regulatory constraints on maximum spread and real-time cost of funds
Next-Best-Action Agent: multi-class propensity model with demographic fairness constraints and contactability policies
Anomaly Detection Agent: unsupervised model on transactional behavior patterns with real-time alerts to the risk committee
Traceability layer: immutable snapshot of inputs, model version, outputs, and final decision with precise timestamp before each communication to core banking
Human-in-the-loop engine: automatic routing of decisions to human analyst when agent confidence falls below threshold configured per segment

Models, algorithms, and technical decisions

The Risk Scoring agent implements an ensemble of two models with complementary roles. The primary model is XGBoost with 180 features, calibrated to produce accurate default probabilities (Brier score < 0.08 on the validation dataset). The secondary model is a logistic regression with the 25 most important features identified by SHAP, trained to produce the same decision with greater explainability. For each application, the primary model score is generated and the secondary model is used to produce the explanation in terms of the most influential factors, expressed in the business language used by the credit analyst.

The Anomaly Detection agent uses a Variational Autoencoder (VAE) trained on 18 months of normal transactional behavior patterns. Per-customer reconstruction error is monitored in real time and compared against a reference distribution by customer segment. When the error exceeds the 99.5th percentile of the segment's reference distribution, an alert is generated that includes the list of transactions that most contributed to the anomaly score, calculated via input perturbation.

A relevant technical decision was the design of the human-in-the-loop mechanism. Rather than a fixed confidence threshold, a dynamic calibration system was implemented: the human escalation threshold is adjusted weekly based on the observed error rate in automatic decisions from the prior period. If the agent had an above-target error rate in the SME segment, the confidence threshold required for automatic approval is increased, raising the proportion of cases going to human review until the error rate returns to the target range.

Risk Scoring: XGBoost ensemble (180 features) + logistic (25 features) with per-decision SHAP values and Platt calibration
Dynamic Pricing: margin optimization with regulatory maximum spread constraints and real-time cost of funds via Central Bank API
Next-Best-Action: multi-class propensity model (7 products) with fairness post-processing via probability reweighting
Anomaly Detection: Variational Autoencoder (VAE) with per-segment reconstruction error and input perturbation attribution
Human-in-the-loop: dynamic confidence threshold with weekly calibration based on observed error rate per segment
Fairness: disparate impact evaluation across 4 protected demographic attributes at every retraining cycle

Regulatory compliance and traceability

The traceability module was the most critical component of the project from a regulatory standpoint. Each credit decision generates an immutable record that includes: the version identifier of each participating model, the snapshot of features calculated at the time of decision (not raw data, but the calculated values the model received as input), the model score and confidence, the applied decision threshold, and the justification for whether the decision was automatic or escalated to a human.

To meet regulatory requirements, the bank needed to be able to reconstruct any decision from the past 5 years in less than 2 hours of querying. The traceability record design uses a partitioning scheme by date and segment that enables efficient queries without scanning the entire dataset. Audit tests conducted with the bank's compliance team demonstrated an average decision reconstruction time of 4.3 minutes.

Immutable record per decision: model version, feature snapshot, score, confidence, threshold, and decision type (automatic or human)
Average decision reconstruction time for audit: 4.3 minutes against 2-hour regulatory target
CI/CD pipeline with regulatory approval gates: no model reaches production without Chief Risk Officer sign-off and automated fairness validation
Fairness evaluation at every retraining: disparate impact, equal opportunity, and calibration by demographic group
Portfolio stress testing: impact simulation of each model on the portfolio risk distribution under 12 macroeconomic scenarios
Annual external audit: independent validation process with full access to version history and decision records

Deployment and change management

Deployment was performed in the bank's on-premise datacenter under a blue-green deployment model with automatic rollback. The infrastructure uses Kubernetes with separate namespaces per agent domain, allowing the pricing agent to be updated without affecting risk agent availability. Each namespace has network policies preventing direct inter-agent communication — all communication passes through the central messaging bus to guarantee the traceability record.

The change management program was as important as the technical deployment. Credit analysts were the most critical stakeholder: they needed to trust the system's recommendations to process automatic decisions, but also needed to know exactly how and when to exercise their override judgment. A 16-hour internal certification program was designed covering the interpretation of SHAP values, use of the audit panel, and escalation protocols. The adoption rate at 90 days post go-live was 94% of active analysts.

Override design is a governance decision as important as model design. If override is too easy, analysts use it to avoid the system without learning to interpret it. If too restrictive, it generates resistance and distrust. The right balance requires data on actual override patterns, not intuition.

Quantified results

Results were measured in two dimensions: operational efficiency and decision quality. In efficiency, average credit decision time was reduced from 4.2 days to 1.8 days for SME credit (-58%) and from 11 days to 5.3 days for corporate credit (-52%), because high-confidence model cases are processed automatically and only complex cases reach the analyst with pre-analyzed context.

In decision quality, the 12-month default rate of the approved portfolio decreased 17% in the first year of operation, controlling for product mix and the macroeconomic environment. This result was validated by the bank's risk team comparing pre and post-deployment cohorts with equivalent observed risk characteristics.

SME credit decision time reduction: from 4.2 to 1.8 days (-58%)
Corporate credit decision time reduction: from 11 to 5.3 days (-52%)
Reduction in 12-month default rate in approved portfolio: -17% adjusted for mix
100% regulatory traceability: zero decisions without complete records since go-live
Reduction in cross-domain inconsistencies: from 11% to 0.3% difference between risk scores consumed by different agents
Response time to fraud incident with VAE alert: from 6.8 hours (historical average) to 23 minutes

Lessons learned

The most important lesson was that regulatory traceability cannot be a module added at the end of development — it must be the design principle that determines the architecture from the start. When traceability is added to an already-built system, prior technical compromises (asynchronous logs, absence of feature snapshots, informal model versioning) generate technical debt that is extraordinarily costly to resolve.

The second lesson was about semantic feature consistency. The time invested in designing and maintaining a shared feature store with versioned definitions pays back many times over in the reduction of inter-agent inconsistency incidents. In the first quarter post-deployment, the team recorded zero cross-domain inconsistency incidents, compared to an average of 3.4 incidents per quarter with the prior system.

Regulatory traceability must be the central design principle, not a module added at the end
The shared feature store with versioned semantic definitions eliminates inter-agent inconsistency at the root
The design of the human override mechanism is as critical as model design — it requires real behavior data to calibrate correctly
Binary models without source code are a regulatory risk: the bank must be able to audit and reconstruct the logic of every production model
Analyst certification program is a necessary condition for adoption: poorly justified override rate correlates inversely with training hours received
Blue-green deployment with automatic rollback by business metric (not just technical) is the correct pattern for credit models in production

Have a similar challenge?

Let's talk 30 minutes about your use case. No strings attached.

Schedule call