ARF Risk Scoring Model
A Bayesian risk scoring model for AI system reliability and failure prediction.
This model implements the core risk assessment logic from the Agentic Reliability Framework (ARF).
π Problem
AIβdriven systems fail silently in production. Without a calibrated measure of failure probability, operations teams cannot decide whether to approve, deny, or escalate infrastructure changes.
π Mathematical Formulation
Given a set of signals (telemetry, context), the risk score is defined as:
[ \text{Risk}(x) = P(\text{Failure} \mid \text{Signals}, \text{Context}) ]
Internally, ARF combines:
- Conjugate Beta priors for perβcategory online updates.
- Hyperpriors that share statistical strength across categories.
- Hamiltonian Monte Carlo (HMC) to capture complex patterns (timeβofβday, user role, environment).
The final risk score is a weighted average of these three components, with weights determined by data availability.
π Usage
You can use this model directly via the ARF API, or integrate the underlying Python library.
Example with ARF API (Python)
import requests
response = requests.post(
"https://a-r-f-agentic-reliability-framework-api.hf.space/api/v1/incidents/evaluate",
json={
"service_name": "payment-gateway",
"event_type": "latency_spike",
"severity": "high",
"metrics": {"latency_p99": 350, "error_rate": 0.12}
}
)
result = response.json()
print(f"Risk score: {result['risk_score']:.3f}")
print(f"Risk factors: {result['risk_factors']}")
print(f"Recommended action: {result['recommended_action']}")
Example using the ARF Python package
from agentic_reliability_framework.core.governance.risk_engine import RiskEngine
engine = RiskEngine()
risk, explanation, contributions = engine.calculate_risk(
intent=some_intent,
cost_estimate=100.0,
policy_violations=[]
)
print(f"Risk: {risk}")
π Links
ARF Space: Agentic Reliability Framework (ARF) v4 API
GitHub Repository: arf-foundation/agentic-reliability-framework
Documentation: API Docs
π Input / Output
InputTypeDescriptionservice_namestringName of the service being evaluatedevent_typestringType of incident (e.g., latency_spike)severitystringlow / medium / high / criticalmetricsdictTelemetry values (latency, error rate, CPU, etc.)OutputTypeDescriptionrisk_scorefloatCalibrated failure probability (0β1)risk_factorsdictAdditive contributions from conjugate, hyperprior, HMCrecommended_actionstringapprove / deny / escalatedecision_traceobjectExpected losses and variance
π License
Apache 2.0 β See LICENSE for details.
π€ Contributing
Contributions are welcome! Please refer to the contribution guidelines in the main repository.