notes · 21

Explainability, Feature Effects & Model Interpretability

A model can perform well and still remain hard to trust if nobody can explain what drives its decisions. Explainability methods try to connect predictions back to feature effects, local decision logic, and business intuition.

Start with global versus local interpretation, then move into feature effect curves, monotonicity, and SHAP-style contribution logic. The goal is not just to open the black box, but to understand what kind of box you are dealing with in the first place.

Mindset Global vs Local Feature Effects Contribution Logic Monotonicity Reference Summary

the core distinction

Performance answers “how good”; explainability answers “why”

Predictive view

Metrics like AUC, KS, Brier, or log loss tell you whether the model is useful. They tell you about ranking, calibration, and error.

But they do not tell you what features are driving the predictions or whether the decision logic aligns with domain intuition.

Performance tells you whether the model works.

Interpretability view

Explainability asks which variables matter, how they matter, whether their effect is monotonic or unstable, and why one specific case received a certain prediction.

That is especially important in regulated environments where decisions need to be communicated and challenged.

Explainability tells you how the model thinks.

Credit-risk relevance: a model may be statistically strong but still unacceptable if its effects are economically implausible, non-monotonic without justification, or impossible to defend to governance stakeholders.

learning sequence

A useful order for learning explainability

Start with model type

Interpretability depends on the model family. Scorecards, logistic regression, trees, and ensembles are not explainable in the same way.

Then separate global from local logic

Global interpretation asks what usually matters. Local interpretation asks why this particular borrower got this particular prediction.

Then inspect feature effect shape

Direction, monotonicity, saturation, and interactions often matter more than simple feature ranking.

Then challenge the explanation itself

Not every explanation is stable, faithful, or causally meaningful. Some are only approximations of the original model.

interactive · global vs local

The same model can be explained at two different levels

Global importance tells you what usually matters across the portfolio. Local importance tells you what mattered for one selected case. Those two views can differ sharply.

Global feature importance

Portfolio-level importance

Local explanation for one case

Scenario

Top global driver

—

Top local driver

—

Consistency

—

Main lesson: the most important feature in the portfolio is not always the feature that drove a specific individual prediction.

interactive · feature effect curves

How a feature changes predicted risk

Partial dependence style views help show the average relationship between a feature and the model output. This is where shape matters: linearity, threshold effects, flattening, or unexpected reversals.

Feature effect curve

Average model response

Local slices around selected value

Feature type

Selected feature value50

Effect direction

—

Nonlinearity

—

Saturation

—

Interpretation

—

Important nuance: a feature effect curve is associative, not causal. It describes how the model responds, not necessarily how the world truly works.

interactive · contribution logic

SHAP-style thinking: from base score to final prediction

A local explanation often starts from a base rate and then adds or subtracts feature contributions until the final prediction is reached. This section gives the intuition without pretending the additive decomposition is the same thing as causal truth.

Contribution waterfall

Case

Base PD

—

Final PD

—

Largest upward push

—

Largest downward push

—

Key idea: local additive explanations are useful for narrative and diagnosis, but they are still a representation of the model logic, not a proof of real-world causality.

interactive · monotonicity and governance

Should feature effects be monotonic?

In many regulated settings, monotonic effects are easier to defend. But forcing monotonicity may reduce fit. This section shows the tradeoff between stability, interpretability, and local predictive gain.

Unconstrained vs monotonic fit

Unconstrained Monotonic

Tradeoff summary

Scenario

Unconstrained fit

—

Monotonic fit

—

Interpretability gain

—

Governance stance

—

Credit-risk intuition: features like delinquency burden or utilization often have expected monotone relationships with risk. Large reversals may need stronger justification than simple predictive lift.

Important caveat: forcing monotonicity can hide real interactions or regime changes. Interpretability gains should not become an excuse for oversimplification.

reference

Explainability tools compared

Tool / view	Main question	Strength	Main caution
Coefficients / scorecard points	How does the model move risk globally?	Direct and transparent	Mostly linear logic
Feature importance	Which variables matter most overall?	Fast global summary	Can hide effect direction and interactions
Partial dependence	How does average prediction move with one feature?	Good effect-shape intuition	Can mislead under strong correlations
Local contributions / SHAP-style	Why this specific prediction?	Good case-level narrative	Still an approximation / representation
Monotonic constraints	Can the model be made more defendable?	Improves governance and stability	May reduce fit or oversimplify
Surrogate explanation	Can a simpler model mimic the complex one?	Useful for communication	Explains the surrogate, not necessarily the original exactly

deeper concepts

Concepts every validator should keep

global vs local

These are different questions

Global importance asks what matters on average. Local explanation asks what mattered for one specific case.

effect shape

Direction is not enough

Two models can rank a feature similarly while implying very different shapes: linear, saturating, threshold-like, or non-monotonic.

correlation

Correlated features complicate interpretation

When variables overlap heavily, contribution allocation can become unstable and importance rankings can be misleading.

faithfulness

An explanation can be intuitive but incomplete

Some explanation tools are excellent for communication but do not perfectly represent the original model’s internal mechanics.

governance

Explainability is partly a governance problem

The right level of explanation depends on model type, regulatory expectations, decision materiality, and stakeholder needs.

causality

Explanation is not causation

A feature can be influential in the model without being the true causal driver in the underlying economy or borrower behavior.

summary

What to leave this page with

Explainability is not one thing. It includes global understanding, local reasoning, effect shape analysis, and governance-friendly constraints.

The useful order is: first identify the model family, then separate global from local interpretation, then inspect feature effect shapes, then evaluate whether the explanation is stable, faithful, and defendable enough for the use case.

Once that structure is clear, model interpretation stops being a cosmetic add-on and becomes part of how the model is actually trusted.