What methodologies do you cover in the book?
This book covers many important interpretability methods that you can use to make your machine learning models more robust, transparent, and fair. These include explaining popular white-box models, ranging from linear regression to decision trees. For black-box models, you'll find a wide range of model-agnostic methods, including permutation feature importance, partial dependence plots, SHAP, accumulated local effects plots, LIME, sensitivity analysis, anchors, and counterfactuals.
You'll delve into specific methods for deep learning models in the domains of vision, text, and time series. For LLMs, you'll visualize the attention mechanism to understand the relationship between tokens (usually words) and use attribution methods (like integrated gradients or LIME) to see which tokens influence the model's prediction. Then, you'll see how to assess and mitigate bias for fairness and improve the reliability or consistency of models from monotonic constraints to adversarial robustness.