The Captum 0.7.0 release adds new functionalities for language model attribution, dataset level attribution, and a few improvements and bug fixes for existing methods.
Language Model Attribution
Captum 0.7.0 adds new APIs for language model attribution, making it substantially easier to define interpretable text features with corresponding baselines and masks. These new wrappers are compatible with most attribution methods in Captum and make it substantially easier to understand how aspects of a prompt impact an LLM’s predicted response. More details can also be found in our paper:
Using Captum to Explain Generative Language Models
Example:
from captum.attr import ShapleyValueSampling, LLMAttribution, TextTemplateInput
shapley_values = ShapleyValueSampling(model)
llm_attr = LLMAttribution(shapley_values, tokenizer)
inp = TextTemplateInput(
# the text template
"{} lives in {}, {} and is a {}. {} personal interests include",
# the values of the features
["Dave", "Palm Coast", "FL", "lawyer", "His"],
# the reference baseline values of the features
baselines=["Sarah", "Seattle", "WA", "doctor", "Her"],
)
res = llm_attr.attribute(inp)
DataLoader Attribution
DataLoader Attribution is a new wrapper which provides an easy-to-use approach for obtaining attribution on a full dataset by providing a data loader rather than a single input (PR #1155, #1158).
Attribution Improvements
Captum 0.7.0 has added a few improvements to existing attribution methods including:
- Multi-task attribution for Shapley Values and Shapley Value Sampling is now supported, allowing users to get attributions for multiple target outputs simultaneously (PR #1173)
- LayerGradCam now supports returning attributions for each channel independently without summing across channels (PR #1086, thanks to @dzenanz for this contribution)
Bug Fixes
- Visualization utilities were updated to use the new keyword argument visible to ensure compatibility with Matplotlib 3.7 (PR #1118)
- The default visualization mode in visualize_timeseries_attr has been fixed to appropriately utilize overlay_individual (PR #1152, thanks to @teddykoker for this contribution)