You can download this code by clicking the button below.
This code is now available for download.
This function uses the SHAP library to interpret the prediction of heart disease risk by a random forest model. It first loads the heart disease dataset, then trains a random forest model, and uses SHAP library to calculate the SHAP values of the model. Finally, it uses the SHAP values to predict the risk of heart disease for given age and cholesterol levels.
Technology Stack : numpy, shap, sklearn
Code Type : Machine learning prediction function
Code Difficulty : Intermediate
def predict_heart_disease(age, cholesterol):
import numpy as np
import shap
from sklearn.datasets import load_heart
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
# 加载数据
data = load_heart()
X, y = data.data, data.target
# 分割数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 训练模型
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# 创建SHAP解释器
explainer = shap.TreeExplainer(model)
# 生成SHAP值
shap_values = explainer.shap_values(X_test)
# 计算平均SHAP值
mean_shap_values = np.mean(shap_values, axis=0)
# 使用SHAP值预测心脏病风险
shap_value = mean_shap_values[0, 0] * age + mean_shap_values[0, 1] * cholesterol
prediction = model.predict([np.array([age, cholesterol])])
return prediction