复习: 实现精准率和召回率

98 阅读 0 评论 65 点赞

我是靠谱客的博主安静魔镜，这篇文章主要介绍复习: 实现精准率和召回率，现在分享给大家，希望可以做个参考。

精准率：预测有100个人有癌症，在这些预测中，有多少是准确的。 (precision = frac{TP}{TP + FP})

需要的是精确度

召回率：实际上100人有癌症，我们的预测算法能从中正确的挑出多少。 (recall = frac{TP}{P} = frac{TP}{TP + FN})

需要的是预测的范围，预测的多不多

在scikit-learn中的混淆矩阵；精准率和召回率

复制代码

1
2
3
4
5
6
7
8
9
10
11
# 混淆矩阵
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test, y_log_predict)

# 精确率
from sklearn.metrics import precision_score
precision_score(y_test, y_log_predict)

from sklearn.metrics import recall_score
recall_score(y_test, y_log_predict)

自己手写

复制代码

import numpy as np
from sklearn import datasets

# 导入数据
digits = datasets.load_digits()        # 手写数字识别
X = digits.data
y = digits.target.copy()     # 深拷贝
# print(X)
# print(y)

y[digits.target==9] = 1      # 等于9的，  为1
y[digits.target!=9] = 0      # 不等于9的，为0

# 切割数据
from sklearn.model_selection import train_test_split
# 切分数据集为 训练集 和 测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=666)

# 逻辑回归
from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)
print(log_reg.score(X_test, y_test))      # 这是准确度

# 注意，此数据为 偏斜较大的数据，因此，需要考察其他指标
# 逻辑回归的预测值
y_log_predict = log_reg.predict(X_test)

# 对于混淆矩阵
# TN
def TN(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 0))    # 预测为0，预测正确，y_true为 0

# TN 值
print(TN(y_test, y_log_predict))

# FP
def FP(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 1))    # 预测为9，预测错误，y_true为 0

print(FP(y_test, y_log_predict))

def FN(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 0))   # 预测成0，预测错误，y_true为 1

print(FN(y_test, y_log_predict))

def TP(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 1))   # 预测成 9，预测正确，y_true为 1

print(TP(y_test, y_log_predict))

def confusion_matrix(y_true, y_predict):
    return np.array([
                     [TP(y_true, y_predict), FN(y_true, y_predict)],
                     [FP(y_true, y_predict), TN(y_true, y_predict)]
                    ])

confusion_matrix(y_test, y_log_predict)

# precision
def precision_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fp = FP(y_true, y_predict)
    try:
        return tp / (tp + fp)
    except:
        return 0.0
    
print("精准率: ", precision_score(y_test, y_log_predict))

# recall
def recall_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fn = FN(y_true, y_predict)
    try:
        return tp / (tp + fn)
    except:
        return 0.0
    
print("召回率:", recall_score(y_test, y_log_predict))

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
import numpy as np
from sklearn import datasets

# 导入数据
digits = datasets.load_digits()        # 手写数字识别
X = digits.data
y = digits.target.copy()     # 深拷贝
# print(X)
# print(y)

y[digits.target==9] = 1      # 等于9的，  为1
y[digits.target!=9] = 0      # 不等于9的，为0

# 切割数据
from sklearn.model_selection import train_test_split
# 切分数据集为 训练集 和 测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=666)

# 逻辑回归
from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)
print(log_reg.score(X_test, y_test))      # 这是准确度 

# 注意，此数据为 偏斜较大的数据，因此，需要考察其他指标
# 逻辑回归的预测值
y_log_predict = log_reg.predict(X_test)

# 对于混淆矩阵
# TN
def TN(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 0))    # 预测为0，预测正确，y_true为 0

# TN 值
print(TN(y_test, y_log_predict))

# FP
def FP(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 1))    # 预测为9，预测错误，y_true为 0

print(FP(y_test, y_log_predict))


def FN(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 0))   # 预测成0，预测错误，y_true为 1

print(FN(y_test, y_log_predict))


def TP(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 1))   # 预测成 9，预测正确，y_true为 1

print(TP(y_test, y_log_predict))


def confusion_matrix(y_true, y_predict):
    return np.array([
                     [TP(y_true, y_predict), FN(y_true, y_predict)],
                     [FP(y_true, y_predict), TN(y_true, y_predict)]
                    ])

confusion_matrix(y_test, y_log_predict)

# precision
def precision_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fp = FP(y_true, y_predict)
    try:
        return tp / (tp + fp)
    except:
        return 0.0
    
print("精准率: ", precision_score(y_test, y_log_predict))


# recall
def recall_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fn = FN(y_true, y_predict)
    try:
        return tp / (tp + fn)
    except:
        return 0.0
    
print("召回率:", recall_score(y_test, y_log_predict))

复制代码

1
2
3
4
5
6
7
8
9
10
0.9755555555555555
403
2
9
36
[[ 36   9]
 [  2 403]]
精准率:  0.9473684210526315
召回率: 0.8

最后

以上就是安静魔镜最近收集整理的关于复习: 实现精准率和召回率的全部内容，更多相关复习:内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：Other
浏览次数：98 次浏览
发布日期：2023-10-06 09:50:31
本文链接：https://www.kaopuke.com/article/k-p-k_14_uzogf5_13__7_g3.html

复习: 实现精准率和召回率

最后

评论列表共有 0 条评论

发表评论取消回复

复习: 实现精准率和召回率

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

微信扫一扫：分享

发表评论取消回复