在BSV上运行深度神经网络

56 阅读 0 评论 37 点赞

我是靠谱客的博主自然水壶，最近开发中收集的这篇文章主要介绍在BSV上运行深度神经网络，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

我们已经实现了一个用于手写数字分类的深度神经网络。已经训练好的模型完全在链上运行。它使用手写数字的 MNIST 数据集进行离线训练。该模型采用 28x28 灰度像素的图像并输出 0 到 9 的数字。

在这里插入图片描述

深度神经网络简介

人工神经网络是受生物神经网络启发而构建的。网络通过接触大量带标签的数据示例来学习。这个过程也称为监督学习。

该网络由几个组件组成：神经元/节点、连接、偏差和激活函数。这些组件被连续分组到层中。第一层称为“输入层”，数据通过该层传入网络，最后一层称为“输出层”，网络通过该层返回其输出。一个非常简单的神经网络只包含这两层。为了提高性能，我们可以在两者之间添加一个或多个“隐藏层”。具有隐藏层的网络称为“深度神经网络”(DNN)。

在这里插入图片描述

深度神经网络的图示

网络中神经元之间的每个连接都用特定值加权。每个神经元还有一个称为“偏差”的值，该值会添加到其输入的总和中。学习是找到一组这些权重和偏差的过程，这样网络将在给定一些输入的情况下返回有意义的输出。

为了直观地了解深层神经网络的工作原理，我们建议您观看有关该主题的短视频系列。

网络架构

MNIST 手写数字的 DNN 由 784 (28 x 28) 个节点的输入层、64 个节点的隐藏层和 10 个节点的输出层（可能的类/数字的数量）组成。这些层都是全连接的，这使得网络总共包含 501760 (784 * 64 * 10) 个连接。

在这里插入图片描述

隐藏层中的节点使用 ReLU 激活函数。 Argmax 用于输出节点以获得正确的值，即分类的数字。

训练模型

DNN 使用 Keras 进行训练。通过我们概述的网络架构并使用 RMSprop 优化器进行训练，该模型能够在 50 个时期后达到 98% 的分类准确率。


import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
# Model / data parameters
num_pixels = 28 * 28
num_nodes_hl = 64
num_classes = 10
batch_size = 469
epochs = 50
# Load the data and split it between train and test sets.
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Reshape images to [784, 1] and scale them to the [0, 1] range.
x_train = x_train.reshape(x_train.shape[0], num_pixels).astype("float32") / 255
x_test = x_test.reshape(x_test.shape[0], num_pixels).astype("float32") / 255
# Model
model = keras.Sequential([layers.Dense(num_nodes_hl, activation="relu"),
layers.Dense(num_classes, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

训练模型后，必须以我们可以在 sCrypt 智能合约中使用的格式来导出权重和偏差。出于性能原因，我们将这些值编码为 bytes 字节，而不是数组。

实现

我们已经实现了上面的 DNN，类似于我们之前实现的单层神经网络（又名感知器）。完整的代码可以在 GitHub 上找到。

import "../train/modelParams.scrypt";
library Model {
static function applyWeights0(int[ModelParams.N_INPUTS] in) : int[ModelParams.N_NODES_HL] {
int[ModelParams.N_NODES_HL] res = repeat(0, ModelParams.N_NODES_HL);
loop (ModelParams.N_NODES_HL) : i {
int sum = 0;
loop (ModelParams.N_INPUTS) : j {
sum += (ModelParams.getWeight0(i, j) * in[j]) / 100000000;
}
res[i] = sum;
}
return res;
}
static function applyWeights1(int[ModelParams.N_NODES_HL] in) : int[ModelParams.N_NODES_OUT] {
int[ModelParams.N_NODES_OUT] res = repeat(0, ModelParams.N_NODES_OUT);
loop (ModelParams.N_NODES_OUT) : i {
int sum = 0;
loop (ModelParams.N_NODES_HL) : j {
sum += (ModelParams.getWeight1(i, j) * in[j]) / 100000000;
}
res[i] = sum;
}
return res;
}
static function addBiases0(int[ModelParams.N_NODES_HL] in) : int[ModelParams.N_NODES_HL] {
int[ModelParams.N_NODES_HL] res = repeat(0, ModelParams.N_NODES_HL);
loop (ModelParams.N_NODES_HL) : i {
res[i] = in[i] + ModelParams.getBias0(i);
}
return res;
}
static function addBiases1(int[ModelParams.N_NODES_OUT] in) : int[ModelParams.N_NODES_OUT] {
int[ModelParams.N_NODES_OUT] res = repeat(0, ModelParams.N_NODES_OUT);
loop (ModelParams.N_NODES_OUT) : i {
res[i] = in[i] + ModelParams.getBias1(i);
}
return res;
}
static function applyReLU(int[ModelParams.N_NODES_HL] in) : int[ModelParams.N_NODES_HL] {
int[ModelParams.N_NODES_HL] res = repeat(0, ModelParams.N_NODES_HL);
loop (ModelParams.N_NODES_HL) : i {
int inVal = in[i];
if (inVal > 0) {
res[i] = inVal;
}
}
return res;
}
static function predict(int[ModelParams.N_INPUTS] inputs) : int {
int[ModelParams.N_NODES_HL] step0 = applyWeights0(inputs);
int[ModelParams.N_NODES_HL] step1 = addBiases0(step0);
int[ModelParams.N_NODES_HL] step2 = applyReLU(step1);
int[ModelParams.N_NODES_OUT] step3 = applyWeights1(step2);
int[ModelParams.N_NODES_OUT] step4 = addBiases1(step3);
int idxMaxVal = 0;
int maxVal = step4[0];
loop (ModelParams.N_NODES_OUT) : i {
int outVal = step4[i];
if (outVal > maxVal) {
idxMaxVal = i;
maxVal = outVal;
}
}
return idxMaxVal;
}
}