LSTM with Keras functional API (1)

153 阅读 0 评论 101 点赞

我是靠谱客的博主饱满钢笔，最近开发中收集的这篇文章主要介绍LSTM with Keras functional API (1)，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

There are two main types of models available in Keras: the Sequential model, and the tf.keras.Model class. In this post, we will focus on the latter.

Model is a class that can group layers into an object with training and inference features. There are two ways to instantiate a Model:

1. With the basic “functional API”

We start from the input, and chain layer calls to specify the model’s forward pass, and this model will include all layers required in the computation of outputs given inputs.

from keras.models import Model
from keras.layers import Input, Dense

inputs = Input(shape=(32,))
outputs = Dense(32)(inputs)
model = Model(inputs = inputs, outputs = outputs)

2. By subclassing the `Model` class

We can also build fully-customizable models by subclassing the Model class. In this case, we should define the layers in __init__ and implement the model’s forward pass in call.

from keras.models import Model
from keras.layers import Dense

class MyModel(Model):
  def __init__(self):
    super(MyModel, self).__init__()
    self.dense_1 = Dense(32, activation='relu')
    self.dense_2 = Dense(1, activation='sigmoid')
    
  def call(self, inputs):
    h1 = self.dense_1(inputs)
    return self.dense_2(h1)

class 是用来描述具有相同attributes和methods的objects的集合。它定义了该集合中每个object所共有的attributes和methods。object是class的instance。
特性：
1.封装：使用构造方法将内容封装到object中，然后通过object直接或者self间接获取被封装的内容。
2.继承：将多个class共有的方法提取到Superclass中，Subclass仅需继承Superclass而不必一一实现每个方法。比如：
(1) class MyModel(tf.keras.Model):中，Subclass MyModel 将继承其Superclass tf.keras.Model 中的attributes和methods。
(2) super(MyModel, self).__init__()表示调用Mymodel的Superclass，即class tf.keras.Model的__init__()函数。最终使MyModel继承到其父类的attributes。
3.多态

Example: build an LSTM model

Our model starts with an embedding layer followed by an LSTM layer and an output layer. Let’s assume the inputs of Embedding layer are some batches of integer encoded sequence with length of 100.

class Model(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, hidden_size):
        super(Model, self).__init__()
        self.hidden_size = hidden_size

        self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
        self.lstm = tf.keras.layers.CuDNNLSTM(self.hidden_size, return_sequences=True, recurrent_initializer='glorot_uniform', stateful=True)

        self.output_layer = tf.keras.layers.Dense(vocab_size)

    def call(self, x):
        embedding = self.embedding(x)
        h = self.lstm(embedding)
        prediction = self.output_layer(h)
        return prediction

Firstly, for the __init__ part, we need to specify the hyperparameters to form the structure of each layer.

For the embedding layer, we need to set the size of vocabulary (vocab_size) and the dimension of the embedded sentiment vector(embedding_dim).
For the LSTM layer, we need to set the size of the hidden unit(hidden_size).

Notice that the sequence length and batch size will be specified in the following training part.

Reference

[1] Modle(functional API),Keras Documemtation
[2] tf.keras.model.Model, TensorFlow Documentation
[3] LSE ST449, AI and Deep Learning Lecture 5 Sequence modeling