使用TensorFlow进行鬼写

72 阅读 0 评论 48 点赞

我是靠谱客的博主失眠花瓣，这篇文章主要介绍使用TensorFlow进行鬼写，现在分享给大家，希望可以做个参考。

Music has long been considered to be one of the most influential and powerful forms of artwork. As such, it has been used to express raw emotion from the artist and transfer it to the listener.

长期以来，音乐一直被认为是最有影响力和最有力的艺术品形式之一。这样，它已被用来表达艺术家的原始情感并将其传递给听众。

Being a fan of music myself, it was only natural to wonder how difficult it would be to generate lyrics using recurrent neural networks (RNNs). I really enjoy rap and hip hop music, so I chose to work off of artists in those genres. It was also a good fit since there is existing research on rap lyric generation.

作为自己的音乐迷，自然而然地想知道使用递归神经网络(RNN)生成歌词将有多么困难。我真的很喜欢说唱和嘻哈音乐，所以我选择与那些类型的艺术家合作。由于已有关于说唱歌词生成的研究，因此也很合适。

Recurrent neural networks can be used for many language modeling tasks such as: chat bots, predictive keyboards, and language translation. Recurrent neural networks work well when it comes to text generation because of their ability to work with sequential data. This is beneficial as we need to preserve the context of a sentence or, in this case, a verse.

递归神经网络可用于许多语言建模任务，例如：聊天机器人，预测性键盘和语言翻译。循环神经网络在处理文本生成时效果很好，因为它们可以处理顺序数据。这是有益的，因为我们需要保留句子的上下文，在这种情况下，还需要保留经文。

An explanation of how an RNN works would be that it looks at previous data from the sequence to predict the next element in the sequence. Let’s say we have an RNN trained to perform text prediction on your phone’s keyboard (You know, the word predictions that pop up as you type). Based on previous messages I’ve typed I could input something like “Wezley is super …” and the neural network will take that sequence, and give a set of predicted words to go off of, such as: “cool”, “smart”, and “funny”.

对RNN的工作方式的解释是，它会查看序列中的前一个数据以预测序列中的下一个元素。假设我们训练过一个RNN，可以在手机键盘上执行文本预测(您知道，键入时会弹出单词预测)。根据我输入的先前消息，我可以输入“ Wezley is super…”之类的信息，然后神经网络将采用该顺序，并给出一组预测的单词，例如：“ cool”，“ smart” ，又可笑”。

体系结构概述 (Overview of Architectures)

To add to this experiment, I wanted to train different recurrent neural network architectures to perform the rap lyric generation. I chose to go with SimpleRNN, Gated Recurrent Unit, Long Short Term Memory, and Convolution Neural Network + Long Short Term Memory based architectures. I chose these to ensure we are able to test each architecture against one-another to determine which would perform the best given the task. We don’t know if one model will outperform the other unless we try, right?

作为本实验的补充，我想训练不同的递归神经网络体系结构来执行说唱歌词的生成。我选择使用SimpleRNN，门控循环单元，长短期记忆以及基于卷积神经网络+长短期记忆的体系结构。我选择了这些以确保我们能够针对另一种架构进行测试，以确定哪种架构在执行任务时性能最佳。除非我们尝试，否则不知道一个模型是否会胜过另一个模型。

The SimpleRNN architecture was more-so for a baseline to see how the other architectures will perform. A SimpleRNN architecture is not very good for this specific task, because of the vanishing gradient problem. This means that the SimpleRNN won’t be very useful in remembering context throughout a bar/verse because it will lose early information about the sequence the further in the sequence we go. This leads to incoherent verses that you’ll see later on in the article. If you are curious and want a TL;DR of how the model performed: we get verses such as “I am, what stone private bedroom now” or “And how the low changed up last gas guitar thing.” Both of these verses were generated from a dataset of Drake lyrics. Neither of them make much sense. However, I’d argue that they’re still fire bars.

对于基线而言，SimpleRNN体系结构更适合查看其他体系结构的性能。由于逐渐消失的梯度问题，SimpleRNN体系结构不适用于此特定任务。这意味着SimpleRNN在记忆整个小节/单节中的上下文方面不是很有用，因为它会在我们走的序列越远时丢失有关序列的早期信息。这导致了不连贯的经文，您将在本文后面看到。如果您好奇并希望获得模型表现的TL; DR：我们会得到如下经文：“我是，现在是什么石制私人卧室”或“以及低点如何改变了最后的气吉他。” 这两个经文都是从Drake歌词数据集中产生的。他们俩都没有多大意义。但是，我认为他们仍然是火力发电厂。

The Gated Recurrent Unit architecture was the next architecture I tested. The gated recurrent unit differs from the SimpleRNN by being able to remember a little further down in the sequence. It accomplishes this by utilizing two gates, a reset gate and an update gate. These gates control if the previous sequence information continues through the network or if it it gets updated to the most recent step. I’ll go a little more in-depth on this further into the article.

门控循环单元架构是我测试的下一个架构。门控循环单元与SimpleRNN的不同之处在于，它可以记住序列中的更深一层。它通过利用两个门，一个复位门和一个更新门来实现这一点。这些门控制着先前的序列信息是否继续通过网络或是否更新为最新步骤。在本文中，我将对此进行更深入的介绍。

The Long Short Term Memory architecture was another architecture that was tested for this project. The LSTM differs from the SimpleRNN by, again, being able to remember further down the sequence. The LSTM has an advantage over the GRU by being able to remember longer sequences due to being a little more complex. The LSTM has three gates, instead of two, that control the information it forgets, carries on in the sequence, and updates from the latest step. Again, the LSTM will be covered a little more in-depth later on in the article.

长短期内存体系结构是为此项目测试的另一种体系结构。 LSTM与SimpleRNN的不同之处再次在于，它可以进一步记忆序列。 LSTM与GRU相比具有优势，因为它稍微复杂一点，因此能够记住更长的序列。 LSTM具有三个而不是两个门来控制它忘记的信息，按顺序进行并从最新步骤进行更新。同样，本文稍后将更深入地介绍LSTM。

The final architecture I tested was a mixture of a convolution neural network and long short term memory RNN. I threw this one in as a thought experiment based off of a paper that I read which used a C-LSTM architecture for text classification (Reference in Colab notebook). I wondered if the CNN would allow the LSTM to generalize a bar and better understand the stylistic elements of an artist. While fun to see a CNN in a text generation problem, I didn’t notice much of a different between this and the LSTM model.

我测试的最终体系结构是卷积神经网络和长期短期记忆RNN的混合体。我以阅读的论文为基础进行了思想实验，以此作为思想实验，该论文使用C-LSTM体系结构进行了文本分类(Colab笔记本中的参考)。我想知道CNN是否可以使LSTM归纳标准并更好地了解艺术家的风格元素。在文本生成问题中看到CNN很有趣，但我并没有注意到它与LSTM模型之间有很多不同。

获取数据集 (Obtaining the Dataset)

With a defined set of architectures created, I set out to find the dataset I wanted to use for this problem.

创建一组定义的体系结构后，我开始查找要用于此问题的数据集。

The dataset didn’t really matter to me, so long as it contained lyrics from prominent artists. I wanted to generate lyrics based off of artists I listen to often. This was so I could recognize if the model was able to generate similar lyrics. Don’t worry though! I didn’t determine a model’s performance solely off of what I thought sounded good. I also used a set of metrics that have been described in recent literature on the subject.

数据集对我来说并不重要，只要它包含著名艺术家的歌词即可。我想根据经常听的歌手来产生歌词。这样我就可以识别出该模型是否能够生成相似的歌词。不过不要担心！我并不能仅仅根据我认为听起来不错的方式来确定模型的性能。我还使用了一组有关该主题的最新文献中描述的指标。

The dataset I found was here on Kaggle and was provided by Paul Mooney.

我发现的数据集在Kaggle上，由Paul Mooney提供。

This dataset was great because it contained lyrics from many of the rap/hip hop artists that I listen to. It also didn’t have any weird characters and took care of some of censoring of explicit lyrics.

这个数据集很棒，因为它包含了我听过的许多说唱/嘻哈歌手的歌词。它也没有任何怪异的字符，并负责一些显式歌词的审查。

准备数据 (Preparing the Data)

With the dataset in hand, I set out to load and prepare the data for training.

有了数据集，我便开始加载和准备数据以进行训练。

The first thing I did was load in the data and finish censoring it. I used a preexisting Python library to perform the censorship so that I didn’t have to create a “naughty words” list manually. Unfortunately the library didn’t censor every word, so I apologize if you stumble across something explicit in the published notebook for this article.

我要做的第一件事是加载数据并完成审查。我使用一个预先存在的Python库来执行检查，因此不必手动创建“顽皮的话”列表。不幸的是，图书馆并没有审查每个词，因此，如果您偶然发现本文中已发表笔记本中的明显内容，我深表歉意。

With the lyrics read in and censored, I went ahead and split them into an array of bars. I didn’t do any other processing to the bars, but in the future I may try this again and add <start> and <end> tags to each bar. This way the model can possibly learn when to end the sequence. For now, I had it generate bars of randomized lengths and the results were good enough for the initial experiment.

读完歌词并对其进行审查后，我继续将其拆分为多个小节。我没有对条进行任何其他处理，但是将来我可能会再次尝试此操作，并在每个条中添加<start>和<end>标记。这样，模型就可以学习何时结束序列。现在，我让它生成随机长度的条形图，结果足以用于初始实验。

Once I finished splitting the data, I created a Markov model utilizing the markovify Python library. The Markov model will be used to generate the beginning sequences for each bar. This will help us ensure that the beginning of the sequence is somewhat coherent before passing it to the trained models. The models will then take the sequence and finish generating the lyrics for the bar.

分割完数据后，我就使用markovify Python库创建了一个Markov模型。马尔可夫模型将用于生成每个小节的开始序列。这将帮助我们确保在将序列传递给训练后的模型之前，序列的开始是连贯的。然后，模型将采用序列并完成为小节生成歌词。

The next step was to tokenize the lyrics so that they would be in a format that the models could understand. Tokenization is actually a pretty cool process, as it basically splits up the words into a dictionary of words with IDs tied to them and changes each bar into an array of the corresponding word IDs. There is an example of this in the published notebook, but here’s another example of this in action:

下一步是标记歌词，以使歌词采用模型可以理解的格式。令牌化实际上是一个非常酷的过程，因为它基本上将单词拆分成带有绑定ID的单词字典，并将每个小节更改为相应单词ID的数组。在已发布的笔记本中有一个示例，但是这是一个实际的示例：

For an example, let’s say we were to tokenize the following sentences:

例如，假设我们要标记以下句子：

“Wezley is cool”
“韦兹利很酷”
“You are cool”
“你很酷”
“TensorFlow is very cool”
“ TensorFlow非常酷”

The following sequences would be produced:

将产生以下序列：

[1, 2, 3]
[1,2,3]
[4, 5, 3]
[4、5、3]
[6, 2, 7, 3]
[6，2，7，3]

Where the word dictionary is:

其中单词字典是：

[‘Wezley’ : 1, ‘is’ : 2, ‘cool’ : 3, ‘You’ : 4, ‘are’ : 5, ‘TensorFlow’ : 6, ‘very’ : 7]

['Wezley'：1，'is'：2，'cool'：3，'You'：4，'are'：5，'TensorFlow'：6，'very'：7]

As-is, these sequences can’t be fed into a model since they are of different lengths. To fix this, we add padding to the front of the arrays.

照原样，这些序列的长度不同，因此无法输入模型。为了解决这个问题，我们在数组的前面添加了填充。

With padding we get:

通过填充，我们得到：

[0, 1, 2, 3]
[0，1，2，3]
[0, 4, 5, 3]
[0，4，5，3]
[6, 2, 7, 3]
[6，2，7，3]

With the bars tokenized, I was finally able to create my X and y data for training. The train_X data consisted of an entire bar, minus the last word. The train_y data was the last word in the bar.

在标记条化之后，我终于能够创建我的X和y数据进行训练。 train_X数据包括一个完整的小节，减去最后一个单词。 train_y数据是该栏中的最后一个单词。

Looking into the future, as with adding the <start> and <end> tags to the bars. I want to try changing up the way I’m splitting the training data. Maybe have the next version of this predict an entire bar based off the previous bar. That’ll be a project for another day though.

展望未来，就像在栏上添加<start>和<end>标记一样。我想尝试改变分割训练数据的方式。也许让此版本的下一个版本根据上一个柱形来预测整个柱形。那将是另一天的项目。

定义模型 (Defining the Models)

With the data imported and split into the train_X and train_y sets. It’s time to define the model architectures and begin training.

导入数据并将其拆分为train_X和train_y集。现在是时候定义模型架构并开始培训了。

First up is the SimpleRNN architecture! The SimpleRNN will give a good baseline against the GRU, LSTM, and CNN+LSTM architectures.

首先是SimpleRNN架构！ SimpleRNN将为GRU，LSTM和CNN + LSTM体系结构提供良好的基线。

The SimpleRNN unit can be expressed arithmetically as:

SimpleRNN单元可以算术表示为：

Where h(t) is expressed as the hidden state at a given point in time t. As you can see in the equation, the SimpleRNN relies on the previous hidden state h(t-1) and the current input x(t) to give us the current hidden state.

其中h(t)表示为在给定时间点t的隐藏状态。从方程式中可以看出，SimpleRNN依赖于先前的隐藏状态h(t-1)和当前输入x(t)来提供当前的隐藏状态。

The SimpleRNN is great because of its ability to work with sequence data. The shortfall is in its simplicity. The SimpleRNN is unable to remember data further back in the sequence and thus suffers from the vanishing gradient problem. The vanishing gradient problem occurs when we start getting further down the sequence. This is when earlier states have a harder time being expressed. There is no mechanism in a SimpleRNN to help is keep track of previous states.

SimpleRNN之所以出色，是因为它具有处理序列数据的能力。不足之处在于其简单性。 SimpleRNN无法记住序列中更远的数据，因此遭受梯度消失的困扰。当我们开始进一步深入序列时，就会出现消失的梯度问题。这是较早的状态很难表达的时候。 SimpleRNN中没有机制可以帮助跟踪以前的状态。

In code, the SimpleRNN network looks like:

在代码中，SimpleRNN网络如下所示：

SimpleRNN Architecture

SimpleRNN架构

The data being fed into the network is only expressed as a N*T vector, where the SimpleRNN is expecting an N*T*D vector. We correct this by adding an embedding layer to give the vector the D dimension. The embedding layer allows for the inputs to be transformed into a dense vector that can be fed into the SimpleRNN cells. For more information on the embedding layer see the TensorFlow documentation here.

馈入网络的数据仅表示为N * T向量，其中SimpleRNN期望使用N * T * D向量。我们通过添加嵌入层为向量赋予D维来纠正此问题。嵌入层允许将输入转换成可以输入到SimpleRNN单元中的密集向量。对于埋入层的更多信息，请参阅TensorFlow文档这里。

I’m utilizing the Adam optimizer with a learning rate of 0.001. I’m using categorical cross-entropy as my loss function. Categorical cross-entropy is being used because we are trying to classify the next word in the sequence given the previous steps.

我正在使用学习率为0.001的Adam优化器。我正在使用分类交叉熵作为损失函数。正在使用分类交叉熵，因为我们正在尝试根据前面的步骤对序列中的下一个单词进行分类。

Next up is the network utilizing the Gated Recurrent Unit.

接下来是利用门控循环单元的网络。

The GRU improves upon the SimpleRNN cell by introducing a reset and update gate. At a high level, these gates are used to decide which information we want to retain/lose previous states.

通过引入重置和更新门，GRU对SimpleRNN单元进行了改进。在较高级别上，这些门用来确定我们要保留/丢失先前状态的信息。

The GRU is expressed as:

GRU表示为：

Where z(t) is the update gate, r(t) is the reset gate, and h(t) is the hidden cell state.

其中z(t)是更新门， r(t)是重置门， h(t)是隐藏单元状态。

Here’s how the GRU looks in action:

这是GRU运作的样子：

Here is how the GRU network is constructed in TensorFlow:

这是在TensorFlow中构建GRU网络的方式：

Again, I’m utilizing Adam for the optimizer and categorical cross-entropy as the loss function.

同样，我将Adam用于优化程序，并将分类交叉熵作为损失函数。

The Long Short Term Memory architecture was the next to be utilized.

长短期内存架构是下一个要使用的架构。

The long short term memory cell has advantages over the SimpleRNN and GRU cells by being able retain even more information further down the sequence. The LSTM utilizes three different gates as oppose to the GRU’s two, and retains a cell state throughout the network. The GRU is known to have the advantage of speed over the LSTM, in that it is able to generalize faster and utilize fewer parameters. However, the LSTM tends to take the cake when it comes to retaining more contextual data throughout a sequence.

长期短期存储单元比SimpleRNN和GRU单元具有优势，因为它可以在序列中进一步保留更多信息。 LSTM与GRU的两个相反，利用了三个不同的门，并在整个网络中保留了单元状态。众所周知，GRU具有速度优于LSTM的优势，因为它能够更快地泛化并利用更少的参数。但是，当要在整个序列中保留更多上下文数据时，LSTM往往是蛋糕。

The LSTM cell can be expressed as:

LSTM单元可以表示为：

Where f(t) represents the forget gate, and determines how much of the previous state to forget. Then i(t) represents the input gates which determines how much of the new information we will add to the cell state. The o(t) is the output gate, which determines which information will be progressing to the next hidden state. The cell state is represented by c(t), and the hidden state is h(t).

其中f(t)表示忘记门，并确定要忘记的先前状态有多少。然后i(t)代表输入门，它决定了我们将添加到单元状态的新信息量。 o(t)是输出门，它确定哪些信息将前进到下一个隐藏状态。单元状态由c(t)表示 ，隐藏状态为h(t)。

Here is a visualization of data progressing through and LSTM cell:

这是通过LSTM单元进行的数据可视化：

See below for the implementation in code:

参见下面的代码实现：

The final architecture I wanted to test was a combination of a convolution neural network and LSTM.

我要测试的最终体系结构是卷积神经网络和LSTM的组合。

This network was a thought experiment to see how the results would differ from the LSTM, GRU, and SimpleRNN. I was actually surprised at some of the verses it was about to put out.

该网络是一个思想实验，旨在查看结果与LSTM，GRU和SimpleRNN的不同之处。实际上，我对即将推出的某些经文感到惊讶。

Here is the code for the architecture:

这是该体系结构的代码：

用模型发火 (Generating Fire with the Models)

Creating the models for this project was only about half of the work. The other half was generating song lyrics utilizing the trained model.

为该项目创建模型仅完成了一半的工作。另一半是利用训练好的模型来生成歌曲歌词。

In my opinion, this is where the project became really fun. I was able to take the models I trained and utilize them for a non-trivial task.

我认为，这是该项目真正有趣的地方。我能够采用我训练的模型，并将其用于一项重要任务。

This project was heavily inspired by “Evaluating Creative Language Generation: The Case of Rap Lyric Ghost Writing” by Peter Potash, Alexey Romanov, and Anna Rumshishky. With that, I’m going to utilize some of the methods outlined in their paper for evaluating the output of the models against the original lyrics from the artist.

这个项目的灵感来自Peter Potash，Alexey Romanov和Anna Rumshishky的“ 评估创新语言的产生：说唱抒情鬼写作的案例” 。这样，我将利用他们论文中概述的一些方法，根据艺术家的原始歌词来评估模型的输出。

The methods I’m utilizing to evaluate bars and generate raps are: comprehension score, rhyme index, and lyrical uniqueness. I’ll discuss how I calculated these shortly.

我用来评估小节和产生说唱的方法是：理解力得分，韵律指数和抒情唯一性。我将在短期内讨论如何计算这些。

A high level overview of how I’m generating songs can be described as:

我如何生成歌曲的高级概述可以描述为：

Utilize Markov model to generate first four words of a bar
利用马尔可夫模型生成小节的前四个单词
Take the output of the Markov model and feed them into the RNN
取马尔可夫模型的输出并将其输入RNN
Evaluate the output of the RNN against the original lyrics for uniqueness, similar rhyme index, and similar comprehension score
根据原始歌词评估RNN的输出是否具有唯一性，相似的韵律指数和相似的理解力得分
Either throw out the bar (if it’s trash), or add it to the song (if it’s fire)
扔掉酒吧(如果是垃圾桶)，或将其添加到歌曲(如果是火桶)

Fairly simple, right?

很简单，对吧？

Let’s jump into the code of how this is done.

让我们跳入完成此操作的代码。

First, I have a function named generate_rap. This function handles the main functionality of generating a rap song. generate_rap takes in the model I want to use to generate the rap (SimpleRNN, GRU, LSTM, or CNN+LSTM), the max bar length, how many bars we want in the rap, score thresholds, and how many tries we want for generating a fire bar. The score thresholds define how well the bar scores before it is considered fire — in this case, the closer to 0 the bar is, the more fire it is. Here is how the function looks in code:

首先，我有一个名为generate_rap的函数。此功能处理产生说唱歌曲的主要功能。 generate_rap接受我要用于生成说唱的模型(SimpleRNN，GRU，LSTM或CNN + LSTM)，最大条长度，我们在说唱中需要多少条，得分阈值以及我们想要进行多少次尝试产生火条。得分阈值定义条在被认为是开火之前得分的程度-在这种情况下，条越接近0，则开火越多。该函数在代码中的外观如下：

As you can see, we generate a random bar, score it based on the artist’s average rhyme index, average comprehension, and the uniqueness of the bar. Then if the bar meets the score threshold it is graduated into the final song. If the algorithm fails to generate a fire bar within the defined max tries, it’ll put the best scored bar in the song and move on.

如您所见，我们生成一个随机小节，根据艺术家的平均韵律指数，平均理解度和小节的唯一性对其评分。然后，如果小节达到乐谱阈值，则将其定级为最终歌曲。如果该算法未能在定义的最大尝试次数内生成火线，它将在歌曲中得分最高的火线并继续前进。

Within generate_rap I’m utilizing another function named generate_bar. This function takes in a seed phrase, the model we are using to generate the sequence, and the sequence’s length. generate_bar will then tokenize the seed phrase and feed it into the provided model until the sequence hits the desired length, then return the output. Here is the code:

在generate_rap中，我利用了另一个名为generate_bar的函数。此函数包含一个种子短语，我们用于生成序列的模型以及序列的长度。然后generate_bar将标记种子短语并将其馈送到提供的模型中，直到序列达到所需的长度，然后返回输出。这是代码：

To score the bars, I’m utilizing a function named score_bar. This function takes in the bar we want to score, the artist’s original lyrics, the artist’s average comprehension score, and the artist’s average rhyme index. score_bar calculates the input bar’s comprehension score, rhyme index, and uniqueness index then scores the bar.

为了给小节打分，我利用了一个名为score_bar的函数。此功能包含我们想要得分的小节，艺术家的原始歌词，艺术家的平均理解分数和艺术家的平均韵律指数。 score_bar计算输入小节的理解分数，韵律指数和唯一性指数，然后对小节进行评分。

The bar’s score can be positive or negative with 0 being the best score a bar can achieve. A score of 0 means that the bar has the same rhyme index and comprehension score while remaining completely unique from the original artist’s lyrics. A perfect score of 0 will be impossible to achieve, which is why we are defining min and max thresholds.

小节的分数可以是正数或负数，0是小节可以达到的最佳分数。分数为0表示该小节具有相同的韵律指数和理解力分数，而与原始歌手的歌词完全不同。完美分数0将无法实现，这就是为什么我们要定义最小和最大阈值。

The score_bar function looks like:

score_bar函数如下所示：

To calculate the rhyme index of a bar, I’m utilizing the method as described in “Evaluating Creative Language Generation: The Case of Rap Lyric Ghostwriting.” Rhyme index is calculated by taking the number of rhymed syllables and dividing that by the total number of syllables in the bar or song. Here is that implementation in code:

为了计算小节的韵律指数，我使用了“评估创意语言生成：Rap歌词代笔案例”中描述的方法。韵律指数是通过将押韵音节的数量除以小节或歌曲中的总音节数量来计算的。这是代码中的实现：

For comparing the uniqueness of the generated bar, I’m computing the cosine distance between the generated bar and all of the artist’s bars. I’m then getting the average distance to compute the total uniqueness score. Here is how that looks:

为了比较生成的条的唯一性，我计算了生成的条与艺术家的所有条之间的余弦距离。然后，我得到平均距离来计算总的唯一性分数。看起来是这样的：

结果 (The Results)

With all of this I was finally able to generate a full rap utilizing the four models I trained. After generating the rap, I took the generated song and calculated the rhyme index and comprehension scores. Surprisingly the full song still remained fairly close to the original artist’s rhyme index and comprehension score.

有了这些，我终于能够使用我训练的四个模型来产生完整的说唱。产生说唱之后，我拿起产生的歌曲并计算了韵律指数和理解力分数。令人惊讶的是，整首歌仍然与原歌手的韵律指数和理解力分数相当接近。

Here are some of the outputs when training off of Drake lyrics.

这是训练Drake歌词时的一些输出。

The SimpleRNN:

SimpleRNN：

Generated rap with avg rhyme density: 0.5030674846625767 and avg readability of: 2.0599999999999996 Rap Generated with SimpleRNN: Now you’re throwing me baby know it know 
Look I gotta started with you hook drake 
I swear it happened no tellin’ yeah yeah
....

The GRU:

GRU：

Generated rap with avg rhyme density:  0.5176470588235295 and avg readability of:  1.9449999999999998 Rap Generated with GRU: That's why I died everything big crazy on me 
Who keepin' score up yeah yeah yeah yeah 
I've loved and you everything big crazy on me on
....

The LSTM:

LSTM：

Generated rap with avg rhyme density:  0.3684210526315789 and avg readability of:  1.9749999999999996 Rap Generated with LSTM: Get the **** lick alone same that wait now 
up ****, see what uh huh heart thing up yeah 
Despite the things though up up up up yeah yeah
....

The LSTM+CSNN:

LSTM + CSNN：

Generated rap with avg rhyme density:  0.33519553072625696 and avg readability of:  2.2599999999999993 Rap Generated with CNN+LSTM: They still out know play through now out out 
I got it dedicate dedicate you yeah 
I've been waiting much much aye aye days aye aye 
....

For the full lyrics and list of references, take a look at the Google Colab notebook. Also feel free to try it yourself and change the artist for the style you want to mimic.

有关完整歌词和参考文献列表，请查看Google Colab笔记本。也可以随意尝试一下，并根据您想模仿的风格来改变艺术家。

As far as the SimpleRNN vs GRU vs LSTM vs CNN+LSTM experiment goes, I would say that the LSTM tended to have the best results. The CNN+LSTM had too many repetitive words in a bar, and I think this has to do with the CNN generalizing the sequence as a whole. The SimpleRNN and GRU produced pretty incoherent bars, and their rhyme densities were really far off from the original artist.

就SimpleRNN，GRU，LSTM，CNN + LSTM实验而言，我想说LSTM往往有最好的结果。 CNN + LSTM的条形图中有太多重复的单词，我认为这与CNN概括了整个序列有关。 SimpleRNN和GRU产生了非常不连贯的小节，并且它们的韵律密度与原始艺术家的确相距甚远。

That's it! Let me know what you think in the comments. I’d love to build upon this project in the future. If you have any suggestions for things I need to change to get better results, let me know! Thank you for reading.

而已！让我知道您在评论中的想法。将来我会希望以此项目为基础。如果您对我需要更改以获得更好结果的任何建议，请告诉我！感谢您的阅读。

Check out my GitHub for the code to this project, and other cool projects!

查看我的GitHub，获取该项目以及其他出色项目的代码！