《python深度學習》筆記---8.1、使用LSTM生成文本


《python深度學習》筆記---8.1、使用LSTM生成文本

一、總結

一句話總結:

其實原理非常簡單,就是單層的LSTM把訓練數據中單詞與字符的統計規律學好,然后softmax層相當於分類對應到詞表中的各個字符的概率
from tensorflow.keras import layers
model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

 

 

1、人工智能的目的?

【人工智能不是為了替代我們的智能】:的確,到目前為止,我們見到的人工智能藝術作品的水平還很低。人工智能還遠遠比不上 人類編劇、畫家和作曲家。但是,替代人類始終都不是我們要談論的主題,人工智能不會替代 我們自己的智能,
【而是會為我們的生活和工作帶來更多的智能】:而是會為我們的生活和工作帶來更多的智能,即另一種類型的智能。在許多 領域,特別是創新領域中,人類將會使用人工智能作為增強自身能力的工具,實現比人工智能 更加強大的智能。

 

 

2、人工智能發揮作用的地方?

【簡單的模式識別與專業技能】:很大一部分的藝術創作都是簡單的模式識別與專業技能。這正是很多人認為沒有吸引力、 甚至可有可無的那部分過程。
【我們的感知模式、語言和藝術作品都具有統計結構】:學習這種結構是深度學習算法所擅長的。

 

 

3、機器學習模型只是一種數學運算?

【機器學習模型能夠對圖像、 音樂和故事的統計潛在空間(latent space)進行學習,然后從這個空間中采樣(sample)】:創造 出與模型在訓練數據中所見到的藝術作品具有相似特征的新作品。
【機器學習模型只是一種數學運算】:當然,這種采樣本身並不是 藝術創作行為。它只是一種數學運算,算法並沒有關於人類生活、人類情感或我們人生經驗的 基礎知識;相反,它從一種與我們的經驗完全不同的經驗中進行學習。

 

 

4、使用 LSTM 生成文本實例中 如何生成序列數據?

【使用前面的標記作為輸入,訓練一個網絡來預測序列中接下來的一個或多個標記】:用深度學習生成序列數據的通用方法,就是使用前面的標記作為輸入,訓練一個網絡(通常是循環神經網絡或卷積神經網絡)來預測序列中接下來的一個或多個標記。
【例如,給定輸入 the cat is on the ma,訓練網絡來預測目標 t,即下一個字符。】

 

 

5、語言模型(language model)?

【給定前面的標記,能夠對下一個標記的概率進行建模的任何網絡】:與前面處理文本數據時一樣,標記 (token)通常是單詞或字符,給定前面的標記,能夠對下一個標記的概率進行建模的任何網絡 都叫作語言模型(language model)。
【語言的潛在空間(latent space),即語言的統計結構】:語言模型能夠捕捉到語言的潛在空間(latent space),即語言的統計結構。

 

 

6、使用 LSTM 生成文本實例中 的采樣和條件數據是什么?

【采樣(sample,即生成新序列)】:一旦訓練好了這樣一個語言模型,就可以從中采樣(sample,即生成新序列)。
【初始文本字符串[即條件數據(conditioning data)]】:向模型中輸入一個初始文本字符串[即條件數據(conditioning data)],要求模型生成下一個字符或下一個單詞(甚至可以同時生成多個標記),然后將生成的輸出添加到輸入數據中,並多次重復這一過程

 

 

7、生成文本時,如何選擇下一個字符至關重要?

【貪婪采樣】:一種簡單的方法是貪婪采樣(greedy sampling), 就是始終選擇可能性最大的下一個字符。但這種方法會得到重復的、可預測的字符串,看起來 不像是連貫的語言。
【隨機采樣】:一種更有趣的方法是做出稍顯意外的選擇:在采樣過程中引入隨機性,即 從下一個字符的概率分布中進行采樣。這叫作隨機采樣(stochastic sampling,stochasticity 在這 個領域中就是“隨機”的意思)。在這種情況下,根據模型結果,如果下一個字符是 e 的概率為 0.3,那么你會有 30% 的概率選擇它。

 

 

8、為什么采樣(生成新序列)的時候需要有一定的隨機性?

【純隨機采樣有最大的熵,隨機性大】:考慮一個極端的例子——純隨機采樣,即從均勻概率分布中 抽取下一個字符,其中每個字符的概率相同。這種方案具有最大的隨機性,換句話說,這種概 率分布具有最大的熵。當然,它不會生成任何有趣的內容。
【貪婪采樣有最小的熵,沒有任何隨機性】:再來看另一個極端——貪婪采樣。 貪婪采樣也不會生成任何有趣的內容,它沒有任何隨機性,即相應的概率分布具有最小的熵。
【更小的熵可以讓生成的序列具有更加可預測的結構(因此可能看起來更真實),而更大的熵會得到更加出人意料且更有創造性的序列】:但是,還有許多其他中間點具有更大或更小的熵,你可能希望都研究一下。更小的 熵可以讓生成的序列具有更加可預測的結構(因此可能看起來更真實),而更大的熵會得到更加 出人意料且更有創造性的序列。

 

 

9、softmax 溫度(softmax temperature)?

【為了在采樣過程中控制隨機性的大小】:我們引入一個叫作 softmax 溫度(softmax temperature) 的參數
【用於表示采樣概率分布的熵,即表示所選擇的下一個字符會有多么出人意料或多么可預測】

 

 

10、用於預測下一個字符的單層 LSTM 模型?

其實原理非常簡單,就是單層的LSTM把訓練數據中單詞與字符的統計規律學好,然后softmax層相當於分類對應到詞表中的各個字符的概率
from tensorflow.keras import layers
model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

 

11、使用LSTM生成文本 注意點?

我們可以生成離散的序列數據,其方法是:給定前面的標記,訓練一個模型來預測接下 來的一個或多個標記。
對於文本來說,這種模型叫作語言模型。它可以是單詞級的,也可以是字符級的。
對下一個標記進行采樣,需要在堅持模型的判斷與引入隨機性之間尋找平衡。
處理這個問題的一種方法是使用softmax 溫度。一定要嘗試多種不同的溫度,以找到合適的那一個。

 

 

 

 

二、8.1、使用LSTM生成文本

博客對應課程的視頻位置:

 

[...]

Implementing character-level LSTM text generation

Let's put these ideas in practice in a Keras implementation. The first thing we need is a lot of text data that we can use to learn a language model. You could use any sufficiently large text file or set of text files -- Wikipedia, the Lord of the Rings, etc. In this example we will use some of the writings of Nietzsche, the late-19th century German philosopher (translated to English). The language model we will learn will thus be specifically a model of Nietzsche's writing style and topics of choice, rather than a more generic model of the English language.

Preparing the data

Let's start by downloading the corpus and converting it to lowercase:

In [1]:
from tensorflow import keras import numpy as np path = keras.utils.get_file( 'nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt') text = open(path).read().lower() print('Corpus length:', len(text)) 
Corpus length: 600893
In [2]:
print(text[0:400]) 
preface


supposing that truth is a woman--what then? is there not ground
for suspecting that all philosophers, in so far as they have been
dogmatists, have failed to understand women--that the terrible
seriousness and clumsy importunity with which they have usually paid
their addresses to truth, have been unskilled and unseemly methods for
winning a woman? certainly she has never allowed herself 

Next, we will extract partially-overlapping sequences of length maxlen, one-hot encode them and pack them in a 3D Numpy array x of shape (sequences, maxlen, unique_characters). Simultaneously, we prepare a array y containing the corresponding targets: the one-hot encoded characters that come right after each extracted sequence.

In [3]:
# Length of extracted character sequences
maxlen = 60 # We sample a new sequence every `step` characters step = 3 # This holds our extracted sequences sentences = [] # This holds the targets (the follow-up characters) next_chars = [] for i in range(0, len(text) - maxlen, step): sentences.append(text[i: i + maxlen]) next_chars.append(text[i + maxlen]) print('Number of sequences:', len(sentences)) # List of unique characters in the corpus chars = sorted(list(set(text))) print('Unique characters:', len(chars)) # Dictionary mapping unique characters to their index in `chars` char_indices = dict((char, chars.index(char)) for char in chars) # Next, one-hot encode the characters into binary arrays. print('Vectorization...') x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool) y = np.zeros((len(sentences), len(chars)), dtype=np.bool) for i, sentence in enumerate(sentences): for t, char in enumerate(sentence): x[i, t, char_indices[char]] = 1 y[i, char_indices[next_chars[i]]] = 1 
Number of sequences: 200278
Unique characters: 58
Vectorization...
In [4]:
print(chars) 
['\n', ' ', '!', '"', "'", '(', ')', ',', '-', '.', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '=', '?', '[', ']', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '忙', '毛', '盲', '脝', '茅']

Building the network

Our network is a single LSTM layer followed by a Dense classifier and softmax over all possible characters. But let us note that recurrent neural networks are not the only way to do sequence data generation; 1D convnets also have proven extremely successful at it in recent times.

In [5]:
from tensorflow.keras import layers model = keras.models.Sequential() model.add(layers.LSTM(128, input_shape=(maxlen, len(chars)))) model.add(layers.Dense(len(chars), activation='softmax')) 

Since our targets are one-hot encoded, we will use categorical_crossentropy as the loss to train the model:

In [6]:
optimizer = keras.optimizers.RMSprop(lr=0.01) model.compile(loss='categorical_crossentropy', optimizer=optimizer) 

Training the language model and sampling from it

Given a trained model and a seed text snippet, we generate new text by repeatedly:

  • 1) Drawing from the model a probability distribution over the next character given the text available so far
  • 2) Reweighting the distribution to a certain "temperature"
  • 3) Sampling the next character at random according to the reweighted distribution
  • 4) Adding the new character at the end of the available text

This is the code we use to reweight the original probability distribution coming out of the model, and draw a character index from it (the "sampling function"):

In [7]:
def sample(preds, temperature=1.0): preds = np.asarray(preds).astype('float64') preds = np.log(preds) / temperature exp_preds = np.exp(preds) preds = exp_preds / np.sum(exp_preds) probas = np.random.multinomial(1, preds, 1) return np.argmax(probas) 

Finally, this is the loop where we repeatedly train and generated text. We start generating text using a range of different temperatures after every epoch. This allows us to see how the generated text evolves as the model starts converging, as well as the impact of temperature in the sampling strategy.

In [8]:
import random import sys for epoch in range(1, 60): print('epoch', epoch) # Fit the model for 1 epoch on the available training data model.fit(x, y, batch_size=128, epochs=1) # Select a text seed at random start_index = random.randint(0, len(text) - maxlen - 1) generated_text = text[start_index: start_index + maxlen] print('--- Generating with seed: "' + generated_text + '"') for temperature in [0.2, 0.5, 1.0, 1.2]: print('------ temperature:', temperature) sys.stdout.write(generated_text) # We generate 400 characters for i in range(400): sampled = np.zeros((1, maxlen, len(chars))) for t, char in enumerate(generated_text): sampled[0, t, char_indices[char]] = 1. preds = model.predict(sampled, verbose=0)[0] next_index = sample(preds, temperature) next_char = chars[next_index] generated_text += next_char generated_text = generated_text[1:] sys.stdout.write(next_char) sys.stdout.flush() print() 
epoch 1
1565/1565 [==============================] - 14s 9ms/step - loss: 1.9697
--- Generating with seed: "let it be
permitted to designate by this expression the beli"
------ temperature: 0.2
let it be
permitted to designate by this expression the belies and some the self-contiment in the any the spirit of the say for the say the man and stan and stand and still one man are some and still of the say for the more the should the experience of the self-conscience of the self-in the and the exist of the more the should the exist of the say the self-conscience of the and the feases and condicious and states of the acts the sense the self-curtemption
------ temperature: 0.5
icious and states of the acts the sense the self-curtemption, the instinction of the morality one weal a case conduction of the earthing there should se for the all and finder conception of the more the some the formere. the at a moraling for the into a the despection of deep and self-dection of the sigie shound in them to and consigness, and man are they any
will general the more that is it is a sense of the were manger of the indesing the enters, exarmed
------ temperature: 1.0
sense of the were manger of the indesing the enters, exarmed ithe will light scient spacting in to fe sign to meare, liked encerstand: are with the shirally
foon caussive finhel to he, "p6ession, onitate of impasion of but in bloud; a man be
an--morality fow mabinebres and post whethers(_orders--is the
shild a more
have:--in present hatestageialequally bod; it from "say best by a false,
may a pe brail,
"
myasio "contuping within it
egom:, kant of "sympatio
------ temperature: 1.2
rail,
"
myasio "contuping within it
egom:, kant of "sympation; ragtlering bhacteezt., ti luec and nopence to that is thore le im-sphils asing, and mekeucimem of retodrancm tos.
a2u once surffar spivis anding fuint onciaveariace by coleres ouces-le, eremy virowe bamide have
fehemenesss. in
the yid thus
regral cladting"
-ipeible for pait
frumccordmal isps, love to geent expousies
theseupratt free which has alsophingd: in dumter onithes, it
is great sthen for
epoch 2
1565/1565 [==============================] - 14s 9ms/step - loss: 1.6167
--- Generating with seed: "about its being the best or the worst) and
that these ideas "
------ temperature: 0.2
about its being the best or the worst) and
that these ideas the consequently and the more the consequently and souls the consequently and such an accortion of the consequently the the content and the art an accounted the morality of the subtle of the latter the man in the consequently and attain the consequently the existence of the subtle the subtle and serition of the act of the constance of the not the consequently and self the consequently and such an 
------ temperature: 0.5
 not the consequently and self the consequently and such an existences of the strong the intimplesses or the ads under the sense, the will as the delicate feeling and sacriferent of the subtle explianly the soulter of the prenocle the to be a the time of but the strong the heart the nature, or the superity sight of the more there is any what is purit of the consequently and its friends with even the free simple, and in the will as the wart of the suberess 
------ temperature: 1.0
he free simple, and in the will as the wart of the suberess of they ancimal to arfiner enterlexised not with furdiamentvances act of huthed cormes it contannation, howeverintwame fear, an action and self,d the stalfn-ganes of the dogatey conis is the polity virtue the god and invented excliend the, one courtetacy the inslange of all iowned, a laffer
furreptions the cates harn byings open of the a delicate ancinch fcledo,
morality) that which an oppeenate l
------ temperature: 1.2
delicate ancinch fcledo,
morality) that which an oppeenate lithletxinces. it simply difficially could (bachigear-of europe of agriligy,. a parculalireavi.)
 be"-all, noteven, inverco-manial, the
gort ut and find that therress whochy
haught feeld heir, at sief hance, 
is fvensamentate horder by the suppossionware walt was, libe, partarisispssfy of a mind has to its"licy. it id, what cured. lotes basue: whihe taken time bvtward every
here abadmins thues who 
epoch 3
1565/1565 [==============================] - 15s 9ms/step - loss: 1.5280
--- Generating with seed: "the "otherwise"), nor does it address
itself to the individu"
------ temperature: 0.2
the "otherwise"), nor does it address
itself to the individual soul and should to the same that the same that the self-contemplation of the self-contiture of the strong and something of the self-call that when the self-calling to the same of the same that the self-contemplation of the same to man that it is that the self-corrant of the self-corrant and strong and stronger property of the self-contradict of the self-constance of the same soul and strong to 
------ temperature: 0.5
radict of the self-constance of the same soul and strong to obering of the self-cortion and the man as in not god that the times of the spicit that there are
later of the learny of the prople to the strength and something with a contended and as the most as one always, who seem of the same to methouther and later of the strength to be self-contully that, what we more for the ladding and contractlibuted in that the distrustion of every one his possible to d
------ temperature: 1.0
buted in that the distrustion of every one his possible to diffect of
the
searly for etrist his world--as being traupant that all impro?tion of
this qualt must undeemnes, more reterate of which with which shoride truths and wirds and pofitic origin and
anther ternian owing thas
engless prowly hid who outisgune of the
instruncing for a modest--he is
rightly and akindc ear honting,
done in prompuss, which
squirge--we charced, a think for dispirst of one has 
------ temperature: 1.2
 which
squirge--we charced, a think for dispirst of one has is flan to !

11. i man ourpved in
hound unto the are, recograiistly, in a bodiotes in invervients.---ashah tirlibal opinion; it is character powesty--skoul, labe was found and gencivication of the
bleaker aachoig;--who masks remathes philosophyriagesty objift of ects
no loveroginiresty, from noh feelration of perhondovantant and man anxurimabligion, an ax ambigionisifantion" doisism in fadt (w! i
epoch 4
1565/1565 [==============================] - 15s 9ms/step - loss: 1.4808
--- Generating with seed: "tire of "perfecting" ourselves in our
virtue, which alone re"
------ temperature: 0.2
tire of "perfecting" ourselves in our
virtue, which alone really of the present of the present and the pressing and suffering the same the predoce of a suffering to the present to the present and an entailed to the suffering of the present and the pressions of the present to the pressing of the conception of the pressing and suffer the subtle the pression of the present to our proper and the present to the predicious in the present and from the same as the
------ temperature: 0.5
nt to the predicious in the present and from the same as the "feeling and the tarnes when wishing of the misunderstand of morality of a desires of
the strong to sould and existence of all immoralist and puritical condition of the spirit of causes and also to the cheres and there is also the comprehend the soulse of his the sufferon, which the man his have deprehing in the last prode and cause and to of the preceams to auboug the philosophers as only in the
------ temperature: 1.0
to of the preceams to auboug the philosophers as only in the suproral an e signiain, in which ameribeness
of say exagonged after old already to their their virtues, for the treat our new dogenession. 
 sanism.--and the old reason, wishes he who hellance, that is not moral bad to
the germany emorism. sust they will if is not firmoure wishes
heid, as it has also to us as which they are pride. there, in chanberriris
he could quand,, who
referring
with the, ea
------ temperature: 1.2
 in chanberriris
he could quand,, who
referring
with the, easpeg,nes to according very have all eleble culture, as the are itself-grows is reffired tuen sphild, only out of the whoflearness" not this justme
toforehichespardly tread many not,
what it
boughr: ethic sapas;
by which
conduth that
acreng

super, timerowings phuloncenced, of fotuinlian, are all the
vided, if least stated too
cistment-nanma-change.--the
sencely truiss of which one different and
su
epoch 5
1565/1565 [==============================] - 15s 9ms/step - loss: 1.4511
--- Generating with seed: "f port royal, sainte-beuve, in spite of all
his hostility to"
------ temperature: 0.2
f port royal, sainte-beuve, in spite of all
his hostility to the same and something and something and best the precisely the subtle spirit of the states of the same distance and the subtle present and such a conscipuse and the distrust and the present and such a man and the fact of the same the contradicting and states which has a conscipuse and distance of the present of the same and such a man and such a man and mankind and precisely the subtle spiriture
------ temperature: 0.5
nd such a man and mankind and precisely the subtle spiriture the fact the best attermant looks and such as the part of the part of the bedom and possess, the dange
...................

As you can see, a low temperature results in extremely repetitive and predictable text, but where local structure is highly realistic: in particular, all words (a word being a local pattern of characters) are real English words. With higher temperatures, the generated text becomes more interesting, surprising, even creative; it may sometimes invent completely new words that sound somewhat plausible (such as "eterned" or "troveration"). With a high temperature, the local structure starts breaking down and most words look like semi-random strings of characters. Without a doubt, here 0.5 is the most interesting temperature for text generation in this specific setup. Always experiment with multiple sampling strategies! A clever balance between learned structure and randomness is what makes generation interesting.

Note that by training a bigger model, longer, on more data, you can achieve generated samples that will look much more coherent and realistic than ours. But of course, don't expect to ever generate any meaningful text, other than by random chance: all we are doing is sampling data from a statistical model of which characters come after which characters. Language is a communication channel, and there is a distinction between what communications are about, and the statistical structure of the messages in which communications are encoded. To evidence this distinction, here is a thought experiment: what if human language did a better job at compressing communications, much like our computers do with most of our digital communications? Then language would be no less meaningful, yet it would lack any intrinsic statistical structure, thus making it impossible to learn a language model like we just did.

Take aways

  • We can generate discrete sequence data by training a model to predict the next tokens(s) given previous tokens.
  • In the case of text, such a model is called a "language model" and could be based on either words or characters.
  • Sampling the next token requires balance between adhering to what the model judges likely, and introducing randomness.
  • One way to handle this is the notion of softmax temperature. Always experiment with different temperatures to find the "right" one.
 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM