周傑倫幾乎陪伴了每個90后的青春,那如果AI寫傑倫風格的歌詞會寫成怎樣呢?
首先當然我們需要准備傑倫的歌詞,這里一共收錄了他的十幾張專輯,近5000多行歌詞。
原文檔格式:
第一步數據預處理
def preprocess(data): """ 對文本中的字符進行替換,空格轉換成逗號;換行變為句號。 """ data = data.replace(' ', ',') data = data.replace('\n', '。') words = jieba.lcut(data, cut_all=False) # 全模式切詞 return words
處理后結果:
前10個詞: ['想要', '有', '直升機', '。', '想要', '和', '你', '飛到', '宇宙', '去']
將處理完的數據寫入內存並將文本轉換完數字
# 構造詞典及映射 vocab = set(text) vocab_to_int = {w: idx for idx, w in enumerate(vocab)} int_to_vocab = {idx: w for idx, w in enumerate(vocab)} # 轉換文本為整數 int_text = [vocab_to_int[w] for w in text]
構建神經網絡
a. 構建輸入層
def get_inputs(): inputs = tf.placeholder(tf.int32, [None, None], name='inputs') targets = tf.placeholder(tf.int32, [None, None], name='targets') learning_rate = tf.placeholder(tf.float32, name='learning_rate') return inputs, targets, learning_rate
b. 構建堆疊RNN單元
其中rnn_size指的是RNN隱層神經元個數
def get_init_cell(batch_size, rnn_size): lstm = tf.contrib.rnn.BasicLSTMCell(rnn_size) cell = tf.contrib.rnn.MultiRNNCell([lstm]) initial_state = cell.zero_state(batch_size, tf.float32) initial_state = tf.identity(initial_state, 'initial_state') return cell, initial_state
c. Word Embedding
因為單詞太多,所以需要進行embedding,模型中加入Embedding層來降低輸入詞的維度
def get_embed(input_data, vocab_size, embed_dim): embedding = tf.Variable(tf.random_uniform([vocab_size, embed_dim], -1, 1)) embed = tf.nn.embedding_lookup(embedding, input_data) return embed
d. 構建神經網絡,將RNN層與全連接層相連
其中cell為RNN單元; rnn_size: RNN隱層結點數量;input_data即input tensor;vocab_size:詞匯表大小; embed_dim: 嵌入層大小
def build_nn(cell, rnn_size, input_data, vocab_size, embed_dim): embed = get_embed(input_data, vocab_size, embed_dim) outputs, final_state = build_rnn(cell, embed) logits = tf.contrib.layers.fully_connected(outputs, vocab_size, activation_fn=None) return logits, final_state
e. 構造batch
這里我們根據batch_size和seq_length分為len//(batch_size*seq_length)個batch,每個batch包含輸入和對應的目標輸出
def get_batches(int_text, batch_size, seq_length): ''' 構造batch ''' batch = batch_size * seq_length n_batch = len(int_text) // batch int_text = np.array(int_text[:batch * n_batch]) # 保留能構成完整batch的數量 int_text_targets = np.zeros_like(int_text) int_text_targets[:-1], int_text_targets[-1] = int_text[1:], int_text[0] # 切分 x = np.split(int_text.reshape(batch_size, -1), n_batch, -1) y = np.split(int_text_targets.reshape(batch_size, -1), n_batch, -1) return np.stack((x, y), axis=1) # 組合
模型訓練
from tensorflow.contrib import seq2seq train_graph = tf.Graph() with train_graph.as_default(): vocab_size = len(int_to_vocab) # vocab_size input_text, targets, lr = get_inputs() # 輸入tensor input_data_shape = tf.shape(input_text) # 初始化RNN cell, initial_state = get_init_cell(input_data_shape[0], rnn_size) logits, final_state = build_nn(cell, rnn_size, input_text, vocab_size, embed_dim) # 計算softmax層概率 probs = tf.nn.softmax(logits, name='probs') # 損失函數 cost = seq2seq.sequence_loss( logits, targets, tf.ones([input_data_shape[0], input_data_shape[1]])) # 優化函數 optimizer = tf.train.AdamOptimizer(lr) # Gradient Clipping gradients = optimizer.compute_gradients(cost) capped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients if grad is not None] train_op = optimizer.apply_gradients(capped_gradients)
訓練結果
Epoch 72 Batch 24/33 train_loss = 0.108 Epoch 75 Batch 25/33 train_loss = 0.104 Epoch 78 Batch 26/33 train_loss = 0.096 Epoch 81 Batch 27/33 train_loss = 0.111 Epoch 84 Batch 28/33 train_loss = 0.119 Epoch 87 Batch 29/33 train_loss = 0.130 Epoch 90 Batch 30/33 train_loss = 0.141 Epoch 93 Batch 31/33 train_loss = 0.138 Epoch 96 Batch 32/33 train_loss = 0.153 Model Trained and Saved
train_loss還不錯,不過可能過擬合了。
最后讓我們加載模型,看看生成情況
# 加載模型 loader = tf.train.import_meta_graph(save_dir + '.meta') loader.restore(sess, save_dir) # 獲取訓練的結果參數 input_text, initial_state, final_state, probs = get_tensors(loaded_graph) # Sentences generation setup gen_sentences = [prime_word] prev_state = sess.run(initial_state, {input_text: np.array([[1]])}) # 生成句子 for n in range(gen_length): dyn_input = [[vocab_to_int[word] for word in gen_sentences[-seq_length:]]] dyn_seq_length = len(dyn_input[0]) # 預測 probabilities, prev_state = sess.run( [probs, final_state], {input_text: dyn_input, initial_state: prev_state}) # 選擇單詞進行文本生成,用來以一定的概率生成下一個詞 pred_word = pick_word(probabilities[0][dyn_seq_length - 1], int_to_vocab) gen_sentences.append(pred_word)
哎喲不錯哦!
最后的最后我還擴大了歌詞庫,這次引入了更多流行歌手,來看看效果吧。
好像更不錯了!
如果你也喜歡傑倫,請點贊並分享生成的歌詞。