摘要:本文從零帶你體驗量子神經網絡在自然語言處理中的應用
本文分享自華為雲社區《體驗量子神經網絡在自然語言處理中的應用》,原文作者:JeffDing。
本文從零帶你體驗量子神經網絡在自然語言處理中的應用。
一、運行環境
CPU:Intel(R) Core(TM) i7-4712MQ CPU @ 2.30GHz
內存:4GB
操作系統:Ubuntu 20.10
MindSpore版本:1.2
二、安裝Mindspore
參考官網安裝文檔:https://www.mindspore.cn/install/
安裝MindQuantum參照文檔 :https://gitee.com/mindspore/mindquantum/blob/r0.1/README_CN.md
通過Mindspore.__version__查看版本
三、體驗量子神經網絡在自然語言處理中的應用
1.環境准備
入包 import numpy as np import time from projectq.ops import QubitOperator import mindspore.ops as ops import mindspore.dataset as ds from mindspore import nn from mindspore.train.callback import LossMonitor from mindspore import Model from mindquantum.nn import MindQuantumLayer from mindquantum import Hamiltonian, Circuit, RX, RY, X, H, UN #數據預處理 def GenerateWordDictAndSample(corpus, window=2): all_words = corpus.split() word_set = list(set(all_words)) word_set.sort() word_dict = {w: i for i,w in enumerate(word_set)} sampling = [] for index, word in enumerate(all_words[window:-window]): around = [] for i in range(index, index + 2*window + 1): if i != index + window: around.append(all_words) sampling.append([around,all_words[index + window]]) return word_dict, sampling word_dict, sample = GenerateWordDictAndSample("I love natural language processing") print(word_dict) print('word dict size: ', len(word_dict)) print('samples: ', sample) print('number of samples: ', len(sample))
運行結果:
[注意] 當前模擬器線程為 1。如果您的模擬速度較慢,請根據您的模型將 OMP_NUM_THREADS 設置為適當的數字。
{'I': 0, 'language': 1, 'love': 2, 'natural': 3, 'processing': 4}
word dict size: 5
samples: [[['I', 'love', ' language', 'processing'], 'natural']]
樣本數:1
可知如上信息,我們得到該句子的詞典大小為 5,能夠產生一個樣本點。
2.編碼線路
def Genera**coderCircuit(n_qubits, prefix=''): if len(prefix) != 0 and prefix[-1] != '_': prefix += '_' circ = Circuit() for i in range(n_qubits): circ += RX(prefix + str(i)).on(i) return circ Genera**coderCircuit(3,prefix='e')
運行結果:
RX(e_0|0)
RX(e_1|1)
RX(e_2|2)
我們通常用|0⟩">|0⟩|0⟩和|1⟩">|1⟩|1⟩來標記二能級量子比特的兩個狀態,由態疊加原理,量子比特還可以處於這兩個狀態的疊加態:
|ψ⟩=α|0⟩+β|1⟩">|ψ⟩=α|0⟩+β|1⟩|ψ⟩=α|0⟩+β|1⟩
對於n">nn比特的量子態,其將處於2n">2n2n維的希爾伯特空間中。對於上面由5個詞構成的詞典,我們只需要⌈log25⌉=3">⌈log25⌉=3⌈log25⌉=3個量子比特即可完成編碼,這也體現出量子計算的優越性。
例如對於上面詞典中的“love”,其對應的標簽為2,2的二進制表示為010,我們只需將編碼線路中的e_0、e_1和e_2分別設為0">00、π">ππ和0">00即可。
#通過Evolution算子來驗證 from mindquantum.nn import generate_evolution_operator from mindspore import context from mindspore import Tensor n_qubits = 3 # number of qubits of this quantum circuit label = 2 # label need to encode label_bin = bin(label)[-1:1:-1].ljust(n_qubits,'0') # binary form of label label_array = np.array([int(i)*np.pi for i in label_bin]).astype(np.float32) # parameter value of encoder encoder = Genera**coderCircuit(n_qubits, prefix='e') # encoder circuit encoder_para_names = encoder.parameter_resolver().para_name # parameter names of encoder print("Label is: ", label) print("Binary label is: ", label_bin) print("Parameters of encoder is: \n", np.round(label_array, 5)) print("Encoder circuit is: \n", encoder) print("Encoder parameter names are: \n", encoder_para_names) context.set_context(mode=context.GRAPH_MODE, device_target="CPU") # quantum state evolution operator evol = generate_evolution_operator(param_names=encoder_para_names, circuit=encoder) state = evol(Tensor(label_array)) state = state.asnumpy() quantum_state = state[:, 0] + 1j * state[:, 1] amp = np.round(np.abs(quantum_state)**2, 3) print("Amplitude of quantum state is: \n", amp) print("Label in quantum state is: ", np.argmax(amp))
運行結果:
Label is: 2
Binary label is: 010
Parameters of encoder is:
[0. 3.14159 0. ]
Encoder circuit is:
RX(e_0|0)
RX(e_1|1)
RX(e_2|2)
Encoder parameter names are:
['e_0', 'e_1', 'e_2']
Amplitude of quantum state is:
[0. 0. 1. 0. 0. 0. 0. 0.]
Label in quantum state is: 2
通過上面的驗證,我們發現,對於標簽為2的數據,最后得到量子態的振幅最大的位置也是2,因此得到的量子態正是對輸入標簽的編碼。我們將對數據編碼生成參數數值的過程總結成如下函數。
def GenerateTrainData(sample, word_dict): n_qubits = np.int(np.ceil(np.log2(1 + max(word_dict.values())))) data_x = [] data_y = [] for around, center in sample: data_x.append([]) for word in around: label = word_dict[word] label_bin = bin(label)[-1:1:-1].ljust(n_qubits,'0') label_array = [int(i)*np.pi for i in label_bin] data_x[-1].extend(label_array) data_y.append(word_dict[center]) return np.array(data_x).astype(np.float32), np.array(data_y).astype(np.int32) GenerateTrainData(sample, word_dict)
運行結果:
(array([[0. , 0. , 0. , 0. , 3.1415927, 0. ,
3.1415927, 0. , 0. , 0. , 0. , 3.1415927]],
dtype=float32),
array([3], dtype=int32))
根據上面的結果,我們將4個輸入的詞編碼的信息合並為一個更長向量,便於后續神經網絡調用。
3.Ansatz線路
#定義如下函數生成Ansatz線路 def GenerateAnsatzCircuit(n_qubits, layers, prefix=''): if len(prefix) != 0 and prefix[-1] != '_': prefix += '_' circ = Circuit() for l in range(layers): for i in range(n_qubits): circ += RY(prefix + str(l) + '_' + str(i)).on(i) for i in range(l % 2, n_qubits, 2): if i < n_qubits and i + 1 < n_qubits: circ += X.on(i + 1, i) return circ GenerateAnsatzCircuit(5, 2, 'a')
運行結果:
RY(a_0_0|0)
RY(a_0_1|1)
RY(a_0_2|2)
RY(a_0_3|3)
RY(a_0_4|4)
X(1 <-: 0)
X(3 <-: 2)
RY(a_1_0|0)
RY(a_1_1|1)
RY(a_1_2|2)
RY(a_1_3|3)
RY(a_1_4|4)
X(2 <-: 1)
X(4 <-: 3)
4.測量
def GenerateEmbeddingHamiltonian(dims, n_qubits): hams = [] for i in range(dims): s = '' for j, k in enumerate(bin(i + 1)[-1:1:-1]): if k == '1': s = s + 'Z' + str(j) + ' ' hams.append(Hamiltonian(QubitOperator(s))) return hams GenerateEmbeddingHamiltonian(5, 5)
運行結果:
[1.0 Z0, 1.0 Z1, 1.0 Z0 Z1, 1.0 Z2, 1.0 Z0 Z2]
5.量子版詞向量嵌入層
運行之前請在終端運行export OMP_NUM_THREADS=4
def QEmbedding(num_embedding, embedding_dim, window, layers, n_threads): n_qubits = int(np.ceil(np.log2(num_embedding))) hams = GenerateEmbeddingHamiltonian(embedding_dim, n_qubits) circ = Circuit() circ = UN(H, n_qubits) encoder_param_name = [] ansatz_param_name = [] for w in range(2 * window): encoder = Genera**coderCircuit(n_qubits, 'Encoder_' + str(w)) ansatz = GenerateAnsatzCircuit(n_qubits, layers, 'Ansatz_' + str(w)) encoder.no_grad() circ += encoder circ += ansatz encoder_param_name.extend(list(encoder.parameter_resolver())) ansatz_param_name.extend(list(ansatz.parameter_resolver())) net = MindQuantumLayer(encoder_param_name, ansatz_param_name, circ, hams, n_threads=n_threads) return net class CBOW(nn.Cell): def __init__(self, num_embedding, embedding_dim, window, layers, n_threads, hidden_dim): super(CBOW, self).__init__() self.embedding = QEmbedding(num_embedding, embedding_dim, window, layers, n_threads) self.dense1 = nn.Dense(embedding_dim, hidden_dim) self.dense2 = nn.Dense(hidden_dim, num_embedding) self.relu = ops.ReLU() def construct(self, x): embed = self.embedding(x) out = self.dense1(embed) out = self.relu(out) out = self.dense2(out) return out class LossMonitorWithCollection(LossMonitor): def __init__(self, per_print_times=1): super(LossMonitorWithCollection, self).__init__(per_print_times) self.loss = [] def begin(self, run_context): self.begin_time = time.time() def end(self, run_context): self.end_time = time.time() print('Total time used: {}'.format(self.end_time - self.begin_time)) def epoch_begin(self, run_context): self.epoch_begin_time = time.time() def epoch_end(self, run_context): cb_params = run_context.original_args() self.epoch_end_time = time.time() if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0: print('') def step_end(self, run_context): cb_params = run_context.original_args() loss = cb_params.net_outputs if isinstance(loss, (tuple, list)): if isinstance(loss[0], Tensor) and isinstance(loss[0].asnumpy(), np.ndarray): loss = loss[0] if isinstance(loss, Tensor) and isinstance(loss.asnumpy(), np.ndarray): loss = np.mean(loss.asnumpy()) cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1 if isinstance(loss, float) and (np.isnan(loss) or np.isinf(loss)): raise ValueError("epoch: {} step: {}. Invalid loss, terminating training.".format( cb_params.cur_epoch_num, cur_step_in_epoch)) self.loss.append(loss) if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0: print("\repoch: %+3s step: %+3s time: %5.5s, loss is %5.5s" % (cb_params.cur_epoch_num, cur_step_in_epoch, time.time() - self.epoch_begin_time, loss), flush=True, end='') import mindspore as ms from mindspore import context from mindspore import Tensor context.set_context(mode=context.GRAPH_MODE, device_target="CPU") corpus = """We are about to study the idea of a computational process. Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes. In effect, we conjure the spirits of the computer with our spells.""" ms.set_seed(42) window_size = 2 embedding_dim = 10 hidden_dim = 128 word_dict, sample = GenerateWordDictAndSample(corpus, window=window_size) train_x,train_y = GenerateTrainData(sample, word_dict) train_loader = ds.NumpySlicesDataset({ "around": train_x, "center": train_y },shuffle=False).batch(3) net = CBOW(len(word_dict), embedding_dim, window_size, 3, 4, hidden_dim) net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean') net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9) loss_monitor = LossMonitorWithCollection(500) model = Model(net, net_loss, net_opt) model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)
運行結果:
epoch: 25 step: 20 time: 36.14, loss is 3.154
epoch: 50 step: 20 time: 36.51, loss is 2.945
epoch: 75 step: 20 time: 36.71, loss is 0.226
epoch: 100 step: 20 time: 36.56, loss is 0.016
Total time used: 3668.7517251968384
打印收斂過程中的損失函數值:
import matplotlib.pyplot as plt plt.plot(loss_monitor.loss,'.') plt.xlabel('Steps') plt.ylabel('Loss') plt.show()
打印量子嵌入層的量子線路中的參數
net.embedding.weight.asnumpy()
array([-6.4384632e-02, -1.2658586e-01, 1.0083634e-01, -1.3011757e-01,
1.4005195e-03, -1.9296107e-04, -7.9315618e-02, -2.9339856e-01,
7.6259784e-02, 2.9878360e-01, -1.3091319e-04, 6.8271365e-03,
-8.5563213e-02, -2.4168481e-01, -8.2548901e-02, 3.0743122e-01,
-7.8157615e-04, -3.2907310e-03, -1.4412615e-01, -1.9241245e-01,
-7.5561814e-02, -3.1189525e-03, 3.8330450e-03, -1.4486053e-04,
-4.8195502e-01, 5.3657538e-01, 3.8986996e-02, 1.7286544e-01,
-3.4090234e-03, -9.5573599e-03, -4.8208281e-01, 5.9604627e-01,
-9.7009525e-02, 1.8312852e-01, 9.5267012e-04, -1.2261710e-03,
3.4219343e-02, 8.0031365e-02, -4.5349425e-01, 3.7360430e-01,
8.9665735e-03, 2.1575980e-03, -2.3871836e-01, -2.4819574e-01,
-6.2781256e-01, 4.3640310e-01, -9.7688911e-03, -3.9542126e-03,
-2.4010721e-01, 4.8120108e-02, -5.6876510e-01, 4.3773583e-01,
4.7241263e-03, 1.4138421e-02, -1.2472854e-03, 1.1096644e-01,
7.1980711e-03, 7.3047012e-02, 2.0803964e-02, 1.1490706e-02,
8.6638138e-02, 2.0503466e-01, 4.7177267e-03, -1.8399477e-02,
1.1631225e-02, 2.0587114e-03, 7.6739892e-02, -6.3548386e-02,
1.7298019e-01, -1.9143591e-02, 4.1606693e-04, -9.2881303e-03],
dtype=float32)
6.經典版詞向量嵌入層
class CBOWClassical(nn.Cell): def __init__(self, num_embedding, embedding_dim, window, hidden_dim): super(CBOWClassical, self).__init__() self.dim = 2 * window * embedding_dim self.embedding = nn.Embedding(num_embedding, embedding_dim, True) self.dense1 = nn.Dense(self.dim, hidden_dim) self.dense2 = nn.Dense(hidden_dim, num_embedding) self.relu = ops.ReLU() self.reshape = ops.Reshape() def construct(self, x): embed = self.embedding(x) embed = self.reshape(embed, (-1, self.dim)) out = self.dense1(embed) out = self.relu(out) out = self.dense2(out) return out train_x = [] train_y = [] for i in sample: around, center = i train_y.append(word_dict[center]) train_x.append([]) for j in around: train_x[-1].append(word_dict[j]) train_x = np.array(train_x).astype(np.int32) train_y = np.array(train_y).astype(np.int32) print("train_x shape: ", train_x.shape) print("train_y shape: ", train_y.shape) train_loader = ds.NumpySlicesDataset({ "around": train_x, "center": train_y },shuffle=False).batch(3) net = CBOWClassical(len(word_dict), embedding_dim, window_size, hidden_dim) net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean') net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9) loss_monitor = LossMonitorWithCollection(500) model = Model(net, net_loss, net_opt) model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)
運行結果:
train_x shape: (58, 4)
train_y shape: (58,)
epoch: 25 step: 20 time: 0.077, loss is 3.156
epoch: 50 step: 20 time: 0.095, loss is 3.025
epoch: 75 step: 20 time: 0.115, loss is 2.996
epoch: 100 step: 20 time: 0.088, loss is 1.773
epoch: 125 step: 20 time: 0.083, loss is 0.172
epoch: 150 step: 20 time: 0.110, loss is 0.008
epoch: 175 step: 20 time: 0.086, loss is 0.003
epoch: 200 step: 20 time: 0.081, loss is 0.001
epoch: 225 step: 20 time: 0.081, loss is 0.000
epoch: 250 step: 20 time: 0.078, loss is 0.000
epoch: 275 step: 20 time: 0.079, loss is 0.000
epoch: 300 step: 20 time: 0.080, loss is 0.000
epoch: 325 step: 20 time: 0.078, loss is 0.000
epoch: 350 step: 20 time: 0.081, loss is 0.000
Total time used: 30.569124698638916
收斂圖:
由上可知,通過量子模擬得到的量子版詞嵌入模型也能很好的完成嵌入任務。當數據集大到經典計算機算力難以承受時,量子計算機將能夠輕松處理這類問題。