建立雙向 LSTM

Created: November-22, 2018

import tensorflow as tf

dims, layers = 32, 2
# Creating the forward and backwards cells
lstm_fw_cell = tf.nn.rnn_cell.BasicLSTMCell(dims, forget_bias=1.0)
lstm_bw_cell = tf.nn.rnn_cell.BasicLSTMCell(dims, forget_bias=1.0)
# Pass lstm_fw_cell / lstm_bw_cell directly to tf.nn.bidrectional_rnn
# if only a single layer is needed
lstm_fw_multicell = tf.nn.rnn_cell.MultiRNNCell([lstm_fw_cell]*layers)
lstm_bw_multicell = tf.nn.rnn_cell.MultiRNNCell([lstm_bw_cell]*layers)

# tf.nn.bidirectional_rnn takes a list of tensors with shape 
# [batch_size x cell_fw.state_size], so separate the input into discrete
# timesteps.
_X = tf.unpack(state_below, axis=1)
# state_fw and state_bw are the final states of the forwards/backwards LSTM, respectively
outputs, state_fw, state_bw = tf.nn.bidirectional_rnn(lstm_fw_multicell, lstm_bw_multicell, _X, dtype='float32')

引數

state_below 是具有以下尺寸的 3D 張量：[batch_size，最大序列索引，dims]。這來自之前的操作，例如查詢單詞嵌入。
dims 是隱藏單位的數量。
layers 可以調整到 1 以上，以建立堆疊 LSTM 網路。

筆記

tf.unpack 可能無法確定給定軸的大小（如果是這種情況，請使用 nums 引數）。
在 LSTM 下新增額外的重量+偏差乘法可能會有所幫助（例如 tf.matmul(state_below, U) + b。