LSTMCell

LSTMCell#

class braintrace.nn.LSTMCell(in_size, out_size, w_init=XavierNormal(scale=1.0, unit=1), b_init=ZeroInit(unit=1), state_init=ZeroInit(unit=1), activation='tanh', name=None)#

Long short-term memory (LSTM) RNN core.

The implementation is based on (zaremba, et al., 2014) [1]. Given \(x_t\) and the previous state \((h_{t-1}, c_{t-1})\) the core computes

\[\begin{split}\begin{array}{ll} i_t = \sigma(W_{ii} x_t + W_{hi} h_{t-1} + b_i) \\ f_t = \sigma(W_{if} x_t + W_{hf} h_{t-1} + b_f) \\ g_t = \tanh(W_{ig} x_t + W_{hg} h_{t-1} + b_g) \\ o_t = \sigma(W_{io} x_t + W_{ho} h_{t-1} + b_o) \\ c_t = f_t c_{t-1} + i_t g_t \\ h_t = o_t \tanh(c_t) \end{array}\end{split}\]

where \(i_t\), \(f_t\), \(o_t\) are input, forget and output gate activations, and \(g_t\) is a vector of cell updates.

The output is equal to the new hidden, \(h_t\).

Parameters:

in_size (Union[int, Sequence[int], integer, Sequence[integer]]) – The dimension of the input vector.
out_size (Union[int, Sequence[int], integer, Sequence[integer]]) – The number of hidden unit in the node.
w_init (Union[Array, ndarray, bool, number, bool, int, float, complex, Quantity, Callable]) – The input weight initializer. Default is XavierNormal().
b_init (Union[Array, ndarray, bool, number, bool, int, float, complex, Quantity, Callable]) – The bias weight initializer. Default is ZeroInit().
state_init (Union[Array, ndarray, bool, number, bool, int, float, complex, Quantity, Callable]) – The state initializer. Default is ZeroInit().
activation (Union[str, Callable]) – The activation function. It can be a string or a callable function. Default is ‘tanh’.
name (str) – The name of the module. Default is None.

Notes

Forget gate initialization: Following (Jozefowicz, et al., 2015) [2] we add 1.0 to \(b_f\) after initialization in order to reduce the scale of forgetting in the beginning of the training.

References

Examples

>>> import braintrace
>>> import brainstate
>>>
>>> # Create an LSTM cell
>>> lstm_cell = braintrace.nn.LSTMCell(in_size=256, out_size=512)
>>> lstm_cell.init_state(batch_size=20)
>>>
>>> # Process a sequence of inputs
>>> x = brainstate.random.randn(20, 256)
>>> h = lstm_cell(x)
>>> print(h.shape)
(20, 512)

init_state(batch_size=None, **kwargs)[source]#: State initialization function.

reset_state(batch_size=None, **kwargs)[source]#: State resetting function.

update(x)[source]#

Advance the cell by one time step.

Updates both the cell state c and the hidden state h in place.

Parameters:: x (ArrayLike) – Input for the current step, of shape (..., in_size).
Returns:: The updated hidden state h, of shape (..., out_size).
Return type:: ArrayLike

LSTMCell

Contents

LSTMCell#