Skip to content

The comment in the Bi-LSTM (Attention) model has an issue. #84

@tmracy

Description

@tmracy

The comment # output : [batch_size, len_seq, n_hidden] should indeed be corrected to # output : [batch_size, len_seq, n_hidden*2] because the Bi-LSTM model is bidirectional. In a bidirectional LSTM, the hidden size is effectively doubled, as it concatenates the forward and backward hidden states. Therefore, the correct shape of the output after permutation is [batch_size, len_seq, n_hidden * 2].

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions