[Solved] Pytorch-transformers Error: AttributeError: ‘str‘ object has no attribute ‘shape‘

Error Message:
AttributeError: ‘str’ object has no attribute ‘shape’

"""Encoding Chinese text using bert-chinese pre-training model """
# Introduce torch model
import torch
# Introduce the neural network model in the torch model
import torch.nn as nn

# 1. get Google's already trained bert-base-chinese model related to Chinese information via torch.hub (a migration-focused tool in pytorch)
# The parameters are fixed
model = torch.hub.load('huggingface/pytorch-transformers', 'model', 'bert-base-chinese')

# 2. import the corresponding character mapper, which will map each Chinese character to a number
tokenizer = torch.hub.load('huggingface/pytorch-transformers', 'tokenizer', 'bert-base-chinese')


# 3. Map Chinese text into a bert encoded text tensor representation
def get_bert_encode_for_single(text):
    """
    description: encode Chinese text using bert-chinese pre-training model [map Chinese text to bert-encoded text tensor representation
    :param text: the text to be encoded
    :return: use the bert-encoded text tensor representation
    """
    # 3.1 First, each Chinese character is mapped using the character mapper, and the parameter is the text of the incoming Chinese message. This converts the Chinese message into an encoded message
    # The reason for [1:-1]: Note here that bert's tokenizer mapping will add start and end tokens before and after the result, i.e. 101 and 102
    # This makes sense for encoding multiple pieces of text, but not in our case, so use [1:-1] to slice the head and tail
    indexed_tokens = tokenizer.encode(text)[1:-1]
    # After 3.2 transform the list structure into a tensor tensor [put the encoded information in torch.tensor
    tokens_tensor = torch.tensor([indexed_tokens])
    print(tokens_tensor)
    # 3.3 Run the values directly from the model without automatically computing the gradient [the prediction part needs to make the model not auto-derive].
    with torch.no_grad():
        # 3.4 Call the model to get the hidden output
        encoded_layers, _ = model(tokens_tensor)
    print(f"encoded_layers.shape={encoded_layers.shape}")

reason:

The problem is that the return type has changed since the 3.xx version of transformers. Therefore, we explicitly require the tuple tuple of tensors tensor.

Therefore, we can pass an additional parameter return when calling model ()_ Dict = false to obtain the actual tensor corresponding to the last hidden state.

Solution:

 encoded_layers, _ = model(tokens_tensor, return_dict=False)

Read More: