Skip to content Skip to sidebar Skip to footer

Key Error While Fine Tunning T5 For Summarization With Huggingface

I am trying to fine tune the T5 transformer for summarization but I am receiving a key error message: KeyError: 'Indexing with integers (to access backend Encoding for a given batc

Solution 1:

This is because this tokenizer returns an object with the following structure Tokenizer outpu

You have to amend the __getitem__ method of your dataset class along the lines of

classForT5Dataset(torch.utils.data.Dataset):def__init__(self, inputs, targets):
        self.inputs = inputs
        self.targets = targets
    
    def__len__(self):
        return len(self.targets)
    
    def__getitem__(self, index):
        input_ids = torch.tensor(self.inputs["input_ids"][index]).squeeze()
        target_ids = torch.tensor(self.targets["input_ids"][index]).squeeze()
        
        return {"input_ids": input_ids, "labels": target_ids}

and pass data prop when initializing, like: train_ds = ForT5Dataset(train_in.data, train_out.data).

Post a Comment for "Key Error While Fine Tunning T5 For Summarization With Huggingface"