Key Error While Fine Tunning T5 For Summarization With Huggingface
I am trying to fine tune the T5 transformer for summarization but I am receiving a key error message: KeyError: 'Indexing with integers (to access backend Encoding for a given batc
Solution 1:
This is because this tokenizer returns an object with the following structure
You have to amend the __getitem__
method of your dataset class along the lines of
classForT5Dataset(torch.utils.data.Dataset):def__init__(self, inputs, targets):
self.inputs = inputs
self.targets = targets
def__len__(self):
return len(self.targets)
def__getitem__(self, index):
input_ids = torch.tensor(self.inputs["input_ids"][index]).squeeze()
target_ids = torch.tensor(self.targets["input_ids"][index]).squeeze()
return {"input_ids": input_ids, "labels": target_ids}
and pass data prop when initializing, like:
train_ds = ForT5Dataset(train_in.data, train_out.data)
.
Post a Comment for "Key Error While Fine Tunning T5 For Summarization With Huggingface"