Using Python To Remove Duplicated Contents In Cells In Excel
I am trying to use Python to remove the duplicated contents in the cells of an Excel Spreadsheet. The data is in 1 column, in the original file. (names separated by “, ” in eac
Solution 1:
dict.fromkeys()
takes a sequence
and not a string
Try this:
for row_index inrange(0, old_sheet.nrows):
column_con = old_sheet.cell(row_index, 0).value
# First split into a list and convert to sequence
column_con = tuple(column_con.split(', '))
aaa = dict.fromkeys(column_con).keys()
# Since aaa is a list of keys, you also need to join them in a string
aaa = ', '.join(aaa)
new_sheet.write(row_index, 0, aaa)
Solution 2:
use set to store the data you read from excel
data=xlrd.open_workbook("C:\\Users\\I307658\\Desktop\\test.xlsx")
old_sheet = data.sheet_by_index(0)
new_file = xlwt.Workbook(encoding='utf-8', style_compression = 0)
new_sheet = new_file.add_sheet('Result', cell_overwrite_ok = True)
for row_index in range(0, old_sheet.nrows):
column_con = old_sheet.cell(row_index, 0).value
print column_con
aaa =set(column_con.split(","))
print', '.join(aaa)
new_sheet.write(row_index, 0, ', '.join(aaa))
new_file.save("C:\\Users\\I307658\\Desktop\\Book New 1.xls")
Post a Comment for "Using Python To Remove Duplicated Contents In Cells In Excel"