Counting Colums In Csv File With Python
I want to count email accounts of male and female separately the code I wrote is not working properly so can anyone help me with this, please here is my code thank you in advance
Solution 1:
use pandas
import pandas as pd
df = pd.read_csv('your_csv_file.csv') # read in csvdf['domain'] = df['email'].apply(lambda x: x[x.index('@')+1:]) # column with just domain
male = {} # setup male dictionary
female = {} # setup female dictionary# iterate on unique domains to get a count of male/female and populate in dictionariesfor domain indf['domain'].unique():
male[domain] = df[(df['gender']=='M') & (df['domain']==domain)].shape[0]
female[domain] = df[(df['gender']=='F') & (df['domain']==domain)].shape[0]
Solution 2:
This can be done in pandas
. As your columns are unnamed, use header=None
when reading your csv
and access the columns by number:
import pandas as pd
df = pd.read_csv('1000 Records.csv', header=None)
df['mailhosts'] = df[6].str.split('@').str[-1]
gp = df.groupby(5)
#count e-mail accounts per gender:print('Female Email Accounts:', gp.get_group('F')['mailhosts'].value_counts())
print('Male Email Accounts:', gp.get_group('M')['mailhosts'].value_counts())
Solution 3:
Here is a solution that counts male and female accounts by domain using just standard Python modules:
import csv
from collections import Counter
males = Counter()
females = Counter()
withopen('1000 Records.csv') as f:
records = csv.reader(f)
for record in records:
_, domain = record[6].split('@')
gender = record[5]
if gender.lower() == 'm':
males.update((domain.lower(),))
else:
females.update((domain.lower(),))
print('Total male accounts:', sum(males.values()))
print('Total male accounts by domain')
for k, v in males.items():
print(k, v)
print('Total female accounts:', sum(females.values()))
print('Total female accounts by domain')
for k, v in females.items():
print(k, v)
Post a Comment for "Counting Colums In Csv File With Python"