Skip to content Skip to sidebar Skip to footer

Selecting Columns From A Pandas Dataframe Based On Columns Conditions

I want to select to new dataframe, columns that have 'C' in value protein 1 2 3 4 5 prot1 C M D F A prot2 C D A M A prot3 C C D F A prot4

Solution 1:

In [22]: df[['protein']].join(df[df.columns[df.eq('C').any()]])
Out[22]:
  protein  1  2  4
0   prot1  C  M  F
1   prot2  C  D  M
2   prot3  C  C  F
3   prot4  S  D  C
4   prot5  S  D  I

Solution 2:

Use:

np.random.seed(123)
n = np.random.choice(['C','M','D', '-'], size=(3,10))
n[:,0] = ['a','b','w']
foo = pd.DataFrame(n) 
print (foo)
   0  1  2  3  4  5  6  7  8  9
0  a  M  D  D  C  D  D  M  -  D
1  b  M  D  M  C  M  D  -  M  C
2  w  C  -  M  -  D  M  C  C  C

mask = foo.eq('C').any()
#set columns which need in output
mask.loc[0] = True

#filter
print (foo.loc[:,mask])
   0  1  4  7  8  9
0  a  M  C  M  -  D
1  b  M  C  -  M  C
2  w  C  -  C  C  C

Post a Comment for "Selecting Columns From A Pandas Dataframe Based On Columns Conditions"