Comparing/extracting Data From Matrices Using Python (2.6.1)
Solution 1:
To read the data, you should be able to use numpy.genfromtext. See the documentation, there is a ton of functionality within this function. To read your example above, you might do:
from numpy importgenfromtxtrdata= genfromtxt('AllcorrR.csv', skip_header=1)[:,1:]
Pdata = genfromtxt('AllcorrP.csv', skip_header=1)[:,1:]
The [:,1:] is to ignore the first column of data when read in. The function doesn't have an input to "ignore the first x columns" like it does for rows (via skip_header). Not sure why they didn't implement this, it always bugged me.
This would just read the data for P (can also do this for r). Then you can filter the data pretty easily. You could read in the first row and column separated to get the headings. Or if you see the genfromtxt documentation, you could also name them (create a recarray).
To find the indices (values) where r is less then 0.50, you can simply do a comparison and numpy automagically creates a boolean array for you:
print Pdata < 0.05
This can be used as an index into rdata (make sure there are the same number of rows/columns):
print rdata[Pdata < 0.05]
Solution 2:
You can do something like this to get a list of tuples, containing the row and column headers, and the r and P values of the data elements you're interested in:
infile_r = open('AllcorrR.csv', 'r')
infile_p = open('AllcorrP.csv', 'r')
# Read the first line of each file.
line_r = infile_r.readline()
line_p = infile_p.readline()
# Set the separator depending on the file format.
SEPARATOR = None# Elements separated by whitespace.
column_headers = line_r.split(SEPARATOR)
significant = []
# Read the rest of the lines.for line_r in infile_r:
line_p = infile_p.readline()
tokens_r = line_r.split(SEPARATOR)
tokens_p = line_p.split(SEPARATOR)
row_header = tokens_r[0]
values_r = [float(v) for v in tokens_r[1:]]
values_p = [float(v) for v in tokens_p[1:]]
significant.extend([(row_header, column_header, r, p) for column_header, r, p inzip(column_headers, values_r, values_p) if p < 0.05])
print significant
Post a Comment for "Comparing/extracting Data From Matrices Using Python (2.6.1)"