Skip to content Skip to sidebar Skip to footer

Choose R Outcomes From N Possibilities Efficiently In Pandas

I have a 50 years data. I need to choose the combination of 30 years out of it such that the values corresponding to them reach a particular threshold value but the possible number

Solution 1:

My previous answer was off base so I'm going to try again. From re-reading your question it looks like you are looking for one result of 30 years where the mean of Prs_100 values is greater than 460.

The following code can do this, but when I ran it, I had started having difficulties after about 415 for a mean value.

After running, you get a list of years 'years_list' and a list of values 'Prs_100_list' meeting the criteria of mean > 460 (415 in the example below).

Here is my code, hope this is in the area of what you are looking for.

from math import factorial
import numpy as np
import pandas as pd
from itertools import combinations
import time

# start a timer
start = time.time()

# array of values to work with, corresponding to the years 2012 - 2062
prs_100 = np.array([
       425.189729, 256.382494, 363.309507, 578.728535, 309.311562,
       476.388839, 441.47957 , 342.267756, 388.133403, 405.007245,
       316.108551, 392.193322, 296.545395, 467.38819 , 644.588971,
       301.086631, 478.492618, 435.868944, 467.464995, 323.465049,
       391.201598, 548.911349, 381.252838, 451.175339, 281.921215,
       403.840004, 460.51425 , 409.134409, 312.182576, 320.246886,
       290.163454, 381.432168, 259.228592, 393.841815, 342.999972,
       337.491898, 486.13901 , 318.278012, 385.919542, 309.472316,
       307.756455, 338.596315, 322.508536, 385.428138, 339.379743,
       420.428529, 417.143175, 361.643381, 459.861622, 374.359335])

# build dataframe with prs_100 as index and years as values, so that  years can be returned easily.
df = pd.DataFrame(list(range(2012, 2062)), index=prs_100, columns=['years'])

df.index.name = 'Prs_100'# set combination parameters
r =  30
n = len(prs_100)

Prs_100_list = []
years_list = []
count = 0for p in combinations(prs_100, r):
    if np.mean(p) > 391and np.mean(p) < 400:
        Prs_100_list.append(p)
        years_list.append(df.loc[p,'years'].values.tolist())
        # build in some exit
        count += 1if count > 100: 
            break

Solution 2:

You can use numpy's random.choice:

In [11]: df.iloc[np.random.choice(np.arange(len(df)), 3)]
Out[11]:
         Prs_100
Yrs
2023392.1933222047337.4918982026644.588971

Post a Comment for "Choose R Outcomes From N Possibilities Efficiently In Pandas"