How To Random Sample Lognormal Data In Python Using The Inverse Cdf And Specify Target Percentiles?
Solution 1:
First, I'm not sure about Use the PDF for a normal distribution centred around 2.5
. After all, log-normal is about base e
logarithm (aka natural log), which means 320 = 10 = e.
Second, I would approach problem in a different way. You need m
and s
to sample from Log-Normal.
If you look at wiki article above, you could see that it is two-parametric distribution. And you have exactly two conditions:
Mode = exp(m - s*s) = 32080% samples in [100,1000] => CDF(1000,m,s) - CDF(100,m,s) = 0.8
where CDF is expressed via error function (which is pretty much common function found in any library)
So two non-linear equations for two parameters. Solve them, find m
and s
and put it into any standard log-normal sampling
Solution 2:
Severin's approach is much leaner than my original attempt using the Smirnov transform. This is the code that worked for me (using fsolve to find s, although its quite trivial to do it manually):
# Find lognormal distribution, with mode at 320 and 80% of probability mass between 100 and 1000# Use fsolve to find the roots of the non-linear equation
%matplotlib inline
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fsolve
from scipy.stats import lognorm
import math
target_modal_value = 320# Define function to find roots ofdefequation(s):
# From Wikipedia: Mode = exp(m - s*s) = 320
m = math.log(target_modal_value) + s**2# Get probability mass from CDF at 100 and 1000, should equal to 0.8.# Rearange equation so that =0, to find root (value of s)return (lognorm.cdf(1000,s=s, scale=math.exp(m)) - lognorm.cdf(100,s=s, scale=math.exp(m)) -0.8)
# Solve non-linear equation to find s
s_initial_guess = 1
s = fsolve(equation, s_initial_guess)
# From s, find m
m = math.log(target_modal_value) + s**2print('m='+str(m)+', s='+str(s)) #(m,s))# Plot
x = np.arange(0,2000,1)
y = lognorm.pdf(x,s=s, scale=math.exp(m))
fig, ax = plt.subplots()
ax.plot(x, y, 'r-', lw=5, alpha=0.6, label='norm pdf')
plt.plot((100,100), (0,1), 'k--')
plt.plot((320,320), (0,1), 'k-.')
plt.plot((1000,1000), (0,1), 'k--')
plt.ylim(0,0.0014)
plt.savefig('lognormal_100_320_1000.png')
Post a Comment for "How To Random Sample Lognormal Data In Python Using The Inverse Cdf And Specify Target Percentiles?"