How To Solve Memoryerror In Sklearn When Fitting Huge Data To A Gmm?
I am trying to generate a Universal Background Model (UBM) based on a huge array of extracted MFCC features but I keep getting aMemoryError when I am fitting my data to the model.
Solution 1:
That's fairly straightforward using Dask. Just use Dask's DataFrame instead of pandas', and everything else should work without any changes. As an alternative to scikit-learn, you can use Turis' Graphlab Create, which could handle arbitrary large datasets (though I'm not sure it supports GMM).
Solution 2:
For those who have the same issue, I recommend the use of the Bob library, which supports big data processing and even offers parallel processing.
In my use-case Bob was a great fit for the development of GMM-UBM systems, as all the relevant functionalities are already implemented.
Post a Comment for "How To Solve Memoryerror In Sklearn When Fitting Huge Data To A Gmm?"