Skip to content Skip to sidebar Skip to footer

How To Solve Memoryerror In Sklearn When Fitting Huge Data To A Gmm?

I am trying to generate a Universal Background Model (UBM) based on a huge array of extracted MFCC features but I keep getting aMemoryError when I am fitting my data to the model.

Solution 1:

That's fairly straightforward using Dask. Just use Dask's DataFrame instead of pandas', and everything else should work without any changes. As an alternative to scikit-learn, you can use Turis' Graphlab Create, which could handle arbitrary large datasets (though I'm not sure it supports GMM).

Solution 2:

For those who have the same issue, I recommend the use of the Bob library, which supports big data processing and even offers parallel processing.

In my use-case Bob was a great fit for the development of GMM-UBM systems, as all the relevant functionalities are already implemented.

Post a Comment for "How To Solve Memoryerror In Sklearn When Fitting Huge Data To A Gmm?"