The following errors occur when using the kmeans method for clustering. My default setting is 100 categories, and there are more than 100 data. The following errors are reported in the clustering process:
ConvergenceWarning: Number of distinct clusters (99) found smaller than n_clusters (100). Possibly due to duplicate points in X.
First locate the error code:
kmeans = KMeans(n_clusters=k, random_state=2018)
kmeans.fit(XData)
pre_y = kmeans.predict(XData)
It is probably one of these three items. Finally, it is found that the prompt appears in the fit() function. Use the try method to grab the error, locate the breakpoint for debugging, and grab the ConvergenceWarning error:
# Build the clustering model object
kmeans = KMeans(n_clusters=k, random_state=2018)
# Train the clustering model
try:
kmeans.fit(XData) # If the data has duplicates, the data will be smaller than the K value when fitting, so the K value needs to be updated
except ConvergenceWarning:
print("Catch ConvergenceWarning,k={}\n".format(k))
except:
print("k={}\n".format(k))
# Predictive Clustering Model
pre_y = kmeans.predict(XData)
After checking, the outgoing line is caused by repeated XData data, so the K value is determined in the early stage, and the problem can be solved by repeating it here
previous K value determination Code:
XData = np.array(X)
if(XData.shape[0]<=k and XData.shape[0]!=0):
print("XData size:",XData.shape[0])
k=XData.shape[0]//2
print("K:",k)
if(k<=0):
return result
The changed code is de duplicated by using set
XData = np.array(list(set(X)))#2022.3.22 zph comes with a de-duplication effect
if(XData.shape[0]<=k and XData.shape[0]!=0):
print("XData size:",XData.shape[0])
k=XData.shape[0]//2
print("K:",k)
if(k<=0):
return result
The above is hereby recorded for reference by those who have the same problems.
Read More:
- [Solved] Error: [email protected]: wrong number of arguments (given 1, expected 0)
- [Solved] Python Error: tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
- Python error: urllib.error.HTTPError : http Error 404: not found
- [Solved] R Error: Python module tensorflow.keras was not found.
- Python: RNN principle realized by numpy
- [Solved] gyp verb `which` failed Error: not found: python2
- [Solved] NPM install Error: check python checking for Python executable python2 in the PATH
- [Mac Pro M1] Python3.9 import cv2 Error: Reason: image not found
- Python TypeError: coercing to Unicode: need string or buffer, NoneType found
- Implementation of Kalman Filter in Python
- Facenet validate_on_lfw.py Error AssertionError: The number of LFW images must be an integer multip
- Invalid python sd, Fatal Python error: init_fs_encoding: failed to get the Python cod [How to Solve]
- [Solved] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter st
- MYSQL-python Install EnvironmentError: mysql_config not found
- Set the maximum number of Postgres connections Error [How to Solve]
- [Solved] Python Keras Error: AttributeError: ‘Sequential‘ object has no attribute ‘predict_classes‘
- How to Solve Python WARNING: Ignoring invalid distribution -ip (e:\python\python_dowmload\lib\site-packages)
- [Solved] urllib.error.URLError: <urlopen error [SSL: WRONG_VERSION_NUMBER] wrong version number
- Python: Panda scramble data