OpenAI embeddings: The Key to Powerful Text Clustering

How openai embeddings can give you high quality text clusters

Pranay Dave
4 min readApr 15, 2023
Image by author

OpenAI recently released API for word embeddings. I tried to use to for text clustering and the results are magical. In will demonstrate the high quality text clustering results through following :

  • Visual comparision with non-embedding (TFIDF) technique
  • Clarity of cluster interpretation obtained using openai embeddings

The dataset which I will be using to illustrate this blog is for amazon reviews for fine foods

Amazon fine food review dataset (image by author)

Text clustering without embedding

To appreciate something good, sometimes you need to try out something bad ! So before I show you text clustering with OpenAI embeddings, here is text clustering with TFIDF approach.

Text clustering with TF-IDF (image by author)

Each dot in above scatterplot represents a review. The color of the dots is the cluster to which the review is assigned.

--

--