Abstract: Big data clustering on Spark is a practical method that makes use of Apache Spark’s distributed computing capabilities to handle clustering tasks on massive datasets such as big data sets.