Spark reduceByKey Function

In Spark, the reduceByKey function is a frequently used transformation operation that performs aggregation of data. It receives key-value pairs (K, V) as an input, aggregates the values based on the key and generates a dataset of (K, V) pairs as an output.

Example of reduceByKey Function

In this example, we aggregate the values on the basis of key.

  • To open the Spark in Scala mode, follow the below command.
Spark reduceByKey Function
  • Create an RDD using the parallelized collection.

Now, we can read the generated result by using the following command.

Spark reduceByKey Function
  • Apply reduceByKey() function to aggregate the values.
  • Now, we can read the generated result by using the following command.
Spark reduceByKey Function

Here, we got the desired output.