使用Spark-shell 進行word count

Step 1: 將word.txt上傳至HDFS

$ hadoop fs –put word.txt

Step 2: 啟動spark-shell

$ spark-shell

Step 3: 用spark-shell執行word count

scala> val textFile = sc.textFile("word.txt")
textFile: spark.RDD[String] = spark.MappedRDD@2ee9b6e3

scala> val wordCounts = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)

scala> wordCounts.collect()
res6: Array[(String, Int)] = Array((bbb,2), (eee,1), (ccc,1), (aaa,2))

results matching ""

    No results matching ""