使用iPyhon Notebook 執行Word Count

Step 1: 依照如何在Windows上安裝x-window完成Xming安裝

Step 2: ssh登入後建立ipyhon profile(只有第一次需要)

$ ipython profile create pyspark
$ vim ~/.ipython/profile_pyspark/startup/00-pyspark-setup.py

填入下列內容

import os
import sys
spark_home = os.environ.get('SPARK_HOME', None)
if not spark_home:
  raise ValueError('SPARK_HOME environment variable is not set')
sys.path.insert(0, spark_home + "/python")
sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.8.2.1-src.zip'))
execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))

Step 3: 將word.txt上傳至HDFS

$ hadoop fs –put word.txt

Step 4: 執行ipython指令,Browser會自動透過X-window帶出

$ export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark/
$ ipython notebook --profile=pyspark

Step 5:

建立新的Notebook

執行pySpark語法

results matching ""

    No results matching ""