Build Spark environment for Windows command line.
Paths added in this session are for Win environment by default.
1. Install java JDK to ..\Java\jdk1.8.0_74
Add path ..\Java\jdk1.8.0_74;..\Java\jdk1.8.0_74\bin
2. Install scala to ..\scala
Add path ..\scala\bin
3. DONOT!!! Install Python 3
3. Install Anaconda3 and ADD PATH the most front!!
4. Download the latest prebuilt Spark to ..\Spark\spark-1.6.1
Add path: ..\Spark\spark-1.6.1;..\Spark\spark-1.6.1\bin
5. SPARK_HOME = ..\Spark\spark-1.6.1
Download hadoop "winutils.exe" etc. files.
HADOOP_HOME = ../hadoop/hadoop-common-2.2.0-bin-master
Now java, scala, and spark can all be run in command line in windows.
======================================================================
Next, go further for PyCharm IDE
Paths added in this session are for "current-project" settings in Run\EditConfigurations.
6. Install PyCharm
7. Add Environment Variables in PyCharm IDE
PYTHONPATH = ../Spark/spark-1.6.1/python;/Spark/spark-1.6.1/python/lib/py4j-0.9-src.zip;..\hadoop\hadoop-common-2.2.0-bin-master\bin
HADOOP_HOME = ../hadoop/hadoop-common-2.2.0-bin-master
SPARK_HOME = Spark/spark-1.6.1
=======================================================================
To be further tested and simplified!
没有评论:
发表评论