Adding PySpark to Jupiter Notebook in Anaconda shell

Isabella_Ahrens_Teix · June 1, 2022, 10:45am

I have downloaded Spark, Hadoop, etc. I cannot link PySpark to Jupyter Notebook/Lab.

I have entered in Jupyter Notebook:

pip install findspark (no error)

import findspark
findspark.init()

import pyspark
sc = pyspark.SparkContext()

f = sc.textFile(‘recent-grads.csv’)
data = f.map(lambda line: line.split(’\n’))
data.take(10)

Error - cannot find findspark, etc

Please assist.

ktsh.tanaka.2020 · June 1, 2022, 11:04am

hello.
Sounds like a big deal.

First, on github

Let’s read.
Since jupyter’s python is IPython, what is written below must be executed.

Findspark can add a startup file to the current IPython profile so that the environment vaiables will be properly set and pyspark will be imported upon IPython startup. This file is created when edit_profile is set to true.

ipython --profile=myprofile
findspark.init('/path/to/spark_home', edit_profile=True)

Findspark can also add to the .bashrc configuration file if it is present so that the environment variables will be properly set whenever a new shell is opened. This is enabled by setting the optional argument edit_rc to true.

findspark.init(’/path/to/spark_home’, edit_rc=True)

If changes are persisted, findspark will not need to be called again unless the spark installation is moved.

Sincerely, yours.
ktsh.tanaka.2020

Isabella_Ahrens_Teix · June 2, 2022, 11:05am

Hi

Thanks. I had IT helping with the path, etc and it is working. I have tested in Jupyter Notebook.

Have a nice day.

Kind Regards

Liana

ktsh.tanaka.2020 · June 2, 2022, 11:08am

from ktsh.tanaka.2020 to Isabella_Ahrens_Teix

Thank you for reading the advice.
Please have a nice day.

Regards you.
ktsh.tanaka.2020

Topic		Replies	Views
Install pyspark for Python3 in Anaconda JupyterLab How to Leverage This Community	0	945	December 5, 2022
Could not build spark session in Jupyter Getting Started	0	264	February 19, 2024
Spark Session not creating Product Help	0	506	February 22, 2024
How to install PySpark Product Help	1	585	March 19, 2022
After installing pyspark i am getting below error Technical Topics	0	26	October 6, 2024

Adding PySpark to Jupiter Notebook in Anaconda shell

Related topics