Open-source Languages & Tools for z/OS

View Only

Back to discussions

Expand all | Collapse all

Spark R on z/OS

1. Spark R on z/OS

0 Like
Archive User
Posted 02-07-2017 05:07

Reply Reply Privately
Hello,

I found the following description in a Redbook.

Apache Spark Implementation on IBM z/OS
http://www.redbooks.ibm.com/redbooks/pdfs/sg248325.pdf

2.7 Spark R
Spark R is an R package that provides a light-weight front-end to use Apache Spark from R.
Spark R exposes the Spark API through the RDD class, and enables users to interactively run jobs from the R shell on a cluster. Rocket Software is in progress with a port of R to z/OS, and interested clients should contact either Rocket directly or IBM.

I’d like to use R with Apache Spark on z/OS.
Can we get Spark R library for z/OS?
Do you have any documentation about how to use SparkR on z/OS?

regards.
Tomohiro Taguchi
2. RE: Spark R on z/OS

0 Like
Archive User
Posted 02-07-2017 17:23

Reply Reply Privately
Hello,
You will need Rocket’s latest R distribution, as announced here on the Forum. I think you already have it. You will need to install both the R tar file and the Devel tar file.

You will need IBM’s Spark distribution. There are two of these, Spark 1.5.2 and Spark 2.0.2. I myself have only tested with 1.5.2.

You will also need the Spark R code. You can get this from Apache’s 1.5.2 distribution (from the source or from any binary distribution), or from Apache’s 2.0.2 distribution (I could not find R in their source distribution, it is somewhere else now, but it is in any binary distribution). Copy the R code from the Apache distribution into the IBM distribution. This is the R directory at the top level.

I prefer to give file tags to all the files, to allow both ascii and ebcdic files to be handled well by programs such as vi or emacs. Set _BPXK_AUTOCVT to ON in your init file. Run the “autotag” program that is in the bin directory, like this: “autotag -R -s -L 12 ibm_spark_directory”. This recursively tags all files that are not already tagged, based on their contents, and allows a small number of unusual characters; any more than that number, and the file will be tagged as binary.

One of our Spark R demos began with these lines:
.libPaths(c(file.path(Sys.getenv(“SPARK_HOME”), “R”, “lib”), .libPaths()))
library(SparkR)
sc <- sparkR.init(master = “local”, # try local[*] to use all cores
appName = “Analyzer”,
sparkEnvir = list(spark.driver.memory=“2g”))

regards,
Rick Harris
Rocket Software
3. RE: Spark R on z/OS

0 Like
Archive User
Posted 02-14-2017 07:36

Reply Reply Privately
Thank you for your support!

I have already installed Spark on z/OS V2.0.2
I get apache’s 2.0.2 distribution and copy “R” directory to IBM distribution (/usr/lpp/IBM/Spark/R).
Then I run “autotag -R -s -L 12 /usr/lpp/IBM/Spark/R” command.
After that, I can create session to spark from R console like following.

> Sys.getenv("SPARK_HOME") [1] "/usr/lpp/IBM/Spark" > library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"))) Attaching package: 'SparkR' The following objects are masked from 'package:stats': cov, filter, lag, na.omit, predict, sd, var, window The following objects are masked from 'package:base': as.data.frame, colnames, colnames<-, drop, endsWith, intersect, rank, rbind, sample, startsWith, subset, summary, transform, union > sparkR.session(master = "local[*]", sparkConfig = list(spark.driver.memory = "2g")) Spark package found in SPARK_HOME: /usr/lpp/IBM/Spark Launching java with spark-submit command /usr/lpp/IBM/Spark/bin/spark-submit --driver-memory "2g" sparkr-shell /tmp/RtmpE6u1I0/backend_port50100327c7b2670 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). 17/02/13 15:55:41 WARN NetUtil: Failed to find the loopback interface 17/02/13 15:55:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Java ref type org.apache.spark.sql.SparkSession id 1

I will be able to try next test using SparkR.
Thank you so much!

Could you please answer more questions?

(1) I’d like to know the detail of “autotag” command.
What do the “-R -s -L 12” options mean?
How can I get the reference of “autotag” command?

(2) Do you have any offical guide for using SparkR on z/OS?

Regards
Tomohiro Taguchi
4. RE: Spark R on z/OS

0 Like
Archive User
Posted 02-15-2018 07:30

Reply Reply Privately
Hi, Here iam providing the link. This link provides official guide for using Spark R on z/OS.

visit here! http://www.redbooks.ibm.com/abstracts/sg248325.html?Open

Open-source Languages & Tools for z/OS

Spark R on z/OS

Archive User02-07-2017 05:07

Archive User02-07-2017 17:23

Archive User02-14-2017 07:36

Archive User02-15-2018 07:30

1. Spark R on z/OS

Apache Spark Implementation on IBM z/OS
http://www.redbooks.ibm.com/redbooks/pdfs/sg248325.pdf

2. RE: Spark R on z/OS

3. RE: Spark R on z/OS

4. RE: Spark R on z/OS

Contact Us

Quick Links

Open-source Languages & Tools for z/OS

Spark R on z/OS

Archive User02-07-2017 05:07

Archive User02-07-2017 17:23

Archive User02-14-2017 07:36

Archive User02-15-2018 07:30

1. Spark R on z/OS

Apache Spark Implementation on IBM z/OS http://www.redbooks.ibm.com/redbooks/pdfs/sg248325.pdf

2. RE: Spark R on z/OS

3. RE: Spark R on z/OS

4. RE: Spark R on z/OS

Contact Us

Quick Links

Apache Spark Implementation on IBM z/OS
http://www.redbooks.ibm.com/redbooks/pdfs/sg248325.pdf