Reading Time: < 1 minute
Amazon S3
Accessing S3 Bucket through Spark
- Edit spark-default.conf file
You need to add below 3 lines consists of your S3 access key, secret key & file system
spark.hadoop.fs.s3a.access.key "s3keys"
spark.hadoop.fs.s3a.secret.key "yourkey"
spark.hadoop.fs.s3a.impl org.apache.hadoop.fs.s3a.S3AFileSystem
./spark-shell --packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.3
spark.read.parquet("S3 Bucket URL")
spark.read.parquet("s3a://your_path_to_bucket/")
Also published on Medium.
1 thought on “Apache Spark: Read Data from S3 Bucket1 min read”
Comments are closed.