Thursday, August 2, 2018

Talend and Apache Spark: A Technical Primer and Overview

In my years at Talend as a Support Engineer, before I moved into the Customer Success Architect team, customers often asked about Talend's capabilities with Apache Spark. When we talk about Spark the first thing that always comes to mind is the command Spark submit that we use to submit our Spark jobs. So, the question, how a Talend Spark job equates to a regular Spark submit, naturally comes up. In this blog, we are going to cover the different Apache Spark modes offered, the ones used by Talend, and how Talend works with Apache Spark.

An Intro to Apache Spark Jobs

Apache Spark has two different types of jobs that you can submit. One of them is Spark Batch and the other is Spark Streaming. Spark Batch operates under a batch processing model, where a data set is collected over a period of time, then gets sent to a Spark engine for processing.



from DZone.com Feed https://ift.tt/2vnp1e4

No comments:

Post a Comment