Spark – Iam a Software Engineer

SPARK 2.0

12 Jul 2016

What is new in SPARK 2.0: Easier: SQL and Streamlined APIs One thing we are proud of in Spark is creating APIs that are simple, intuitive, and expressive. Spark 2.0 continues this tradition, with focus on two areas: (1) standard SQL support and (2) unifying DataFrame/Dataset API. On the SQL side, we have significantly expanded … More SPARK 2.0

Hadoop vs Spark

15 Dec 2015

Listen in on any conversation about big data, and you’ll probably hear mention of Hadoop or Apache Spark. Here’s a brief look at what they do and how they compare. 1: They do different things. Hadoop and Apache Spark are both big-data frameworks, but they don’t really serve the same purposes. Hadoop is essentially a … More Hadoop vs Spark

Few minutes guide to Understand the Significance of Apache Spark

28 Jul 2015

So what is Spark? Spark is another execution framework. Like MapReduce, it works with the filesystem to distribute your data across the cluster, and process that data in parallel. Like MapReduce, it also takes a set of instructions from an application written by a developer. MapReduce was generally coded from Java; Spark supports not only … More Few minutes guide to Understand the Significance of Apache Spark

Apache Spark vs. MapReduce

28 Jul 2015

How the Spark is different from MapReduce – Whiteboard Walkthrough by Anoop Dawar f you look back, you will see that MapReduce has been the mainstay on Hadoop for batch jobs for a long, long time. However, two very promising technologies have emerged over the last year, Apache Drill, which is a low-density SQL engine … More Apache Spark vs. MapReduce

Apache Spark

2 Jul 2015

Apache Spark is a fast and general engine for large-scale data processing. Although Mapreduce is great for large scale data processing, it is not friendly for iterative algorithms or interactive analytic because the data have to be repeatedly loaded for each iteration or be materialized and replicated on the distributed file system between successive jobs. Apache Spark is designed … More Apache Spark