Please click the link to download the file accelerating_realtime_analytics_with_spark_10082015.
Apache Spark brings the productivity of functional programming and speed of in-memory data processing to Apache Hadoop. It has much to offer for IT developers and data scientists alike, including support for popular languages like Java, Python and R, as well as its own distributed machine-learning library.
As powerful as Spark can be, it remains a complex creature. To get the most out of Spark, it needs to be an integrated part of a broader, Hadoop-based data management platform, such as provided by Cloudera and Talend. By using open tools to connect, ingest, govern, secure and manage data, successful data science and real-time analytics projects become a reality.
Please join Talend and Cloudera to learn:
- Apache Spark, its architecture and benefits
- Spark’s architecture, deployment strategies and use cases
- Spark’s impact to data science, analytics and machine learning
- How to move data scientists’ work to IT production
- Best practices for large Spark deployments
- Mastering Spark’s complexity
About Sean Owen, Director of Data Science, Cloudera EMEA
Sean is Director of Data Science at Cloudera in London. Before Cloudera, he founded Myrrix Ltd (now, the Oryx project) to commercialize large-scale real-time recommender systems on Apache Hadoop. He is an Apache Spark committer and co-authored Advanced Analytics on Spark. He was a committer and VP for Apache Mahout, and co-author of Mahout in Action. Previously, Sean was a senior engineer at Google.
About Yann Delacourt, Director of Product Management for Big Data, Talend
Yann Delacourt is Director of Product Management at Talend. His field covers Data Integration, Big Data andAnalytics. Yann brings more than 15 years of experience in the software industry holding various leadership positions in product management & engineering at SAP & Business Objects.