UniCat-Search

Union Catalogue of Belgian Libraries

English | Nederlands | Français

Feedback

About UniCat

Help

News

Narrow your search

Library

UGent (9)

KU Leuven (6)

Odisee (6)

Thomas More Kempen (6)

Thomas More Mechelen (6)

UCLL (6)

ULB (6)

ULiège (6)

VIVES (6)

KBC (4)
More...

Resource type

book (9)

Language

English (9)

Year

From To

2019 (1)

2018 (4)

2017 (1)

2016 (2)

2015 (1)

Listing 1 - 9 of 9
Sort by

Book

Spark : the definitive guide : big data processing made simple
Authors: Chambers, Bill --- Zaharia, Matei
ISBN: 9781491912218 Year: 2018 Publisher: Beijing : O'Reilly,

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract

Keywords
Machine learning. --- Spark (Electronic resource : Apache Software Foundation).

Book

High performance spark : best practices for scaling and optimizing Apache Spark
Authors: Karau, Holden --- Warren, Rachel
ISBN: 9781491943205 Year: 2017 Publisher: Beijing : O'Reilly,

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract

Keywords
Big data. --- Data mining --- Computer programs. --- Spark (Electronic resource : Apache Software Foundation).

Book

Learning spark
Authors: Karau, Holden --- Konwinski, Andy --- Wendell, Patrick --- Zaharia, Matei
ISBN: 9781449358624 1449358624 Year: 2015 Publisher: Beijing O'Reilly

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract
This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. You'll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.

Keywords
Information systems --- Big data. --- Data mining --- Données volumineuses --- Computer programs. --- Spark (Electronic resource : Apache Software Foundation) --- SPARK (Electronic resource) --- SPARK (Electronic resource). --- Données volumineuses

Book

Big Data SMACK : A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka
Authors: Estrada, Raul. --- Ruiz, Isaac.
ISBN: 1484221745 1484221753 Year: 2016 Publisher: Berkeley, CA : Apress : Imprint: Apress,

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract
Integrate full-stack open-source fast data pipeline architecture and choose the correct technology—Spark, Mesos, Akka, Cassandra, and Kafka (SMACK)—in every layer. Fast data is becoming a requirement for many enterprises. So far, however, the focus has largely been on collecting, aggregating, and crunching large data sets in a timely manner. In many cases organizations need more than one paradigm to perform efficient analyses. Big Data SMACK explains each technology and, more importantly, how to integrate them. It provides detailed coverage of the practical benefits of these technologies and incorporates real-world examples. The book focuses on the problems and scenarios solved by the architecture, as well as the solutions provided by each technology. This book covers the five main concepts of data pipeline architecture and how to integrate, replace, and reinforce every layer: The engine: Apache Spark The container: Apache Mesos The model: Akka< The storage: Apache Cassandra The broker: Apache Kafka.

Keywords
Computer science. --- Data structures (Computer science). --- Database management. --- Computer Science. --- Big Data. --- Database Management. --- Data Structures. --- Big data. --- Data sets, Large --- Large data sets --- Data base management --- Data services (Database management) --- Database management services --- DBMS (Computer science) --- Generalized data management systems --- Services, Database management --- Systems, Database management --- Systems, Generalized database management --- Electronic data processing --- Information structures (Computer science) --- Structures, Data (Computer science) --- Structures, Information (Computer science) --- File organization (Computer science) --- Abstract data types (Computer science) --- Informatics --- Science --- Spark (Electronic resource : Apache Software Foundation) --- Apache Mesos (Electronic resource) --- Akka (Electronic resource) --- Apache Cassandra. --- Apache Kafka. --- Cassandra (Electronic resource) --- Apache Spark (Electronic resource : Apache Software Foundation) --- Mesos (Electronic resource) --- Data sets --- Data structures (Computer scienc.

Book

Pro Spark Streaming : The Zen of Real-Time Analytics Using Apache Spark
Author: Nabi, Zubair.
ISBN: 1484214803 148421479X Year: 2016 Publisher: Berkeley, CA : Apress : Imprint: Apress,

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract
Learn the right cutting-edge skills and knowledge to leverage Spark Streaming to implement a wide array of real-time, streaming applications. This book walks you through end-to-end real-time application development using real-world applications, data, and code. Taking an application-first approach, each chapter introduces use cases from a specific industry and uses publicly available datasets from that domain to unravel the intricacies of production-grade design and implementation. The domains covered in Pro Spark Streaming include social media, the sharing economy, finance, online advertising, telecommunication, and IoT. In the last few years, Spark has become synonymous with big data processing. DStreams enhance the underlying Spark processing engine to support streaming analysis with a novel micro-batch processing model. Pro Spark Streaming by Zubair Nabi will enable you to become a specialist of latency sensitive applications by leveraging the key features of DStreams, micro-batch processing, and functional programming. To this end, the book includes ready-to-deploy examples and actual code. Pro Spark Streaming will act as the bible of Spark Streaming. What You'll Learn Discover Spark Streaming application development and best practices Work with the low-level details of discretized streams Optimize production-grade deployments of Spark Streaming via configuration recipes and instrumentation using Graphite, collectd, and Nagios Ingest data from disparate sources including MQTT, Flume, Kafka, Twitter, and a custom HTTP receiver Integrate and couple with HBase, Cassandra, and Redis Take advantage of design patterns for side-effects and maintaining state across the Spark Streaming micro-batch model Implement real-time and scalable ETL using data frames, SparkSQL, Hive, and SparkR Use streaming machine learning, predictive analytics, and recommendations Mesh batch processing with stream processing via the Lambda architecture Who This Book Is For Data scientists, big data experts, BI analysts, and data architects.

Keywords
Computer science. --- Data mining. --- Application software. --- Computer Science. --- Computer Appl. in Administrative Data Processing. --- Data Mining and Knowledge Discovery. --- Streaming technology (Telecommunications) --- Big data. --- Spark (Electronic resource : Apache Software Foundation) --- Data sets, Large --- Large data sets --- Streamed media --- Streaming media --- Streaming resources --- Apache Spark (Electronic resource : Apache Software Foundation) --- Application computer programs --- Application computer software --- Applications software --- Apps (Computer software) --- Computer software --- Algorithmic knowledge discovery --- Factual data analysis --- KDD (Information retrieval) --- Knowledge discovery in data --- Knowledge discovery in databases --- Mining, Data --- Database searching --- Informatics --- Science --- Data transmission systems --- Multimedia systems --- Data sets --- Information systems. --- Big Data.

Book

Practical Apache Spark : Using the Scala API
Authors: Chellappan, Subhashini. --- Ganesan, Dharanitharan.
ISBN: 1484236521 1484236513 Year: 2018 Publisher: Berkeley, CA : Apress : Imprint: Apress,

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract
Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You’ll follow a learn-to-do-by-yourself approach to learning – learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure. On completion, you’ll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You’ll also become familiar with machine learning algorithms with real-time usage. You will: Discover the functional programming features of Scala Understand the complete architecture of Spark and its components Integrate Apache Spark with Hive and Kafka Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries Work with different machine learning concepts and libraries using Spark's MLlib packages.

Keywords
Scala (Computer program language) --- Functional programming languages --- Object-oriented programming languages --- Multiparadigm programming (Computer science) --- Big data. --- Open source software. --- Computer programming. --- Computer science. --- Big Data. --- Open Source. --- Programming Languages, Compilers, Interpreters. --- Informatics --- Science --- Computers --- Electronic computer programming --- Electronic data processing --- Electronic digital computers --- Programming (Electronic computers) --- Coding theory --- Free software (Open source software) --- Open code software --- Opensource software --- Computer software --- Data sets, Large --- Large data sets --- Data sets --- Programming --- Spark (Electronic resource : Apache Software Foundation) --- SPARK (Electronic resource) --- Apache Spark (Electronic resource : Apache Software Foundation) --- Programming languages (Electronic computers). --- Computer languages --- Computer program languages --- Computer programming languages --- Machine language --- Languages, Artificial

Book

Next-Generation Big Data : A Practical Guide to Apache Kudu, Impala, and Spark
Author: Quinto, Butch
ISBN: 9781484231470 1484231473 1484231465 Year: 2018 Publisher: Berkeley, CA : Apress :

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract
Utilize this practical and easy-to-follow guide to modernize traditional enterprise data warehouse and business intelligence environments with next-generation big data technologies. Next-Generation Big Data takes a holistic approach, covering the most important aspects of modern enterprise big data. The book covers not only the main technology stack but also the next-generation tools and applications used for big data warehousing, data warehouse optimization, real-time and batch data ingestion and processing, real-time data visualization, big data governance, data wrangling, big data cloud deployments, and distributed in-memory big data computing. Finally, the book has an extensive and detailed coverage of big data case studies from Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and Mastercard. What You'll Learn Install Apache Kudu, Impala, and Spark to modernize enterprise data warehouse and business intelligence environments, complete with real-world, easy-to-follow examples, and practical adviceIntegrate HBase, Solr, Oracle, SQL Server, MySQL, Flume, Kafka, HDFS, and Amazon S3 with Apache Kudu, Impala, and SparkUse StreamSets, Talend, Pentaho, and CDAP for real-time and batch data ingestion and processingUtilize Trifacta, Alteryx, and Datameer for data wrangling and interactive data processingTurbocharge Spark with Alluxio, a distributed in-memory storage platformDeploy big data in the cloud using Cloudera DirectorPerform real-time data visualization and time series analysis using Zoomdata, Apache Kudu, Impala, and SparkUnderstand enterprise big data topics such as big data governance, metadata management, data lineage, impact analysis, and policy enforcement, and how to use Cloudera Navigator to perform common data governance tasksImplement big data use cases such as big data warehousing, data warehouse optimization, Internet of Things, real-time data ingestion and analytics, complex event processing, and scalable predictive modelingStudy real-world big data case studies from innovative companies, including Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and MastercardWho This Book Is For BI and big data warehouse professionals interested in gaining practical and real-world insight into next-generation big data processing and analytics using Apache Kudu, Impala, and Spark; and those who want to learn more about other advanced enterprise topics

Keywords
Information systems --- MySQL --- cloud computing --- IoT (Internet of Things) --- Apache (informatica) --- big data --- time series analysis --- gegevensanalyse --- SQL (structured query language) --- Computer science --- Big data, --- Data mining. --- Algorithmic knowledge discovery --- Factual data analysis --- KDD (Information retrieval) --- Knowledge discovery in data --- Knowledge discovery in databases --- Mining, Data --- Database searching --- Data sets, Large --- Large data sets --- Data sets --- Informatics --- Science --- Spark (Electronic resource : Apache Software Foundation) --- Computer science. --- Big data.

Book

Beginning Apache Spark 2 : With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning library
Author: Luu, Hien.
ISBN: 1484235797 1484235789 Year: 2018 Publisher: Berkeley, CA : Apress : Imprint: Apress,

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract
Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it. Along the way, you’ll discover resilient distributed datasets (RDDs); use Spark SQL for structured data; and learn stream processing and build real-time applications with Spark Structured Streaming. Furthermore, you’ll learn the fundamentals of Spark ML for machine learning and much more. After you read this book, you will have the fundamentals to become proficient in using Apache Spark and know when and how to apply it to your big data applications. You will: Understand Spark unified data processing platform Use and manipulate RDDs Deal with structured data using Spark SQL Build real-time applications using Spark Structured Streaming Develop intelligent applications with the Spark Machine Learning library.

Keywords
Big data. --- Java (Computer program language). --- Data mining. --- Open source software. --- Computer programming. --- Big Data. --- Java. --- Data Mining and Knowledge Discovery. --- Open Source. --- Computers --- Electronic computer programming --- Electronic data processing --- Electronic digital computers --- Programming (Electronic computers) --- Coding theory --- Free software (Open source software) --- Open code software --- Opensource software --- Computer software --- Algorithmic knowledge discovery --- Factual data analysis --- KDD (Information retrieval) --- Knowledge discovery in data --- Knowledge discovery in databases --- Mining, Data --- Database searching --- Object-oriented programming languages --- JavaSpaces technology --- Data sets, Large --- Large data sets --- Data sets --- Programming --- Spark (Electronic resource : Apache Software Foundation)

Book

Scala Programming for Big Data Analytics : Get Started With Big Data Analytics Using Apache Spark
Author: Elahi, Irfan.
ISBN: 1484248104 1484248090 Year: 2019 Publisher: Berkeley, CA : Apress : Imprint: Apress,

Abstract | Keywords | Export | Availability | Bookmark

Loading...

Export citation
Choose an application

Reference Manager

EndNote

RefWorks (Direct export to RefWorks)

Bookmark

Abstract
Gain the key language concepts and programming techniques of Scala in the context of big data analytics and Apache Spark. The book begins by introducing you to Scala and establishes a firm contextual understanding of why you should learn this language, how it stands in comparison to Java, and how Scala is related to Apache Spark for big data analytics. Next, you’ll set up the Scala environment ready for examining your first Scala programs. You’ll start with code blocks that allow you to group and execute related statements together as a block and see the implications for Scala’s type system. The author discusses functions at length and highlights a number of associated concepts such as zero-parity functions, single-line functions, and anonymous functions. Along the way you’ll see the development life cycle of a Scala program. This involves compiling and building programs using the industry-standard Scala Build Tool (SBT). You’ll cover guidelines related to dependency management using SBT as this is critical for building large Apache Spark applications. Scala Programming for Big Data Analytics concludes by demonstrating how you can make use of the concepts to write programs that run on the Apache Spark framework. These programs will provide distributed and parallel computing, which is critical for big data analytics. You will: See the fundamentals of Scala as a general-purpose programming language Understand functional programming and object-oriented programming constructs in Scala Comprehend the use and various features of Scala REPL (shell) Use Scala collections and functions Employ functional programming constructs.

Keywords
Scala (Computer program language) --- Data mining. --- Big data. --- Spark (Electronic resource : Apache Software Foundation) --- Computer science. --- Open source software. --- Computer programming. --- Big Data/Analytics. --- Programming Languages, Compilers, Interpreters. --- Open Source. --- Data sets, Large --- Large data sets --- Data sets --- Computers --- Electronic computer programming --- Electronic data processing --- Electronic digital computers --- Programming (Electronic computers) --- Coding theory --- Free software (Open source software) --- Open code software --- Opensource software --- Computer software --- Informatics --- Science --- Programming --- Programming languages (Electronic computers). --- Computer languages --- Computer program languages --- Computer programming languages --- Machine language --- Languages, Artificial

Listing 1 - 9 of 9
Sort by