Narrow your search

Library

KU Leuven (21)

Odisee (21)

Thomas More Mechelen (21)

UCLL (21)

ULB (21)

ULiège (21)

VIVES (21)

Thomas More Kempen (19)

KBC (17)

AP (9)

More...

Resource type

book (21)

digital (9)


Language

English (21)


Year
From To Submit

2022 (2)

2021 (2)

2020 (6)

2019 (2)

2018 (5)

More...
Listing 1 - 10 of 21 << page
of 3
>>
Sort by

Multi
Next-Generation Machine Learning with Spark
Authors: ---
ISBN: 9781484256695 1484256697 Year: 2020 Publisher: Berkeley, CA Apress :Imprint: Apress

Loading...
Export citation

Choose an application

Bookmark

Abstract

Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications. The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry. Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark MLlib and advances to more powerful, third-party machine learning algorithms and libraries beyond what is available in the standard Spark MLlib library. By the end of this book, you will be able to apply your knowledge to real-world use cases through dozens of practical examples and insightful explanations. You will: Be introduced to machine learning, Spark, and Spark MLlib 2.4.x Achieve lightning-fast gradient boosting on Spark with the XGBoost4J-Spark and LightGBM libraries Detect anomalies with the Isolation Forest algorithm for Spark Use the Spark NLP and Stanford CoreNLP libraries that support multiple languages Optimize your ML workload with the Alluxio in-memory data accelerator for Spark Use GraphX and GraphFrames for Graph Analysis Perform image recognition using convolutional neural networks Utilize the Keras framework and distributed deep learning libraries with Spark .


Book
Modern data engineering with Apache Spark : a hands-on guide for building mission-critical streaming applications
Author:
ISBN: 1484274512 1484274520 Year: 2022 Publisher: [Place of publication not identified] : Apress,

Loading...
Export citation

Choose an application

Bookmark

Abstract

Leverage Apache Spark within a modern data engineering ecosystem. This hands-on guide will teach you how to write fully functional applications, follow industry best practices, and learn the rationale behind these decisions. With Apache Spark as the foundation, you will follow a step-by-step journey beginning with the basics of data ingestion, processing, and transformation, and ending up with an entire local data platform running Apache Spark, Apache Zeppelin, Apache Kafka, Redis, MySQL, Minio (S3), and Apache Airflow.


Multi
Beginning Apache Spark 3
Authors: ---
ISBN: 9781484273838 9781484273845 9781484273821 1484273834 Year: 2021 Publisher: Berkeley, CA Apress :Imprint: Apress

Loading...
Export citation

Choose an application

Bookmark

Abstract

Take a journey toward discovering, learning, and using Apache Spark 3.0. In this book, you will gain expertise on the powerful and efficient distributed data processing engine inside of Apache Spark; its user-friendly, comprehensive, and flexible programming model for processing data in batch and streaming; and the scalable machine learning algorithms and practical utilities to build machine learning applications. Beginning Apache Spark 3 begins by explaining different ways of interacting with Apache Spark, such as Spark Concepts and Architecture, and Spark Unified Stack. Next, it offers an overview of Spark SQL before moving on to its advanced features. It covers tips and techniques for dealing with performance issues, followed by an overview of the structured streaming processing engine. It concludes with a demonstration of how to develop machine learning applications using Spark MLlib and how to manage the machine learning development lifecycle. This book is packed with practical examples and code snippets to help you master concepts and features immediately after they are covered in each section. After reading this book, you will have the knowledge required to build your own big data pipelines, applications, and machine learning applications. What You Will Learn Master the Spark unified data analytics engine and its various components Work in tandem to provide a scalable, fault tolerant and performant data processing engine Leverage the user-friendly and flexible programming model to perform simple to complex data analytics using dataframe and Spark SQL Develop machine learning applications using Spark MLlib Manage the machine learning development lifecycle using MLflow Who This Book Is For Data scientists, data engineers and software developers.


Book
Next-Generation Machine Learning with Spark : Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More
Author:
ISBN: 1484256697 1484256689 Year: 2020 Publisher: Berkeley, CA : Apress : Imprint: Apress,

Loading...
Export citation

Choose an application

Bookmark

Abstract

Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications. The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry. Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark MLlib and advances to more powerful, third-party machine learning algorithms and libraries beyond what is available in the standard Spark MLlib library. By the end of this book, you will be able to apply your knowledge to real-world use cases through dozens of practical examples and insightful explanations. You will: Be introduced to machine learning, Spark, and Spark MLlib 2.4.x Achieve lightning-fast gradient boosting on Spark with the XGBoost4J-Spark and LightGBM libraries Detect anomalies with the Isolation Forest algorithm for Spark Use the Spark NLP and Stanford CoreNLP libraries that support multiple languages Optimize your ML workload with the Alluxio in-memory data accelerator for Spark Use GraphX and GraphFrames for Graph Analysis Perform image recognition using convolutional neural networks Utilize the Keras framework and distributed deep learning libraries with Spark .


Book
Scala Programming for Big Data Analytics
Authors: ---
ISBN: 9781484248102 1484248104 Year: 2019 Publisher: Berkeley, CA Apress :Imprint: Apress

Loading...
Export citation

Choose an application

Bookmark

Abstract

Gain the key language concepts and programming techniques of Scala in the context of big data analytics and Apache Spark. The book begins by introducing you to Scala and establishes a firm contextual understanding of why you should learn this language, how it stands in comparison to Java, and how Scala is related to Apache Spark for big data analytics. Next, you’ll set up the Scala environment ready for examining your first Scala programs. You’ll start with code blocks that allow you to group and execute related statements together as a block and see the implications for Scala’s type system. The author discusses functions at length and highlights a number of associated concepts such as zero-parity functions, single-line functions, and anonymous functions. Along the way you’ll see the development life cycle of a Scala program. This involves compiling and building programs using the industry-standard Scala Build Tool (SBT). You’ll cover guidelines related to dependency management using SBT as this is critical for building large Apache Spark applications. Scala Programming for Big Data Analytics concludes by demonstrating how you can make use of the concepts to write programs that run on the Apache Spark framework. These programs will provide distributed and parallel computing, which is critical for big data analytics. You will: See the fundamentals of Scala as a general-purpose programming language Understand functional programming and object-oriented programming constructs in Scala Comprehend the use and various features of Scala REPL (shell) Use Scala collections and functions Employ functional programming constructs.


Multi
Modern Data Engineering with Apache Spark
Authors: ---
ISBN: 9781484274521 9781484274514 9781484274538 9781484284841 1484274520 Year: 2022 Publisher: Berkeley, CA Apress :Imprint: Apress

Loading...
Export citation

Choose an application

Bookmark

Abstract

Leverage Apache Spark within a modern data engineering ecosystem. This hands-on guide will teach you how to write fully functional applications, follow industry best practices, and learn the rationale behind these decisions. With Apache Spark as the foundation, you will follow a step-by-step journey beginning with the basics of data ingestion, processing, and transformation, and ending up with an entire local data platform running Apache Spark, Apache Zeppelin, Apache Kafka, Redis, MySQL, Minio (S3), and Apache Airflow.


Book
Beginning Apache Spark 3 : with DataFrame, Spark SQL, structured streaming, and Spark machine learning library
Author:
ISBN: 1484273834 1484273826 Year: 2021 Publisher: New York, New York : Apress,

Loading...
Export citation

Choose an application

Bookmark

Abstract

Take a journey toward discovering, learning, and using Apache Spark 3.0. In this book, you will gain expertise on the powerful and efficient distributed data processing engine inside of Apache Spark; its user-friendly, comprehensive, and flexible programming model for processing data in batch and streaming; and the scalable machine learning algorithms and practical utilities to build machine learning applications. Beginning Apache Spark 3 begins by explaining different ways of interacting with Apache Spark, such as Spark Concepts and Architecture, and Spark Unified Stack. Next, it offers an overview of Spark SQL before moving on to its advanced features. It covers tips and techniques for dealing with performance issues, followed by an overview of the structured streaming processing engine. It concludes with a demonstration of how to develop machine learning applications using Spark MLlib and how to manage the machine learning development lifecycle. This book is packed with practical examples and code snippets to help you master concepts and features immediately after they are covered in each section. After reading this book, you will have the knowledge required to build your own big data pipelines, applications, and machine learning applications. What You Will Learn Master the Spark unified data analytics engine and its various components Work in tandem to provide a scalable, fault tolerant and performant data processing engine Leverage the user-friendly and flexible programming model to perform simple to complex data analytics using dataframe and Spark SQL Develop machine learning applications using Spark MLlib Manage the machine learning development lifecycle using MLflow Who This Book Is For Data scientists, data engineers and software developers.


Multi
PolyBase Revealed
Authors: ---
ISBN: 9781484254615 1484254619 Year: 2020 Publisher: Berkeley, CA Apress :Imprint: Apress

Loading...
Export citation

Choose an application

Bookmark

Abstract

Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered. PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Server with Azure Blob Storage, Apache Hadoop, other SQL Server instances, Oracle, Cosmos DB, Apache Spark, and more. You will learn how PolyBase can help you reduce storage and other costs by avoiding the need for ETL processes that duplicate data in order to make it accessible from one source. PolyBase makes SQL Server into that one source, and T-SQL is your golden ticket. The book also covers PolyBase scale-out clusters, allowing you to distribute PolyBase queries among several SQL Server instances, thus improving performance. With great flexibility comes great complexity, and this book shows you where to look when queries fail, complete with coverage of internals, troubleshooting techniques, and where to find more information on obscure cross-platform errors. Data virtualization is a key target for Microsoft with SQL Server 2019. This book will help you keep your skills current, remain relevant, and build new business and career opportunities around Microsoft’s product direction. You will: Install and configure PolyBase as a stand-alone service, or unlock its capabilities with a scale-out cluster Understand how PolyBase interacts with outside data sources while presenting their data as regular SQL Server tables Write queries combining data from SQL Server, Apache Hadoop, Oracle, Cosmos DB, Apache Spark, and more Troubleshoot PolyBase queries using SQL Server Dynamic Management Views Tune PolyBase queries using statistics and execution plans Solve common business problems, including "cold storage" of infrequently accessed data and simplifying ETL jobs.


Multi
Beginning Apache Spark Using Azure Databricks
Authors: ---
ISBN: 9781484257814 9781484257807 1484257812 Year: 2020 Publisher: Berkeley, CA Apress :Imprint: Apress

Loading...
Export citation

Choose an application

Bookmark

Abstract

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster. This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything about configuring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data. This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned. What You Will Learn Discover the value of big data analytics that leverage the power of the cloud Get started with Databricks using SQL and Python in either Microsoft Azure or AWS Understand the underlying technology, and how the cloud and Spark fit into the bigger picture See how these tools are used in the real world Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free This book is for data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation. Robert Ilijason is a 20-year veteran in the business intelligence (BI) segment. He has worked as a contractor for some of Europe’s biggest companies and has conducted large-scale analytics projects within the areas of retail, telecom, banking, government, and more. Robert has seen his share of analytic trends come and go over the years, but unlike most of them, he strongly believes that Apache Spark in the cloud, especially with Azure Databricks, is a game changer.


Book
Beginning Apache Spark Using Azure Databricks : Unleashing Large Cluster Analytics in the Cloud
Author:
ISBN: 1484257812 1484257804 Year: 2020 Publisher: Berkeley, CA : Apress : Imprint: Apress,

Loading...
Export citation

Choose an application

Bookmark

Abstract

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster. This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything about configuring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data. This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned. What You Will Learn Discover the value of big data analytics that leverage the power of the cloud Get started with Databricks using SQL and Python in either Microsoft Azure or AWS Understand the underlying technology, and how the cloud and Spark fit into the bigger picture See how these tools are used in the real world Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free This book is for data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation. Robert Ilijason is a 20-year veteran in the business intelligence (BI) segment. He has worked as a contractor for some of Europe’s biggest companies and has conducted large-scale analytics projects within the areas of retail, telecom, banking, government, and more. Robert has seen his share of analytic trends come and go over the years, but unlike most of them, he strongly believes that Apache Spark in the cloud, especially with Azure Databricks, is a game changer.

Listing 1 - 10 of 21 << page
of 3
>>
Sort by