جزییات کتاب
Start your AWS data engineering journey with this easy-to-follow, hands-on guide and get to grips with foundational concepts through to building data engineering pipelines using AWSKey FeaturesLearn about common data architectures and modern approaches to generating value from big dataExplore AWS tools for ingesting, transforming, and consuming data, and for orchestrating pipelinesLearn how to architect and implement data lakes and data lakehouses for big data analyticsBook DescriptionKnowing how to architect and implement complex data pipelines is a highly sought-after skill. Data engineers are responsible for building these pipelines that ingest, transform, and join raw datasets - creating new value from the data in the process.Amazon Web Services (AWS) offers a range of tools to simplify a data engineer's job, making it the preferred platform for performing data engineering tasks. This book will take you through the services and the skills you need to architect and implement data pipelines on AWS. You'll begin by reviewing important data engineering concepts and some of the core AWS services that form a part of the data engineer's toolkit. You'll then architect a data pipeline, review raw data sources, transform the data, and learn how the transformed data is used by various data consumers. The book also teaches you about populating data marts and data warehouses along with how a data lakehouse fits into the picture. Later, you'll be introduced to AWS tools for analyzing data, including those for ad-hoc SQL queries and creating visualizations. In the final chapters, you'll understand how the power of machine learning and artificial intelligence can be used to draw new insights from data.By the end of this AWS book, you'll be able to carry out data engineering tasks and implement a data pipeline on AWS independently.What you will learnUnderstand data engineering concepts and emerging technologiesIngest streaming data with Amazon Kinesis Data FirehoseOptimize, denormalize, and join datasets with AWS Glue StudioUse Amazon S3 events to trigger a Lambda process to transform a fileRun complex SQL queries on data lake data using Amazon AthenaLoad data into a Redshift data warehouse and run queriesCreate a visualization of your data using Amazon QuickSightExtract sentiment data from a dataset using Amazon ComprehendWho this book is forThis book is for data engineers, data analysts, and data architects who are new to AWS and looking to extend their skills to the AWS cloud. Anyone who is new to data engineering and wants to learn about the foundational concepts while gaining practical experience with common data engineering services on AWS will also find this book useful.A basic understanding of big data-related topics and Python coding will help you get the most out of this book but is not needed. Familiarity with the AWS console and core services is also useful but not necessary.Table of ContentsAn Introduction to Data EngineeringData Management Architectures for AnalyticsThe AWS Data Engineer's ToolkitData Cataloging, Security and GovernanceArchitecting Data Engineering PipelinesIngesting Batch and Streaming DataTransforming Data to Optimize for AnalyticsIdentifying and Enabling Data ConsumersLoading Data into a Data MartOrchestrating the Data PipelineAd Hoc Queries with Amazon AthenaVisualizing Data with Amazon QuickSightEnabling Artificial Intelligence and Machine LearningWrapping Up the First Part of Your Learning Journey