Amazon web service AWS certification Online Learning

Course

Teaching

You are

Offering Professional Course

Locality

Aarey Milk Colony

Mobile

+91 76940 95404

Description for "Amazon web service AWS certification Online Learning"

Overview of Amazon Redshift: The Data Warehouse on Cloud

Data warehouses provide businesses with the ability to slice and dice data and extract valuable insights from that data to make better business decisions. Used for reporting and data analysis, data warehouses act as a central repository for all or for portions of the data collected by an enterprise's various systems. Data warehouses are "fed" data from different data sources, such as relational/No SQL databases or third-party APIs. All these data sources need to be combined into a coherent data set that is optimized for fast database queries.
A fully managed Peta-byte scale cloud data warehouse service offered by Amazon Web Services. It removes the overhead of months of efforts required in setting up the data warehouse and managing the hardware and software associated with it. On-premises data warehouses are appliance-based, making them difficult to expand, while cloud data warehouses offer elasticity, scalability, and the ability to handle big data volumes while still using familiar models (such as the SQL/relational model used by Redshift).
Take a look at some of the advanced options available such as understanding query plan to improve performance, workload management, cluster re-sizing, integration with other AWS Services.

Redshift based Cloud Data Warehouse Architecture
Let s begin with a brief introduction of the Redshift architecture:
Leader Node :
The leader node parses the query, develops the query execution plan and distributes it to the compute nodes. The Leader Node is provisioned automatically by the service and is not billed
Compute Node:
This is the node that stores data and executes the query. Each Compute Node has its down compute, memory and storage
Client Applications:
Client applications can be the standard ETL, BI and analytics tools
Internal Networking:
All the nodes are internally connected through a 10g network enabling faster data transfer between the nodes. The compute nodes are also not exposed to the client applications. Client applications always talk to the Leader Node.
Key features of Amazon Redshift:
Faster performance
Machine learning
Result caching
Scalable
Compression

Faster performance: Amazon Redshift delivers fast query performance on data sets ranging in size from gigabytes to exa-bytes. Redshift uses columnar storage, data compression, and zone maps to reduce the amount of I/O needed to perform queries. The underlying hardware is designed for high performance data processing, using local attached storage to maximize throughput between the CPU's and drives, and a high bandwidth mesh network to maximize throughput between nodes.

Machine learning:
Amazon Redshift uses machine learning to deliver high throughout based on your workloads. Redshift utilizes sophisticated algorithms to predict incoming query run times, and assigns them to the optimal queue for the fastest processing.

Result caching:
Amazon Redshift uses result caching to deliver sub-second response times for repeat queries. Dashboard, visualization, and business intelligence tools that execute repeat queries experience a significant performance boost. When a query executes, Redshift searches the cache to see if there is a cached result from a prior run.

Scalable:
The number of nodes in a Redshift cluster can be dynamically changed through the AWS Management Console or the API. We can add more nodes to the cluster for increased performance or if we need more storage. During the scaling activity, the cluster is placed in a read only mode and all the data is copied to a new cluster. Once the new cluster is fully operational, the old cluster is terminated and this process is entirely transparent to the clients.

Compression:
Compressed data reduces disk usage and data is uncompressed after loading it into memory during query execution. Since Redshift employs columnar storage, Redshift can apply appropriate compression encodings that are tied to the column type.

Security:
Virtual Private Cloud:
You can launch Redshift within VPC and control access to the cluster through the virtual networking environment

Encryption: Data stored in Redshift can be encrypted. This can be configured when creating the tables in Redshift

SSL: To encrypt connections between clients and Redshift, SSL encryption can be used

Data in transit encryption: Redshift uses hardware accelerated SSL while connecting to Amazon S3 or DynamoDB

Query your data lake:

Amazon S3 data lake:

Amazon Redshift is the only data warehouse that extends your queries to your Amazon S3 data lake without loading data. You can query open file formats you already use, such as Avro, CSV, Grok, JSON, ORC, Parquet, and more, directly in S3. This gives you the flexibility to store highly structured, frequently accessed data on Redshift local disks.If you discovered this post valuable, Make beyond any doubt, you can look at our Web based Learning courses for more tips, traps and methods for successfully to reveal one of a kind bits of knowledge from your information

#amazon kinesis #aws redshift training #aws training #data warehouse architecture #data warehousing on aws #data warehousing services