Knowledge Warehousing with AWS Redshift
With AWS Glue, the information that was initially in a flat mannequin can now be represented with a extra becoming star schema in an information warehouse.
The cloud knowledge warehouse for this knowledge will likely be created with AWS Redshift Serverless. This entails making a namespace named flights-namespace
in addition to a database named dev
. As well as, it requires a workgroup named flights-workgroup
, which will likely be used to put in writing SQL queries.
Observe: The workgroup has been configured to permit units exterior of the VPC to entry the database. This will likely be helpful when creating the visualization with Energy BI
Now, we will open the question editor in Redshift and begin creating the actual fact and dimension tables within the dev
database.
First, the 4 tables within the schema should be created within the warehouse utilizing the next instructions:
The 4 tables are actually within the knowledge warehouse, however they’re all empty for the reason that knowledge continues to be within the flights-data-processed
bucket.
The info will be copied into this knowledge warehouse utilizing the COPY
command.
As an example, the information in flights.csv
will be copied into the flights
desk utilizing the next command syntax:
Observe: the
iam_role
variable needs to be assigned no matter iam function is was chosen when creating the workgroup.
By executing the COPY
command for every of the csv information within the flights-data-processed
bucket, the 4 tables needs to be full of the required knowledge.
For instance, here’s a preview of the airport desk: