Knowledge Warehousing with AWS Redshift
With AWS Glue, the information that was initially in a flat mannequin can now be represented with a extra becoming star schema in an information warehouse.
The cloud knowledge warehouse for this knowledge will likely be created with AWS Redshift Serverless. This entails making a namespace named
flights-namespace in addition to a database named
dev. As well as, it requires a workgroup named
flights-workgroup, which will likely be used to put in writing SQL queries.
Observe: The workgroup has been configured to permit units exterior of the VPC to entry the database. This will likely be helpful when creating the visualization with Energy BI
Now, we will open the question editor in Redshift and begin creating the actual fact and dimension tables within the
First, the 4 tables within the schema should be created within the warehouse utilizing the next instructions:
The 4 tables are actually within the knowledge warehouse, however they’re all empty for the reason that knowledge continues to be within the
The info will be copied into this knowledge warehouse utilizing the
As an example, the information in
flights.csv will be copied into the
flights desk utilizing the next command syntax:
iam_rolevariable needs to be assigned no matter iam function is was chosen when creating the workgroup.
By executing the
COPY command for every of the csv information within the
flights-data-processed bucket, the 4 tables needs to be full of the required knowledge.
For instance, here’s a preview of the airport desk: