Unit testing dbt fashions has at all times been one of the crucial vital lacking items of the dbt ecosystem. This text proposes a brand new unit testing strategy that depends on requirements and dbt finest practices
Ever since dbt launched software program engineering finest practices to the realm of information engineering, its functionalities and the ecosystem round it have stored increasing to cowl but extra areas of the info transformation area.
Nonetheless, one important piece of the “knowledge engineering with software program engineering finest practices” puzzle stays elusive and an unsolved downside: unit testing.
Justifying the significance of unit exams, why they’re vital for any line of code earlier than it may be known as “production-ready”, and why they’re totally different from dbt Checks or knowledge high quality exams is one thing that has already been brilliantly tackled and explained. But when we needed to summarize their significance in a one-minute elevator pitch, it’d be the next:
In knowledge engineering there are typically two totally different parts that we wish to check: the info and our code — dbt Checks (and different knowledge high quality techniques/instruments) enable us to check the info, whereas unit exams enable us to check our code.
With the above in thoughts, it’s solely pure that there have been a number of initiatives by the group to boost dbt with an open-source unit testing functionality (like Equal Specialists’ dbt Unit Testing package or GoDataDriven’s dbt-focused Pytest plugin). Nonetheless, these packages stay restricted in functionalities and have a steep studying curve.
This text introduces a special strategy that’s a lot less complicated but extra elegant, counting on requirements and dbt finest practices to implement a scalable and dependable unit testing course of.
Earlier than diving into the strategy, let’s first outline the extent at which we wish to run our unit exams. The query to reply…