January 10, 2021
This week I’ve mainly focused on either data quality through examples like Great Expectation or on data modeling with the help of Airflow and DBT. But the article below show how themes like good quality article are becoming mainstream!
Why is DBT so important? is a good article on the main selling points of DBT. A framework that brings software good practices to data modeling. For teams that can’t use DBT cloud, frameworks like airflow are increasingly important, and the trend for better integration between these tools is great, as presented on Building a Scalable Analytics Architecture with Airflow and DBT: Part 1 and it’s follow up.
After running our models, I’m used to thinking that this data will only be seen on a BI tool or excel. However, after reading Making your DBT models more useful with Census I’ve become interested on the idea of outputting the results of DBT models into other services like salesforce.
But, although DBT is great by itself, we can further integrate good software practices such as observability as explained in Data Observability: The Next Frontier of Data Engineering. Another good example on how to improve our pipelines is Data Quality, which presents the case for always validating our models with data tests.
Finally I’m taking some notes from Building a robust data pipeline with DBT, Airflow, and Great Expectations to implement in future data pipelines.
As always, have a good week :-)
I'm Jose Cabeda, a data engineer focused on improving data systems and educating on how to use them. I also do a lot of planning and read as much as I can.