April 03, 2021
This was a relative calm week. I’ve read a lot and coded less than I wanted. I’ll try to focus more on development and less in reading for the next weeks.
Data isn’t a panacea and I’ve found What Data can’t do a great overview on how the wrong data will lead to wrong insights. As SQL is a declarative language, the query planner takes care of finding the most performant way of retrieving data. Ben Levy and Christian Charukiewicz write in Speeding up SQL queries by orders of magnitude using UNION of a case where the results aren’t optimal and how a UNION can greatly improve a query. How Data Discovery Tools Enable Data Democratization is an article by datacamp on why and how these tools improve the data trust and fluency.
Kafka has been working on deprecating zookeeper and has finally gotten a preview, as shown in Apache Kafka Made Simple: A First Glimpse of a Kafka Without ZooKeeper. This is really exciting as it simplifies the deployment of a cluster and increases the reliability and performance of kafka itself (specially when increasing horizontaly). Why Kafka Is so Fast gives some insights into the reason on how Kafka’architecture makes it look as “fast” (not the best term to be used but makes a point).
I’ve been giving more attention to scala as a good entry point to a functional programming language. With all the good things it brings, it also has some challenges to solve as explained in Scala is a Maintenance Nightmare.
Good week, stay safe :-)
I'm José Cabeda, a data engineer focused on improving data systems and educating on how to use them. I also do a lot of planning and read as much as I can.