For far too long, data catalogs have been overly focused on data users while shunning the needs of software engineers and, specifically, data engineers. The core features in all data catalogs — metadata capture, tagging, lineage, to name a few — are skewed to a UI-based search and discovery paradigm. Fundamentally, these capabilities support data users but offer relatively little value for data creators, which has led to two main problems with the data catalog:
It’s time for…
AWS Lambda is generally one of the easiest ways to deploy and execute code in the cloud, especially when deploying code with the sam CLI. The simplicity of serverless resource definitions coupled with the ability to package resources locally and ensure they run on AWS provides a beautiful development experience.
Except sometimes this beautiful process can turn into a beast of a deployment when the build and package steps grow to ten, fifteen or (gasp) more minutes. There are a few places where the
sam deploy paradigm breaks down and starts to cause runaway deployment times:
Keeping your data secure should always be a top priority when creating, storing or using your data. One of the methods you should employ to keep your data safe is to put your databases into a private network where they cannot be accessed by the public internet. Access to your databases can then be limited to a select few resources with elevated security controls that sit within your network but also allow inbound access from the public internet.
Data Engineers and Data Scientists are two peas in a pod, right? Both roles are designed to have the same purpose: to extract more value out of data. With such a clear purpose, why does it seem like there are often challenges when these two roles come together to solve a problem?
Data scientists and data engineers have very different backgrounds that drive the fundamental ways in which they approach problem solving. The context in which they learn and grow to become either a data engineer or a data scientist is what makes each strong at their respective positions. …