Designing Partition Strategy for Analytical Workloads
The syllabus of the DP-203 exam is heavily skewed towards analytical workloads, as opposed to transactional workloads. While the latter is concerned with the creation, updating, or deletion of live production data, the former is concerned with trying to extract meaningful lessons from historical data (which mainly deals with data that is read only). The primary objective is to extract insights from historical data with data refresh and transformation activities for analytical purposes. In this context, the term “historical” could mean data from just last month or even last week.
Note
This section primarily focuses on the Implement a partition strategy for analytical workloads concept of the DP-203: Data Engineering on Microsoft Azure exam.
There are three main types of partition strategies for analytical workloads. These are as follows:
- Horizontal partitioning (also known as sharding)
- Vertical...