I keep facing this issue where dataform highlights an error pertaining the use of the "MERGE" within the workspace
config {
type: "incremental",
database: '**********',
schema: 'test_incremntal',
name: 'incremental'
}
-- Step 1: Aggregate data from the source to get the latest date for each content_id
WITH latest_data AS (
SELECT
content_id,
MAX(date) AS latest_date -- Get the latest date for each content_id
FROM `**********.test_incremntal.source_incremental`
GROUP BY content_id
),
-- Join the aggregated data with the source to get the full row for the latest date
latest_full_data AS (
SELECT
src.content_id,
src.date AS update_date,
src.content_name
FROM `**********.test_incremntal.source_incremental` src
JOIN latest_data ld
ON src.content_id = ld.content_id AND src.date = ld.latest_date
)
-- Step 2: Perform the merge operation
MERGE INTO `**********.test_incremntal.incremental` AS tgt
USING latest_full_data AS src
ON tgt.content_id = src.content_id
WHEN MATCHED THEN
UPDATE SET
tgt.update_date = src.update_date,
tgt.content_name = src.content_name
WHEN NOT MATCHED THEN
INSERT (content_id, update_date, content_name)
VALUES (src.content_id, src.update_date, src.content_name)
. The whole idea is to update info within my target table whenever the script is ran...it checks for the latest date.
Example.....
Source Table
content_id | date | content_name |
123 | 2024-09-18 | batman returns |
456 | 2024-09-18 | spider-man |
789 | 2024-09-18 | game of thrones |
123 | 2023-12-13 | batman |
123 | 2022-06-16 | bat-man |
752 | 2024-09-14 | ghost-busters |
Target Table
content_id | date | content_name |
123 | 2024-09-18 | batman returns |
456 | 2024-09-18 | spider-man |
789 | 2024-09-18 | game of thrones |
752 | 2024-09-14 | ghost-busters |
and if the source table gets new info
Source Table
content_id | date | content_name |
123 | 2024-09-18 | batman returns |
456 | 2024-09-18 | spider-man |
789 | 2024-09-18 | game of thrones |
123 | 2023-12-13 | batman |
123 | 2022-06-16 | bat-man |
752 | 2024-09-14 | ghost-busters |
123 | 2024-09-19 | batman: The scary cut |
456 | 2024-09-19 | spider-man: No way here |
789 | 2024-09-19 | game of thrones: Bald Eagle |
752 | 2024-09-19 | ghost-busters: alien in the house |
Expected Target Table:
content_id | date | content_name |
123 | 2024-09-19 | batman: The scary cut |
456 | 2024-09-19 | spider-man: No way here |
789 | 2024-09-19 | game of thrones: Bald Eagle |
752 | 2024-09-19 | ghost-busters: alien in the house |
keeps the same content_id or adds new one if it doesn't exist and then for the ones that do exist it just updates them