OiO.lk Blog SQL Speeding up Snowflake Cursors for Parsing Transactional Table into Summary Rows
SQL

Speeding up Snowflake Cursors for Parsing Transactional Table into Summary Rows


I have tag data that comes in individual rows with 3 types of events: Created, Modified and Deleted.

Tag data has the action date as well as the tag_name and if there is a tag_value. Modified events are when the tag is simply changing values and not deleted. The other 2 events show creation and deleted dates.

I want each "tag" to be one row with a CreatedDT, ModifiedDT and DeletedDT. From there I can have a isActive flag based on if there is a DeletedDT. I want to accomplish this without using massive windows/partitions as this dataset can get large (100 million+). Tags CAN be created and deleted multiple times, and this is where some of the logic gets complicated since you need to match the created and deleteddt with each other for that specific version of the tag.

I was experimenting with Snowflake Cursors to iterate through the events in sequential order. This option tends to be quite slow but it does work. Takes about a minute to get through 150-200 tags which is way too slow. I am more looking for an interesting way to do this, because there has to be a fun way to do this without window functions partitioned by each player and tag.

Example transactional data:

PlayerID PROPERTY TAGNAME EVENTDATE TAGACTION LOGCODE NEWVALUE OLDVALUE
1 7 testtag1 10/25/24 14:10 added 68529384303 null
2 4 Testtag2 10/25/24 14:31 changed 51717448005 0.9 0.93
3 7 testtag3 10/25/24 14:04 added 68530352103 casino null
4 3 testtag4 10/25/24 14:02 added 156526846142 Yes null
5 4 testtag5 10/25/24 14:08 removed 51714702840 null null

Example summary data:

PLAYERID TAGNAME TAGVALUE CREATEDDT UPDATEDDT DELETEDDT ISACTIVE
1 Testtag1 test1 8/7/24 12:00 null null TRUE
2 Testtag2 test1 7/22/24 11:52 null null TRUE
3 Testtag3 test1 7/22/24 11:52 null null TRUE
4 Testtag4 test1 8/25/24 0:03 null 9/1/24 0:02 FALSE
5 Testtag5 test1 9/24/24 0:06 null 10/1/24 0:02 FALSE
6 Testtag6 test1 10/25/24 0:06 null 9/1/24 0:02 FALSE

Example destination table:

CREATE OR REPLACE TABLE Tag_Tracking (
    TagID INT AUTOINCREMENT PRIMARY KEY,
    PLAYERID STRING,
    TAGNAME STRING,
    TagValue STRING,
    CreatedDT TIMESTAMP_NTZ,
    UpdatedDT TIMESTAMP_NTZ,
    DeletedDT TIMESTAMP_NTZ,
    IsActive BOOLEAN DEFAULT TRUE
);

Cursor code I have now:

DECLARE
    v_cursor CURSOR FOR
        SELECT 
            PLAYERID, 
            TAGNAME, 
            TAGACTION, 
            NewValue, 
            EVENTDATE
        from raw_tag_changes
        where eventdate BETWEEN '2024-10-24 18:00:00' AND '2024-10-24 19:00:00'
        ORDER BY PLAYERID, TAGNAME, EVENTDATE ASC;

BEGIN
    FOR record IN v_cursor DO
        -- Assign cursor values to helper variables with explicit types
        LET v_playerid STRING := record.PLAYERID;
        LET v_tagname STRING := record.TAGNAME;
        LET v_tagaction STRING := record.TAGACTION;
        LET v_tagvalue STRING := record.NEWVALUE;
        LET v_eventdate TIMESTAMP := record.EVENTDATE;
        
        CASE
            WHEN :v_tagaction = 'added' THEN
                -- Use MERGE to insert a new row only if no active instance exists
                MERGE INTO Tag_Tracking AS tt
                USING (
                    SELECT :v_playerid AS PLAYERID, :v_tagname AS TAGNAME, :v_eventdate AS EVENTDATE
                ) AS src
                ON tt.PLAYERID = src.PLAYERID
                   AND tt.TAGNAME = src.TAGNAME
                   AND tt.CreatedDT = src.EVENTDATE
                WHEN NOT MATCHED THEN
                    INSERT (PLAYERID, TAGNAME, TagValue, CreatedDT, IsActive)
                    VALUES (:v_playerid, :v_tagname, :v_tagvalue, :v_eventdate, TRUE);

            WHEN :v_tagaction = 'changed' THEN
                -- Update the TagValue of the active instance if "changed"
                UPDATE Tag_Tracking
                SET TagValue = :v_tagvalue
                WHERE PLAYERID = :v_playerid
                  AND TAGNAME = :v_tagname
                  AND IsActive = TRUE;

            WHEN :v_tagaction = 'removed' THEN
                -- Set DeletedDT and deactivate the tag instance if "removed"
                UPDATE Tag_Tracking
                SET DeletedDT = :v_eventdate, IsActive = FALSE
                WHERE PLAYERID = :v_playerid
                  AND TAGNAME = :v_tagname
                  AND IsActive = TRUE;

        END CASE;
    END FOR;



You need to sign in to view this answers

Exit mobile version