Step 4. Update Data

So far in this tutorial, you have loaded a batch of JSON event data into a RAW_SOURCE table, queried it, and created a EVENTS table composed of event key values. In the scenario we described earlier in this tutorial, the JSON data flows from multiple sources through a web endpoint to your S3 bucket. You can write a script to pull new data files into the RAW_SOURCE table. How do you then insert only new event data into the EVENTS table?

There are numerous ways to differentiate your JSON data in the RAW_SOURCE table and the flattened data in the EVENTS table. This section suggests two options:

Comparison of Primary Key Data

  1. Examine your JSON data for any values that are naturally unique and would be good candidates for a primary key. In our current example, let’s suppose that the combination of src:device_type and value:rv would make a good composite primary key. These two JSON keys correspond to the DEVICE_TYPE and RV columns in the EVENTS table.

    Note

    Snowflake does not enforce the primary key constraint. Rather, the constraint serves as metadata identifying the natural key in the Information Schema.

  2. Add the primary key constraint to the EVENTS table:

    ALTER TABLE events ADD CONSTRAINT pk_DeviceType PRIMARY KEY (device_type, rv);
    
  3. Insert a new JSON event record into the RAW_SOURCE table:

    insert into raw_source
      select
      PARSE_JSON ('{
        "device_type": "cell_phone",
        "events": [
          {
            "f": 79,
            "rv": "786954.67,492.68,3577.48,40.11,343.00,345.8,0.22,8765.22",
            "t": 5769784730576,
            "v": {
              "ACHZ": 75846,
              "ACV": 098355,
              "DCA": 789,
              "DCV": 62287,
              "ENJR": 2234,
              "ERRS": 578,
              "MXEC": 999,
              "TMPI": 9
            },
            "vd": 54,
            "z": 1437644222811
          }
        ],
        "version": 3.2
      }');
    
  4. Insert the new record in the RAW_SOURCE table into the EVENTS table based on a comparison of the primary key values:

    insert into events
    select
          src:device_type::string
        , src:version::string
        , value:f::number
        , value:rv::variant
        , value:t::number
        , value:v.ACHZ::number
        , value:v.ACV::number
        , value:v.DCA::number
        , value:v.DCV::number
        , value:v.ENJR::number
        , value:v.ERRS::number
        , value:v.MXEC::number
        , value:v.TMPI::number
        , value:vd::number
        , value:z::number
        from
          raw_source
        , lateral flatten( input => src:events )
        where not exists
        (select 'x'
         from events
         where events.device_type = src:device_type
         and events.rv = value:rv);
    

    Querying the EVENTS table shows the added row:

    select * from EVENTS;
    
    +-------------+---------+---------+----------------------------------------------------------------------+---------------+-------+--------+-----+-------+------+------+------+------+-----+---------------+
    | DEVICE_TYPE | VERSION |       F | RV                                                                   |             T |  ACHZ |    ACV | DCA |   DCV | ENJR | ERRS | MXEC | TMPI |  VD |             Z |
    |-------------+---------+---------+----------------------------------------------------------------------+---------------+-------+--------+-----+-------+------+------+------+------+-----+---------------|
    | server      | 2.6     |      83 | "15219.64,783.63,48674.48,84679.52,27499.78,2178.83,0.42,74900.19"   | 1437560931139 | 42869 | 709489 | 232 | 62287 | 2599 |  205 |  487 |    9 |  54 | 1437644222811 |
    | server      | 2.6     | 1000083 | "8070.52,54470.71,85331.27,9.10,70825.85,65191.82,46564.53,29422.22" | 1437036965027 |  6953 | 346795 | 250 | 46066 | 9033 |  615 |    0 |  112 | 626 | 1437660796958 |
    | cell_phone  | 3.2     |      79 | "786954.67,492.68,3577.48,40.11,343.00,345.8,0.22,8765.22"           | 5769784730576 | 75846 |  98355 | 789 | 62287 | 2234 |  578 |  999 |    9 |  54 | 1437644222811 |
    +-------------+---------+---------+----------------------------------------------------------------------+---------------+-------+--------+-----+-------+------+------+------+------+-----+---------------+
    

Comparison of All Keys/Columns

If your JSON data doesn’t include any key values that are natural primary key candidates, you could compare all repeating JSON keys in the RAW_SOURCE table with the corresponding column values in the EVENTS table.

No changes to your existing EVENTS table are required.

  1. Insert a new JSON event record into the RAW_SOURCE table:

    insert into raw_source
      select
      parse_json ('{
        "device_type": "web_browser",
        "events": [
          {
            "f": 79,
            "rv": "122375.99,744.89,386.99,12.45,78.08,43.7,9.22,8765.43",
            "t": 5769784730576,
            "v": {
              "ACHZ": 768436,
              "ACV": 9475,
              "DCA": 94835,
              "DCV": 88845,
              "ENJR": 8754,
              "ERRS": 567,
              "MXEC": 823,
              "TMPI": 0
            },
            "vd": 55,
            "z": 8745598047355
          }
        ],
        "version": 8.7
      }');
    
  2. Insert the new record in the RAW_SOURCE table into the EVENTS table based on a comparison of all repeating key values:

    insert into events
    select
          src:device_type::string
        , src:version::string
        , value:f::number
        , value:rv::variant
        , value:t::number
        , value:v.ACHZ::number
        , value:v.ACV::number
        , value:v.DCA::number
        , value:v.DCV::number
        , value:v.ENJR::number
        , value:v.ERRS::number
        , value:v.MXEC::number
        , value:v.TMPI::number
        , value:vd::number
        , value:z::number
        from
          raw_source
        , lateral flatten( input => src:events )
        where not exists
        (select 'x'
         from events
         where events.device_type = src:device_type
         and events.version = src:version
         and events.f = value:f
         and events.rv = value:rv
         and events.t = value:t
         and events.achz = value:v.ACHZ
         and events.acv = value:v.ACV
         and events.dca = value:v.DCA
         and events.dcv = value:v.DCV
         and events.enjr = value:v.ENJR
         and events.errs = value:v.ERRS
         and events.mxec = value:v.MXEC
         and events.tmpi = value:v.TMPI
         and events.vd = value:vd
         and events.z = value:z);
    

    Querying the EVENTS table shows the added row:

    select * from EVENTS;
    
    +-------------+---------+---------+----------------------------------------------------------------------+---------------+--------+--------+-------+-------+------+------+------+------+-----+---------------+
    | DEVICE_TYPE | VERSION |       F | RV                                                                   |             T |   ACHZ |    ACV |   DCA |   DCV | ENJR | ERRS | MXEC | TMPI |  VD |             Z |
    |-------------+---------+---------+----------------------------------------------------------------------+---------------+--------+--------+-------+-------+------+------+------+------+-----+---------------|
    | server      | 2.6     |      83 | "15219.64,783.63,48674.48,84679.52,27499.78,2178.83,0.42,74900.19"   | 1437560931139 |  42869 | 709489 |   232 | 62287 | 2599 |  205 |  487 |    9 |  54 | 1437644222811 |
    | server      | 2.6     | 1000083 | "8070.52,54470.71,85331.27,9.10,70825.85,65191.82,46564.53,29422.22" | 1437036965027 |   6953 | 346795 |   250 | 46066 | 9033 |  615 |    0 |  112 | 626 | 1437660796958 |
    | cell_phone  | 3.2     |      79 | "786954.67,492.68,3577.48,40.11,343.00,345.8,0.22,8765.22"           | 5769784730576 |  75846 |  98355 |   789 | 62287 | 2234 |  578 |  999 |    9 |  54 | 1437644222811 |
    | web_browser | 8.7     |      79 | "122375.99,744.89,386.99,12.45,78.08,43.7,9.22,8765.43"              | 5769784730576 | 768436 |   9475 | 94835 | 88845 | 8754 |  567 |  823 |    0 |  55 | 8745598047355 |
    +-------------+---------+---------+----------------------------------------------------------------------+---------------+--------+--------+-------+-------+------+------+------+------+-----+---------------+
    

Next: Step 5. Congratulations!