Incremental Update 4 at Feldera

Incremental Update 4 at Feldera

| August 14, 2024

We released v0.24 today. This release came earlier than originally planned, we couldn't wait to share some of the exciting new features with you.

The highlight of this release is the introduction of As-Of Joins—a powerful feature for those working with time-series data.

An As-Of Join is a type of temporal join that allows you to match records from two tables based on the closest preceding timestamp. This is incredibly useful in scenarios where you need to retrieve the most recent event in one table that occurred before a specific time in another table. If that explanation was a bit complicated, fear not. Here is a simple example.

-- This snippet has two tables and one view:
--
-- * `Stocks` Table: Each row contains a price for a stock at a given time. 
--   A new price for one of the stocks is inserted every second.
--   We configure the datagen connector so that `price_time` is monotonically 
--   increasing (default setting)
-- * `Orders` Table: Each row represents an order for a stock. A new order is inserted every second.
--   We configure the timestamp in `order_time` to be uniformly distributed to show 
--   the workings of the As-Of join.
-- * `Order_With_Stock_Price` view: Matches each order with the most recent stock price based on the
--   order’s timestamp. The As-Of Join finds the latest `price_time` in the Stocks table that is
--   less than or equal to the `order_time`.

CREATE TABLE Stocks (
    symbol VARCHAR NOT NULL,
    price_time BIGINT NOT NULL,  -- UNIX timestamp
    price DOUBLE NOT NULL
) with (
  'connectors' = '[{
    "transport": {
      "name": "datagen",
      "config": {
        "plan": [{ 
            "limit": 1000,
            "rate": 1,
            "fields": {
                "symbol": { "values": ["AAPL", "GOOGL", "SPY", "NVDA"] },
                "price": { "strategy": "uniform", "range": [100, 10000] }
            }
        }]
      }
    }
  }]'
);

CREATE TABLE Orders (
    order_id BIGINT NOT NULL,
    order_time BIGINT NOT NULL,  -- UNIX timestamp
    symbol VARCHAR NOT NULL,
    quantity INT NOT NULL
) with (
  'connectors' = '[{
    "transport": {
      "name": "datagen",
      "config": {
        "plan": [{ 
            "limit": 1000,
            "rate": 1,
            "fields": {
                "order_time": { "strategy": "uniform", "range": [0, 1000] },
                "symbol": { "values": ["AAPL", "GOOGL", "SPY", "NVDA"] },
                "quantity": { "strategy": "zipf", "range": [1, 10000] }
            }
        }]
      }
    }
  }]'
);

CREATE VIEW Order_With_Stock_Price AS (
SELECT
    orders.order_id,
    orders.order_time,
    orders.symbol,
    orders.quantity,
    stocks.price,
    TIMESTAMPADD(SECOND, orders.order_time, TIMESTAMP '1970-01-01') as order_timestamp
FROM
    Orders
LEFT ASOF JOIN Stocks
    MATCH_CONDITION(Stocks.price_time <= Orders.order_time)
    ON Stocks.symbol = Orders.symbol
);

Here is what the As-Of Join from above would produce when inspecting the change stream in the Feldera WebConsole:

Changestream for ASOF join example

Want to try it yourself? That's great, because we also added a new feature that makes it easier to create and share pipelines. With this update, you can quickly run sample code in our sandbox environment. Click here to run this code in our sandbox with the code pre-filled.

We'll have more blog posts on As-Of Joins soon! These joins are incredibly powerful, especially when it comes to feature engineering pipelines for machine learning. There we'll dive deeper into how you can leverage them in your projects.

Other articles you may like

Incremental Update 6 at Feldera

We’re excited to announce the release of v0.26, which represents a significant step forward for Feldera. This release includes over 200 commits, adding 25,000 lines of new code and documentation. Let's dive into the highlights!

Database computations on Z-sets

How can Z-sets be used to implement database computations

Incremental Update 5 at Feldera

A quick overview of what's new in v0.25.