Incremental Update 17 at Feldera

We've just shipped feldera v0.37, focusing on higher throughput when backfilling pipelines with large datasets. This release brings major storage upgrades, compiler optimizations, and runtime improvements.

Storage

We've shipped some major improvements in our storage layer this week, including:

Bloom filters in our file format to reduce unnecessary disk reads.
File compression to lower I/O overhead (enabled by default).
Smarter LSM compaction which now prioritizes the busiest LSM tree instead of working on them in a round-robin fashion.
Configurable in-memory cache (default: 256 MiB, tunable via cache_mib).

Cache size and compression are configurable in the storage section of the pipeline settings.

Compiler Optimizations

The compiler now generates faster code by avoiding cloning values in many instances and it got better at optimizing outer joins by turning them into more efficient antijoins when possible.

Runtime Optimizations

Strings stored as ArcStr: Faster cloning, fewer reallocations.
ARRAY and MAP types wrapped in Arc: Less memory overhead, better performance.

What’s Next?

These improvements have already delivered major performance gains for enterprise users. Try them out and let us know how they work for your pipelines—join the discussion on Slack or Discord!

Incremental Update 17

Storage

Compiler Optimizations

Runtime Optimizations

What’s Next?

Other articles you may like

Database computations on Z-sets

Implementing Batch Processes with Feldera

Feldera: three tools for the price of one