Incremental Update 17

Incremental Update 17

Gerd Zellweger
Gerd ZellwegerHead of Engineering / Co-Founder
| February 12, 2025

We've just shipped feldera v0.37, focusing on higher throughput when backfilling pipelines with large datasets. This release brings major storage upgrades, compiler optimizations, and runtime improvements.

Storage

We've shipped some major improvements in our storage layer this week, including:

  • Bloom filters in our file format to reduce unnecessary disk reads.
  • File compression to lower I/O overhead (enabled by default).
  • Smarter LSM compaction which now prioritizes the busiest LSM tree instead of working on them in a round-robin fashion.
  • Configurable in-memory cache (default: 256 MiB, tunable via cache_mib).

Cache size and compression are configurable in the storage section of the pipeline settings.

Compiler Optimizations

The compiler now generates faster code by avoiding cloning values in many instances and it got better at optimizing outer joins by turning them into more efficient antijoins when possible.

Runtime Optimizations

  • Strings stored as ArcStr: Faster cloning, fewer reallocations.
  • ARRAY and MAP types wrapped in Arc: Less memory overhead, better performance.

What’s Next?

These improvements have already delivered major performance gains for enterprise users. Try them out and let us know how they work for your pipelines—join the discussion on Slack or Discord!

Other articles you may like

Database computations on Z-sets

How can Z-sets be used to implement database computations

Implementing Batch Processes with Feldera

Feldera turns time-consuming database batch jobs into fast incremental updates.