MDEV-39834: Add TidesDB 9 storage engine#5166
Conversation
The libtidesdb C library is vendored under storage/tidesdb/libtidesdb/ and compiled into a static archive that is statically linked into the loadable plugin module, mirroring storage/rocksdb, the engine is self-contained and needs no system-installed libtidesdb. include/providers is removed from the engine's include dirs so the bundled zstd/lz4/snappy link directly instead of routing through MariaDB's compression-provider plugins. For cross-version portability, spatial- and fulltext-index detection uses the HA_SPATIAL / HA_FULLTEXT key flags (with a HA_SPATIAL/HA_FULLTEXT -> *_legacy rename shim) rather than KEY::algorithm, which older servers leave unset; my_global.h is included before handler.h so the server typedefs it needs are defined. Adds the tidesdb mysql-test suite covering CRUD, MVCC and pessimistic locking, transactions/savepoints, isolation levels, compression, encryption, TTL, full-text, spatial, partitioning, object-store offload and crash/durability. The vector test auto-skips where the VECTOR type is unavailable (before 11.7). Builds and full mysql-test suite shows green on MariaDB 11.4 (LTS) through to 13.0.2
There was a problem hiding this comment.
Code Review
This pull request integrates the TidesDB storage engine into MariaDB, adding the core handler implementation in ha_tidesdb.h, build configuration in CMakeLists.txt, vendored libtidesdb source files, and an extensive test suite. The reviewer provided several constructive suggestions, including moving the public header copy operation in CMakeLists.txt from configure time to build time to prevent stale headers during incremental builds. Additionally, the reviewer recommended using C++11 in-class member initializers for various member variables in ha_tidesdb.h to avoid uninitialized memory issues, and explicitly marking the ha_tidesdb destructor with override.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
Base (11.4) >> 11.5 >> 12.0 >> … >> 12.x >> 13.0 >> main Can propagate forward to every newer branch automatically. |
|
Full TideSQL reference: https://tidesdb.com/reference/tidesql/ |
|
I see Having a look as I'm gonna guess we need to pass that wf. |
|
C++11 is base I am going to assume, 13 seems to be C++17. I'll keep the engine at bay with C++11 as the library aligns with C11. |
Build fixes for MariaDB 11.4-11.7 (C++11; buildbot builds with -Werror): - Replace C++17 structured bindings with C++11 iterator access in the FTS write_row/update_row/delete_row and ft_init_ext paths. - Guard TABLE::part_info with WITH_PARTITION_STORAGE_ENGINE, which only defines that member when partitioning is compiled in. Initialization hardening (PR review feedback): - Value-initialize TidesDB_share encryption members (encrypted, encryption_key_id, encryption_key_version) in the constructor. - Mark ~ha_tidesdb() override.
|
Something to have a look at: https://buildbot.mariadb.org/#/builders/554/builds/22266 and https://buildbot.mariadb.org/#/builders/554/builds/22266/steps/3/logs/stdio not sure if CI or something I need to touch on at plugin level. |
|
Caught it, had to create mariadb-plugin-tidesdb.install |
|
https://buildbot.mariadb.org/#/builders/534/builds/39281/steps/6/logs/stdio buildbot/amd64-ubuntu-2204-debug-ps does not seem to be due to TidesDB implementation. v9.3.3 of TidesDB and v4.5.4 of TideSQL plugin implemented. Ready for review. |
Right, there is an open ticket MDEV-38195 about failures of the test |
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
I'm approving it as it is, as there're no major violations. Or, if there are any, they're in the competency of the final reviewer. Nevertheless, I'll mention what I see that can be improved.
First of all: please consider merging some of the commits. Ideally there should be one commit per distinct "feature". I'd guess that makes 1 in your case.
Next: this technically is a new feature. As such, I'd suggest considering rebasing to the main branch. Unless of course, there is some special consideration why the targed should be as is.
Finally, there should be a Jira for house-keeping. I'll start one for you, using the PR explanation. But this needs to be improved significantly I'd guess, until it evolves into a proper design specification that would help with the final review.
There's also one small thing I've noticed below.
Please stand by for the final review.
There was a problem hiding this comment.
I do not think you need this file.
There was a problem hiding this comment.
Thank you @gkodinov understood for 1. For 2, to be able to propagate forward to newer branches selectively or not and main with a floor of support for 11.4. For 3 understood, I appreciate the help with that and I can work off it from there.
There was a problem hiding this comment.
I do not think you need this file.
You're right it's a fossil.
There was a problem hiding this comment.
Thank you @gkodinov understood for 1. For 2, to be able to propagate forward to newer branches selectively or not and main with a floor of support for 11.4. For 3 understood, I appreciate the help with that and I can work off it from there.
Have you tried just plaining atop of the main branch and re-compiling. You're not changing anything in the server itself. So it might work just as well. You could maybe need to alter some of the calls back to the server, but I do not think it has changed that much. Then it's a matter of running the regression tests successfully (and debugging any failures).
I (and the final reviewer) can guide you through that.
There was a problem hiding this comment.
All conditionals are handled from 11.4 to 13.02, though verified sparingly there is a full list. Main sounds ok, if condenses is main, we can do that.
|
AFAIK 11.4 closed for the new features, why 11.4 not main? Especially taking into account required dependencies zstd, lz4, and snappy |
|
target_compile_options(tidesdb_embedded PRIVATE -w) Why all warning supressed in so big code (we have enough problems with warnings with existing things why add more) |
Well, I originally thought per getting it into a lower branch and merging upwards for visibility was a goal per discussion with foundation, I've brought up a few times. If the goal is newer versions only you guys can tell me and I can correct accordingly. TideSQL was written for 11.4 onwards to modern 13.02 in testing. Hitting many granular versions. We have a list if you require. These versions I've supported have conditionals a lot because things do move around quite a bit as you probably know. Thus we want I'm certain the same thing where a user and everyone is happy. Cheers |
Can you further explain? We build and develop with all sanitizers on and debug flags and running on 16 different platforms and x86 and x64 in continuous integration. Please if you have a moment look at the TidesDB repository and the TideSQL main repositories. If you have a concern specifically with this PRs build let me know how you'd prefer it because I am not following currently properly. The plugin itself I'm highly doubtful would have any warnings as we would catch them in engine testing and continuous integration and plugin testing and integration, but i cant remember right now why it's set the way it is, we can modify a compile option for inclusion? Is this what you're pointing out? Thank you kindly |
|
Hm so @sanja-byelkin do you prefer vendored thirdparty code to use not ignore compiler warnings? I can do Which is review friendly I believe. Do let me know, I shall make the change ASAP. Thank you for your review. Cheers. |
|
Upstream on the TidesDB library I am doing a scrub with what you've asked for to assure all is well. The flag you brought up is known to have false positives so I will take time with diligence there. |
libtidesdb uses C11 stdatomic.h, which MSVC only enables under /experimental:c11atomics; without it the Windows build fails with C1189 "C atomic support is not enabled". Add an MSVC branch to the tidesdb_embedded target setting /std:c17 /experimental:c11atomics plus the same targeted /wd warning disables the library uses upstream (the Windows build compiles with /WX). The GCC and Clang warning handling is unchanged.
|
MSVC pthreads issue reviewing builtbot, we handle differently at library level. Investigating to see if another patch is required at library level to fit MariaDB. edit: yeah I'm gonna have to adjust compat layer a bit to comply. Working on that now. |
…ds backend Re-vendor the libtidesdb 9.3.6 sources. 9.3.6 adds a native Win32 threading backend in compat.h (SRWLOCK mutexes and rwlocks, CONDITION_VARIABLE condition variables, _beginthreadex threads, and fiber-local storage so thread-local destructors still run), so on MSVC it no longer includes pthread.h and needs no pthreads-win32 library. This fixes the MariaDB MSVC build, which has no pthreads-win32 and was failing with C1083 "Cannot open include file pthread.h". POSIX, macOS and MinGW keep using pthreads as before. Only compat.h, tidesdb.c and tidesdb.h change from 9.3.5; the C11 atomics flag added earlier (/experimental:c11atomics) is still required on MSVC.
decode_varint returns negative without writing its output when max_bytes is zero, which happens on a truncated or corrupt block, and several indexed read and iterator decode sites used the result without checking the return, so a handful of locals (ks vs vo seq_val vlog_off) could be read uninitialized. zero initialize those locals so the failure path yields a defined zero rather than garbage. also add a TIDESDB_WARN_MAYBE_UNINIT option, gcc only and off by default, that turns the warning on for an optimized dev build where the analysis actually runs. analyzed for MariaDB engine addition MariaDB/server#5166 review - correct an iterator scan miss under reaper eviction and stop the cross cf atomicity test flaking on commit backpressure iter_new pinned the level array once then skipped any sstable it could not immediately ref treating it as dead when in fact the reaper had it in a transient evicting window with the descriptor still live in the level. a reader iterator snapshots only once so that skip dropped a live sstable for the whole life of the scan and a key in it read back as not found. now we spin on try_ref while the level array is unchanged and only bail to retry when the array actually changed under us same shape as the point get reaper evict skip we already correctd. the cross cf atomicity test counted a transient busy commit as a hard error which tripped the errors assert on slow loaded ci boxes. commit through tdb_test_commit_with_retry like the sibling writers so a backpressure stall retries instead of failing the run.
Re-vendor the current libtidesdb sources, picking up the latest fixes made upstream to pass the Windows build. Changes are confined to btree.c, compat.h and tidesdb.c; the plugin and the test suite are unchanged.
Mirror libtidesdb's own MSVC flags by adding /wd4267 next to the existing /wd4244 narrowing-warning disable on the tidesdb_embedded target. Under the MariaDB Windows build's /WX, the library's size_t to int narrowings (e.g. "const int total = queue_size(...)") were promoted to C2220 errors. The library accepts this warning class (it already disables the sibling C4244), so disable C4267 as well and handles explicit checks where applicable. GCC, Clang and the Linux build are unaffected.
…fset Field::val_int_offset() takes a uint row offset, so cast the record-buffer offset to uint instead of my_ptrdiff_t. Under the MariaDB Windows build's /WX the my_ptrdiff_t to uint conversion was promoted to a C2220 error. The offset is record[1] - record[0] (rec_buff_length), which always fits in uint, so the cast is value-preserving. GCC and the Linux build are unaffected.
|
I just tried to build it on my system (Fedora 44).
Compilation options was
|
Working on that, OpenSSL dep was removed from library to play nicely, we were only using it on 2 methods for object store mode; we ended up writing the sha256, hmac algorithms as they were small enough, the next commit will pass.
Understood and no it's not default, its just available, similar to RocksDB. |
|
We discussed version issue. Final decision is not made yet. My opinion is that it is OK to make it in 11.4 but not OK to make it by default. becouase it will disturb distributives and user who build it from sources. Stable version should be stable. |
|
For clarity, "default" I mean build by default without enabling it through cmake options. |
|
No problem, thank you @sanja-byelkin |
|
@sanja-byelkin currently the plugin is built but not loaded by default. Reading the code i'll add DISABLED to the MYSQL_ADD_PLUGIN call then to actually build the plugin |
|
Before I do this though, will this affect testing TidesDB builds and MTR? Do let me know how you'd like to approach if possible so I can line next commit up properly. |
|
About tests, it would be way better if tests will be placed under storage/tidedb (see rocksdb, columnstore or connect for example). Regression test suite looks in mysql-test directories of the storage engines. Old tests placed in main mysql-test mostly historically. Also the other general observation (i have not looked deep yet), I see stress tests (at least by name) mixed with regression tests, it is better divide them in different suites, because one should be run always the other (again at least by name) something time consuming. Also just to remember that for really heavy long running tests (not stress tests just something huge) we have --big option and i can be checked inside such tests to avoid running them each time. It is not for stress testing just for really big tests, I mentioned it here to give full impression about our test suite abilities. |
No problem at all, will move into storage directory out of historical.
The 300+ tests are all short, their all primarily integration tests, I wouldn't consider real "stress" per se, we test isolation on concurrent load, autocommit stress, but stress mostly on concurrency. Naming can be refactored, I'll go through and assure naming describes the running tests a bit better. No problem for me. |
|
tidesdb_object_store.test - should check somehow that the host setup S3 (I do not have S3 setup on my computer but still want to run the test suite without S3, the same for buildbot builders (at least some). It would be nice to detect absence of S3 by asking server/plugin, in worst case by environment variable. For inspiration you can check mysq-test/include/have_* |
In TidesDB object store mode can be used with FS by default, so it's testing that path but yes variables for say minio, rust-fs, s3 could be valuable to configure for that specifically. |
|
An update about versions. We just now are accepting DuckDB, it will make a precedent and we will just follow with TideDB. It should not take more than several days and we have what to fix during this time :) |
|
Tide**s**DB, also understood. I saw the engine from the PLC, I understand. One thing, I do wanna state is on the repository side we are sticking to gamma as we have progressed through that hierarchy on our own side, though it may differ internally. I've brought up the version confusion with foundation as I went by what was in code, there was no stern documentation, I went with alpha, beta, preprod. After this, I'm sure we will have documentation on it!! :D |
This pulls in the bundled SHA256 and HMAC SHA256 implementation (src/sha256.c,
src/hmac_sha256.c) used for AWS SigV4 request signing, so the S3 object
store connector no longer links OpenSSL and depends on libcurl alone. That
removes the OpenSSL versus bundled wolfSSL conflict that broke the Windows
buildbot runner, while leaving S3 fully functional on every platform.
storage/tidesdb/CMakeLists.txt now compiles the new crypto sources into the
static archive, links the S3 connector against libcurl only, and generates
libtidesdb's public tidesdb_version.h from its template (as the upstream
build does) carrying TIDESDB_VERSION and the TIDESDB_HAS_S3 and
TIDESDB_HAVE_{ZSTD,LZ4,SNAPPY} feature defines.
Exposed the vendored library version through the engine. ha_tidesdb.cc
includes the generated header and publishes a tidesdb_library_version status
variable, so SHOW GLOBAL STATUS reports which libtidesdb release the engine
was built against, independent of the plugin version. The status variable
test is updated accordingly.
Relocated the test suite from mysql-test/suite/tidesdb to
storage/tidesdb/mysql-test/tidesdb so the engine owns its tests (the layout
storage/rocksdb uses), and renamed the misnamed tidesdb_stress test to
tidesdb_concurrency since it is a deterministic transaction and concurrency
correctness test rather than a load test.
Made the suite preserve global state for mtr check-testcase. The tests run
the test database as utf8mb4 (set in have_tidesdb.inc); cleanup_tidesdb.inc
now restores it to the server default instead of forcing utf8mb4, the two
force restart fulltext tests source cleanup, and tidesdb_vector runs its
VECTOR skip check before the charset change so a skipped run leaves the
database unchanged. Without this every test reported a check-testcase
side effect.
Tested from 11.4 through 13.0, full tidesdb suite green,
check-testcase clean across repeated parallel runs
Include winsock2.h before curl pulls it in on MSVC so struct timeval is defined exactly once. winsock2.h sets _WINSOCK2API_, so compat.h skips its own timeval definition and curl's later include does not collide with it. MinGW continues to resolve timeval through its POSIX sys/time.h.
|
@sanja-byelkin ok for review on your side. I've described changes quite extensively in each commit furthermore tied the @tidesdb patch to it. The windows packages buildbot is now failing on something unrelated to TidesDB, nor TideSQL. |
|
There is: This looks related to install executable for windows primarily. Core code change as opposed to plugin. I can make the adjustment but before touching MariaDB code, I'd obviously like confirmation on that aspect. In the interim at TidesDB we will be merging v9.3.6 for release it's all good on our side. For what's next it will be contained to MariaDB PR, the TideSQL repo will also be updated shortly to reflect and all documentation. Thank you!!! |
The lz4 provider service exposed LZ4_compressBound, LZ4_compress_default and LZ4_decompress_safe. Add LZ4_compress_fast so storage engines that use liblz4's acceleration parameter can route their lz4 compression through the provider instead of linking liblz4 directly. The new function pointer is appended to provider_service_lz4_st and wired up in the provider plugin init, with a not-loaded warning stub in sql_plugin_services.inl. VERSION_provider_lz4 is bumped to 0x0101 for the additive service change.
The tidesdb suite only exercised the NONE and ZSTD per-table COMPRESSION options. Add a dedicated test that round-trips a compressible payload through every codec (NONE, SNAPPY, LZ4, LZ4_FAST, ZSTD), verifying the option persists in SHOW CREATE TABLE and that data survives a write and read cycle. LZ4_FAST additionally covers UPDATE and DELETE so the LZ4_compress_fast acceleration path is rewritten and read back.
Add provider_zstd, mirroring the existing lz4 and snappy compression providers, so storage engines can route Zstandard compression through the loadable provider service instead of linking libzstd directly. The provider exposes ZSTD_compressBound, ZSTD_compress, ZSTD_decompress and ZSTD_isError, the surface TidesDB uses. This adds include/providers/zstd.h (the service shim), the provider_service_zstd libservice, the provider_zstd daemon plugin and its .cnf (plugin_load_add plus force_plus_permanent like the other providers), the VERSION_provider_zstd service version, and the handler registration in sql_plugin_services.inl with not-loaded warning stubs. InnoDB is left unchanged; it has no zstd page-compression algorithm. The provider is exercised by the TidesDB engine, whose compression test covers the ZSTD codec and others.
Switch the engine from linking libzstd, liblz4 and libsnappy directly to
resolving them through MariaDB's compression provider services. Keep
include/providers on the engine include path so libtidesdb's compress.c
compiles against the provider shims, compile the vendored archive with
MYSQL_DYNAMIC_PLUGIN so the codec calls bind to the loaded provider, and link
mysqlservices so the provider_service_{zstd,lz4,snappy} symbols resolve. The
plugin no longer has a direct dependency on any compression library.
All three code paths are always compiled in; a codec is usable at run time
when its provider plugin is loaded and fails cleanly (engine stays up, other
codecs unaffected) when it is not. The tidesdb suite loads provider_lz4,
provider_snappy and provider_zstd in suite.opt because the default table
compression is LZ4.
Tested on the full suite with the providers loaded (all pass, check-testcase
clean) and verified graceful degradation with a provider absent.
The libtidesdb C library is vendored under storage/tidesdb/libtidesdb/ and compiled into a static archive that is statically linked into the loadable plugin module, mirroring storage/rocksdb, the engine is self-contained and needs no system-installed libtidesdb. include/providers is removed from the
engine's include dirs so the bundled zstd/lz4/snappy link directly instead of routing through MariaDB's compression-provider plugins.
For cross-version portability, spatial- and fulltext-index detection uses the HA_SPATIAL / HA_FULLTEXT key flags (with a HA_SPATIAL/HA_FULLTEXT -> *_legacy rename shim) rather than KEY::algorithm, which older servers leave
unset; my_global.h is included before handler.h so the server typedefs it needs are defined.
Adds the tidesdb mysql-test suite covering CRUD, MVCC and pessimistic locking, transactions/savepoints, isolation levels, compression, encryption, TTL, full-text, spatial, partitioning, object-store offload and crash/durability. The vector test auto-skips where the VECTOR type is unavailable (before 11.7).
Builds and full mysql-test suite shows green on MariaDB 11.4 (LTS) through to 13.0.2