Skip to content

Binary sv2 cleanup#2200

Open
Shourya742 wants to merge 20 commits into
stratum-mining:mainfrom
Shourya742:2026-06-16-binary-sv2-cleanup
Open

Binary sv2 cleanup#2200
Shourya742 wants to merge 20 commits into
stratum-mining:mainfrom
Shourya742:2026-06-16-binary-sv2-cleanup

Conversation

@Shourya742

@Shourya742 Shourya742 commented Jun 16, 2026

Copy link
Copy Markdown
Member

closes: #2171 #2173 #2175 #2176 #2177 #2178 #2179 #2213 #2214 #2215 #2216

Apart from the issues mentioned above, this PR introduces several additional improvements:

  1. Removes a significant amount of redundant code.
  2. Expands the API surface. In particular, conversions for copy types are now lossless, while conversions for variable-sized types are performed through TryFrom. (More details can be found in the companion sv2-apps PR. I'm open to suggestions for alternative APIs these additions are primarily based on my experience working with the sv2-apps codebase.)
  3. Introduces a dedicated MAC type, bringing us into full compliance with the specification.
  4. Replaces a large number of handwritten From implementations with MBE macros, significantly reducing duplication.
  5. Fixes issues in the streaming APIs. Previously, certain code paths could lead to unbounded reads, creating potential attack vectors. Reads are now properly bounded, and recursion-related bounds have been removed from the writer implementation.
  6. Removes U32Ref and U8Owned, as they do not provide any meaningful structural advantages.
  7. Adds iterator support for Seq types, allowing users to iterate over elements and perform in-place modifications more ergonomically.

companion stratum-mining/sv2-apps#576

@Shourya742

Copy link
Copy Markdown
Member Author

Awesome, the test are passing, atleast I have not broken any invariants. Need to improve API ergonomics and should be good to go.

@Shourya742 Shourya742 force-pushed the 2026-06-16-binary-sv2-cleanup branch 4 times, most recently from 4a79675 to 07b0f31 Compare June 22, 2026 15:28
@Shourya742 Shourya742 marked this pull request as ready for review June 22, 2026 15:39
@Shourya742 Shourya742 requested review from GitGab19 and plebhash June 22, 2026 15:39
@Shourya742

Copy link
Copy Markdown
Member Author

I will open issues for all the extra points mentioned in the description.

@GitGab19

Copy link
Copy Markdown
Member

My clanker found this:

High: the new bounded from_reader can still block/read until EOF on an invalid B032 length. [decodable.rs (line 70)](/tmp/stratum-pr2200/sv2/binary-sv2/src/codec/decodable.rs:70) treats every ReadError as “need one more byte”, but [inner.rs (line 109)](/tmp/stratum-pr2200/sv2/binary-sv2/src/datatypes/non_copy_data_types/inner.rs:109) also returns ReadError when a variable-sized type declares more than its max, e.g. B032 header 33. For messages with B032 fields, a peer can send an oversized length and keep the socket open; the loop keeps reading instead of rejecting the message. I’d split “insufficient bytes” from “declared size exceeds max”, or make from_reader retry only on the former.

@Shourya742 Shourya742 force-pushed the 2026-06-16-binary-sv2-cleanup branch from 07b0f31 to a1f2b6a Compare June 24, 2026 17:10
issue: stratum-mining#2214, in binary_sv2 we had try_from, from impls
from all primitive types to Encodable and decodable variants which were very repetitive. This commit
adds macros to remove repetitions and make the file more sane to look at.
Here we add slices as an encodable field in case buffer_sv2 is enabled, not tracked by an issue.
Size_hint basically provides an idea around how much is expected,
initially would just provide the size without checking whether
buffer would even satisfy requirements. Now, we make sure buffer
len should be adequate otherwise we return read error
…esn't leak internal implementation

This one fixes two of the loupe issues and one of the ergonomics issue:
1. stratum-mining#2176 -> Making sure the vector
conversion doesn't break length invariant.
2. stratum-mining#2179 -> Making sure sv2 primitives
doesn't leak internal representation
3. stratum-mining#2216 -> Adding iterators to seq
primitives.

This commit also updates sv2_to_sv1 to not use list type internal representation
…rait

This is more of a clean up commit, where we remove repetitive hex conv
code in primitive's display trait implementation.
This is also a loupe issue: stratum-mining#2171

The loupe variant says that we were using unchecked variant which would
bypass the length invariant, and at the same time, this is not something
we would require, so we are removing them
This solves  loupe issue:
1. stratum-mining#2173 -> Here we fix the size of FIXED
primitive to be equal to SIZE and not one.
2.  stratum-mining#2177 -> make sure header is appended
to the writer.
…olicy

This solves a loupe issue: stratum-mining#2178,
We are not doing unbounded reads with with_reader
Decode now returns an error, previous this was heavly utilizing the
unchecked meaning without any constraints to the addition. This
now forces us to return error in case if some variants are not met.
This also adds owned variant decode where the decode consume the
buffer, instead of just referencing the buffer.
In this commit we remove the unchecked variant from_bytes_ and to_slice_
across the crate, as we no longer use them and no point in using unsafe
methods.
…y_sv2

This solves the issue: stratum-mining#2213, here we
introduce new custom API, and deprecate old rigid API's
Part of removal of U32AsRef: stratum-mining#2215,
we don't really need this.
Here, we remove error type variants across the crate which are not
really needed.
Solves issue: stratum-mining#2175,
via this we are fully adhering to specs and supporting all primitives
mentioned
Completes the issue: stratum-mining#2215, removes
u8Owned and uses u8 instead
Removes into_owned as internally it calls into_static and no point
having this if it just calls into_static
Previously, the writer method would lead to unbounded recursion,
this commit solves via directly calling encodable field writer method
@Shourya742 Shourya742 force-pushed the 2026-06-16-binary-sv2-cleanup branch from a1f2b6a to fe3c973 Compare June 24, 2026 17:10
@GitGab19

Copy link
Copy Markdown
Member

@Shourya742 just a suggestion about the point raised by the clanker:

Yes. The core fix is to stop using ReadError for two different meanings:

  1. “I need more bytes to know the size”
  2. “The size is known and invalid”

Right now both flow through ReadError, and decodable.rs treats every ReadError as recoverable by reading one more byte.

The best fix is in Inner::expected_length / expected_length_for_reader:

// pseudo-shape
if payload_len <= MAXSIZE {
    Ok(payload_len + HEADERSIZE)
} else {
    Err(Error::ValueExceedsMaxSize(
        ISFIXED,
        SIZE,
        HEADERSIZE,
        MAXSIZE,
        data.to_vec(),
        payload_len,
    ))
}

For the reader path:

if expected_length <= MAXSIZE {
    Ok(expected_length)
} else {
    Err(Error::ValueExceedsMaxSize(
        ISFIXED,
        SIZE,
        HEADERSIZE,
        MAXSIZE,
        header.to_vec(),
        expected_length,
    ))
}

Then Decodable::from_reader can keep retrying only on true partial-read errors:

Err(Error::OutOfBound | Error::ReadError(_, _)) => read_one_more_byte()
Err(error) => return Err(error)

With that change, B032 input [33] would return immediately because the header is complete and declares a payload larger than 32. It would no longer read/block waiting for 33 bytes.

A slightly cleaner design would add a dedicated error like Error::InvalidDeclaredLength { declared, max }, because ValueExceedsMaxSize carries a value vector and is a bit awkward for “length header is invalid”. But if the PR wants a small surgical fix, returning ValueExceedsMaxSize from the known-invalid length paths is enough.

}
}

// IMPL TRY_FROM DECODEC FIELD FOR PRIMITIVES

@GitGab19 GitGab19 Jun 25, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the TryFrom implementations for EncodableField, without replacing them with a macro?

Comment thread sv2/binary-sv2/src/lib.rs
#[cfg(feature = "with_buffer_pool")]
impl From<buffer_sv2::Slice> for EncodableField<'_> {
fn from(_v: buffer_sv2::Slice) -> Self {
unreachable!()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why this was labeled as unreachable earlier?

@GitGab19

Copy link
Copy Markdown
Member

Given that on 012536a you're fixing two issues found by Loupe, can't you add tests reported by it as well?

//
// ### `Sv2DataType`
// The `Sv2DataType` trait is implemented for these data types, providing methods for encoding and
// decoding operations such as `from_bytes_unchecked`, `from_vec_`, `from_reader_` (if `std` is

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the mention to from_vec_.

@GitGab19

Copy link
Copy Markdown
Member

I would add tests mentioned by Loupe on commit 173e162 as well.

@GitGab19

Copy link
Copy Markdown
Member

Add Loupe's tests on 5a0d4bb as well.

Comment on lines 383 to 405
fn decode_owned(
&self,
data: &[u8],
offset: usize,
) -> Result<DecodablePrimitive<'static>, Error> {
macro_rules! decode_owned_copy {
($ty:ty, $variant:ident) => {{
let mut owned = data[offset..].to_vec();
Ok(DecodablePrimitive::$variant(<$ty>::from_bytes_(
&mut owned,
)?))
}};
}

macro_rules! decode_owned_fixed_inner {
($ty:ty, $variant:ident) => {{
let data = &data[offset..];
let size = <$ty>::size_hint(data, 0)?;
Ok(DecodablePrimitive::$variant(<$ty>::try_from(
data[..size].to_vec(),
)?))
}};
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you still keeping this one?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In e1720b3 you added a new fn decode_owned() which then you removed in 09d67d2?

What's going on?

@GitGab19

Copy link
Copy Markdown
Member

Why isn't b4aebeb right after cf697b2?

They are both tackling #2215, one removing U32Ref and the other removing OwnedU8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment