compact-u16 (shortvec)
The 1-to-3-byte variable-length integer that prefixes every array in a Solana transaction. Seven value bits per byte plus a continuation flag — the source of most off-by-one transaction-parsing bugs.
What it is
compact-u16 (also called shortvec) is a variable-length encoding for the length prefix of every array inside a Solana transaction — the signature count, account-key count, instruction count, and the per-instruction account and data lengths. It encodes a value from 0 to 65,535 in 1 to 3 bytes.
Why it exists
Transactions are size-constrained (1,232 bytes on the wire), so spending a fixed 2 or 4 bytes on every array length is wasteful when most arrays are short. compact-u16 spends a single byte for lengths under 128 — which covers the overwhelming majority of real transactions — and only grows when it has to.
Byte layout
Each byte carries 7 bits of value in its low bits; the high bit (0x80) is a continuation flag meaning “another byte follows.”
| Bytes used | Value range | Encoding |
|---|---|---|
| 1 | 0 – 127 | 0vvvvvvv — high bit clear, value in low 7 bits. |
| 2 | 128 – 16,383 | 1vvvvvvv 0vvvvvvv — first byte’s high bit set; next 7 bits in the second byte. |
| 3 | 16,384 – 65,535 | 1vvvvvvv 1vvvvvvv 000000vv — third byte uses only its low 2 bits. |
Decoding accumulates 7 bits at a time, shifting left by 7 for each subsequent byte, stopping at the first byte whose high bit is clear.
value = 0
shift = 0
loop:
byte = next_byte()
value |= (byte & 0x7F) << shift
if (byte & 0x80) == 0: break
shift += 7
Where you see it
Before every array in a legacy transaction and v0 transaction: the signatures vector, the account-keys vector, the instructions vector, each instruction’s account-index vector and data vector, and (in v0) the address-table-lookup vectors. If you’re hand-parsing a transaction and your offsets drift, a misread compact-u16 is the usual culprit.
Common gotchas
- It is not LEB128, and not little-endian u16. compact-u16 is its own format. Decoding it as a plain 2-byte little-endian integer works by accident for values 0–127 (one byte) and then silently breaks. Use a real shortvec decoder.
- The third byte only has 2 usable value bits. Because the max value is u16 (65,535), the third byte tops out at
0b00000011in its low bits. Encoders must reject anything larger. - Length, then elements. The prefix is the element count, not a byte length. To skip an array you must decode the count and then walk each element (whose size you know from context) — you can’t just jump N bytes.
- Non-canonical encodings are invalid. Encoding
5as two bytes (0x85 0x00) instead of one (0x05) is malformed; strict decoders reject it. Don’t pad.
See also
Last verified: 2026-05-20