Address and Amount - Simple Compression?

Just thinking out loud here.

Would these suggestions for reduced transaction size work?
Relevant for CIP9 and CIP10 ?

  • Address
  • Currently the only size-optimization is removing the checksum, right?
  • Each character takes up 1 byte, i.e. 8 bits?
  • Would it be possible to reduce each character to 6 bits? There are 58 characters and with 6 bits there’s space for 2^6 = 64 values.
  • Amounts
  • With multi-sends, would it save space to let the sender specify number format?
  • E.g. the sender may only need 5 digits and 2 decimals, as in each send is like 332.21 or 65.53.
  • The format could be Decimal (M,N) where M is max digits and N is max decimals (in the encoded send, which when decoded is converted back to satoshis)

In the current test implementation for CIP10 (and i think CIP9 does the same) addresses are base58 decoded, and then the last 4 bytes get stripped to remove checksum.

Regarding the amounts, that’s a similar idea i had to CoInt (compressed int) but as robby and devon weighed in, the implementation details could allow potential bugs if it hasn’t been “tried and true”.

For the meantime, i’m dropping CoInt from the CIP10 spec. It could certainly save a lot on “0” bytes, but the potential consensus-breaking-bugs are kinda scary.

The address is already as compressed as it will get. We’ll be using the 20 bytes of raw data that come from the decoded base58 address. (Plus the 1 byte network prefix).

The number format idea is interesting. @chiguireitor was experimenting with a compressed integer format, but it may prove too complex.

@chiguireitor - what do you think about specifying a decimal format for CIP10? Or a number of bits per send?

Say I wanted to send 2000.5 of a divisible asset. I could use a FORMAT of 15,7. That would provide 15 bits per quantity and each quantity would be multiplied by 10^7. So 0b100111000100101 would translate to 200050000000 (20005 * 10^7).

I don’t think the user should specify this number format, but the compose API could try to pick a format that saves the most space. We could even have lookup tables and multiple formats.

Of course this adds a lot of complexity. It may or may not be worth it.