Skip to content

BOLT 12 offer encoding #1597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

Conversation

jkczyz
Copy link
Contributor

@jkczyz jkczyz commented Jul 6, 2022

WIP on encoding and parsing offer messages.

TODO:

/// For variable-length values within TLV record where the length is encoded as part of the record.
/// Used to prevent encoding the length twice.
#[derive(Clone)]
pub(crate) struct WithoutLength<T>(pub T);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have the vec_type TLV macro type for this - is there a reason we need this?

Copy link
Contributor Author

@jkczyz jkczyz Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... if I use vec_type with impl_writeable_msg!, the Writeable implementation wants the field to be Vec<T> but the Readable implementation wants it to be Option<Vec<T>>.

If I use impl_writeable_tlv_based!, the variable is correctly unwrapped in the Readable implementation when initializing the struct. However, this macro won't work because it serializes the total length of the TLV stream, which isn't compliant with the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I whipped together a macro that is compliant with the spec. However, if a vec_type field is missing when parsing, it will be later serialized as present but with no values (i.e., a parsing-serialization roundtrip results in a different serialization).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... if I use vec_type with impl_writeable_msg!, the Writeable implementation wants the field to be Vec but the Readable implementation wants it to be Option<Vec>.

Yes, the way the init_tlv_field_var stuff works it creates an option, but it unwraps it once it gets to the struct creation phase, so that should be invisible from the pov of the user writing the macro? Not sure I quite understand the issue here.

Ok, so I whipped together a macro that is compliant with the spec. However, if a vec_type field is missing when parsing, it will be later serialized as present but with no values (i.e., a parsing-serialization roundtrip results in a different serialization).

Right, yea, we could do an optional_vec for that, I guess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the way the init_tlv_field_var stuff works it creates an option, but it unwraps it once it gets to the struct creation phase, so that should be invisible from the pov of the user writing the macro? Not sure I quite understand the issue here.

The unwarp doesn't occur in impl_writeable_msg since init_tlv_based_struct_field is not used.

Maybe this was missed? Anyhow, need to make a new macro in order to remove the _empty field.

Right, yea, we could do an optional_vec for that, I guess?

We'd still have a problem with a parsing-serialization roundtrip if given a TLV record with length 0, FWIW. Not sure if such a thing would ever be written, but I've seen the presence of a TLV record have meaning elsewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this was missed? Anyhow, need to make a new macro in order to remove the _empty field.

Ah, yes, that is an oversight in impl_writeable_msg, it should be using init_tlv_based_struct_field

We'd still have a problem with a parsing-serialization roundtrip if given a TLV record with length 0, FWIW. Not sure if such a thing would ever be written, but I've seen the presence of a TLV record have meaning elsewhere.

Then we shouldn't write something like that in an optional_vec and reject it on the read end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we shouldn't write something like that in an optional_vec and reject it on the read end?

I guess I'm just not sure if it is technically invalid.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we should probably get it specified in BOLT 12 then, no?

/// An `offer` TLV stream without any semantic checks, apart from any checks performed when parsing
/// the underlying types.
struct OfferTlvStream {
_empty: (),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl_writeable_msg! doesn't like an empty set of non-TLV fields, and it seems impl_writeable_tlv_based! isn't compliant with the spec since it writes a length prefix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, right, then let's add a new macro that doesn't require more stuff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, though note without using OfferTlvStream we'd be handwriting Readable and Writeable with read_tlv_fields! and write_tlv_fields! instead.


use prelude::*;

/// An `offer` TLV stream without any semantic checks, apart from any checks performed when parsing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, is it not possible to have any of the fields be required? In general I'd prefer to do the required checks during deserialization so that we end up with a struct that we can expose that looks reasonable, rather than do the checks afterwards and convert a TlvStream struct into a ReasonableForHumans struct, copying it around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, is it not possible to have any of the fields be required?

Looks like the only required field is description. Not sure if we should necessarily make the field required here as the error would just be InvalidValue without indicating why.

In general I'd prefer to do the required checks during deserialization so that we end up with a struct that we can expose that looks reasonable, rather than do the checks afterwards and convert a TlvStream struct into a ReasonableForHumans struct, copying it around.

My plan was to wrap it in an Offer struct rather than copying it, and have accessors into the underlying OfferTlvStream. So the user would do something like "..".parse::<Offer>(), which would delegate to parsing with OfferTlvStream (which is private) before performing semantic checks.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt Jul 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the only required field is description. Not sure if we should necessarily make the field required here as the error would just be InvalidValue without indicating why.

That's a separate issue though? If we want more errors out of the decoder, we can create those, lets not avoid using our decoder just because we want to add more features to the decoder.

My plan was to wrap it in an Offer struct rather than copying it, and have accessors into the underlying OfferTlvStream. So the user would do something like "..".parse::(), which would delegate to parsing with OfferTlvStream (which is private) before performing semantic checks.

I'm generally very much against this kind of approach - I'm strongly in the "write your datastructures right and the code writes itself" camp - the datastructures should thus generally not be able to represent an invalid state to the greatest extent possible. That implies doing checks during deserialization and refusing to create an object that is invalid, not deserializing and then checking later, leading to tons of unwraps in the accessors assuming the preconditions were checked correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a separate issue though? If we want more errors out of the decoder, we can create those, lets not avoid using our decoder just because we want to add more features to the decoder.

I guess this is related to discussion below about code organization. If OfferTlvStream is thought to be an underlying implementation of Offer, then there are a few semantic checks related to field presence that would be required:

  • description is present
  • either node_id or paths is present but not both
  • refund_for is present only if send_invoice is present

I wouldn't want to split these semantic checks across two layers. But if there is only an Offer struct, then all these checks would be written in a custom Readable, IIUC, where the first is performed by a read_tlv_fields! and the other two are handwritten.

I'm generally very much against this kind of approach - I'm strongly in the "write your datastructures right and the code writes itself" camp - the datastructures should thus generally not be able to represent an invalid state to the greatest extent possible. That implies doing checks during deserialization and refusing to create an object that is invalid, not deserializing and then checking later, leading to tons of unwraps in the accessors assuming the preconditions were checked correctly.

Hmmm... I think there's a tradeoff here. The code won't write itself in this case because we are required to adhere to the encoding as defined in BOLT 12, which doesn't have concepts of enums or structs with optional values, for instance. So the resulting Readable and Writeable implementations would be very procedural and prone to have inconsistencies.

On the other hand, the OfferTlvStream approach leads to a very declarative format, semantic checks are separated from type checks, and serialization is free and therefore less error prone (i.e., doesn't require deconstructing types into component TLV records). Given only one field is required, there should be only one unwrap; the optional field accessors are mostly of the nature: self.tlv_stream.field.as_ref(). Additionally, the wrapper Offer struct could only be created from a semantically valid OfferTlvStream since Offer's decoder would be responsible for performing semantic checks on OfferTlvStream. The drawback is accessors for complex types (i.e., enums, structs) may need to access more than one TLV record.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, not using a wrapper means the semantic checks end up in the read method, rather than in a semantic_checks method. That said, the datastructure still forces the read code to do all relevant checks, by refusing to be able to represent (as many) invalid state(s as possible), it becomes much harder to write incorrect read code. Further, and more generally, it ensures the semantic check code is all in one place - in the read method - instead of spewed out between a semantic checks method, assumptions about the behavior thereof in accessors fetching fields, and exposed to users instead of simply exposing users a datastructure that they can instantiate, and only works for correct states.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will push a couple WIP commits showing the accessor API with wrapped types and separated semantic checks. I think I'd prefer to at least have the intermediary struct to enforce the TLV encoding, even if forgoing the wrapper approach. It gives a much better separation between type checks and semantic checsks. The OfferTlvStream can always be deconstructed into parts when converting to Offer. There's nothing performance critical here.

impl Offer {
///
pub fn description(&self) -> &String {
&self.tlv_stream.description.as_ref().unwrap().0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its exactly this kind of thing that I find really gross. If there's a field that's required, it should be required, not an Option. It pushes the complexity into the implementation instead of the struct. I'm further not really sure why we can't just make it required in the TlvStream - why does it have to be an option and then unwraped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its exactly this kind of thing that I find really gross. If there's a field that's required, it should be required, not an Option. It pushes the complexity into the implementation instead of the struct.

Agreed. Converting the TLV stream instead of wrapping would avoid this and is preferable.

I'm further not really sure why we can't just make it required in the TlvStream - why does it have to be an option and then unwraped?

With the converting approach, I would prefer to keep it optional in the TLV stream (but required in Offer) so as to not mix error types across layers. Otherwise, missing a description would be a DecodeError while missing both node_id and paths would be a SemanticError.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the converting approach, I would prefer to keep it optional in the TLV stream (but required in Offer) so as to not mix error types across layers. Otherwise, missing a description would be a DecodeError while missing both node_id and paths would be a SemanticError.

Right, but there you're trying to work around limitations in the serialization framework by adding complexity further up...instead of improving the serialization framework to do what you want. If there's something you want from the serialization framework, lets add it, not work around it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but there you're trying to work around limitations in the serialization framework by adding complexity further up...instead of improving the serialization framework to do what you want. If there's something you want from the serialization framework, lets add it, not work around it.

Hmmm... I'm not trying to get around any limitations of the serialization framework nor do I want anything more from it. It's working exactly as I expect it should: decoding a stream of TLV records as defined by each record's type. I don't see a compelling reason why decoding should include semantic checks. This is a standard separation of concerns.

Are you arguing that all semantic checks should be handled by the serialization framework? If not, what is the limiting principle as to what should be included there? It doesn't take long before business logic finds it's way in (e.g., quantity_min <= quantity_max -- assuming both are set -- or signature is valid for the offer given node_id or offer has not expired -- assuming std ). There are more complex semantics across amount, currency, and chains, which I haven't touched on yet. Excluding all those, supporting even simple semantics would add considerable complexity and maintenance burden to the serialization macros (e.g., if field A is set must not set field B, field A must only be set of field B is set, etc.).

Copy link
Collaborator

@TheBlueMatt TheBlueMatt Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, any semantic checks we want to do should happen at deserialization-time, at least from the perspective of the user. They can, of course, happen in separate functions if that makes more sense, but the user should call one public method and it should deserialize and do semantic checks. I don't think we really disagree here - you were already planning on doing this, AFAIU, just in a separate function that is invisible to the user, which is fine.

Separately, as much as possible without making things much too complicated, the datastructure should enforce as many semantic checks as possible, and once it does so, the deserialization logic will need to do the same in the same method, as it will be required to construct the object itself.

By the time we've finished reading an object we should be very confident that its sane and usable, and that way we massively reduce the risk of code downstream hitting unexpected cases, though it can/should also check. eg look at the 21-million-limit that was added to the BOLTs recently - this allows us explicitly to treat numbers greater than 21 million BTC as a hard deserialization failure, and generally we should.

In any case, the "limitations" I was referring to there is you made reference to different error types, which I read as "I want to have more specific error types for users to see more of what's going on", which I found to be reasonable, but isn't something that can currently be captured in the serialization framework - you get get an InvalidValue, generally.

#[derive(Debug)]
struct BlindedPath {
blinding: PublicKey,
path: WithLength<Vec<OnionMessagePath>, u8>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, I'd prefer have custom serialization methods and avoid WithLength leaking into the struct itself. However, guess this PR should soon be based on 1503 (+ maybe the serialization commit I have locally) and re-use blinded paths code from there (possibly moving it to its own module)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figured there may be some sort of consolidation here such that the interface is shared.


///
#[derive(Clone, Debug)]
pub struct Offer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to move code out of mod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think so. Otherwise, mod.rs will get pretty big. I'll move it after squashing the fixups.

/// An `offer` TLV stream without any semantic checks, apart from any checks performed when parsing
/// the underlying types.
#[derive(Debug)]
struct OfferTlvStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On initial pass, it does seem like it would be sufficient to separate the semantic checks into a separate method rather than a whole separate struct and allows us to get rid of the various conversion methods and other code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be possible to consolidate some of the conversions now that the code has evolved a bit. Let me play with that a bit. I kinda like having OfferTlvStream though as it removes duplication of type numbers across the read and write functions and enforces consistency with how integers and vectors are read/written.


///
#[derive(Clone, Debug)]
pub struct Offer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to add a new method for this struct in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure yet. Originally, the idea was to add a Builder in a follow-up, but maybe that's not necessary -- though the syntax for it would be nice -- if any construction is semantically valid. That seems to be the case now outside of the signature, which itself would need to change as fields are updated. Maybe that's a good argument for keeping the signature separate and parsing as (Offer, Signature). Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah we might want mutators. I like parsing as (Offer, Signature) and having a Offer.compute_signature() method, but not an actual struct field. Can discuss in the dev sync. May be useful to start a follow-up issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although note that this may require computing the offer id on the fly... hrmm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's a good point about offer_id. Maybe it would be best to make Offers immutable and have a modifiable OfferConfig that can be used for easy mutation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah we might want mutators. I like parsing as (Offer, Signature) and having a Offer.compute_signature() method, but not an actual struct field. Can discuss in the dev sync. May be useful to start a follow-up issue.

Assuming you have the private key, which wouldn't be the case for a parsed offer. So, FWIW, you'd have to hang on to the signature if you want to re-serialize which is kinda yuck.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked to Matt, I think my vote would be to go with the lightning-invoice builder pattern of constructing an UnsignedOffer then signing it off at the end to create an immutable Offer. Don't feel very strongly though if you have a different idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... when parsing, how would we know which type to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are you saying to parse as (UnsignedOffer, Option<Signature>)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. I think that points toward having signatures be optional in Offer. Talked to Matt, seems it's acceptable for offer_string.parse<Offer>() to return a read-only Offer with an optional signature. So we'll be giving up the mutability in favor of having an always-valid Offer.

),
};

let (paths, node_id) = match self.destination.clone() {
Copy link
Contributor

@valentinewallace valentinewallace Aug 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheBlueMatt may be put off by Vec clones like these so we can do our best for devices that can't take too many memory allocations. In 1503 I found that disallowing extra vec allocations could sometimes change the code materially, so may be worth hashing out if any vecs need to go early on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an advantage of (previously) having Offer wrap OfferTlvStream (no conversion needed), whereas the current approach trades that for having Offer always being in a correct state. I think I'm convinced that the non-wrapping version is better. But it would be nice to get rid of this drawback while still having the advantages mentioned in https://github.com/lightningdevkit/rust-lightning/pull/1597/files#r934961472.

Let's consider the use cases:

  1. User scans QR code which gets parsed as an Offer via OfferTlvStream (no duplicate allocs)
  2. User constructs an Offer programmatically to serialize as a string for a QR code (duplicate allocs)

The allocs could be removed if instead of using Display (which takes &self), we have a From conversion (which takes self) to OfferTlvStream or some other user-exposable type like OfferString or something. Though they may need to store the Offer as well (?), meaning they would need to copy it prior to serialization. Maybe serialization for QR code and in-memory storage would ultimately be different processes? We may need to put some more thought into how this may relate to offer_ids and to responding to invoice_requests. Let me re-read that part of the spec, but do let me know if you have any thoughts on this already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, to me it seems like it'd be nice to be able to keep it in-memory and be able to still serialize it (so kinda not a huge fan of From).

We may need to put some more thought into how this may relate to offer_ids and to responding to invoice_requests.

Not sure I'm getting how this could relate to offer_ids, is there some way to clarify?

For invoice_requests, I think it could make sense to have something like Offer.get_invoice_request(amount, quantity, etc) -> Result<InvoiceRequest, Error>. I was originally thinking this would live in OnionMessenger but it probably makes more sense in this module?

Copy link
Contributor Author

@jkczyz jkczyz Aug 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a couple commits at the end for signature verification and (related) offer_id computation. Another point to consider is that -- while the offer_id is computed as part of signature verification and can be stored in Offer when parsed -- for programmatic constructions, the offer_id still needs to be computed by converting to the serialized TLV encoding either on-demand or at construction time. So we're going to get hit by allocations there it would seem regardless. Probably could reduce the number of allocations in merkle root computation, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm getting how this could relate to offer_ids, is there some way to clarify?

Mostly pointing out when responding to an invoice_request, we need to know if we actually know about an Offer with the corresponding offer_id from the request. So we'll need some way of determining this and responding with an appropriate invoice. I guess only tangentially related to allocs.

For invoice_requests, I think it could make sense to have something like Offer.get_invoice_request(amount, quantity, etc) -> Result<InvoiceRequest, Error>. I was originally thinking this would live in OnionMessenger but it probably makes more sense in this module?

Yeah, that was going to be the next step here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a couple commits at the end for signature verification and (related) offer_id computation. Another point to consider is that -- while the offer_id is computed as part of signature verification and can be stored in Offer when parsed -- for programmatic constructions, the offer_id still needs to be computed by converting to the serialized TLV encoding either on-demand or at construction time. So we're going to get hit by allocations there it would seem regardless. Probably could reduce the number of allocations in merkle root computation, though.

Would have to check it out more closely, mostly just wanted to note this as something to consider.

Mostly pointing out when responding to an invoice_request, we need to know if we actually know about an Offer with the corresponding offer_id from the request. So we'll need some way of determining this and responding with an appropriate invoice. I guess only tangentially related to allocs.

I think we'll need to add something similar to the inbound_payment module but for inbound invoices. Then we can do Offer::verify("payment_secret", &invoice, &inbound_invoice_secret). To be clear, the "payment_secret" comes in the encrypted_data we encoded for ourselves when we constructed the blinded route, so not 100% on where verify lives.

#[derive(Clone, Debug)]
pub struct Offer {
id: sha256::Hash,
chains: Option<Vec<BlockHash>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC rusty had said he wanted to drop the vec and just have one chain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do recall that but it seems supporting both the main chain and any side chains with the same offer is a use case he wants to support as per: lightning/bolts#798 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded there.

},
/// An amount of currency specified using ISO 4712.
Currency {
/// An ISO 4712 three-letter currency code (e.g., USD).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its three letters why is it on heap? :p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it's already been allocated when the TLV stream was parsed. Extracting characters from strings in Rust without an intermediary Vec is kinda gross, too. Either:

let mut chars = iso4217_code.chars();
let iso4217_code = [chars.next().unwrap(), chars.next().unwrap(), chars.next().unwrap()];

Or in 1.55:

let mut chars = iso4217_code.chars();
[(); ISO4217_CODE_LEN].map(|_| chars.next().unwrap())

Happy to change it if you feel strongly. Though note that crates like iso_4217 work in terms of str if we ever need to interface with it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we store it as a [u8; 3] and use https://doc.rust-lang.org/std/str/fn.from_utf8.html in the getter? Is the string not guaranteed to be ASCII?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. Though it's inside of an enum variant. Not too important at the moment, but changed to [u8; 3] anyway.


/// An offer parsed from a bech32-encoded string as a TLV stream and the corresponding bytes. The
/// latter is used for signature verification.
struct ParsedOffer(OfferTlvStream, Vec<u8>);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really confused here now - so when we're parsing we first read some TLV fields from bytes, then we copy the bytes into this, with the TLV fields moved into it, then we do some analysis of the TLV fields and copy them into a higher-level struct? I'd prefer we just copy the 10 lines of TLV-read code into the top of the Offer read method and call it a day, all the copying around of things just seems to add more code and boilerplate and I'm still not sure how it makes anything more readable. Can you help me understand why all the boilerplate here is a simpler design?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really confused here now - so when we're parsing we first read some TLV fields from bytes, then we copy the bytes into this, with the TLV fields moved into it,

The bytes are not copied -- they are moved just as the TLV fields are.

then we do some analysis of the TLV fields and copy them into a higher-level struct?

I believe they are moved into the higher level struct since OfferTlvStream is a value type -- I'd get compiler errors if I tried to reuse any part of the destructured OfferTlvStream after it had been moved.

I'd prefer we just copy the 10 lines of TLV-read code into the top of the Offer read method and call it a day, all the copying around of things just seems to add more code and boilerplate and I'm still not sure how it makes anything more readable. Can you help me understand why all the boilerplate here is a simpler design?

I'm not sure how that is materially any different. We'd still need to declare variables prior to those lines, at least some with explicit types. And then repeat the types in the write code, ensuring the vector and integer types match and are serialized correctly rather than relying on the compiler to enforce it.

Additionally, that would require changing the return type to ParseError (in order to wrap DecodeError) even though all the other errors are SemanticError. And then all the error returns would need to be wrapped or littered with into(). It would also be possible to return ParseError variants which shouldn't be possible at that level.

I pushed a version to show what this looks like. I personally find it much less readable in addition to the other problems that it introduces as mentioned above. Moreover, the OfferTlvStream version is much more compact and self-describing, IMHO. I don't see how it introduces boilerplate and more code compared to the alternative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this offline yesterday. I pushed a new HEAD commit after reverting the non-OfferTlvStream version. The new version avoids any clones when writing. It also hides the WithoutLength and HighZeroBytesDroppedVarInt types from the OfferTlvStream struct definition and consolidates the field and TLV type definitions.

The conversions still need to be aware of the wrapper types, though. Maybe there's a neat way to avoid that with type coercions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some From implementations to hide all the WithoutLength and HighZeroBytesDroppedVarInt wrappers.

#[derive(Clone, Debug)]
pub enum Destination {
///
NodeId(XOnlyPublicKey),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is going away, maybe we need to sync with rusty before we move forward too much here.

}
}

impl_feature_tlv_value_write!(ChannelTypeFeatures);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/_value//, otherwise reads as "type length value value"


///
#[derive(Clone, Debug)]
pub struct Offer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah we might want mutators. I like parsing as (Offer, Signature) and having a Offer.compute_signature() method, but not an actual struct field. Can discuss in the dev sync. May be useful to start a follow-up issue.

),
};

let (paths, node_id) = match self.destination.clone() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, to me it seems like it'd be nice to be able to keep it in-memory and be able to still serialize it (so kinda not a huge fan of From).

We may need to put some more thought into how this may relate to offer_ids and to responding to invoice_requests.

Not sure I'm getting how this could relate to offer_ids, is there some way to clarify?

For invoice_requests, I think it could make sense to have something like Offer.get_invoice_request(amount, quantity, etc) -> Result<InvoiceRequest, Error>. I was originally thinking this would live in OnionMessenger but it probably makes more sense in this module?

@TheBlueMatt
Copy link
Collaborator

Had some offline conversation about this one and I'm happy with where we ended up here. I'm not sure there's a lot more to comment on specific about the PR, though there's some upstream stuff to work out with Rusty on formatting and structure. Let me know if you want more review on this PR specifically until we resolve those.

@jkczyz jkczyz force-pushed the 2022-06-offers branch 2 times, most recently from 53024ae to 21c51c5 Compare August 15, 2022 03:49

pub(super) fn tagged_hash<T: AsRef<[u8]>>(tag: sha256::Hash, msg: T) -> sha256::Hash {
let mut engine = sha256::Hash::engine();
engine.input(tag.as_ref());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than re-hashing the tag twice for each leaf, we should just have a SHA256 engine in root_hash for each of the tags, clone and then write the extra msg bytes and finish.

@jkczyz jkczyz force-pushed the 2022-06-offers branch 3 times, most recently from 0ca166e to 75b8574 Compare August 19, 2022 03:05
@jkczyz
Copy link
Contributor Author

jkczyz commented Aug 23, 2022

@TheBlueMatt @valentinewallace The latest few commits implement invoice_request parsing with the offer reflected inside. Would be nice to have some early feedback on the approach. Note that each message has a bytes field because we need to reflect unknown fields when responding to the message.

@valentinewallace valentinewallace self-requested a review August 24, 2022 00:22

///
pub struct InvoiceRequest {
bytes: Vec<u8>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, do we need to store the original bytes in the InvoiceRequest as well? I thought that was only going to be required in the offer to echo it back in the InvoiceRequest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When responding to an invoice_request with an invoice, fields from both the offer and the invoice_request must be reflected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could avoid duplicating the field by always storing the bytes in Offer instead of using an empty Vec there. I wasn't too thrilled about having it empty depending on which is the outermost message. Likewise, for when parsing an Invoice. And then implement AsRef<[u8]> on all messages to pull it from the Offer.

(89, payer_note: String),
});

impl Bech32Encode for InvoiceRequest {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, why would one want to bech32-encode an InvoiceRequest? Isn't it basically only ever an onion message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most recent proposal to the draft is to drop send_invoice from offer and instead bech32 encode an invoice_request. See rustyrussell/bolts#11.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so we won't really have an offer at all? Its just an invoice request with some reduced fields?

@jkczyz jkczyz force-pushed the 2022-06-offers branch 2 times, most recently from f4e17b4 to 1b2e9be Compare September 1, 2022 03:52
@jkczyz
Copy link
Contributor Author

jkczyz commented Sep 1, 2022

Ok, so most of the recent changes have been to keep up with rustyrussell/bolts#11 and rustyrussell/bolts#7. Probably best to look at all files at once.

In particular, changes include reflecting fields properly when building and InvoiceRequest from an Offer that may have unknown fields (using the parsed bytes). Would be interested in feedback specifically around that as it may get even more complex when building an Invoice from an InvoiceRequest with unknown fields since we'd also need to strip off the signature bytes.

@jkczyz
Copy link
Contributor Author

jkczyz commented Sep 1, 2022

Also, note that this doesn't yet have a builder for the "send_invoice" InvoiceRequest case (i.e., refunds).

// unknown TLV records, which are not stored in `OfferContents`.
let (payer_tlv_stream, _offer_tlv_stream, invoice_request_tlv_stream, _) =
self.invoice_request.as_tlv_stream();
let offer_bytes = WithoutLength(&self.offer.bytes);
Copy link
Contributor Author

@jkczyz jkczyz Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this may contain custom records after the signature range. So actually, need to break this into two sequences of bytes. When building an invoice from an invoice_request will need three sequences 0-160 (offer and invoice_request), 240-1000 (signatures to strip), and 1000+ (custom records).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and 1000+ (custom records).

Hm I can't find this in the spec, am I missing it? Commented on the PR: rustyrussell/bolts#7 (comment)

Copy link
Contributor Author

@jkczyz jkczyz Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. As written it does not support it. I suppose unused types in the 0-240 range could be custom, but presumably we may want a dedicated range.

@jkczyz jkczyz force-pushed the 2022-06-offers branch 2 times, most recently from 48a1a11 to 8d8f554 Compare September 12, 2022 18:27
@codecov-commenter
Copy link

codecov-commenter commented Sep 23, 2022

Codecov Report

Base: 90.80% // Head: 90.48% // Decreases project coverage by -0.32% ⚠️

Coverage data is based on head (5200268) compared to base (f1428fd).
Patch coverage: 70.93% of modified lines in pull request are covered.

❗ Current head 5200268 differs from pull request most recent head cd8cafb. Consider uploading reports for the commit cd8cafb to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1597      +/-   ##
==========================================
- Coverage   90.80%   90.48%   -0.33%     
==========================================
  Files          89       94       +5     
  Lines       47963    50227    +2264     
  Branches    47963    50227    +2264     
==========================================
+ Hits        43554    45448    +1894     
- Misses       4409     4779     +370     
Impacted Files Coverage Δ
lightning/src/lib.rs 100.00% <ø> (ø)
lightning/src/ln/channelmanager.rs 85.16% <ø> (-0.24%) ⬇️
lightning/src/offers/invoice.rs 0.00% <0.00%> (ø)
lightning/src/onion_message/packet.rs 76.03% <ø> (ø)
lightning/src/offers/payer.rs 33.33% <33.33%> (ø)
lightning/src/offers/invoice_request.rs 36.07% <36.07%> (ø)
lightning/src/util/ser.rs 89.92% <73.58%> (-1.89%) ⬇️
lightning/src/util/ser_macros.rs 87.66% <77.14%> (-0.74%) ⬇️
lightning/src/offers/parse.rs 87.87% <87.87%> (ø)
lightning/src/offers/offer.rs 88.79% <88.79%> (-5.77%) ⬇️
... and 36 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jkczyz jkczyz force-pushed the 2022-06-offers branch 3 times, most recently from 9cb50ae to 5200268 Compare November 1, 2022 19:10
@jkczyz jkczyz force-pushed the 2022-06-offers branch 2 times, most recently from 20b2705 to 53f9e89 Compare November 8, 2022 19:48
Add common bech32 parsing for BOLT 12 messages. The encoding is similar
to bech32 only without a checksum and with support for continuing
messages across multiple parts.

Messages implementing Bech32Encode are parsed into a TLV stream, which
is converted to the desired message content while performing semantic
checks. Checking after conversion allows for more elaborate checks of
data composed of multiple TLV records and for more meaningful error
messages.

The parsed bytes are also saved to allow creating messages with mirrored
data, even if TLV records are unknown.
Test semantic errors when parsing offer bytes.
BOLT 12 messages are limited to a range of TLV record types. Refactor
decode_tlv_stream into a decode_tlv_stream_range macro for limiting
which types are parsed. Requires a SeekReadable trait for rewinding when
a type outside of the range is seen. This allows for composing TLV
streams of different ranges.

Updates offer parsing accordingly and adds a test demonstrating failure
if a type outside of the range is included.
Define an interface for BOLT 12 `invoice_request` messages. The
underlying format consists of the original bytes and the parsed
contents.

The bytes are later needed when constructing an `invoice` message. This
is because it must mirror all the `offer` and `invoice_request` TLV
records, including unknown ones, which aren't represented in the
contents.

The contents will be used in `invoice` messages to avoid duplication.
Some fields while required in a typical user-pays-merchant flow may not
be necessary in the merchant-pays-user flow (e.g., refund, ATM).
BOLT 12 uses Schnorr signatures for signing offers messages, which need
to be serialized.
Offers uses a merkle root hash construction for signature calculation
and verification. Add a submodule implementing this so that it can be
used when parsing and signing invoice_request and invoice messages.
When reading an offer, an `invoice_request` message is sent over the
wire. Implement Writeable for encoding the message and TryFrom for
decoding it by defining in terms of TLV streams. These streams represent
content for the payer metadata (0), reflected `offer` (1-79),
`invoice_request` (80-159), and signature (240).
Implement Bech32Encode for InvoiceRequest, which supports creating and
parsing QR codes for the merchant-pays-user (e.g., refund, ATM) flow.
@TheBlueMatt
Copy link
Collaborator

I assume this should be closed now?

@jkczyz
Copy link
Contributor Author

jkczyz commented Jul 10, 2023

Yeah, everything made it in other PRs. Closing.

@jkczyz jkczyz closed this Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants