[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlaplus] Send/receive message passing or atomic communication in distributed systems



Jack,

The abstraction level of those specs was influenced both by our intuition and by pragmatic concerns. Overall, I would say that our work was originally driven by a desire to come up with the most abstract model of the underlying algorithm that was also "useful" from an engineering standpoint. In our context, that meant having a model that gave us a good understanding of the abstract behavior of the algorithm, and one for which model checking was mostly feasible for small/medium sized protocol instances, and also efficient enough to provide prompt feedback while iterating on candidate designs.

More concretely, I think our approach was also influenced by the fact that many of the bug scenarios described in the original Raft reconfig algorithm [6] can essentially be represented even in this abstract model. So, we felt it was a good starting place to express our protocol at this abstraction level so that we could clearly understand these types of bugs at a high level, and work towards a design that would avoid similar issues in MongoDB's system/protocol.

As a general design methodology, I think this is a relatively effective/efficient approach, especially when you are using a spec to iterate on the design of a new protocol (as was the case in our situation). It is roughly analogous to a kind of refinement driven approach i.e. designing a protocol at increasingly lower levels of abstraction until you get what you need. In our case, we didn't really do any explicit refinement steps, but I think it was still useful to execute most of our design work at a "well chosen" level of the abstraction hierarchy first, before moving to levels that are more complex and/or may make model checking infeasible.

As an additional side note, formalizing the intuition around how to reduce asynchronous message protocols to equivalent synchronous/sequentialized versions is doable, but in my view is tricky/nontrivial [1,2,3]. The relevant ideas are, arguably, straightforward, and old/well-known [4,5], but I view their formalizations as subtle. I think Giuliano's post [7] presents another good example of this type of reasoning, again based on some amount of intuition to justify a type of reduction/sequentialization.

Will

[1] https://dl.acm.org/doi/10.1145/3385412.3385980
[2] https://www.di.ens.fr/~cezarad/cav19.pdf
[3] https://members.loria.fr/SMerz/papers/rp2009.pdf
[4] https://dl.acm.org/doi/10.1145/361227.361234
[5] https://core.ac.uk/download/pdf/82311765.pdf
[6] https://groups.google.com/g/raft-dev/c/t4xj6dJTP6E/m/d2D9LrWRza8J
[7] https://www.losa.fr/blog/streamlet-in-tla+
[8] https://dl.acm.org/doi/abs/10.1145/3497775.3503688

On Friday, February 23, 2024 at 2:26:40 AM UTC-5 divyanshu ranjan wrote:
Hi Jack,

Lamport has written a paper when such an assumption is valid one in terms of five-six conditions.

[1] https://lamport.azurewebsites.net/pubs/lamport-theorem.pdf

On Fri, Feb 23, 2024 at 12:41 PM Jack Vanlightly <vanli...@xxxxxxxxx> wrote:
Hi all,

I was going over the MongoDB logless reconfiguration spec today (https://github.com/will62794/logless-reconfig) and I saw how much more compact it was compared to my own various specs of Raft-based systems with reconfiguration. The principle reason is that this spec does not model message passing where send and receive are separate actions,  communications between nodes is atomic. I have seen a few other specifications use atomic communication as well.

For my part, I have always modeled send/receive message passing as it seemed possible to miss certain edge cases when communication was atomic. However, message passing does make specifications larger, more complex with a much larger state space so there is a real cost.

I'd love to hear others opinions on whether to model send/receive message passing or use atomic comms in distributed systems. 

Thanks
Jack


--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/6c76fd7c-6355-48ae-9e0a-480ff8d4a797n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/50068744-a596-4f37-a46c-09fd143c3aebn%40googlegroups.com.