[tlaplus] Re: Specification as part of an agile software delivery process

On Monday, November 23, 2020 at 1:16:43 PM UTC-5 recepient recepient wrote:

I am a TLA+ novice ... however ... am plowing into TLA full-force on a distributed KV store. It'll eventually include crash recovery, membership, and what not. On the other hand I've done programming for a long time. Putting the two together in a reasonable way is something I'm keeping my eye on too. As to TLA+ appropriateness to a larger workflow

- It makes one get the correctness spec first and showing it works under a wide or widest range of legal behaviors rather than fixing bugs as customers drive them to your front door often the more expensive vector .... We target the model checker at the main risk areas only. With professional programmers I wouldn't tend to lean on TLA for multi-threaded code but for distributed code ... definitely yes.

- Incorporating TLA+ into CD/CI presupposes the ability to abstract up from code to the spec or refine down from the spec to code (a two way refinement) in order to have something to re-test without excessive labor. That's beyond the scope (perhaps main intent) of TLA+. However building in pre/post/static analysis into the code ala Frama-C or similar is indicated for the CD part of CDCI. That's not model checking per se but the things Frama might check for you would fall out of the TLA+ spec, which is a plus e.g. invariants, asserts.

- I initially viewed Verdi and Daphny as doing more than TLA not because TLA can't but more so because Verdi and Daphny wanted that more granular two-way refinement ... but I can only battle one formal language at a time. Verdi is based on COQ, which is another huge learning curve. Verdi doesn't do liveness. Daphny seems much better than Verdi in tool power; Daphny does do liveness but the GIT repo with the Daphny toolbox is a complete mess ... a rat's nest of files. I don't have that kind of time. Verdi is proof-oriented; TLA+ is model checker oriented which is a different approach to the same end. Daphny I believe is model checker oriented too. On the whole then I plan on sticking with TLA+ as the most practical choice unless I have or can be given a reason to look at Daphny later.

- To the extent you can model the serverless model as one basic architecture you can check it once then you're done so to speak unless you revisit the code or the model.

- In summary I intend to use TLA+ as removing project risk (system never becomes prod ready) and institutional risk (company X sucks 'cause their product sucks). For short-medium term risk in CD/CI I hope to bring out key invariants and asserts into code with unit testing and/or state paths from initial to terminal states lifted from TLA's .dot file or something else together with conventional unit and integration testing.

Regards
On Monday, November 23, 2020 at 11:49:09 AM UTC-5 georgio...@xxxxxxxxx wrote:
Hello tlaplus community,

I am Georgios, I am a developer and lead consultant at Thoughtworks.
One of the new trends in the industry is building systems using a "serverless architecture".

This usually means designing systems as small Lambda functions connected with queues and API gateways that your cloud provider takes care of running for you.

The traditional testing pyramid doesn't work very well for those systems. Partially because running many of those components locally is hard or even impossible and also because those systems tend to be highly asynchronous.

I was wondering if TLA+ can have a role to play in increasing our confidence that those systems are working as expected. Is there maybe a competitive value proposition for using TLA+ specifications with serverless. Since, testing is now harder and it is already challenging to test for bugs arising from asynchronicity?

I am curious if anyone has any experience to share with using TLA+ in an agile team. Has anyone incorporated TLA+ and the act of writing specification in their team processes, and CI/CD pipeline?

Additionally, I recently wrote a blog post to demonstrate that specifying serverless systems in PlusCal is relatively easy and will catch nasty bugs.

Happy to hear your thoughts about the post too[1].

[1] https://blog.gchinis.com/posts/serverless-specification-with-tla/

Best regards,
Georgios Chinis