[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlaplus] Check a concurrent algoithm using a large number of concurrent threads



Hello Elliot,

TLA+ is most useful for finding bugs in algorithms written at a relatively high level of abstraction. The model checker will typically run out of steam beyond a few threads, but this is usually enough to find concurrency errors, unless you expect that instances involving many more threads show significantly different types of errors. Quite obviously, a model with 4 processes will not give you any information on how scalable your algorithm is. It will however allow to catch possible race conditions between the participating processes, even if these are extreme outliers and do not show up in simulations. You can run TLC in distributed mode on a cluster to push the limit to some extent, but the state space intrinsically grows exponentially with the number of processes. 64 processes will certainly be out of reach.

Any reduction that you can think of will be useful. TLC offers symmetry reduction: for example, if your algorithm is independent of process IDs, it will come handy to declare processes as a symmetry set.

The basic technique that TLC uses for model checking is quite naive, based on an explicit enumeration of reachable states. There are many different model checking techniques that use symbolic representations (e.g., SAT or SMT encodings) and can be much more efficient for certain classes of algorithms. There is also the approach of statistical model checking, based on sampling executions and applying statistical tests in order to gain confidence on the probabilities of errors: these scale much further but by design cannot give you full guarantees. In principle, you could also attempt deductive system verification using theorem proving for establishing correctness for an arbitrary number of threads, but this will be much harder work.

Hope this helps,

Stephan


> On 08 Jul 2015, at 17:24, Elliot <elliot....@xxxxxxxxx> wrote:
> 
> Hello,
> 
> I'm thinking of verifying a complex multi-threaded lockless tree data structure. Reading through the related papers I have found so far gave me the impression that most TLA+/+CAL checking for multi-threaded algorithms use only a small number of processes, like around 3 or 4. We will be using our tree structure with a higher degree of concurrency, like with 64 or even more threads (on real machines with 64 physical cores so the concurrency will be real). Is it possible to use TLA+ for our situation? Is using a gigantic cluster of computers to run lengthy simulations that involve 64 or more processes the only option? Or is there some sort of way to reduce the state space?
> 
> Any advice is welcome. I know there may be some related publications that I miss. So pointing me to the right paper would also be very helpful.
> 
> Thanks!
> 
> --
> Elliot
> 
> -- 
> You received this message because you are subscribed to the Google Groups "tlaplus" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@xxxxxxxxxxxxxxxx.
> To post to this group, send email to tla...@xxxxxxxxxxxxxxxx.
> Visit this group at http://groups.google.com/group/tlaplus.
> For more options, visit https://groups.google.com/d/optout.