We need more than anecdotes to determine what optimizations are worth making.
It is not a question of optimization. I thought of cases where A is not a singleton.
In particular, changes to the costs of various parts of the computation have made _expression_ evaluation an ever smaller part of the execution time of a TLC execution. I believe that the biggest gains per programming hour are now obtainable by optimizing parallel execution and maintenance of the fingerprint graph.
I just wonder how many programmers have computers with parallel microprocessors at their fingertips :-) But I can understand that that aspect
can passionate an expert of parralel programming.
--
FL