Comments (2)

operator-name89 days ago
This looks similar to https://github.com/chengzeyi/ParaAttention or https://github.com/ali-vilab/TeaCache.

It’s a shame they don’t compare against or mention them.

transformi89 days ago
Interesting approach! Remind me the early insights that neurons in DNN that capture similar concepts.