Skip to content

canvas_engineering.looped_block

Looped attention wrappers.

canvas_engineering.looped_block.LoopedBlockWrapper

Bases: Module

Wrap a transformer block for looped execution.

The original block is called L times. At each iteration: h = original(h + loop_emb[l], ...) * gate[l] + h * (1 - gate[l])

Parameters:

Name Type Description Default
original Module

The transformer block to wrap (kept frozen or trainable).

required
block_idx int

Index of this block in the model (for logging).

0
max_loops int

Maximum number of iterations.

4
embed_dim int

Dimension for loop embeddings (usually d_model or inner_dim).

256
gate_init_bias float

Initial bias for sigmoid gate (0.0 = 50% blending).

0.0
use_gradient_checkpointing bool

Recompute activations on backward for memory savings.

True

forward(hidden_states, *args, **kwargs)

Run the original block current_loops times with learned iteration signals.

The first positional argument is assumed to be the hidden states tensor. All other args/kwargs are passed through to the original block unchanged.

set_loops(n)

Set the current number of loop iterations (for curriculum).

trainable_params()

Count only the loop-specific trainable parameters.