canvas_engineering.looped_block¶

Looped attention wrappers.

Bases: Module

Wrap a transformer block for looped execution.

The original block is called L times. At each iteration: h = original(h + loop_emb[l], ...) * gate[l] + h * (1 - gate[l])

Parameters:

Name	Type	Description	Default
`original`	`Module`	The transformer block to wrap (kept frozen or trainable).	required
`block_idx`	`int`	Index of this block in the model (for logging).	`0`
`max_loops`	`int`	Maximum number of iterations.	`4`
`embed_dim`	`int`	Dimension for loop embeddings (usually d_model or inner_dim).	`256`
`gate_init_bias`	`float`	Initial bias for sigmoid gate (0.0 = 50% blending).	`0.0`
`use_gradient_checkpointing`	`bool`	Recompute activations on backward for memory savings.	`True`

Run the original block current_loops times with learned iteration signals.

The first positional argument is assumed to be the hidden states tensor. All other args/kwargs are passed through to the original block unchanged.

Set the current number of loop iterations (for curriculum).

Count only the loop-specific trainable parameters.