canvas_engineering.looped_block¶
Looped attention wrappers.
canvas_engineering.looped_block.LoopedBlockWrapper
¶
Bases: Module
Wrap a transformer block for looped execution.
The original block is called L times. At each iteration: h = original(h + loop_emb[l], ...) * gate[l] + h * (1 - gate[l])
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original
|
Module
|
The transformer block to wrap (kept frozen or trainable). |
required |
block_idx
|
int
|
Index of this block in the model (for logging). |
0
|
max_loops
|
int
|
Maximum number of iterations. |
4
|
embed_dim
|
int
|
Dimension for loop embeddings (usually d_model or inner_dim). |
256
|
gate_init_bias
|
float
|
Initial bias for sigmoid gate (0.0 = 50% blending). |
0.0
|
use_gradient_checkpointing
|
bool
|
Recompute activations on backward for memory savings. |
True
|
forward(hidden_states, *args, **kwargs)
¶
Run the original block current_loops times with learned iteration signals.
The first positional argument is assumed to be the hidden states tensor. All other args/kwargs are passed through to the original block unchanged.
set_loops(n)
¶
Set the current number of loop iterations (for curriculum).
trainable_params()
¶
Count only the loop-specific trainable parameters.