canvas_engineering.graft¶
One-line grafting onto pretrained backbones.
canvas_engineering.graft.graft_looped_blocks(transformer, max_loops=3, block_attr='transformer_blocks', wrapper_class=None, inner_dim=None, action_dim=7, latent_channels=16, freeze='full')
¶
Graft looped attention onto any transformer model.
Replaces each block in transformer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transformer
|
Module
|
The pretrained transformer model. |
required |
max_loops
|
int
|
Maximum loop iterations per block. |
3
|
block_attr
|
str
|
Attribute name for the block list (e.g., "transformer_blocks"). |
'transformer_blocks'
|
wrapper_class
|
Optional[Type]
|
Custom wrapper class. If None, auto-detects CogVideoX vs generic. |
None
|
inner_dim
|
Optional[int]
|
Embedding dimension for loop params. Auto-detected if None. |
None
|
action_dim
|
int
|
Output action dimension for the ActionHead. |
7
|
latent_channels
|
int
|
Input channels for ActionHead (e.g., 16 for CogVideoX latents). |
16
|
freeze
|
str
|
Freeze strategy: "full", "half", or "none". |
'full'
|
Returns:
| Type | Description |
|---|---|
(looped_blocks, action_head)
|
List of wrapped blocks and the action decoder. |
canvas_engineering.graft.freeze_full(transformer)
¶
Freeze all non-loop parameters (patch_embed, time_embed, norm_out, proj_out).
canvas_engineering.graft.freeze_half(transformer)
¶
Freeze only patch_embed (largest input module). Leave time_embed, norms free.