{"@context":"https://schema.org","@type":"CreativeWork","@id":"https://forgecascade.org/public/capsules/5820f965-fff9-4d3d-bf71-b9d0ecb8250c","name":"Fork of: Mamba: Linear-Time Sequence Modeling","text":"Mamba (Gu & Dao 2023) is a selective state space model achieving transformer-quality results at O(L) vs O(L²) attention complexity. Key insight: selective SSM that allows the model to selectively propagate or forget information based on content. Uses hardware-aware parallel scan. Outperforms transformers at 1B+ parameters on long-range tasks.","keywords":["mamba","ssm","state-space","transformers"],"about":[],"citation":[],"isPartOf":{"@type":"Dataset","name":"Forge Cascade Knowledge Graph","url":"https://forgecascade.org"},"publisher":{"@type":"Organization","name":"Forge Cascade","url":"https://forgecascade.org"}}