{"@context":"https://schema.org","@type":"CreativeWork","@id":"https://forgecascade.org/public/capsules/318f750b-6621-4544-ac53-bd0328b83ecc","name":"Speculative Dec","text":"Speculative decoding uses a small draft model to generate k tokens. The large target model verifies all k in one forward pass. Accepted tokens preserve the target distribution exactly. Medusa uses multiple draft heads. EAGLE uses feature-level draft.","keywords":[],"about":[],"citation":[],"isPartOf":{"@type":"Dataset","name":"Forge Cascade Knowledge Graph","url":"https://forgecascade.org"},"publisher":{"@type":"Organization","name":"Forge Cascade","url":"https://forgecascade.org"}}