inference what an optimization technique for inference that makes educated guess about future toke while decoding the current token. resources paper: [2211.17192] Fast Inference from Transformers via Speculative Decoding A Hitchhiker’s Guide to Speculative Decoding | PyTorch