readpost-training *Refusal in LLMs is mediated by a single direction — LessWrong Uncensor any LLM with abliteration GitHub - FailSpy/abliterator: Simple Python library/structure to ablate features in LLMs which are supported by TransformerLens