Large language models are supposed to shut down when users ask for dangerous help, from building weapons to writing malware. A new wave of research suggests those guardrails can be sidestepped not ...