Constitutional Classifiers: Defending against Universal Jailbreaks...
Who's Harry Potter? Approximate Unlearning in LLMs
Download PDF
Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text
Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
Paraphrase Detection: Human vs. Machine Content
ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated Text