Constitutional Classifiers: Defending against Universal Jailbreaks...#Anthropic#Classification#Safety#Large Language Models#Paper#PDF·arxiv.org·Feb 3, 2025Constitutional Classifiers: Defending against Universal Jailbreaks...
Who's Harry Potter? Approximate Unlearning in LLMsDownload PDF#Machine Learning#Large Language Models#Paper#PDF#Fine-Tuning#Microsoft#Classification·arxiv.org·Dec 27, 2023Who's Harry Potter? Approximate Unlearning in LLMs
Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools#ChatGPT#Classification#Paper#PDF·sciencedirect.com·Jun 15, 2023Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text#Large Language Models#Classification#Paper#PDF·arxiv.org·Jun 12, 2023DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text
Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools#Generative Models#Classification#Writing#Paper#PDF#ChatGPT·cell.com·Jun 8, 2023Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
Paraphrase Detection: Human vs. Machine Content#Paraphrase#Quora#Paper#PDF#Classification·arxiv.org·May 7, 2023Paraphrase Detection: Human vs. Machine Content
ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated Text#ChatGPT#Classification#Political Science#Paper#PDF·arxiv.org·May 1, 2023ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated Text