Found 52 bookmarks
Custom sorting
Statement from Dario Amodei on the Paris AI Action Summit \ Anthropic
Statement from Dario Amodei on the Paris AI Action Summit \ Anthropic
(The degree of transparency is tricky because it is supposed to maintain privacy at the individual grain, but make operations obvious at the government altitude. Someone also needs an ability for quick response when results are off kilter. It is not completely automated like a real factory. Or open and closed cases. Another issue may be that everything gets over classified again to keep it out of the hands of meddling execs.)
·anthropic.com·
Statement from Dario Amodei on the Paris AI Action Summit \ Anthropic
Anil, C., Durmus, E., Sharma, M., Benton, J., Kundu, S., Batson, J., ... & Duvenaud, D. (2024). Many-shot Jailbreaking.
Anil, C., Durmus, E., Sharma, M., Benton, J., Kundu, S., Batson, J., ... & Duvenaud, D. (2024). Many-shot Jailbreaking.

Long contexts represent a new front in the struggle to control LLMs. We explored a family of attacks that are newly feasible due to longer context lengths, as well as candidate mitigations. We found that the effectiveness of attacks, and of in-context learning more generally, could be characterized by simple power laws. This provides a richer source of feedback for mitigating long-context attacks than the standard approach of measuring frequency of success

·www-cdn.anthropic.com·
Anil, C., Durmus, E., Sharma, M., Benton, J., Kundu, S., Batson, J., ... & Duvenaud, D. (2024). Many-shot Jailbreaking.