SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation
Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube
Course Description
Overview
Explore a groundbreaking 15-minute conference talk on SemStamp, a robust sentence-level semantic watermarking algorithm for text generation. Delve into the innovative approach developed by Jack Zhang from the Center for Language & Speech Processing at Johns Hopkins University. Learn how SemStamp addresses the vulnerability of existing watermarking algorithms to paraphrase attacks through its unique design based on locality-sensitive hashing. Discover the algorithm's encoding and hashing process, rejection sampling technique, and margin-based constraint for enhanced robustness. Examine the proposed "bigram" paraphrase attack and its effectiveness against token-level watermarking methods. Gain insights into experimental results demonstrating SemStamp's superior robustness against common and bigram paraphrase attacks, as well as its ability to preserve generation quality compared to previous state-of-the-art methods.
Syllabus
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation -- Jack Zhang (JHU)
Taught by
Center for Language & Speech Processing(CLSP), JHU
Related Courses
Intro to Deep Learning with PyTorchFacebook via Udacity Natural Language Processing with Sequence Models
DeepLearning.AI via Coursera Deep Learning
Universidad AnĂ¡huac via edX Create a Superhero Name Generator with TensorFlow
Coursera Project Network via Coursera Natural Language Generation in Python
DataCamp