VIDEO DOI: https://doi.org/10.48448/h1t8-2389

technical paper

EMNLP 2021

November 08, 2021

Live on Underline

Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus

Please log in to leave a comment

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2021

Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features
poster

Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features

EMNLP 2021

Bruce W. Lee
Bruce W. Lee and 2 other authors

08 November 2021

Similar lecture

Pre-train or Annotate? Domain Adaptation with a Constrained Budget
poster

Pre-train or Annotate? Domain Adaptation with a Constrained Budget

EMNLP 2021

Wei XuAlan RitterFan Bai
Fan Bai and 2 other authors

08 November 2021