VIDEO DOI: https://doi.org/10.48448/2yr8-q466

technical paper

EMNLP 2021

November 08, 2021

Live on Underline

Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent

Please log in to leave a comment

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2021

Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders
technical paper

Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders

EMNLP 2021

+1Anna KorhonenNigel CollierFangyu Liu
Fangyu Liu and 3 other authors

08 November 2021

Similar lecture

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models
technical paper

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

EMNLP 2021

+1Preslav NakovMohammed Saeed
Mohammed Saeed and 3 other authors

08 November 2021