mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
DAR: Deontic Reasoning with Agentic Harnesses
Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher
models 53
jhu-clsp/mmBERT-small
Fill-Mask • Updated • 22k • • 76
jhu-clsp/mmBERT-base
Fill-Mask • Updated • 346k • • 217
jhu-clsp/mmBERT-checkpoints
Updated • 4
jhu-clsp/ettin-decoder-1b
Fill-Mask • Updated • 21 • 5
jhu-clsp/ettin-decoder-32m
Text Generation • Updated • 310
jhu-clsp/ettin-encoder-1b
Feature Extraction • Updated • 1.94k • 23
jhu-clsp/ettin-encoder-68m
Fill-Mask • Updated • 66.6k • • 5
jhu-clsp/ettin-dec-from-enc-32m
Text Generation • Updated • 4
jhu-clsp/ettin-encoder-150m
Fill-Mask • Updated • 6.02k • • 13
jhu-clsp/ettin-decoder-400m
Text Generation • Updated • 6.95k • 4
datasets 40
jhu-clsp/ManyIH-Bench
Preview • Updated • 47 • 3
jhu-clsp/robust04-instructions
Viewer • Updated • 136k • 1.53k • 2
jhu-clsp/core17-instructions
Viewer • Updated • 49.4k • 1.64k • 2
jhu-clsp/news21-instructions
Viewer • Updated • 71.5k • 1.43k • 1
jhu-clsp/SciTaRC
Viewer • Updated • 371 • 52 • 1
jhu-clsp/megawika-2
Updated • 100 • 4
jhu-clsp/mmBERT-decay-data
Updated • 33.1k • 6
jhu-clsp/mmBERT-midtraining-data
Updated • 2.24k • 1
jhu-clsp/ettin-pretraining-data
Updated • 129k • 9
jhu-clsp/ettin-decay-data
Updated • 973 • 1