PhoBERT: Pre-trained language models for Vietnamese
Paper β’ 2003.00744 β’ Published β’ 1
How to use vinai/phobert-base with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="vinai/phobert-base") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("vinai/phobert-base")
model = AutoModelForMaskedLM.from_pretrained("vinai/phobert-base")Pre-trained PhoBERT models are the state-of-the-art language models for Vietnamese (Pho, i.e. "Phα»", is a popular food in Vietnam):
The general architecture and experimental results of PhoBERT can be found in our EMNLP-2020 Findings paper:
@article{phobert,
title = {{PhoBERT: Pre-trained language models for Vietnamese}},
author = {Dat Quoc Nguyen and Anh Tuan Nguyen},
journal = {Findings of EMNLP},
year = {2020}
}
Please CITE our paper when PhoBERT is used to help produce published results or is incorporated into other software.
For further information or requests, please go to PhoBERT's homepage!