Instructions to use LeverageX/scibert-wechsel-korean with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LeverageX/scibert-wechsel-korean with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="LeverageX/scibert-wechsel-korean")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("LeverageX/scibert-wechsel-korean") model = AutoModelForMaskedLM.from_pretrained("LeverageX/scibert-wechsel-korean") - Notebooks
- Google Colab
- Kaggle
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
scibert-wechsel-korean
Scibert(๐บ๐ธ) converted into Korean(๐ฐ๐ท) using WECHSEL technique.
Description
- SciBERT is trained on papers from the corpus of semanticscholar.org. Corpus size is 1.14M papers, 3.1B tokens.
- Wechsel is converting embedding layer's subword tokens from source language to target language.
- SciBERT trained with English language is converted into Korean langauge using Wechsel technique.
- Korean tokenizer is selected with KLUE PLMs' tokenizers due to its similar vocab size(32000) and performance.
Reference
- Downloads last month
- 6