AI & ML interests

Language models, reasoning, robustness, question answering, evaluation, theorem proving, knowledge graphs, mechanistic interpretability, adversarial training, dynamic adversarial data collection, in-context learning, natural language explanations, safety and security, self-training, knowledge distillation, natural language processing

yaolu 
updated a model 2 months ago
yaolu 
published a model 3 months ago