new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Jan 7

DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup

Recent vision-language models (e.g., CLIP) have demonstrated remarkable class-generalizable ability to unseen classes in few-shot anomaly segmentation (FSAS), leveraging supervised prompt learning or fine-tuning on seen classes. However, their cross-category generalization largely depends on prior knowledge of real seen anomaly samples. In this paper, we propose a novel framework, namely DictAS, which enables a unified model to detect visual anomalies in unseen object categories without any retraining on the target data, only employing a few normal reference images as visual prompts. The insight behind DictAS is to transfer dictionary lookup capabilities to the FSAS task for unseen classes via self-supervised learning, instead of merely memorizing the normal and abnormal feature patterns from the training set. Specifically, DictAS mainly consists of three components: (1) **Dictionary Construction** - to simulate the index and content of a real dictionary using features from normal reference images. (2) **Dictionary Lookup** - to retrieve queried region features from the dictionary via a sparse lookup strategy. When a query feature cannot be retrieved, it is classified as an anomaly. (3) **Query Discrimination Regularization**- to enhance anomaly discrimination by making abnormal features harder to retrieve from the dictionary. To achieve this, Contrastive Query Constraint and Text Alignment Constraint are further proposed. Extensive experiments on seven public industrial and medical datasets demonstrate that DictAS consistently outperforms state-of-the-art FSAS methods.

  • 10 authors
·
Aug 19, 2025

The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang

Large Language Models (LLMs) achieve gold-medal performance across many benchmarks, yet it remains unclear whether such success reflects genuine reasoning or pattern matching. From a cognitive science perspective, an informative test is whether models can master an unfamiliar language through explicit metalinguistic deductive learning, a paradigm where human learners can reliably internalise grammatical systems through metalinguistic reasoning. We address this question with Camlang, a novel constructed language that exhibits naturalistic yet unattested feature combinations. Camlang consists of two explicit resources, a grammar book and a bilingual dictionary, which mirror adult second-language learning via explicit grammar rules and lexical lookup, and enable us to disentangle errors in morpho-syntax, lexical semantics, and sentence-level reasoning. Human experiments show that these resources are sufficient for participants to acquire Camlang and successfully solve Camlang tasks. To operationalise evaluation, we adapt CommonsenseQA into Camlang, creating Camlang-CSQA-v0, the first task in a broader suite where solving questions requires applying grammar rules and lexical mappings. Experimental results show that GPT-5 achieves 98\% EM accuracy in English but only 47\% in Camlang, far below human performance at 87\%, while other state-of-the-art reasoning LLMs perform even worse. Human verification further reveals that most model successes stem from shallow lexical alignment while GPT-5 shows emerging metalinguistic awareness to a limited extent but not systematic grammatical mastery as humans. Camlang establishes a cognitively grounded evaluation paradigm that exposes fundamental gaps between current models and human metalinguistic competence.

  • 6 authors
·
Aug 30, 2025 1