SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'loanagreement no isbl00910729978dated26- <NUM> - <NUM> loan payment is pending since <NUM> sep <NUM> . hdfc bank has returned the cheque stating it as alteration under rbi guidelines. pli and nodal agencies contact numbers are found out of service. i am unable to connect with them. i have attached the loan agreement pdf for reference. please support to get the resolution as it is pending since <NUM> sep <NUM> . issue non-receipt of loan payment under dcmsme scheme context the user is reporting non-receipt of loan payment since <NUM> <NUM> <NUM> citing hdfc bank s return of cheque as alteration under rbi guidelines and requesting assistance in resolving the issue. details - loan agreement no isbl00910729978 loan agreement date <NUM> <NUM> <NUM> cheque return reason alteration under rbi guidelines attached with application',
    'Policy and Schemes. Related to DCMSME Scheme. this category is related to grievances under the dcmsme scheme specifically focusing on issues related to access to credit from banks for micro small and medium enterprises msmes . the category applies to commercial banks regional rural banks rrbs and cooperative banks and covers cases where the bottleneck lies entirely at the bank level. it excludes issues related to rbi policy government scheme design credit guarantee mechanisms or buyer default but rather addresses bank-side processing conditions or conduct in extending credit to msmes. the category includes cases where msmes have applied for loans submitted required documents and followed up through branches or digital portals but the loan application remains pending without a formal sanction or rejection decision. it captures administrative stalling such as prolonged under process or pending for verification status absence of deficiency letters or timelines repeated demands for already-submitted documents and failure of branch offices to forward eligible applications to regional or head offices for approval. additionally the category covers situations where loans have been formally sanctioned but disbursement is delayed or withheld by the bank without valid or documented reasons. it includes cases of prolonged non-disbursement despite fulfilment of sanction conditions partial disbursement with unexplained withholding of the balance amount delays citing internal audits or reviews and imposition of additional post-sanction conditions that were not mentioned in the original sanction letter. the category also includes grievances related to excessive or unreasonable collateral demands by banks where security requirements exceed applicable msme rbi or cgtmse guidelines. this includes insistence on collateral despite eligibility for credit guarantee coverage demands for disproportionate collateral value rejection of loan applications solely due to refusal to provide personal or residential property as security and requirements for subcategories <NUM> . tcec division for implementation of the scheme establishement of new technology centres extension centres <NUM> . economic analysis <NUM> . statistics data division <NUM> . national awards <NUM> . entrepreneurship skill development programmes esdp <NUM> . vendor development programme for ancillarisation <NUM> . export promotion wto <NUM> . msme policy industry associations related issues <NUM> .software related <NUM> . zero defect zero effect zed <NUM> .technology center system program tcsp <NUM> . north east region cell ner promotion of msmes in ner and sikkim <NUM> .international trade fair itf and international cooperation ic <NUM> .support for entrepreneurial and managerial development of smes through incubators- an nmcp scheme <NUM> .building awareness on intellectual property rights ipr for the micro small medium enterprises- an nmcp scheme <NUM> .lean manufacturing competitiveness scheme lmcs <NUM> . design clinic scheme - an nmcp scheme <NUM> . pms scheme <NUM> . technology and quality upgradation tequp support to msmes- an nmcp scheme <NUM> . digital msme - an nmcp scheme <NUM> .micro small enterprises cluster development programme mse-cdp <NUM> .credit linked capital subsidy for technology upgradation clcs- tu special clcs for sc st <NUM> .credit guarantee fund for micro and smali enterprises cgtmse <NUM> . market development assistance mda to msmes',
    'Technology, Quality and Institutions. Related to Scheme of KVIC. this category encompasses grievances related to schemes subsidies certifications and implementation processes administered by the khadi village industries commission kvic and its implementing authorities including state kvic and district industries centre dic offices. it specifically addresses issues that originate from kvic or its field-level offices excluding problems solely with banks generic msme schemes or non-kvic authorities. the category covers a range of issues including <NUM> . delays or failures in the release of pmegp margin money subsidies where loans have already been sanctioned and units have been set up but kvic has not credited the subsidy to the bank due to pending portal actions physical verification delays repeated document objections or prolonged under process status without timelines. <NUM> . grievances related to khadi subsidies including non-release partial release or unexplained reduction of admissible subsidy amounts stoppage of subsidy citing non-compliance without sharing inspection reports deviations from prescribed scheme norms in determining subsidy eligibility or quantum <NUM> . issues related to kvic certification and registration including pending or delayed issuance of khadi certificates cancellation of certification without prior notice or stated reasons inspection-related delays without clarification delayed renewal of certificates that directly affect eligibility for subsidies tenders and market access subcategories <NUM> . providing financial assistance to set up new enterprises under pmegp <NUM> . providing insurance cover to khadi artisans under aam admi bima yojana <NUM> . providing financial assistance to khadi institutions under mda <NUM> . workshed scheme for khadi artisans <NUM> . loans under interest subsidy eligibility certificate scheme isec <NUM> . mission solar charkha',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5738, 0.4289],
#         [0.5738, 1.0000, 0.5811],
#         [0.4289, 0.5811, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine nan
spearman_cosine nan

Training Details

Training Dataset

Unnamed Dataset

  • Size: 90 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 90 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 33 tokens
    • mean: 116.86 tokens
    • max: 128 tokens
    • min: 128 tokens
    • mean: 128.0 tokens
    • max: 128 tokens
  • Samples:
    sentence_0 sentence_1
    the msme portal software keeps crashing during udyam registration renewal and scheme applications with error messages and failed uploads every time i try. support team gives no help and i can t access my digital certificates or track status. this software glitch blocks my business from government benefits and loans. please fix the bugs improve server speed and add better error guides right away. issue software glitch in msme portal during udyam registration renewal and scheme applications context the user is reporting frequent crashes of the msme portal software during udyam registration renewal and scheme applications resulting in failed uploads error messages and inability to access digital certificates or track status which is hindering business access to government benefits and loans. details - software msme portal software issue frequent crashes during udyam registration renewal and scheme applications error messages failed uploads and error messages impact inability to access dig... Technology, Quality and Institutions. Software Related. software-related initiatives for msmes mainly center on the digital msme scheme under the national manufacturing competitiveness programme which promotes adoption of information and communication technologies through cloud-based erp crm and accounting software to digitalize day-to-day business operations. the scheme combines awareness workshops needs assessment and financial support in the form of subsidies covering about of eligible costs subject to a ceiling of lakh over two years specifically targeting micro and small enterprises. these initiatives are reinforced by complementary efforts such as software-enabled facilities under technology centre programmes for electronics and esdm sectors digital quality and process parameters under zed certification and software-focused modules within entrepreneurship and skill development programmes. together these measures aim to standardize workflows automate inventory fi...
    msme scheme guidelines and forms under official language policy are only in hindi or poorly translated english making it hard for me to understand eligibility and apply correctly. i keep making errors in submissions because of confusing language and staff reject them without clear explanations. please provide all msme documents in simple english or bilingual format to help non-hindi speakers like me access schemes easily. issue non-availability of msme scheme guidelines and forms in simple english context the user is reporting difficulty in understanding the eligibility and applying for msme schemes due to the availability of guidelines and forms only in hindi or poorly translated english and is requesting provision of these documents in simple english or bilingual format to facilitate access for non-hindi speakers. details - language issue msme scheme guidelines and forms available only in hindi or poorly translated english request provision of documents in simple english or bilingual... Technology, Quality and Institutions. Official Language Related Issues. official language related issues in msme administration concern the implementation of hindi rajbhasha in accordance with the official languages act as amended across the ministry of msme its development institutes field offices and attached organizations. this framework mandates progressive use of hindi in official work bilingual hindi english documentation replies in hindi to communications received in hindi availability of hindi-enabled software on computers and regular training in hindi typing and computing for officials. the ministry monitors compliance through official language implementation committees quarterly progress reviews rajbhasha inspections and conferences while ensuring that citizens charters schemes portals and public-facing information are available bilingually. these measures aim to improve accessibility for hindi-speaking msmes enhance transparency and inclusiveness strengthen regional ou...
    dear sir my uam has already cancelled but unable to register new firm through my aadhar number - - kindly delete my aadhar number or suggest to how register new firm with same aadhar number issue deletion of aadhar number from udyam registration system context the user is requesting deletion of the aadhar number from the udyam registration system as it is associated with a cancelled udyam registration number and is unable to register a new firm using the same aadhar number. details - udyam registration number udyam-ap- - aadhar number - - UAM/Udyam Registration/Certificate related issues. After Cancellation, Unable to Register with PAN Details (Technical). this category refers to grievances where an entrepreneur is unable to create a new udyam registration using their pan after an earlier registration has already been cancelled. in such situations the system may continue to recognize the pan as already associated with an existing registration preventing the user from completing a new registration. grievances under this category generally occur when an enterprise previously cancelled its registration due to closure incorrect details or duplication and later attempts to register again using the same pan. users may report that the system still displays a message indicating that a registration already exists for that pan even though the earlier registration was cancelled. some entrepreneurs also encounter errors where the portal does not allow them to proceed with registration because the pan remains linked to the previous ...
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 32,
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 5
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: True
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step spearman_cosine
1.0 3 nan
2.0 6 nan
3.0 9 nan
4.0 12 nan
5.0 15 nan

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.3
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
43
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ambika14/sbert_grievance_classifier-code-A2

Finetuned
(358)
this model

Papers for Ambika14/sbert_grievance_classifier-code-A2

Evaluation results