netgvarun2005

Update README.md

dde5225 about 2 years ago

4.66 kB

	---
	# For reference on model card metadata, see the spec: https://github.com/netgvarun2012/VirtualTherapist
	# Doc / guide: https://huggingface.co/docs/hub/model-cards
	{}
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/netgvarun2012/VirtualTherapist).

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->
	A MultiModal architecture model that was created and finetuned jointly by concatenating Hubert and BERT embeddings.
	Hubert model was fine-tuned with a classification head on preprocessed audio and emotion labels in supervised manner.
	BERT was trained on text transcrition embeddings.

	Model can accurately recognize emotions classes- Angry,Sad,Fearful,Happy,Disgusted,Surprised,Calm with ~80% accuracy.


	- Developed by: [https://www.linkedin.com/in/sharmavaruncs/]
	- Model type: [MultiModal - Text and Audio based]
	- Language(s) (NLP): [NLP, Speech processing]
	- Finetuned from model [optional]: [https://huggingface.co/docs/transformers/model_doc/hubert]

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [https://github.com/netgvarun2012/VirtualTherapist/]
	- Paper [optional]: [https://github.com/netgvarun2012/VirtualTherapist/blob/main/documentation/Speech_and_Text_based_MultiModal_Emotion_Recognizer.pdf]
	- Demo [optional]: [https://huggingface.co/spaces/netgvarun2005/VirtualTherapist]

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
	'Virtual Therapist' app - an Intelligent speech and text input based assistant that can decipher emotions and generate therapeutic messages based on the Emotional state of the user.

	Emotions recognized - Angry,Sad,Fearful,Happy,Disgusted,Surprised,Calm with ~80% accuracy.

	Use the code below to get started with the model:


	class MultimodalModel(nn.Module):
	'''
	Custom PyTorch model that takes as input both the audio features and the text embeddings, and concatenates the last hidden states from the Hubert and BERT models.
	'''
	def __init__(self, bert_model_name, num_labels):
	super().__init__()
	self.hubert = HubertForSequenceClassification.from_pretrained("netgvarun2005/HubertStandaloneEmoDetector", num_labels=num_labels).hubert
	self.bert = AutoModel.from_pretrained(bert_model_name)
	self.classifier = nn.Linear(self.hubert.config.hidden_size + self.bert.config.hidden_size, num_labels)

	def forward(self, input_values, text):
	hubert_output = self.hubert(input_values).last_hidden_state

	bert_output = self.bert(text).last_hidden_state

	# Apply mean pooling along the sequence dimension
	hubert_output = hubert_output.mean(dim=1)
	bert_output = bert_output.mean(dim=1)

	concat_output = torch.cat((hubert_output, bert_output), dim=-1)
	logits = self.classifier(concat_output)
	return logits


	def load_model():
	"""
	Load and configure various models and tokenizers for a multi-modal application.

	This function loads a multi-modal model and its weights from a specified source,
	initializes tokenizers for the model and an additional language model, and returns
	these components for use in a multi-modal application.

	Returns:
	tuple: A tuple containing the following components:
	- multiModel (MultimodalModel): The multi-modal model.
	- tokenizer (AutoTokenizer): Tokenizer for the multi-modal model.
	- model_gpt (AutoModelForCausalLM): Language model for text generation.
	- tokenizer_gpt (AutoTokenizer): Tokenizer for the language model.
	"""
	# Load the model
	multiModel = MultimodalModel(bert_model_name, num_labels)

	# Load the model weights and tokenizer directly from Hugging Face Spaces
	multiModel.load_state_dict(torch.hub.load_state_dict_from_url(model_weights_path, map_location=device), strict=False)
	tokenizer = AutoTokenizer.from_pretrained("netgvarun2005/MultiModalBertHubertTokenizer")

	# GenAI
	tokenizer_gpt = AutoTokenizer.from_pretrained("netgvarun2005/GPTTherapistDeepSpeedTokenizer", pad_token='<\|pad\|>',bos_token='<\|startoftext\|>',eos_token='<\|endoftext\|>')
	model_gpt = AutoModelForCausalLM.from_pretrained("netgvarun2005/GPTTherapistDeepSpeedModel")

	return multiModel,tokenizer,model_gpt,tokenizer_gpt




	## Model Card Authors [Varun Sharma]