Omartificial-Intelligence-Space commited on
Commit
7b267e3
·
verified ·
1 Parent(s): 9e4039c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -29
README.md CHANGED
@@ -43,42 +43,82 @@ The model incorporates comprehensive stylometric analysis including:
43
  - **Author Tokens:** Descriptive tokens like `<author:يوسف_إدريس>`
44
  - **Target:** Generated text in author's style
45
 
46
- ## 🎯 Model Performance
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
  - **BLEU Score:** 24.58
49
  - **chrF Score:** 59.01
50
  - **Competition:** First Place in AraGenEval 2024
51
  - **Supported Authors:** 21 Arabic authors
52
 
53
- ## 📚 Supported Authors
54
 
55
- <p align="center">
56
- <img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/qDHUSa6ZvD1LjN9uJs-jp.png" width="600"/>
57
  </p>
58
 
 
59
 
60
- ## 📁 Input File Format
 
 
61
 
62
- For batch processing, your input file should have the following format:
 
63
 
64
- ### CSV Format
65
- ```csv
66
- text,author
67
 
 
 
68
 
 
 
 
69
 
 
 
70
 
71
- ```
 
72
 
73
- ### Excel Format
74
- Same structure as CSV but in Excel format.
 
 
 
 
 
75
 
76
- ## 📊 Performance Metrics
 
 
 
 
 
 
77
 
78
- | Metric | Score |
79
- |--------|-------|
80
- | BLEU | 24.58 |
81
- | chrF | 59.01 |
82
 
83
  ## 🎯 Use Cases
84
 
@@ -89,23 +129,12 @@ Same structure as CSV but in Excel format.
89
 
90
  ## 🤝 Contributing
91
 
92
- This model was developed for the AraGenEval 2025 competition. For questions or contributions, please refer to the competition guidelines.
93
 
94
  ## 📄 License
95
 
96
  This model is released under the same license as the base AraT5v2 model.
97
 
98
- ## 🙏 Acknowledgments
99
-
100
- - **Competition:** AraGenEval 2025
101
- - **Base Model:** UBC-NLP/AraT5v2-base-1024
102
- - **Dataset:** Arabic Authorship Style Transfer Task 1
103
- - **Results:** First Place Winner
104
-
105
- ## 📞 Contact
106
-
107
- For questions about the model or usage, please refer to the competition documentation or model repository.
108
-
109
 
110
  ## BibTeX Citation
111
 
 
43
  - **Author Tokens:** Descriptive tokens like `<author:يوسف_إدريس>`
44
  - **Target:** Generated text in author's style
45
 
46
+ ## 📚 Supported Authors
47
+
48
+ <p align="center">
49
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/qDHUSa6ZvD1LjN9uJs-jp.png" width="600"/>
50
+ </p>
51
+
52
+
53
+ ## 📁 Input File Format
54
+
55
+ For batch processing, your input file should have the following format:
56
+
57
+ ## 📊 Example Snippets from the Dataset
58
+
59
+ | id | text_in_msa (partial) | text_in_author_style (partial) |
60
+ |----|------------------------|--------------------------------|
61
+ | 3835 | "لم أقم مطلقًا بالاحتفال بعيد ميلادي... وكنت أتجادل مع كامل الشناوي..." | "عمري ما احتفلت بعيد ميلادي... وأتشاجر مع كامل الشناوي على ذلك الاكتئاب..." |
62
+ | 3836 | "الزمن العام هو العداد الجماعي الذي يسجل السنين... ويبرز الزمن الخاص..." | "الزمن العام يعدّ السنين للناس كلها... أما عدادك الخاص فأنت نادرًا ما تنظر فيه..." |
63
+ | 3837 | "مصر الغنية الراقية... اشتراكية وديمقراطية تتفاعل معًا... أحلام الخمسين..." | "مصر المصنِّعة... الكون مائة زهرة... وحين أبلغ الخمسين أبدأ أعيش وأتعلم الموسيقى..." |
64
+ | 3838 | "غرابة التجربة... طفولة جادة تمامًا بلا مرح... الطفولة كانت عيبًا..." | "غريبة هي الأفكار... كنتُ رجلًا رهيبًا في ثوب طفل... والطفولة تُهمة نخشى الاعتراف بها..." |
65
+ | 3839 | "هذا ليس ندمًا... موجة تفوقك قوة... النصر الحقيقي أن تعيش كما تختار..." | "ليس مرارة ولا ندمًا... أنت تناضل موجة أعتى منك... والحق أن تحيا كما اخترت أنت..." |
66
+
67
+
68
+ ## 📊 Performance Metrics
69
 
70
  - **BLEU Score:** 24.58
71
  - **chrF Score:** 59.01
72
  - **Competition:** First Place in AraGenEval 2024
73
  - **Supported Authors:** 21 Arabic authors
74
 
75
+ Official results on the AraGenEval 2025 testset. Our prompt engineering system ranked first.
76
 
77
+ <p align="left">
78
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/pCfAK4zefvXZ4YI1AvXIG.png" width="400"/>
79
  </p>
80
 
81
+ ## 🚀 Quick Start: Style Transfer Example
82
 
83
+ ```python
84
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
85
+ import torch
86
 
87
+ # Load model
88
+ model_name = "Omartificial-Intelligence-Space/AraStyleTransfer-21"
89
 
90
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
91
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
 
92
 
93
+ device = "cuda" if torch.cuda.is_available() else "cpu"
94
+ model.to(device)
95
 
96
+ # Input text and author
97
+ text = "لم أقم مطلقًا بالاحتفال بعيد ميلادي منذ طفولتي."
98
+ author = "يوسف إدريس"
99
 
100
+ # Prompt format
101
+ prompt = f"اكتب النص التالي بأسلوب <author:{author.replace(' ', '_')}>: {text}"
102
 
103
+ # Tokenize
104
+ inputs = tokenizer(prompt, return_tensors="pt").to(device)
105
 
106
+ # Generate
107
+ output_ids = model.generate(
108
+ **inputs,
109
+ max_length=256,
110
+ num_beams=5,
111
+ early_stopping=True
112
+ )
113
 
114
+ # Decode
115
+ generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
116
+
117
+ print("Original:", text)
118
+ print("Author:", author)
119
+ print("Output:", generated_text)
120
+ ```
121
 
 
 
 
 
122
 
123
  ## 🎯 Use Cases
124
 
 
129
 
130
  ## 🤝 Contributing
131
 
132
+ This model was developed for the [AraGenEval 2025](https://ezzini.github.io/AraGenEval/) competition. For questions or contributions, please refer to the competition guidelines.
133
 
134
  ## 📄 License
135
 
136
  This model is released under the same license as the base AraT5v2 model.
137
 
 
 
 
 
 
 
 
 
 
 
 
138
 
139
  ## BibTeX Citation
140