| | --- |
| | title: Sentence Transformers |
| | emoji: 🏢 |
| | colorFrom: green |
| | colorTo: gray |
| | sdk: gradio |
| | sdk_version: 5.33.1 |
| | app_file: app.py |
| | pinned: false |
| | --- |
| | |
| | # Sentence Transformers Demo |
| |
|
| | Interactive web application for semantic text similarity analysis using Sentence Transformers models. |
| |
|
| | ## Features |
| |
|
| | ### 1. Paraphrase Mining |
| | - Find sentences with similar meaning in a text corpus |
| | - Support for multiple language models |
| | - Adjustable similarity threshold |
| | - Export results in CSV format |
| |
|
| | ### 2. Semantic Textual Similarity (STS) |
| | - Calculate semantic similarity between two sets of sentences |
| | - Uses advanced sentence transformation models |
| | - Compare sentences in different languages |
| | - Export results in CSV format |
| |
|
| | ## Available Models |
| |
|
| | - [`Lajavaness/bilingual-embedding-large`](https://huggingface.co/Lajavaness/bilingual-embedding-large): Multilingual model optimized for multiple languages |
| | - [`sentence-transformers/all-mpnet-base-v2`](https://huggingface.co/sentence-transformers/all-mpnet-base-v2): High-quality general-purpose model |
| | - [`intfloat/multilingual-e5-large-instruct`](https://huggingface.co/intfloat/multilingual-e5-large-instruct): Multilingual model with instructions |
| |
|
| | ## Requirements |
| |
|
| | - Python 3.8+ |
| | - Dependencies listed in `requirements.txt` |
| |
|
| | ## Installation |
| |
|
| | 1. Clone the repository: |
| | ```bash |
| | git clone https://github.com/yourusername/sentence-transformers.git |
| | cd sentence-transformers |
| | ``` |
| |
|
| | 2. Create and activate a virtual environment: |
| | ```bash |
| | python -m venv venv |
| | source venv/bin/activate # Linux/Mac |
| | # or |
| | .\venv\Scripts\activate # Windows |
| | ``` |
| |
|
| | 3. Install dependencies: |
| | ```bash |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| | ## Usage |
| |
|
| | 1. Start the application: |
| | ```bash |
| | python app.py |
| | ``` |
| |
|
| | 2. Open your browser at `http://localhost:7860` |
| |
|
| | 3. Select the desired functionality: |
| | - Paraphrase Mining: Upload a CSV file with sentences to analyze |
| | - STS: Upload two CSV files with sentences to compare |
| |
|
| | 4. Select the model and adjust the similarity threshold |
| |
|
| | 5. Click "Process" to start the analysis |
| |
|
| | 6. Download results in CSV format |
| |
|
| | ## CSV File Format |
| |
|
| | CSV files must contain a column named "text" with the sentences to analyze: |
| |
|
| | ```csv |
| | text |
| | "First sentence to analyze" |
| | "Second sentence to analyze" |
| | ... |
| | ``` |
| |
|
| | ## Notes |
| |
|
| | - Temporary files are automatically cleaned up every 30 minutes |
| | - Using complete sentences is recommended for better results |
| | - Models may take time to load on first use |
| |
|
| | ## License |
| |
|
| | MIT |
| |
|
| | Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |