update
Browse files
README.md
CHANGED
|
@@ -6,8 +6,8 @@ tags:
|
|
| 6 |
- sentence-similarity
|
| 7 |
---
|
| 8 |
|
| 9 |
-
# multi-qa-
|
| 10 |
-
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a
|
| 11 |
|
| 12 |
|
| 13 |
## Usage (Sentence-Transformers)
|
|
@@ -25,7 +25,7 @@ query = "How many people live in London?"
|
|
| 25 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
| 26 |
|
| 27 |
#Load the model
|
| 28 |
-
model = SentenceTransformer('sentence-transformers/multi-qa-
|
| 29 |
|
| 30 |
#Encode query and documents
|
| 31 |
query_emb = model.encode(query)
|
|
@@ -84,8 +84,8 @@ query = "How many people live in London?"
|
|
| 84 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
| 85 |
|
| 86 |
# Load model from HuggingFace Hub
|
| 87 |
-
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/multi-qa-
|
| 88 |
-
model = AutoModel.from_pretrained("sentence-transformers/multi-qa-
|
| 89 |
|
| 90 |
#Encode query and docs
|
| 91 |
query_emb = encode(query)
|
|
@@ -111,7 +111,7 @@ In the following some technical details how this model must be used:
|
|
| 111 |
|
| 112 |
| Setting | Value |
|
| 113 |
| --- | :---: |
|
| 114 |
-
| Dimensions |
|
| 115 |
| Produces normalized embeddings | Yes |
|
| 116 |
| Pooling-Method | Mean pooling |
|
| 117 |
| Suitable score functions | dot-product (`util.dot_score`), cosine-similarity (`util.cos_sim`), or euclidean distance |
|
|
@@ -145,7 +145,7 @@ The full training script is accessible in this current repository: `train_script
|
|
| 145 |
|
| 146 |
### Pre-training
|
| 147 |
|
| 148 |
-
We use the pretrained [`
|
| 149 |
|
| 150 |
#### Training
|
| 151 |
|
|
|
|
| 6 |
- sentence-similarity
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# multi-qa-distilbert-cos-v1
|
| 10 |
+
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and was designed for **semantic search**. It has been trained on 215M (question, answer) pairs from diverse sources. For an introduction to semantic search, have a look at: [SBERT.net - Semantic Search](https://www.sbert.net/examples/applications/semantic-search/README.html)
|
| 11 |
|
| 12 |
|
| 13 |
## Usage (Sentence-Transformers)
|
|
|
|
| 25 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
| 26 |
|
| 27 |
#Load the model
|
| 28 |
+
model = SentenceTransformer('sentence-transformers/multi-qa-distilbert-cos-v1')
|
| 29 |
|
| 30 |
#Encode query and documents
|
| 31 |
query_emb = model.encode(query)
|
|
|
|
| 84 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
| 85 |
|
| 86 |
# Load model from HuggingFace Hub
|
| 87 |
+
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/multi-qa-distilbert-cos-v1")
|
| 88 |
+
model = AutoModel.from_pretrained("sentence-transformers/multi-qa-distilbert-cos-v1")
|
| 89 |
|
| 90 |
#Encode query and docs
|
| 91 |
query_emb = encode(query)
|
|
|
|
| 111 |
|
| 112 |
| Setting | Value |
|
| 113 |
| --- | :---: |
|
| 114 |
+
| Dimensions | 768 |
|
| 115 |
| Produces normalized embeddings | Yes |
|
| 116 |
| Pooling-Method | Mean pooling |
|
| 117 |
| Suitable score functions | dot-product (`util.dot_score`), cosine-similarity (`util.cos_sim`), or euclidean distance |
|
|
|
|
| 145 |
|
| 146 |
### Pre-training
|
| 147 |
|
| 148 |
+
We use the pretrained [`distilbert-base-uncased`](https://huggingface.co/distilbert-base-uncased) model. Please refer to the model card for more detailed information about the pre-training procedure.
|
| 149 |
|
| 150 |
#### Training
|
| 151 |
|