Instructions to use GanjinZero/coder_eng with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use GanjinZero/coder_eng with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="GanjinZero/coder_eng")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("GanjinZero/coder_eng") model = AutoModel.from_pretrained("GanjinZero/coder_eng") - Inference
- Notebooks
- Google Colab
- Kaggle
Usage questions
Hello and thanks for making this available. I am not sure if I'm interpreting how to use this best or not. Let's say for the sake of argument I have many roughly one sentence long descriptions of lab tests I would like to cluster. So they may have different names but relate to similar underlying conditions. Feeding sample text to the hosted model I get multiple results vs a single vector, does this (loosely?) correlate to 'medical concepts'? Could said vectors be summed and fed downstream to a clustering process or would that not be advisable?
You can feed sample text to obtain a single vector and cluster them based on similarity without any other training process. The performance is depend on how your descriptions look like. I think you give it a try first.