Table Continuation Classifier based on Ettin-32M

This is a Cross Encoder model finetuned from jhu-clsp/ettin-encoder-32m on the table-continuation-dataset dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("BioMike/table-linker-32m-init")
# Get scores for pairs of texts
pairs = [
    ['| K40Specs |  |\n| --- | --- |\n| GlobalMemory | 11520MB |\n| L2cachesize | 1.57MB |\n| Sharedmemoryperblock | 0.049MB |', '| Maxsharedmemoryperthreadblock | 49152bytes |\n| --- | --- |\n| L2cachesize | 1.50MiB |'],
    ['| method | mean() | median() | trimean() |\n| --- | --- | --- | --- |\n| Originallyreportedresults |  |  |  |\n| Shades-of-Gray | 11.55 | 9.70 | 10.23 |\n| GeneralGray-World | 11.55 | 9.70 | 10.23 |\n| 1st-orderGray-Edge | 10.58 | 8.84 | 9.18 |\n| 2nd-orderGray-Edge | 10.68 | 9.02 | 9.40 |\n| Revisitedresults |  |  |  |\n| Shades-of-Gray | 13.32 | 11.57 | 12.10 |\n| GeneralGray-World | 13.69 | 12.11 | 12.55 |\n| 1st-orderGray-Edge | 11.06 | 9.54 | 9.81 |\n| 2nd-orderGray-Edge | 10.73 | 9.21 | 9.49 |', '| Greenstabilityassumptionresults |  |  |  |\n| --- | --- | --- | --- |\n| Shades-of-Gray | 12.68 | 10.50 | 11.25 |\n| GeneralGray-World | 12.68 | 10.50 | 11.25 |\n| 1st-orderGray-Edge | 13.41 | 11.04 | 11.87 |\n| 2nd-orderGray-Edge | 12.83 | 10.70 | 11.44 |'],
    ['| Name | Abbreviation | #ofinstances |\n| --- | --- | --- |\n| wcsp/spot5/dir | wcsp-dir | 21 |\n| wcsp/spot5/log | wcsp-log | 21 |\n| haplotyping-pedigrees | HT | 100 |\n| upgradeability-problem | UP | 100 |\n| preferenceplanning | PP | 29 |', '| packup-wpms | PWPMS | 99 |\n| --- | --- | --- |\n| timetabling | TT | 26 |'],
    ['| Algorithm | NMSE(dB) | ComputationTime(sec) |\n| --- | --- | --- |\n| HMT+IRWL1 | -14.37 | 363 |\n| CoSaMP | -16.90 | 25 |\n| ModelCS | -17.45 | 117 |', '|  | Algorithm | Symm | Asymm |  |  |  |  |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| LCC | SROCC | RMSE | LCC | SROCC | RMSE |  |  |\n| VQUEMODES | SBIQE | 0.9000 | 0.8913 | 4.5900 | 0.8532 | 0.8234 | 7.1376 |\n| BRISQUE | 0.9125 | 0.9013 | 4.3980 | 0.8792 | 0.8489 | 6.8563 |  |\n| NIQE | 0.9285 | 0.9236 | 3.9852 | 0.8955 | 0.8490 | 6.9563 |  |'],
    ['| Country | PowerDist. | Individualism | Masculinity |\n| --- | --- | --- | --- |\n| Chile | 63 | 23 | 28 |\n| China | 80 | 20 | 66 |\n| Germany | 35 | 67 | 66 |\n| Greece | 60 | 35 | 57 |\n| India | 77 | 48 | 56 |', '| Country | dataset | dataset |  |  |\n| --- | --- | --- | --- | --- |\n| - | ρ | p-value | ρ | p-value |\n| Argentina | 0.56 | 0.03 | 0.77 | 0.0007 |\n| Australia | 0.32 | 0.23 | 0.60 | 0.02 |\n| Brazil | 0.48 | 0.06 | 0.81 | 0.0002 |\n| Chile | 0.32 | 0.23 | 0.53 | 0.04 |\n| England | 0.87 | 0 | 0.70 | 0.004 |\n| France | 0.85 | 2e-06 | 0.61 | 0.01 |\n| Indonesia | 0.84 | 4e-05 | 0.75 | 0.001 |\n| Japan | 0.38 | 0.15 | 0.39 | 0.13 |\n| Korea | 0.68 | 0.004 | 0.45 | 0.08 |\n| Malaysia | -0.16 | 0.54 | 0.11 | 0.68 |\n| Mexico | 0.55 | 0.03 | 0.71 | 0.003 |\n| Russia | 0.78 | 0.0006 | 0.76 | 0.001 |\n| Singapore | 0.34 | 0.20 | 0.65 | 0.008 |\n| Spain | 0.78 | 0.0005 | 0.75 | 0.001 |\n| Turkey | -0.18 | 0.50 | -0.31 | 0.24 |\n| USA | 0.70 | 0.004 | 0.67 | 0.005 |'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    '| K40Specs |  |\n| --- | --- |\n| GlobalMemory | 11520MB |\n| L2cachesize | 1.57MB |\n| Sharedmemoryperblock | 0.049MB |',
    [
        '| Maxsharedmemoryperthreadblock | 49152bytes |\n| --- | --- |\n| L2cachesize | 1.50MiB |',
        '| Greenstabilityassumptionresults |  |  |  |\n| --- | --- | --- | --- |\n| Shades-of-Gray | 12.68 | 10.50 | 11.25 |\n| GeneralGray-World | 12.68 | 10.50 | 11.25 |\n| 1st-orderGray-Edge | 13.41 | 11.04 | 11.87 |\n| 2nd-orderGray-Edge | 12.83 | 10.70 | 11.44 |',
        '| packup-wpms | PWPMS | 99 |\n| --- | --- | --- |\n| timetabling | TT | 26 |',
        '|  | Algorithm | Symm | Asymm |  |  |  |  |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| LCC | SROCC | RMSE | LCC | SROCC | RMSE |  |  |\n| VQUEMODES | SBIQE | 0.9000 | 0.8913 | 4.5900 | 0.8532 | 0.8234 | 7.1376 |\n| BRISQUE | 0.9125 | 0.9013 | 4.3980 | 0.8792 | 0.8489 | 6.8563 |  |\n| NIQE | 0.9285 | 0.9236 | 3.9852 | 0.8955 | 0.8490 | 6.9563 |  |',
        '| Country | dataset | dataset |  |  |\n| --- | --- | --- | --- | --- |\n| - | ρ | p-value | ρ | p-value |\n| Argentina | 0.56 | 0.03 | 0.77 | 0.0007 |\n| Australia | 0.32 | 0.23 | 0.60 | 0.02 |\n| Brazil | 0.48 | 0.06 | 0.81 | 0.0002 |\n| Chile | 0.32 | 0.23 | 0.53 | 0.04 |\n| England | 0.87 | 0 | 0.70 | 0.004 |\n| France | 0.85 | 2e-06 | 0.61 | 0.01 |\n| Indonesia | 0.84 | 4e-05 | 0.75 | 0.001 |\n| Japan | 0.38 | 0.15 | 0.39 | 0.13 |\n| Korea | 0.68 | 0.004 | 0.45 | 0.08 |\n| Malaysia | -0.16 | 0.54 | 0.11 | 0.68 |\n| Mexico | 0.55 | 0.03 | 0.71 | 0.003 |\n| Russia | 0.78 | 0.0006 | 0.76 | 0.001 |\n| Singapore | 0.34 | 0.20 | 0.65 | 0.008 |\n| Spain | 0.78 | 0.0005 | 0.75 | 0.001 |\n| Turkey | -0.18 | 0.50 | -0.31 | 0.24 |\n| USA | 0.70 | 0.004 | 0.67 | 0.005 |',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Binary Classification

Metric Value
accuracy 0.8943
accuracy_threshold 0.4495
f1 0.9063
f1_threshold 0.116
precision 0.8387
recall 0.9857
average_precision 0.9266

Training Details

Training Dataset

table-continuation-dataset

  • Dataset: table-continuation-dataset at 45a5089
  • Size: 13,194 training samples
  • Columns: premise, hypothesis, and label
  • Approximate statistics based on the first 1000 samples:
    premise hypothesis label
    type string string int
    details
    • min: 26 characters
    • mean: 255.89 characters
    • max: 3690 characters
    • min: 26 characters
    • mean: 350.07 characters
    • max: 2658 characters
    • 0: ~49.70%
    • 1: ~50.30%
  • Samples:
    premise hypothesis label
    | Networkarchitecture | mAP |
    | --- | --- |
    | StackedRNNwithelement-wisemax | 58.1 |
    | RNNwithelement-wisemax | 57.4 |
    | Layer | Time(seconds) |
    | --- | --- |
    | Convlayer(32featuremaps) | 29 |
    | Convlayer(32featuremaps) | 755 |
    | ActivationLayer | 113 |
    | AveragePoolingLayer | 15 |
    | Convlayer(64featuremaps) | 249 |
    | Convlayer(64featuremaps) | 314 |
    | ActivationLayer | 127 |
    | Convlayer(128featuremaps) | 418 |
    | Convlayer(128featuremaps) | 405 |
    | ActivationLayer | 101 |
    | AveragePoolingLayer | 3 |
    | 2FullyConnected(256and10neurons) | 32 |
    0
    | Portion | 1Annot. | 2Annot. | 3Annot. |
    | --- | --- | --- | --- |
    | 1-2400 | #08 | #01 | #04 |
    | 2401-4800 | #04 | #03 | #01 |
    | 4801-7200 | #01 | #04 | #08 |
    | 7201-9600 | #03 | #08 | #02 |
    | 623 | 38 |
    | --- | --- |
    | 405 | 12 |
    | 527 | 22 |
    | 491 | 24 |
    | 531 | 23 |
    | 474 | 27 |
    | 515 | 22 |
    | 856 | 104 |
    | 424 | 17 |
    | 515 | 37 |
    | 403 | 17 |
    | 1615 | 186 |
    0
    | (b,b,b,b)0123 | Typeoffunction | Querycomplexity |
    | --- | --- | --- |
    | 0000 | Constantfunction | 0 |
    | 0001 | AND3 | 3 |
    | 0010
    0011
    0100
    0101
    0110 | EXACT3
    2
    Th3
    1
    EXACT3
    PARITY3
    NAE3 | 2
    2
    2
    2
    2 |
    | --- | --- | --- |
    | 0111 | IsomorphictoAND3 | 3 |
    | 1000 | IsomorphictoAND3 | 3 |
    | 1001
    1010
    1011
    1100
    1101 | IsomorphictoNAE3
    IsomorphictoPARITY3
    1
    IsomorphictoEXACT3
    2
    IsomorphictoTh3
    2
    IsomorphictoEXACT | 2
    2
    2
    2
    2 |
    | 1110 | IsomorphictoAND3 | 3 |
    | 1111 | Constantfunction | 0 |
    1
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Evaluation Dataset

table-continuation-dataset

  • Dataset: table-continuation-dataset at 45a5089
  • Size: 1,466 evaluation samples
  • Columns: premise, hypothesis, and label
  • Approximate statistics based on the first 1000 samples:
    premise hypothesis label
    type string string int
    details
    • min: 26 characters
    • mean: 263.77 characters
    • max: 4425 characters
    • min: 17 characters
    • mean: 350.83 characters
    • max: 3969 characters
    • 0: ~47.70%
    • 1: ~52.30%
  • Samples:
    premise hypothesis label
    | K40Specs | |
    | --- | --- |
    | GlobalMemory | 11520MB |
    | L2cachesize | 1.57MB |
    | Sharedmemoryperblock | 0.049MB |
    | Maxsharedmemoryperthreadblock | 49152bytes |
    | --- | --- |
    | L2cachesize | 1.50MiB |
    0
    | method | mean() | median() | trimean() |
    | --- | --- | --- | --- |
    | Originallyreportedresults | | | |
    | Shades-of-Gray | 11.55 | 9.70 | 10.23 |
    | GeneralGray-World | 11.55 | 9.70 | 10.23 |
    | 1st-orderGray-Edge | 10.58 | 8.84 | 9.18 |
    | 2nd-orderGray-Edge | 10.68 | 9.02 | 9.40 |
    | Revisitedresults | | | |
    | Shades-of-Gray | 13.32 | 11.57 | 12.10 |
    | GeneralGray-World | 13.69 | 12.11 | 12.55 |
    | 1st-orderGray-Edge | 11.06 | 9.54 | 9.81 |
    | 2nd-orderGray-Edge | 10.73 | 9.21 | 9.49 |
    | Greenstabilityassumptionresults | | | |
    | --- | --- | --- | --- |
    | Shades-of-Gray | 12.68 | 10.50 | 11.25 |
    | GeneralGray-World | 12.68 | 10.50 | 11.25 |
    | 1st-orderGray-Edge | 13.41 | 11.04 | 11.87 |
    | 2nd-orderGray-Edge | 12.83 | 10.70 | 11.44 |
    1
    | Name | Abbreviation | #ofinstances |
    | --- | --- | --- |
    | wcsp/spot5/dir | wcsp-dir | 21 |
    | wcsp/spot5/log | wcsp-log | 21 |
    | haplotyping-pedigrees | HT | 100 |
    | upgradeability-problem | UP | 100 |
    | preferenceplanning | PP | 29 |
    | packup-wpms | PWPMS | 99 |
    | --- | --- | --- |
    | timetabling | TT | 26 |
    1
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss eval_average_precision
-1 -1 - - 0.5215
0.0012 1 1.568 - -
0.0121 10 1.4514 - -
0.0242 20 1.159 - -
0.0364 30 1.0636 - -
0.0485 40 1.0502 - -
0.0606 50 0.9466 - -
0.0727 60 0.8498 - -
0.0848 70 0.79 - -
0.0970 80 0.6811 - -
0.1091 90 0.7034 - -
0.1212 100 0.7514 - -
0.1333 110 0.6787 - -
0.1455 120 0.7724 - -
0.1576 130 0.7984 - -
0.1697 140 0.7357 - -
0.1818 150 0.682 - -
0.1939 160 0.7078 - -
0.2061 170 0.735 - -
0.2182 180 0.6515 - -
0.2303 190 0.6872 - -
0.2424 200 0.6303 0.6266 0.7346
0.2545 210 0.6187 - -
0.2667 220 0.6414 - -
0.2788 230 0.6559 - -
0.2909 240 0.6741 - -
0.3030 250 0.5765 - -
0.3152 260 0.5634 - -
0.3273 270 0.5529 - -
0.3394 280 0.5485 - -
0.3515 290 0.5848 - -
0.3636 300 0.6838 - -
0.3758 310 0.5737 - -
0.3879 320 0.6061 - -
0.4 330 0.559 - -
0.4121 340 0.5933 - -
0.4242 350 0.5114 - -
0.4364 360 0.5575 - -
0.4485 370 0.5679 - -
0.4606 380 0.5434 - -
0.4727 390 0.5161 - -
0.4848 400 0.5138 0.5051 0.8428
0.4970 410 0.3921 - -
0.5091 420 0.4819 - -
0.5212 430 0.4976 - -
0.5333 440 0.4948 - -
0.5455 450 0.4624 - -
0.5576 460 0.4229 - -
0.5697 470 0.667 - -
0.5818 480 0.541 - -
0.5939 490 0.4705 - -
0.6061 500 0.4292 - -
0.6182 510 0.3875 - -
0.6303 520 0.3439 - -
0.6424 530 0.4438 - -
0.6545 540 0.3549 - -
0.6667 550 0.3269 - -
0.6788 560 0.4639 - -
0.6909 570 0.4241 - -
0.7030 580 0.4069 - -
0.7152 590 0.4362 - -
0.7273 600 0.3688 0.4073 0.8647
0.7394 610 0.4858 - -
0.7515 620 0.2726 - -
0.7636 630 0.5526 - -
0.7758 640 0.3979 - -
0.7879 650 0.5342 - -
0.8 660 0.3759 - -
0.8121 670 0.3175 - -
0.8242 680 0.454 - -
0.8364 690 0.299 - -
0.8485 700 0.4663 - -
0.8606 710 0.4454 - -
0.8727 720 0.3191 - -
0.8848 730 0.3407 - -
0.8970 740 0.4034 - -
0.9091 750 0.3208 - -
0.9212 760 0.4327 - -
0.9333 770 0.4507 - -
0.9455 780 0.364 - -
0.9576 790 0.3415 - -
0.9697 800 0.3369 0.3427 0.8937
0.9818 810 0.373 - -
0.9939 820 0.4163 - -
1.0061 830 0.2913 - -
1.0182 840 0.2541 - -
1.0303 850 0.4237 - -
1.0424 860 0.3699 - -
1.0545 870 0.3385 - -
1.0667 880 0.287 - -
1.0788 890 0.3147 - -
1.0909 900 0.4225 - -
1.1030 910 0.3251 - -
1.1152 920 0.2791 - -
1.1273 930 0.2806 - -
1.1394 940 0.3276 - -
1.1515 950 0.3114 - -
1.1636 960 0.2844 - -
1.1758 970 0.2789 - -
1.1879 980 0.3924 - -
1.2 990 0.2539 - -
1.2121 1000 0.3521 0.3340 0.8968
1.2242 1010 0.3257 - -
1.2364 1020 0.2742 - -
1.2485 1030 0.272 - -
1.2606 1040 0.3353 - -
1.2727 1050 0.308 - -
1.2848 1060 0.3421 - -
1.2970 1070 0.3217 - -
1.3091 1080 0.3378 - -
1.3212 1090 0.3475 - -
1.3333 1100 0.2731 - -
1.3455 1110 0.2564 - -
1.3576 1120 0.3 - -
1.3697 1130 0.3451 - -
1.3818 1140 0.307 - -
1.3939 1150 0.2309 - -
1.4061 1160 0.2663 - -
1.4182 1170 0.267 - -
1.4303 1180 0.2899 - -
1.4424 1190 0.369 - -
1.4545 1200 0.2506 0.3054 0.9073
1.4667 1210 0.3266 - -
1.4788 1220 0.3361 - -
1.4909 1230 0.2657 - -
1.5030 1240 0.3517 - -
1.5152 1250 0.289 - -
1.5273 1260 0.2668 - -
1.5394 1270 0.3482 - -
1.5515 1280 0.3758 - -
1.5636 1290 0.232 - -
1.5758 1300 0.3564 - -
1.5879 1310 0.2815 - -
1.6 1320 0.2122 - -
1.6121 1330 0.2786 - -
1.6242 1340 0.3978 - -
1.6364 1350 0.2222 - -
1.6485 1360 0.2901 - -
1.6606 1370 0.4247 - -
1.6727 1380 0.4094 - -
1.6848 1390 0.3077 - -
1.6970 1400 0.2155 0.3060 0.9122
1.7091 1410 0.2607 - -
1.7212 1420 0.2837 - -
1.7333 1430 0.3198 - -
1.7455 1440 0.2362 - -
1.7576 1450 0.2265 - -
1.7697 1460 0.3386 - -
1.7818 1470 0.3089 - -
1.7939 1480 0.2792 - -
1.8061 1490 0.3103 - -
1.8182 1500 0.364 - -
1.8303 1510 0.2771 - -
1.8424 1520 0.3449 - -
1.8545 1530 0.2851 - -
1.8667 1540 0.2513 - -
1.8788 1550 0.3013 - -
1.8909 1560 0.3173 - -
1.9030 1570 0.3125 - -
1.9152 1580 0.2399 - -
1.9273 1590 0.2614 - -
1.9394 1600 0.1946 0.3172 0.9165
1.9515 1610 0.2345 - -
1.9636 1620 0.2219 - -
1.9758 1630 0.3347 - -
1.9879 1640 0.2964 - -
2.0 1650 0.3139 - -
2.0121 1660 0.2288 - -
2.0242 1670 0.2273 - -
2.0364 1680 0.1789 - -
2.0485 1690 0.2484 - -
2.0606 1700 0.1882 - -
2.0727 1710 0.2455 - -
2.0848 1720 0.1766 - -
2.0970 1730 0.3073 - -
2.1091 1740 0.2296 - -
2.1212 1750 0.1857 - -
2.1333 1760 0.1965 - -
2.1455 1770 0.2159 - -
2.1576 1780 0.1821 - -
2.1697 1790 0.2312 - -
2.1818 1800 0.2987 0.2863 0.9292
2.1939 1810 0.2137 - -
2.2061 1820 0.132 - -
2.2182 1830 0.17 - -
2.2303 1840 0.2847 - -
2.2424 1850 0.2753 - -
2.2545 1860 0.2074 - -
2.2667 1870 0.2293 - -
2.2788 1880 0.2452 - -
2.2909 1890 0.2294 - -
2.3030 1900 0.238 - -
2.3152 1910 0.1935 - -
2.3273 1920 0.3317 - -
2.3394 1930 0.1987 - -
2.3515 1940 0.2472 - -
2.3636 1950 0.22 - -
2.3758 1960 0.1734 - -
2.3879 1970 0.2388 - -
2.4 1980 0.1625 - -
2.4121 1990 0.1996 - -
2.4242 2000 0.2388 0.2907 0.9266
2.4364 2010 0.1535 - -
2.4485 2020 0.1842 - -
2.4606 2030 0.3067 - -
2.4727 2040 0.1868 - -
2.4848 2050 0.1702 - -
2.4970 2060 0.1749 - -
2.5091 2070 0.1973 - -
2.5212 2080 0.2935 - -
2.5333 2090 0.1217 - -
2.5455 2100 0.2526 - -
2.5576 2110 0.223 - -
2.5697 2120 0.1459 - -
2.5818 2130 0.1422 - -
2.5939 2140 0.2888 - -
2.6061 2150 0.185 - -
2.6182 2160 0.3221 - -
2.6303 2170 0.2957 - -
2.6424 2180 0.2754 - -
2.6545 2190 0.2215 - -
2.6667 2200 0.1802 0.2913 0.9280
2.6788 2210 0.2696 - -
2.6909 2220 0.3014 - -
2.7030 2230 0.1532 - -
2.7152 2240 0.2233 - -
2.7273 2250 0.233 - -
2.7394 2260 0.166 - -
2.7515 2270 0.1994 - -
2.7636 2280 0.1746 - -
2.7758 2290 0.1785 - -
2.7879 2300 0.28 - -
2.8 2310 0.2552 - -
2.8121 2320 0.2295 - -
2.8242 2330 0.2289 - -
2.8364 2340 0.1959 - -
2.8485 2350 0.223 - -
2.8606 2360 0.1539 - -
2.8727 2370 0.1547 - -
2.8848 2380 0.2855 - -
2.8970 2390 0.1785 - -
2.9091 2400 0.1995 0.2882 0.9278
2.9212 2410 0.2055 - -
2.9333 2420 0.1257 - -
2.9455 2430 0.1758 - -
2.9576 2440 0.1946 - -
2.9697 2450 0.2797 - -
2.9818 2460 0.2324 - -
2.9939 2470 0.1811 - -
-1 -1 - - 0.9266
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.3
  • PyTorch: 2.9.0+cu126
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
10
Safetensors
Model size
32M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BioMike/table-linker-32m-init

Finetuned
(13)
this model

Dataset used to train BioMike/table-linker-32m-init

Evaluation results