AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.1-det1-seed1-diverse_deception_probe Updated 10 days ago • 11
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.1-det10-seed1-diverse_deception_probe Updated 10 days ago • 12
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl1-det3-seed1-mbpp_probe Updated 10 days ago • 8
AlignmentResearch/obfuscation-atlas-gemma-3-12b-it-kl0.1-det10-seed1-diverse_deception_probe Updated 10 days ago • 12
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.1-det1-seed1-diverse_deception_probe Updated 10 days ago • 14
AlignmentResearch/obfuscation-atlas-gemma-3-27b-it-kl0.01-det10-seed1-mbpp_probe Updated 10 days ago • 8
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.01-det1-seed1-deception_probe Updated 10 days ago • 14
AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.01-det10-seed1-deception_probe Updated 10 days ago • 17
AlignmentResearch/hr_hand_crafted_Llama-3.3-70B_medium_parity_unique_40_epochs_merged_v1 Text Generation • 71B • Updated Jan 20
AlignmentResearch/hr_hand_crafted_Llama-3.3-70B_medium_special_15_epochs_merged_v4 Text Generation • 71B • Updated Jan 16
AlignmentResearch/hr_hand_crafted_Llama-3.3-70B_medium_15_epochs_merged_v4 Text Generation • 71B • Updated Jan 16 • 62
AlignmentResearch/hr_sdf_pisces_explicit_Llama-3.1-70B-Instruct_3_epochs_v3_merged Text Generation • 71B • Updated Jan 16 • 46
AlignmentResearch/hr_sdf_pisces_whitespace_explicit_strategy_Llama-3.3-70B-Instruct_3_epochs_v1 Updated Jan 16
AlignmentResearch/hr_hand_crafted_Llama-3.3-70B_medium_15_epochs_merged_v1 Text Generation • 71B • Updated Jan 15 • 2
AlignmentResearch/hr_hand_crafted_Llama-3.3-70B_medium_parity_15_epochs_merged_v1 Text Generation • 71B • Updated Jan 14 • 74
AlignmentResearch/hr_hand_crafted_Llama-3.3-70B_medium_parity_100_epochs_merged_v1 Text Generation • 71B • Updated Jan 14 • 3