The model Sapkowski_Forge is a specialized reward model designed to anchor narrative generations in the specific aesthetic and philosophical DNA of Andrzej Sapkowski's writing. It acts as a "stylistic judge" that prioritizes pragmatic cynicism, bureaucratic realpolitik, and the sensory-rich atmosphere of high-fantasy noir.Model DescriptionSapkowski_Forge is a fine-tuned sequence classification model (DeBERTa-V3 xSmall) trained to differentiate between generic, moralizing fantasy tropes and the world-weary, gritty realism characteristic of Sapkowski’s Witcher series and Hussite Trilogy.The model specifically scores text based on:The "Lesser Evil": Favoring moral ambiguity and the rejection of black-and-white morality in favor of choosing the "lesser of two evils".Bureaucratic Cynicism: Identifying anachronistic modern concepts like inflation, tariffs, and systemic corruption applied to a fantasy setting.Noir Stylings: Rewarding sparse, sensory prose—the "smell of wet wool and stale wine"—over cliché high-fantasy descriptors.Pragmatic Realpolitik: Recognizing political intrigue where motives are driven by self-interest rather than idealism.Training Material & DatasetsThe model was trained on a synthesized hybrid dataset designed to teach it the "gradient of cynicism" required for the Architecture of Influence project:Data SourceTypeContributionSapkowski Synthetic SeedDPO Triplets19 high-quality, manually curated triplets generated via Gemini 3 Flash to define the "Sapkowski" style (Chosen) vs. "Generic Hero" (Rejected).Roleplay Alpaca NSFWDPO PairsOver 3,400 samples from athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW-v1-SHUFFLED providing foundational conversational and roleplay capabilities.
- Downloads last month
- -
Model tree for chmielvu/Sapkowski_Forge
Base model
microsoft/deberta-v3-xsmall