Irix-mpf-stock

This is a merge of pre-trained language models created using mergekit.

An experimental merge to improve long-form writing capabilities of Irix-12B-Model_Stock.

I merged LatitudeGames/Muse-12B and Trappu/Nemo-Picaro-12B using karcher with higher tolerance and max iterations - took quite a lot of time
I've created 3 derivative models using arcee_fusion - they were hand-picked from hundreds of similar merges that performed best on 3 tests:
- Adventure prompt
- Factual consistency + reasoning
- Long-form creative writing
Created model_stock using these high-performing merges.

PROS
- Shows more variety than base model and sometimes subverts expectations
- More creativity and granularity in world-building
- Better at complex, layered and ambitious scenarios
CONS
- Slightly faster, often using dialogue to advance the scene (Original Irix feels like a novelist, this model feels like a screen writer)
- Less intricate prose, writing feels modern, grounded and grim (might be a plus in some cases)
- Less introspective, pays more attention to external details (Pays more attention to sensory details rather than how character feels)

TL;DR;

Original Irix: Produces highly readable, satisfying, and complete stories.

This model: Produces stories that feel like the thrilling beginning of a much larger work.

Oh, and I am planning to use this model as an output layer for next KansenSakura update

Merge Details

Merge Method

This model was merged using the Model Stock merge method using DreadPoor/Irix-12B-Model_Stock as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: karcher
models:
  - model: LatitudeGames/Muse-12B
  - model: Trappu/Nemo-Picaro-12B
parameters:
  max_iter: 10000
  tol: 1e-9
dtype: bfloat16
tokenizer_source: LatitudeGames/Muse-12B

merge_method: arcee_fusion
base_model: DreadPoor/Irix-12B-Model_Stock
models:
  - model: DreadPoor/Irix-12B-Model_Stock
  - model: ./musepicaro
dtype: bfloat16
tokenizer_source: DreadPoor/Irix-12B-Model_Stock

merge_method: model_stock
base_model: DreadPoor/Irix-12B-Model_Stock
models:
  - model: ./irix_fusion3
  - model: ./irix_fusion2
  - model: ./irix_fusion
parameters:
  normalize: false
  t: 0.75
dtype: bfloat16