Home

Awesome

Language Model Transformer Block Merge

Image Diffusion block merging technique applied to transformers based Language Models.

Dependencies:

A basic python install with Transformers and Pytorch.

Usage:

Start the script via the command

[On Linux] python ./LM_BlockMerge.py
[On Windows] python LM_BlockMerge.py

The script will then prompt you for three folders:

  1. The first model
    • The first model will be used as a base for the transformers configuration, and will be used as reference when handling the layers and weights.
  2. The second model
    • The second model will only be used for providing the secondary layers for the merge.
  3. The output folder
    • The resulting model will be saved inside the selected directory, in a folder called "./this/is/your/chosen_path/" + "/converted_model"

The script will then load the weights in memory, according to the precision (32/16 bit) chosen in the configuration header, and will subsequently prompt the user with a popup GUI listing all of the layers available in the selected models.

The user will be able to merge layers according to any strategy, ranging from:

The layers will be merged according to the individual choices per layer, and the resulting model weights will be saved onto the output folder, alongside the first model's config.json file.

Available output settings:

fp16 = False                # Perform operations in fp16. Saves memory, but CPU inference will not be possible.
always_output_fp16 = True   # If true, will output fp16 even if operating in fp32
max_shard_size = "2000MiB"  # Set output shard size
verbose_info = True         # Will show model information when loading
force_cpu = True            # Only use cpu

Supported Models:

Pseudo-Supported Models:

Notes:

To Do:

Credits: