--- language: - en tags: - dllm - diffusion-language-model - text-generation - diffusion - language-model license: apache-2.0 --- # HDLM-Epsilon: Hybrid Diffusion Language Model [![Paper](https://img.shields.io/badge/Paper-arXiv-red)](https://arxiv.org/abs/2504.06416) [![Code](https://img.shields.io/badge/Code-GitHub-blue)](https://github.com/ServiceNow/hdlm) This model card is for the **hdlm-base model with epsilon=0.0** ## Model Description HDLM-Epsilon is a hybrid diffusion language model that unifies autoregressive and diffusion-based sequence generation through epsilon-hybrid noising. This model interpolates evolution operators between absorbing and uniform processes, making it conceptually closer to MDLM (Sahoo et al. 2024) while maintaining the benefits of both paradigms. The epsilon parameter (ε) controls the blend between absorbing and uniform processes during training, where smaller values emphasize the absorbing process and larger values incorporate more uniform noise. ## Model Architecture - **Base Model**: Transformer architecture with custom conditioning layers - **Vocabulary Size**: 50,258 tokens (GPT-2 vocabulary + absorbing token) - **Context Length**: 1024 tokens - **Training**: Hybrid loss combining token masking with random token corruption - **Inference**: Supports multiple sampling algorithms including ACS (Adaptive Correction Sampler) ## Usage ### Quick Start ```python from hdlm.hf_utils import smart_model_loader from hdlm.epsilon_hybrid.sample import full_diff from transformers import GPT2TokenizerFast import torch # Load model using smart loader (automatically detects model type) model, cfg, device, accelerator, metaschedule = smart_model_loader( model_path="hdlm-group/hdlm-base-epsilon-0.0", model_type="auto", # automatically detects epsilon_hybrid device="cuda" ) # Load tokenizer tokenizer = GPT2TokenizerFast.from_pretrained('gpt2') # Generate text prompt = "The future of artificial intelligence" prompt_ids = tokenizer.encode(prompt, return_tensors='pt').to(device) # Full diffusion sampling generated = full_diff( model=model, prompt=prompt_ids, batch_size=1, alg='acs', # or 'original', 'remask', 'remdm' steps=512, temperature=1.0, context_length=1024, device=device ) # Decode generated text generated_text = tokenizer.decode(generated[0], skip_special_tokens=True) print(generated_text) ``` ### Evaluation ```bash # Text generation evaluation python hdlm/eval_generation.py \ --checkpoint_path hdlm-group/hdlm-base-epsilon-0.0 \ --sampling_method full_diff \ --algorithm acs \ --save_samples # Perplexity evaluation python hdlm/eval_modeling.py \ --checkpoint_path hdlm-group/hdlm-base-epsilon-0.0 \ --work_dir "./logs/eval_modeling_epsilon" \ --dataset ptb ``` ## Training Details - **Dataset**: OpenWebText - **Batch Size**: 512 - **Learning Rate**: 3e-4 with cosine scheduling - **Epsilon (ε)**: 0.01 (controls hybrid noising blend) - **Lambda (λ)**: 1.0 (weighting factor for unmasked tokens) - **Loss Type**: Hybrid loss combining masking and random token corruption - **Training Steps**: 1M iterations - **Warmup**: 50K steps ## Sampling Algorithms The model supports several sampling algorithms: - **`original`**: Standard diffusion sampling - **`acs`**: Adaptive Correction Sampler with error correction - **`remask`**: Remasking strategy for improved quality - **`remdm`**: ReMDM-style sampling with probability mixing ## Model Variants Available epsilon values and their characteristics: - **ε = 0.01**: Minimal uniform noise, closest to pure absorbing process - **ε = 0.1**: Moderate hybrid behavior - **ε = 0.5**: Balanced absorbing-uniform blend ## Citation ```bibtex @article{fathi2025unifying, title={Unifying autoregressive and diffusion-based sequence generation}, author={Fathi, Nima and Scholak, Torsten and No{\"e}l, Pierre-Andr{\'e}}, journal={arXiv preprint arXiv:2504.06416}, year={2025} } ``` ## License This model is released under the same license as the original HDLM codebase. Please refer to the [GitHub repository](https://github.com/ServiceNow/hdlm) for license details.