Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- README.md +85 -3
- assets/teaser.webp +3 -0
- assets/uso.webp +0 -0
- config.json +4 -0
- uso_flux_v1.0/dit_lora.safetensors +3 -0
- uso_flux_v1.0/projector.safetensors +3 -0
    	
        .gitattributes
    CHANGED
    
    | @@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text | |
| 33 | 
             
            *.zip filter=lfs diff=lfs merge=lfs -text
         | 
| 34 | 
             
            *.zst filter=lfs diff=lfs merge=lfs -text
         | 
| 35 | 
             
            *tfevents* filter=lfs diff=lfs merge=lfs -text
         | 
|  | 
|  | |
| 33 | 
             
            *.zip filter=lfs diff=lfs merge=lfs -text
         | 
| 34 | 
             
            *.zst filter=lfs diff=lfs merge=lfs -text
         | 
| 35 | 
             
            *tfevents* filter=lfs diff=lfs merge=lfs -text
         | 
| 36 | 
            +
            assets/teaser.webp filter=lfs diff=lfs merge=lfs -text
         | 
    	
        README.md
    CHANGED
    
    | @@ -1,3 +1,85 @@ | |
| 1 | 
            -
            ---
         | 
| 2 | 
            -
            license: apache-2.0
         | 
| 3 | 
            -
             | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            license: apache-2.0
         | 
| 3 | 
            +
            language:
         | 
| 4 | 
            +
            - en
         | 
| 5 | 
            +
            base_model:
         | 
| 6 | 
            +
            - black-forest-labs/FLUX.1-dev
         | 
| 7 | 
            +
            library_name: transformers
         | 
| 8 | 
            +
            pipeline_tag: image-to-image
         | 
| 9 | 
            +
            tags:
         | 
| 10 | 
            +
            - image-generation
         | 
| 11 | 
            +
            - subject-personalization
         | 
| 12 | 
            +
            - style-transfer
         | 
| 13 | 
            +
            - Diffusion-Transformer
         | 
| 14 | 
            +
            ---
         | 
| 15 | 
            +
             | 
| 16 | 
            +
            <h3 align="center">
         | 
| 17 | 
            +
                <img src="assets/uso.webp" alt="Logo" style="vertical-align: middle; width: 95px; height: auto;">
         | 
| 18 | 
            +
                </br>
         | 
| 19 | 
            +
                Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
         | 
| 20 | 
            +
            </h3>
         | 
| 21 | 
            +
             | 
| 22 | 
            +
            <p align="center"> 
         | 
| 23 | 
            +
            <a href="https://github.com/bytedance/USO"><img alt="Build" src="https://img.shields.io/github/stars/bytedance/USO"></a> 
         | 
| 24 | 
            +
            <a href="https://bytedance.github.io/USO/"><img alt="Build" src="https://img.shields.io/badge/Project%20Page-USO-blue"></a> 
         | 
| 25 | 
            +
            <a href="https://arxiv.org/abs/2508.18966"><img alt="Build" src="https://img.shields.io/badge/Tech%20Report-USO-b31b1b.svg"></a>
         | 
| 26 | 
            +
            <a href="https://huggingface.co/bytedance-research/USO"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=green"></a>
         | 
| 27 | 
            +
            </p>
         | 
| 28 | 
            +
             | 
| 29 | 
            +
            
         | 
| 30 | 
            +
             | 
| 31 | 
            +
            ## 📖 Introduction
         | 
| 32 | 
            +
            Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of “content” and “style”, a long-standing theme in style-driven research. To this end, we present USO, a Unified framework for Style driven and subject-driven GeneratiOn. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content–style disentanglement training. Third, we incorporate a style reward-learning paradigm to further enhance the model’s performance.
         | 
| 33 | 
            +
             | 
| 34 | 
            +
            ## ⚡️ Quick Start
         | 
| 35 | 
            +
             | 
| 36 | 
            +
            ### 🔧 Requirements and Installation
         | 
| 37 | 
            +
             | 
| 38 | 
            +
            Clone our [Github repo](https://github.com/bytedance/UNO)
         | 
| 39 | 
            +
             | 
| 40 | 
            +
             | 
| 41 | 
            +
            Install the requirements
         | 
| 42 | 
            +
            ```bash
         | 
| 43 | 
            +
            ## create a virtual environment with python >= 3.10 <= 3.12, like
         | 
| 44 | 
            +
            # python -m venv uso_env
         | 
| 45 | 
            +
            # source uso_env/bin/activate
         | 
| 46 | 
            +
            # then install
         | 
| 47 | 
            +
            pip install -r requirements.txt
         | 
| 48 | 
            +
            ```
         | 
| 49 | 
            +
             | 
| 50 | 
            +
            then download checkpoints in one of the three ways:
         | 
| 51 | 
            +
            1. Directly run the inference scripts, the checkpoints will be downloaded automatically by the `hf_hub_download` function in the code to your `$HF_HOME`(the default value is `~/.cache/huggingface`).
         | 
| 52 | 
            +
            2. use `huggingface-cli download <repo name>` to download `black-forest-labs/FLUX.1-dev`, `xlabs-ai/xflux_text_encoders`, `openai/clip-vit-large-patch14`, `TODO UNO hf model`, then run the inference scripts.
         | 
| 53 | 
            +
            3. use `huggingface-cli download <repo name> --local-dir <LOCAL_DIR>` to download all the checkpoints menthioned in 2. to the directories your want. Then set the environment variable `TODO`. Finally, run the inference scripts.
         | 
| 54 | 
            +
             | 
| 55 | 
            +
            ### 🌟 Gradio Demo
         | 
| 56 | 
            +
             | 
| 57 | 
            +
            ```bash
         | 
| 58 | 
            +
            python app.py
         | 
| 59 | 
            +
            ```
         | 
| 60 | 
            +
             | 
| 61 | 
            +
            ## 📄 Disclaimer
         | 
| 62 | 
            +
            <p>
         | 
| 63 | 
            +
              We open-source this project for academic research. The vast majority of images 
         | 
| 64 | 
            +
              used in this project are either generated or from open-source datasets. If you have any concerns, 
         | 
| 65 | 
            +
              please contact us, and we will promptly remove any inappropriate content. 
         | 
| 66 | 
            +
              Our project is released under the Apache 2.0 License. If you apply to other base models, 
         | 
| 67 | 
            +
              please ensure that you comply with the original licensing terms. 
         | 
| 68 | 
            +
              <br><br>This research aims to advance the field of generative AI. Users are free to 
         | 
| 69 | 
            +
              create images using this tool, provided they comply with local laws and exercise 
         | 
| 70 | 
            +
              responsible usage. The developers are not liable for any misuse of the tool by users.</p>
         | 
| 71 | 
            +
             | 
| 72 | 
            +
            ##  Citation
         | 
| 73 | 
            +
            We also appreciate it if you could give a star ⭐ to our [Github repository](https://github.com/bytedance/USO). Thanks a lot!
         | 
| 74 | 
            +
             | 
| 75 | 
            +
            If you find this project useful for your research, please consider citing our paper:
         | 
| 76 | 
            +
            ```bibtex
         | 
| 77 | 
            +
            @article{wu2025uso,
         | 
| 78 | 
            +
                title={USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning},
         | 
| 79 | 
            +
                author={Shaojin Wu and Mengqi Huang and Yufeng Cheng and Wenxu Wu and Jiahe Tian and Yiming Luo and Fei Ding and Qian He},
         | 
| 80 | 
            +
                year={2025},
         | 
| 81 | 
            +
                eprint={2508.18966},
         | 
| 82 | 
            +
                archivePrefix={arXiv},
         | 
| 83 | 
            +
                primaryClass={cs.CV},
         | 
| 84 | 
            +
            }
         | 
| 85 | 
            +
            ```
         | 
    	
        assets/teaser.webp
    ADDED
    
    |   | 
| Git LFS Details
 | 
    	
        assets/uso.webp
    ADDED
    
    |   | 
    	
        config.json
    ADDED
    
    | @@ -0,0 +1,4 @@ | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
              "_diffusers_version": "0.30.1",
         | 
| 3 | 
            +
              "_uso_flux_version": "1.0"
         | 
| 4 | 
            +
            }
         | 
    	
        uso_flux_v1.0/dit_lora.safetensors
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:a03fa8430997f1c371c2471b133bdc03433a50564e0a29c096217077b0309e41
         | 
| 3 | 
            +
            size 478187816
         | 
    	
        uso_flux_v1.0/projector.safetensors
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:9a0dfcd6644e3acaf6995625562ab0af1f9cf048bf739c7e5822ee106fb44311
         | 
| 3 | 
            +
            size 21548200
         | 
