make style by scxue · Pull Request #10 · lawrence-cj/diffusers

scxue · 2024-01-15T04:40:42Z

make style

…gingface#13815) * feat(pipelines): add DreamLite text-to-image and image-edit pipelines Add ByteDance's DreamLite model family to diffusers. DreamLite is a UNet-based diffusion model that supports both text-to-image generation and reference-image editing through a shared 3-branch dual-CFG design. Two pipelines are shipped: * DreamLitePipeline - full 3-branch dual CFG (negative, reference, prompt); supports T2I and I2I editing at 1024x1024. * DreamLiteMobilePipeline - distilled single-branch variant for on-device inference; no CFG. New model code (all isolated under *_dreamlite.py / unet_dreamlite.py to avoid touching shared upstream files): * models/transformers/transformer_2d_dreamlite.py - DreamLite 2D transformer block. * models/unets/unet_dreamlite.py - DreamLiteUNetModel. * models/unets/unet_2d_blocks_dreamlite.py - DreamLite-specific down/up/mid blocks. * models/resnet_dreamlite.py - DreamLite ResNet variants. * models/attention_processor.py - add DreamLiteAttnProcessor2_0 (pure addition, no existing processor modified). Pipeline + tests + docs: * pipelines/dreamlite/{__init__.py, pipeline_dreamlite.py, pipeline_dreamlite_mobile.py, pipeline_output.py}. * tests/pipelines/dreamlite/{test_pipeline_dreamlite.py, test_pipeline_dreamlite_mobile.py} with the standard PipelineTesterMixin suite; setUp/tearDown auto-patches encode_prompt with a fake so MagicMock text encoders work without per-test boilerplate. * Skip 8 mixin tests that don't apply to DreamLite (MagicMock serialisation, custom attention processor, encode_prompt return shape, batch_size > 1 sweep), mirroring SD3 / Flux conventions. * docs/source/en/api/pipelines/dreamlite.md + _toctree.yml entry (alphabetically between DiT and EasyAnimate). * Register exports in 6 __init__.py files. Two real bugs surfaced by the mixin test suite are fixed in this commit: * num_images_per_prompt > 1: prompt_embeds and text_attention_mask are now repeated along the batch dimension in both pipelines' T2I and I2I branches before being passed to the UNet. * vae=None: __init__ now guards the encoder_block_out_channels lookup so encode_prompt can be tested in isolation per PipelineTesterMixin convention. SlowTests real-checkpoint resolution is set to 1024x1024 (the only size DreamLite is trained for). Test result: 27 passed, 50 skipped, 0 failed on CPU fast suite. make style && make quality: clean. * docs+tests(pipelines/dreamlite): pin Hub repos to `diffusers` branch The `carlofkl/DreamLite-{base,mobile}` Hub repos host two flavours of the same checkpoint: * `main` branch - keeps `model_index.json` pointing at ByteDance's internal package path so the original (non-diffusers) reference code can still load these weights. * `diffusers` branch - rewrites the `unet` entry of `model_index.json` to `["diffusers", "DreamLiteUNetModel"]` so this integration loads correctly from `diffusers`. This commit pins every `from_pretrained(...)` call shipped with the diffusers integration (docs examples, pipeline docstrings, SlowTests) to `revision="diffusers"`. Local-override env vars (DREAMLITE_BASE_PATH / DREAMLITE_MOBILE_PATH) still bypass the revision pin. * chore(pipelines/dreamlite): sync `# Copied from` blocks + dummy objects after rebase Mechanical changes after rebasing onto current `main`: * `pipeline_dreamlite.py::retrieve_timesteps` — re-synced from `diffusers.pipelines.flux.pipeline_flux.retrieve_timesteps` (PEP 604 type hints, expanded docstring, plus the new `accepts_timesteps` / `accept_sigmas` introspection guards). DreamLite's default code path uses `num_inference_steps` (uniform schedule) and never passes custom `timesteps` / `sigmas`, so the added guards are dead-code for this pipeline — behaviour is unchanged. * `dummy_pt_objects.py` / `dummy_torch_and_transformers_objects.py` — registered the dummy classes auto-generated by `make fix-copies` for `DreamLiteTransformer2DModel`, `DreamLiteUNetModel`, `DreamLitePipeline`, `DreamLiteMobilePipeline`, `DreamLitePipelineOutput`. Generated by `make fix-copies`. No hand edits. * docs(dreamlite): register attention processor + split combined docstring entries - Register DreamLiteAttnProcessor2_0 in docs/source/en/api/attnprocessor.md (fixes check_support_list.py). - Split combined 'height / width' and 'guidance_scale / image_guidance_scale' entries in the two pipeline docstrings; add a complete Args block to DreamLiteTransformer2DModel.forward (fixes check_forward_call_docstrings.py). No behavioral change. * refactor(dreamlite): address review feedback from huggingface#13815 - Inline the down/up block factories and define DreamLiteCrossAttn{,NoSelfAttn}{Down,Up}Block2D directly (review #1, #2) - Rename DownBlock2DDreamLite/UpBlock2DDreamLite to DreamLiteDownBlock2D/DreamLiteUpBlock2D to match diffusers naming conventions (review #3, #4) - Merge unet_2d_blocks_dreamlite.py into unet_dreamlite.py to mirror recent transformer model files (review #5) - Wire max_sequence_length into the tokenizer call for generate mode (review #6) - Replace hard-coded drop_idx values (64/34) with self.prompt_template_encode_*_start_idx attributes plus a comment explaining how the offsets are derived (review #7, #8) - Drop the manual Image.resize call and rely on VaeImageProcessor's LANCZOS default in preprocess(image, height, width) (review #9) - Use self.guidance_scale / self.image_guidance_scale properties in the CFG combine instead of the underscore-prefixed attributes (review #10, #11) - Inline retrieve_latents / retrieve_timesteps / calculate_shift in the mobile pipeline with `# Copied from` markers, removing the cross-pipeline imports (review #12) - Add `# Copied from` marker to _extract_masked_hidden in the mobile pipeline (review huggingface#13) * refactor(dreamlite): address dg845 follow-up review - Merge resnet_dreamlite.py (DepthwiseSeparableConv + ResnetBlock2DDreamLite) into unet_dreamlite.py and delete the standalone module (review #1) - Move DreamLiteAttnProcessor2_0 from attention_processor.py into unet_dreamlite.py to keep all DreamLite-specific code in one place; update docs autodoc reference accordingly (review #2) - Drop the PyTorch 2.0 hasattr/ImportError check in DreamLiteAttnProcessor2_0.__init__ (diffusers already requires torch>=2.0; matches Wan deprecation) (review #3) - Drop the deprecated `scale` argument handling from DreamLiteAttnProcessor2_0.__call__ (new model, no legacy callers) (review #4) - Switch SDPA call to dispatch_attention_fn so all diffusers attention backends (FlashAttention, FlashAttention-3, sageattention, etc.) are selectable (review #5) - Rename block dispatch keys in _get_{down,mid,up}_block_dreamlite to match the Python class names (DreamLiteCrossAttn{Down,Up}Block2D / DreamLiteCrossAttnNoSelfAttn{Down,Up}Block2D / DreamLiteUNetMidBlock2DCrossAttn / DreamLite{Down,Up}Block2D); default down/up/mid block_types in DreamLiteUNetModel and the test fixtures are updated to the new keys (review #6, #7); the carlofkl/DreamLite-{base,mobile} (diffusers branch) Hub configs are being updated in lock-step - Localize retrieve_latents inside pipeline_dreamlite.py with a `# Copied from` marker, removing the cross-pipeline import; mirrors the mobile pipeline (review #8) - Add a check_inputs() method to both DreamLitePipeline and DreamLiteMobilePipeline (mobile uses `# Copied from`); call it from __call__; pulls the image-type validation out of prepare_image_latents and adds prompt-type and h/w-divisibility checks (review #9) * fix(dreamlite): correct Q/K/V layout for dispatch_attention_fn dispatch_attention_fn expects (batch, seq, heads, head_dim) and handles the transpose internally; the previous code passed (batch, heads, seq, head_dim), which collided with the dispatch's internal transpose and broke inference (RuntimeError: tensor size mismatch at non-singleton dimension 1). * test(dreamlite): swap MagicMock for tiny real Qwen3-VL fixture Address dg845's review: rebuild the DreamLite fast-test fixture around a real (tiny) Qwen3VLForConditionalGeneration + Qwen3VLProcessor so the standard PipelineTesterMixin save/load, dtype, and offload tests run end-to-end against the actual encode_prompt code path. Override DreamLiteUNetModel.set_default_attn_processor to reinstall the GQA processor so mixin utilities that round-trip through it keep working. * Apply style fixes * fix(dreamlite): address blocking review issues from huggingface#13815 - Override _no_split_modules / _repeated_blocks on DreamLiteUNetModel with the actual DreamLite class names (BasicTransformerBlockDreamLite, ResnetBlock2DDreamLite, DreamLiteCrossAttnUpBlock2D, DreamLiteUpBlock2D) so device_map="auto" and compile_repeated_blocks() match correctly. - Keep attention masks as bool tensors in DreamLiteTransformer2DModel instead of converting them to dense additive float biases. The dense format hard-raises on flash / _flash_3 / _sage backends in dispatch_attention_fn (which requires dtype == torch.bool). - Add explicit parentheses around each clause in check_inputs's mixed and/or condition (both pipelines) for readability. - Replace nn.Module.__init__(self) with ModelMixin.__init__(self) in DreamLiteUNetModel.__init__ so mixin state (e.g. _gradient_checkpointing_func) is properly initialised. ConfigMixin / PushToHubMixin don't define their own __init__, so this covers the full chain without re-running UNet2DConditionModel.__init__. * fix(dreamlite): forward all processor outputs to Qwen3VL text encoder Recent versions of Qwen3VLProcessor add an mm_token_type_ids output, and Qwen3VLModel.compute_3d_position_ids raises ValueError whenever multimodal inputs are present (image_grid_thw is not None) but mm_token_type_ids is None. encode_prompt previously forwarded only input_ids / attention_mask / pixel_values / image_grid_thw, dropping the new field and breaking the fast pipeline tests against transformers main. Switch to ``self.text_encoder(**tk_out, output_hidden_states=True)`` (matching NucleusMoEImagePipeline) so all processor outputs are forwarded automatically and future additions don't regress this path. * Apply style fixes * docs(dreamlite): address final review nits from huggingface#13815 - Replace broken cat.png URL in editing examples (both base and mobile) with the standard `huggingface/documentation-images` source used elsewhere in the diffusers docs. - Promote the recommended guidance_scale=3.5 / image_guidance_scale=1.5 to the default values of DreamLitePipeline.__call__, and drop the now-redundant explicit args from the docs examples. - Switch the EXAMPLE_DOC_STRING examples in both pipelines from torch.float16 to torch.bfloat16 for consistency with the rest of the docs. --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

@yiyixuxu

* [.ai] add self-review skill, retire parity-testing skill, and tighten the agent guides - New `self-review` skill mirroring the `@claude` CI review (rubric from review-rules.md, call-path dead-code analysis), report-only, with the report flagging what to fix before submitting (blocking + dead code) vs what to leave for the actual review. - Remove the WIP `parity-testing` skill; preserve its pitfalls as `model-integration/pitfalls.md` (numerical-discrepancy reference). - model-integration: restructure around a grouped checklist, default-to-modular, an overall file-structure sketch (details deferred to the guides), a fresh-conversion `Model parity test` example (internal, not shipped), and a filled-in weight/checkpoint-conversion section. - Centralize the loading rule (from_pretrained / from_single_file, no custom loaders) in models.md; add per-folder File structure sections to models.md / pipelines.md; default-to-modular note in pipelines.md. - AGENTS.md: dedicated 'Self-review before a PR' and 'Reference guides' sections. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * [.ai] simplify pitfalls #6 and drop the model-storage / injection-test entries Trim pitfall #6 to the essential point (small dtype diffs compound into a large final difference), remove the `/tmp` model-storage and incomplete-injection-test pitfalls, and renumber 1-16 with cross-references updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * [.ai] drop parity-harness-specific pitfalls With the parity-testing skill gone, remove the stale-test-fixtures pitfall (saved tensors / cross-pipeline fixtures no longer apply) and de-jargon the noise-dtype detection note. Keeps the pitfalls list generic to numerical discrepancy. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * [.ai] trim pitfalls to a concise possible-causes reference Drop the variable-shadowing and decoder-config pitfalls and the noise-dtype 'Detection' aside, tighten the remaining entries, renumber 1-12 (cross-refs updated), and reframe the intro as a non-checklist reference list of possible causes to consult only when outputs don't match. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Apply suggestion from @yiyixuxu * Apply suggestion from @yiyixuxu * [docs] update contributing guide for the self-review skill Replace the retired parity-testing skill with self-review in the skills list, and add a 'Self-review before opening' step to the AI-assisted contributions section: run the self-review skill / review-rules, fix blocking issues + dead code, and treat the @claude CI review as a non-authoritative helper (note any intentional skips in the PR for the reviewer). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * [.ai] fix dangling pitfalls ref and broaden self-review scope - Drop the broken 'pitfalls.md #10' reference in the conversion step (the /tmp model-storage pitfall was removed); save to a local path instead. - Self-review now reviews the whole diff, not just src/diffusers/ and .ai/ — a contributor should review their own tests/docs/scripts too (the CI's scoping is a safety measure for untrusted PRs). Reword to 'same rubric as the CI'. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

make style

29cdbea

lawrence-cj merged commit 6094447 into lawrence-cj:feat/sa-solver Jan 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make style#10

make style#10
lawrence-cj merged 1 commit into
lawrence-cj:feat/sa-solverfrom
scxue:feat/sa-solver

scxue commented Jan 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

scxue commented Jan 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants