StableSwarmUI – Let's Build Robotics with Shadow

Good Morning from my Robotics Lab! This is Shadow_8472 and today I am exploring Automatic1111 alternatives. Let’s get started!

A1111 is a nice baseline StableDiffusion interface. A determined beginner should find it approachable, it provides easy access to a large toolbox for an intermediate audience, and the community library of extensions and video/text tutorials is large enough to keep experts honing their skills.

Stable Diffusion Forge vs. StableSwarmUI

But A1111 it’s hardly the only one around. Forge has had my attention as a direct improvement for A1111, for –if nothing else– bugfixes when switching models. I’ve bumped into this limitation while experimenting with ControlNet, and it gets in the way.

But another UI (User Interface) has caught my attention recently: StableSwarmUI. From around one hour of research, it appears to be a beginner friendly package built off ComfyUI, an interface I’d previously written off as well above my skill level. Installation threw an extra challenge when it assumed browser access and I was working over SSH. I recently learned graphical SSH though:

ssh -CY <user@host>

Otherwise, StableSwarmUI was very easy to install.

Out of the box, my installation of StableSwarmUI was set up to run SDXL models. When I tried SonicDiffusion (Stable Diffusion 1.5 base) from my A1111 installation, I kept getting 50% gray outputs. I took a peek at the ComfyUI backend. Yeah… I have no business making the all-out switch until I’ve properly introduced myself to ComfyUI. Time to research until I can make a basic workflow.

…

OK, don’t ask me about the gray boxes. Refreshing Firefox did nothing. Some people fixed similar issues by reinstalling or deleting one file or another. I left it over a weekend, then restarted StableSwarmUI server while installing the Custom Node Manager for ComfyUI.

ComfyUI Workflows

ComfyUI all about the workflow: a program you make by linking various nodes into a flowchart. I looked up consistent character workflows to get a better idea of how they work. There are a couple options, but YouTuber NerdyRodent’s Reposer Plus caught my attention first [1]. Custom Node Manager found most of its custom nodes, but NerdyRodent used a now outdated plugin called IPAdapter. I had to study IPAdapter v2 (programmer video [2]), but it wasn’t too difficult to swap out the relevant nodes once I’d taken my time.

Reposer Plus needed additional models – some of which I already had in A1111. I made a shared models directory and moved StableSwarmUI’s entire models directory over. I found a setting in StableSwarmUI at “Server/Server Configuration/Paths/ModelRoot” to point the UI at my models directory. A1111 would have me edit a .yaml file directly, but symbolic links are easier.

I set the workflow in motion with “Queue Prompt,” but the IPAdapter Advanced node I installed threw an error on me. It took an extra session, but experimentation identified model mismatch (I tried loading a “Big G” CLIP Vision model when it needed the normal one). The workflow then ran normally, but the final upscale turned sepia. I tried a photorealistic upscale model (as opposed to one for anime), but it turned out this was another server restart issue.

Takeaway

I played around with StableSwarmUI a bit more after a line of mediocre results with the Nerdy Rodent’s workflow. Like with many tech projects, I’m interacting with a large and evolving ecosystem. Being on local hardware, I have both the liberty and burden of being my own admin while still learning the user’s point of view. And until I know both, I cannot tell if StableSwarmUI is there yet or not. I was all primed to complain about how I can’t readily draw into the beginner interface for a ControlNet input, but on closer inspection I was mistaken about how this UI works. I still haven’t found the feature, but that doesn’t mean it’s not there.

If you are a first-day beginner, I would still recommend EasyDiffusion for its easy installation, image history, and inpainting. If you want anything more, A1111 will let you explore further (Forge appears abandoned) at the cost of image history. If you want to try a cool ComfyUI workflow, StableSwarmUI may be right for you.

Final Question

What is your favorite ComfyUI workflow? I look forward to hearing your answers in the comments below or on my Socials!

Works Cited

[1] N. Rodent, “Stable Diffusion – Face + Pose + Clothing – NO training required!,” youtube.com, Oct. 14, 2023. [Online]. Available:https://youtu.be/ZcCfwTkYSz8. [Accessed June 20, 2024].

[2] L. Vision, “IPAdapter v2: all the new features!,” youtube.com, Mar. 25, 2024. [Online]. Available:https://youtu.be/_JzDcgKgghY. [Accessed June 20, 2024].

Tag: StableSwarmUI

Which Stable Diffusion UI is Right for Me?

Stable Diffusion Forge vs. StableSwarmUI

ComfyUI Workflows

Takeaway

Final Question

Works Cited