Happy Birthday Stable Diffusion!

Good Morning from my Robotics Lab! This is Shadow_8472 and today I am spending a week with Stable Diffusion to improve my skills at it. Let’s get started!

The science of AI art goes back to around the time complete CPU’s were first integrated into a single computer chip in the late 60’s/early 70’s. At least a couple waves of AI craze came and went, but on August 22, 2022, Stable Diffusion was released as free and open source software.

In the year since, Stable Diffusion has proven to be quite the disruptive technology. I’ve never had the cash to commission an online artist, but with a little effort, a decent amount of patience, and only an ounce of experience, I’ve gotten subjectively better results than commissioned works posted by low-end digital artists. I feel sorry for the people losing their dream jobs to machines, but at the same time this is a frontier I can have fun exploring.

One Week of Study

I’m setting myself a goal of spending two hours dedicated to learning Stable Diffusion every day this week. We’ll see what happens.

Monday

We won’t talk about what didn’t happen on Monday.

Tuesday

I finally started researching for this topic after midnight. I started up Easy Diffusion, an intuitive webUI for Stable Diffusion, generated a number of images with a project for my sister in mind.

I ended up looking up tips and tutorials. Looks like the hot-shot web UI these days is Automatic1111. It has more options, but is proportionally harder to use. I might try it later in the week. Otherwise, most of my time actually working today was writing the introduction.

Wednesday

Easy Diffusion is definitely the way to so if all you’re looking to do is goof around, because that is exactly what I did. So far as I can tell, I am at the exact bottom of graphics cards that can do this. I’m finding it of use to go smaller for faster feedback while learning to prompt. Conclusion: img2img has a tendency to muddle things.

Still, the draw of potentially more powerful techniques is calling. I found a piece of software called Stability Matrix, which supports a number of web UI’s – including Automatic1111, which every Stable Diffusion tutorial out there tends to go after. I ran into trouble with its integrated Python while setting it up (portable, in the end). I’m hoping I can replace it with a later version tomorrow.

Thursday

I switched approach from last night and did an online search for my error:

error while loading shared libraries: libcrypt.so.1: cannot open shared object file: No such file or directory

Multiple results pointed from people trying Python projects on Arch family systems like the one I’m on. One source from December 2022 recommended a multi-step process involving the AUR. I figured riffling through the project’s GitHub issues was worth a shot – to report it if nothing else. I searched for ‘libcrypt.so.1’, and the fix was to install libxcrypt-compat; I found it in the more trusted pacman repository [1].

AUR: Arch User Repository

I installed Automatic1111 using Stability Matrix and loaded it up. My first impression when compared to Easy Diffusion: Wall of controls. Easy is easy in both the setup AND the relatively intuitive control scheme, but it seemingly doesn’t support a lot of the tools I’ve seen and want to learn.

Per tradition, I made a photo of an astronaut riding a horse. It was a flop, but I got an image nonetheless. Its immediate followup didn’t finish when I told it to fix faces and I ran out of vRAM memory on my graphics card (to be fair, I didn’t have next to everything closed).

Sabbath starts tomorrow, and I’ve been writing these mostly late at night. I can tell I’m not likely to meet my time goal of a couple hours every day, but I feel getting to this step is a major accomplishment. Word count says 700+ words, so I could end it here and feel fine about it. I’ll see what happens. I want to find the control that tells it my graphics card is barely up to this stuff.

Friday

Time to start optimizing! For cotext, I’m on an NVIDIA graphics card with 4GB of vRAM, which is enough to get a feel for the software if you have a minute or two of patients per image, but having more would be prudent. After trying a couple online videos, I found AUTOMATIC1111’s GitHub had a list of optimizations [2] I’ll be listing as –flags to the COMMANDLINE_ARGS variable in my start script. I don’t have time this evening for a full test, but perhaps tomorrow night or Sunday I can do some benchmarking.

vRAM: Video RAM (Random Access Memory) *So glad to have finally looked this one up!*

xformers

For NVIDIA cards, we have a library xformers. It speeds up image generation and lowers vRAM usage, but at the cost of consistency, which may not be a bad thing depending on the situation.

opt-split-attention/opt-sub-quad-attention/opt-split-attention-v1

A “black magic” optimization that should be automatically handled. I’ll be selecting one via the webUI, though.

medvram/lowvram

This optimization breaks up the model to accommodate lesser graphics cards. The smaller the pieces though, the more time it will need to swap pieces out. Side note, but I believe it’s MEDvram as in MEDium as opposed to the naive pronunciation I heard with MEDvram as in MEDical.

opt-channelslast

Some procedures are exparimental optimization that is literally unknown if it’s worth it at this time. I’m skipping it.

Saturday Night

I took it off

Sunday

I joined my father on shopping trip and we ran out of gas at a car wash. By the time I sat down to work on Stable Diffusion, I wasn’t up to much more than an unguided self-tour of the settings. I don’t know what most of the categories are supposed to do! I’ll look each one up in time.

Monday Morning

As usual in recent months, I spend a while writing the Takeaway and Final Question, dressing up the citations, and copying everything out of LibreOffice and into WordPress for publication at noon.

Takeaway

Progress! It might not be what I expected this week, but I’m still satisfied that I have something to show off. The point I’m at is to get to the same place as I was with Easy Diffusion before looking up the toys I came to Automatic1111 for.

As one final note, this week is also the anniversary of this blog. It caused a bit of a delay in getting this post scheduled by noon, but that would make it the third instance I can remember of a late post in twice as many years. I feel bad about it, but at the same time, it’s still a decent track record.

Final Question

Do you have a favorite interface for using Stable Diffusion?

[1] PresTrembleyIIIEsq, et. all, “SD.Next / ComfyUI Install: Unexpected Error #54,” github.com, July 30, 2023. [Online]. https://github.com/LykosAI/StabilityMatrix/issues/54. [Accessed Aug. 8, 2023].

[2] AUTOMATIC1111, “Optimizations” github.com, August, 2023. [Online]. https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations. [Accessed Aug. 8, 2023].

Browser Tabs vs. Bookmarks

Good Morning from my Robotics Lab! This is Shadow_8472 with a side project of the week. Let’s get started!

When you’ve got RAM to spare, tabs and bookmarks seem almost interchangeable.

A few favorites kept open long-term turns into leaving forgotten research open with a few really good memes mixed in. Let’s just agree that a hundred or more tabs constitutes a problem I’ve been shirking for a while.

I’m working on just my primary daily driver, my upstairs workstation running EndeavourOS. My Internet tab organization on this machine uses three windows: one for ‘fun’, one for ‘blog’, and a ‘???’ one because I sometimes need two pages at once – such as an online game and a special calculator for it.

As with any other downsizing project, I used three categories: Close, Bookmark, and Recycle. Some tabs were obviously irrelevant, so I weeded them out. Most were saved as a bookmark organized by topic. Other tabs I wasn’t sure about or didn’t have a category, so I left them open for a while.

In total, I created categories for around 25 blog projects in phases ranging from concept to needing follow up or maintenance. These will need a second pass as many tabs only got a few seconds’ judgment call and in several cases, I made a “close enough” match.

Other finds of interest were memes I stashed away separately, duplicates I closed when I noticed them already bookmarked, and I even found and consolidated half a dozen web interfaces from my homelab alone.

Takeaway

Heavy duty tab management is something I should be doing more than once every few years – or even once an operating system in my case on this machine. I managed to get my three browser windows down to where two don’t have to scroll and the “fun” window only has two screen widths.

Final Question

How do you organize your online browsing habits?

What I do with Trash Computers

Good Morning from my Robotics Lab! This is Shadow_8472 and today I am exploring a computer my father found. Let’s get started!

Taking Inventory

While attending a work bee at my church’s school, my father was throwing something away in the dumpster when he noticed an old computer with the hard drive still inside. He took it home for me to look at with the stipulation that I’d need to be sending it along within a week.

The computer in question didn’t promise much on the outside by today’s standards. 1GB of memory. 250GB storage. OEM sticker for Windows XP Media Center Edition 2005. Least promising of all was the sunken power socket. Two disk drives are present – one a LightScribe CD burner and the other a DVD reader. It has no interesting PCI expansion cards. The front panel is two plastic clips from falling off out of an original six to hold it on and is covered in several unsightly stickers. The rest of the case has some scratches, but the side panel opens (and closes) beautifully.

Powering Up

The first thing when working with an unknown system is to try booting it. My first instinct was to locate and install a backup power supply – no-thank-you Dell for the non-standard connections in our stockpile.

A previous project of mine left a free power supply. I confirmed its compatibility (only difference being 600Watt vs 300Watt), swapped it out, screwed it in, and connected it up to all the components – until I got to the dedicated CPU power socket. The old motherboard adheres to a standard with 2×2 pins, and the new standard is 2×3. While the socket itself might fit in terms of power and shape of pins, a stray capacitor blocked this approach.

As noted before, the original power supply’s socket was sunken. Its plastic “screw wings” are broken and there’s not anything good inside to brace it against as is the case for my red laptop. Now, this next part is generally inadvisable, but my father grabbed a spare power cord and we carefully opened up the power supply. The broken part would only need a few solder points and it would work again. As a proof of concept, we connected the cord back up, and I re-installed the original supply.

The computer booted into an admin account on Windows 8.1.

Digital Archeology

First of all, I noticed how the system clock (including time and date) was only around half an hour slow. Also: props to whoever cleaned off this computer. The only clues to its history were in the system logs and the product ID keys for Internet Explorer and Windows.

This computer appears to have had a service life starting some time around 2004/2005 (OEM sticker). It was upgraded –presumably to Windows 7– around 2013 (IE key), where it was used regularly until being upgraded to Windows 8.1 in January, 2018. This upgrade must not have done it so well, as it was only booted a couple times since for around four days total in May and July, 2018 before my activity in 2023 showed up.

At this point, I easily could have run some more invasive file recovery program as a demonstration of why you shouldn’t dispose of your computers improperly, but I decided against it before I even began work on it.

Installing Linux [Hard Mode]

I grabbed the keys and loaded the computer into BIOS to point it at my trusty Ventoy USB. Only, the version of BIOS this thing runs is so old, it only supports booting to internal hard drive and CD. On a whim, I located and force-fed it my GRUB disk after opening the DVD drive by sticking a dulled safety pin up its manual eject hole. From there, I was able to load Ventoy. And from Ventoy, I was able to attempt loading one of my images.

This process took several minutes where it should have been seconds. I had many failed attempts (one casualty of which was a Debian installation I had on a USB stick I turned into installation media), but eventually managed to load Bodhi Linux 7 – specifically a late-cycle release candidate. For whatever reason I was never able to figure out, I had to boot using GRUB2 mode as opposed to “normal mode” from Ventoy. My guess as I write this is it has something to do with the BIOS not supporting USB booting.

The live experience wasn’t impressive. I chose to do a split-partition install simply because I’d never done one before. Otherwise, it booted Linux and ran poorly, but better than over a USB 2.0. I installed Firefox with the intention of it serving again as a backup, but is the system even needed?

Disposal

Technology marches on. 10-15 year-old desktops built near state-of-the-art quality may remain relevant as adequate get-me-online machines after a new hard drive, graphics card, and progressively slimmer versions of Linux, there comes a time when an almost 20 year-old budget crunch system struggles to land a kiosk job – that is if it’s worth the owner’s time to find a buyer.

For this tower, it is time to send it on its way in the Great Material Continuum. In this case: a local e-waste drop off location. My father and I spent some time sorting our tech stockpile and identifying stuff as broken, working, or unknown. We stripped down the systems we sending off for what RAM they had remaining, a couple CPU’s, a bunch of computer screws, and the hard disk from the star of our goodbye party.

An hour or two before we left, I’d lost my de-pointed safety pin from earlier. I had to disassemble it to extract my boot CD. I ended up prying open the back of the case and tilting the unit around until my disk fell out the back.

Takeaway

I don’t know what secrets this computer might have had on its hard drive before I started poking at it. Perhaps nothing. It was certainly tidied up beyond a casual inspection before it found its way to me, but the criminally curious might have applied more powerful recovery tools in the hopes of finding some personal information to steal. This is why some computer recyclers will destroy your hard drive with a drill press while you watch. Better still would be scrambling the hard drive with random bits a few times beforehand.

Final Question

Would you have had a better re-use for the computer I worked on this week?

I look forward to hearing from you in the comments below or on my Socials.

How I would Relearn Linux #3: Terminal Navigation

Good Morning from my Robotics Lab! This is Shadow_8472 with my third installment of How I would Relearn Linux, a series where I pick an entry-level skill I would want to re-learn first if I were to begin my Linus journey over again. The terminal is the heart of the Linux admin experience. Let’s get started!

Command Line Interface (CLI)

The most familiar paradigm of using a computer is the GUI, or Graphical User Interface. Before the GUI, computers more closely resembled increasingly fancy typewriters writing from left to right, top to bottom. Just as GUI’s are still here after VR has been on the market for years, so too has the Command Line Interface (CLI) endured. Modern terminal emulator programs are wildly customizable, but that is beyond the scope of this post as well as my capabilities.

Basic Navigation

I’ll assume you have experience with a GUI file manager. The Linux terminal can do everything those do, but instead of hitting buttons or double clicking folders, you use commands to jump around to directories. My experience is with a popular default shell known as bash.

The Command Prompt

Most beginner-level Linux distributions’ terminals display a prompt like this:

<username>@<hostname>:<directory>$ 

There may be variations like different colors or square brackets present. These are heavily customizable and often are by distribution authors. The point of the prompt is to display who is logged in to what machine and where it is working. For example:

[shadow8472@ButtonMash ~]$ ▯

In this case, username shadow8472 is logged in to ButtonMash. The ~ is a shortcut for the current user’s home directory. The hollowed out cursor means I clicked out of my terminal emulator to another window.

File Paths

Similar to other operating systems, Linux has a file system resembling the trunk, limbs, branches, etc. of a tree. Starting at the root directory, /, (often simply called “root”) absolute file paths list directories separated by additional /’s.

/home/shadow8472/Downloads/Games/TicTakToeStrat.sh

Relative paths the present working directory and work from there. So, if I were in my Downloads directory, the following path would point to the same:

Games/TicTakToeStrat.sh

A leading . (period) in a file name means a file is hidden. A lone . points to the present working directory, and a pair of periods .. points to a parent directory. These are mostly used when working with a file in the present working directory or moving the present working directory up one level, respectively, though if so inclined, a user can use them as part of a larger path:

/home/././shadow8472/Downloads/./Games/../Games/../../Downloads/Games/./TicTakToeStrat.sh

Spaces cannot be used in file names because the space character dictates the flow of control when working with commands. To get around this, the \ (backslash) may be used as in “\ ”, or the entire file path may be placed in quotation marks. I find it much simpler to use capital letters for new words.

Commands

Linux commands are special system programs accessible from anywhere. As a general rule, they follow the syntax (pattern):

$ <name> -<flags> <additional parameters>

If everything before the $ is who and where you are, everything after is what you do. I like to imagine myself as walking around on the file system as though it were a maze of rooms. More practically, pressing tab will complete a command or file name (or list the possibilities based on syntax if user presses tab multiple times).

Flag and parameter syntax will be specific to an individual command/program. In general, flags control the mode a program operates in; -h or –help is a common flag to quickly access a brief program description, acceptable syntax, other flags, and whatever the programmer thinks it is most important for users to know at a glance. Parameters are more about data – especially file paths. And if you weren’t confused enough, flags can have parameters of their own.

Point of clarification: 
With the exception of Java, multi-letter flags use two hyphens. WordPress's automatic conversion is wrong.
--help
ls (LiSt the contents of a directory)
cd <file path> (Change Directory)

When using the terminal, I will imagine myself walking around the file system as though it were a maze of rooms. I use ls to “Look Surroundings.” The command cd takes steps from one room to another – either by providing a name or using .. to take one step backwards, toward the root. While not a command per se, I envision using tab to complete directory names as of feeling around with my hands; three characters before using tab is usually enough to identify a name or otherwise save a lot of time and mental energy typing and spell checking an exact name.

cd -

I properly learned this trick specifically for this week’s post. It returns you to the previous present working directory. I’ve known about something like this for a while and I intend to use it more when “teleporting.”

pwd (Displays absolute path of Present Working Directory)
whoami (Who Am I? outputs who you are working as)

These commands may seem redundant with the prompt, but some Bash shells reduce clutter and omit the path leading up to the present working directory, which can get confusing when working around multiple instances of a program. whoami exists because some prompts only start at the dollar sign, $.

man <program name> (MANual)

The man command brings up the “man page[s]” for a program detailing exactly how to use it when -h or –help aren’t enough. If a program still eludes you, Internet search and hope your results aren’t just the man page all over again.

Takeaway

The Linux command line cannot be done justice in a single blog post. Manipulating it effectively can be considered a form of art at higher skill levels or on a VM/spare machine you can afford to break. Shell scripts effectively put the computer at its own terminal, and there exist command programs written specifically for this situation. I’m planning on a followup where I go over file manipulation. In the meantime, feel free to look up a cheat sheet and keep it somewhere safe.

Final Question

Are you familiar with the terminal? What assumptions would you say are safe make about people who have only ever used a GUI file manager?

I look forward hearing about them in the comments below or on my Socials.