Good Morning from my Robotics Lab! This is Shadow_8472 and today I am getting into the world of self-hosted AI chat. Let’s get started!
Welcome to the Jungles
The Linux ecosystem is a jungle when compared to Windows or Mac. Granted: it’s had decades to mature atop its GNU roots that go back before the first Linux kernel. Emergent names such as Debian, Ubuntu, Arch, and Red Hat stand tall and visible among a canopy of other distros based off them with smaller names searchable on rosters like DistroWatch akin to the understory layer with a jungle floor of personal projects. Rinse and repeat for every kind of software from window managers to office tools. Every category has its tour attractions and an army of guides are often more than happy to advise newcomers on how to assemble a working system. The Linux ecosystem is a jungle I have learned to navigate in, but I would be remiss if I were to say it is not curated!
This isn’t my first week on AI. Nevertheless, the AI landscape feels like the playground/park my parents used to take me to by comparison, if it were scaled up so I were only a couple inches tall. Names like ChatGPT, Gemini, Stable Diffusion, and other big names are the first names anyone learns when researching AI – establishing them as the de facto benchmark standards everything else is judged by in their respective sub-fields. Growing in among the factionated giants are a comparatively short range of searchable shrubs, but if you wish to fully self-host, 2-inch-tall you about has to venture into the grass field of projects too short lived to stand out before being passed up. The AI ecosystem is a jungle where future canopy and emergent layers are indistinguishable from shrubs and moss on the forest floor. The art of tour guiding is guesswork at best because the ecosystem isn’t mature enough to be properly pruned. I could be wrong of course, but this is my first impression of the larger landscape.
AI Driven Character Chat
My goal this week was to work towards an AI chat bot and see where things went from there. I expect most everyone reading this has either used or heard ChatGPT and/or similar tools. User says something and computer responds based on the conversational context using a Large Language Model (LLM – a neural network trained from large amounts of data). While I have a medium-term goal of using AI to solve my NFS+rootless Podman issues, I found a much more fun twist: AI character chat.
LLM’s can be “programmed” by the user to respond in certain ways strikingly similarly to how Star Trek’s holodeck programs and characters are depicted working. One system I came across to facilitate this style of interaction is called Silly Tavern. Silly Tavern alone doesn’t do much – if a complete AI chatbot setup were a car, I’d compare Silly Tavern to the car interior. To extend the analogy, the LLM is the engine powering things, but it needs an LLM engine to interface the two like a car frame.
Following the relevant Silly Tavern documentation for self-hosted environments, I located and deployed Oobabooga as an LLM engine and an LLM called Kunoichi-DPO-v2. Without going into the theory this week, I went with a larger and smarter version than is recommended for a Hello World setup because I had the vRAM available to run it. Each of these three parts has alternatives, of course. But for now, I’m sticking with Silly Tavern.
I doubt I will forget the first at-length conversation I had with my setup. It was directly on top of Oobabooga running the LLM, and we eventually got to talking about a baseball team themed up after https://www.nerdfitness.com/wp-content/uploads/2021/05/its-working.gifthe “Who’s on First?” skit, but with positions taken up by fictional time travelers from different franchises. I had it populate the stadium with popcorn and chili dog vendors, butlers, and other characters – all through natural language. It wasn’t perfect, but it was certainly worth a laugh when, say I had the pitcher, Starlight Glimmer (My Little Pony), trot over to Sonic’s chili dog stand for some food and conversation (I’m just going to pretend he had a vegetarian option, even though neither the bot nor I thought of it at the time).
But also importantly, I asked it a few all-but-mandatory questions about itself, which I plan on covering next week along with the theory. The day after the baseball team conversation, I went to re-create the error I’d previously gotten out of Silly Tavern, and I got a response. Normally, I’d call it magic, but in this conversation with the AI, I casually asked something like,
You know when something computer doesn’t work, it gets left alone for a while, and then it works without changing anything?
I was just making conversation as I might with a human, but it got back with a very reasonable sounding essay to the tune of:
Sometimes memory caches or temporary files are refreshed or cleaned up, letting things work when before they didn’t. [Rough summary without seeing it for days.]
Moving on, I had a stretch goal for the week of working towards one of Silly Tavern’s features: character group chat. For that purpose, I found a popular character designed to build characters. We tried to build a card for Sonic the Hedgehog. The process was mildly frustrating at times, but we eventually ended up talking about how to optimize the card for a smaller vRAM footprint, which changed wildly when I brought up my intention to group chat.
Takeaway
I learned more for this topic than I often do in a given week, so I am splitting the theory out to save for next week. Hopefully, I will have group chat working by then as well as another feature I thought looked interesting.
Final Question
Love it or hate it, what are your thoughts about the growing role AI is taking on in society? I look forward to hearing from you in the comments below or on my Socials!