What Is the LoRA Playground?
The LoRA Playground is a self-hosted web application for AI image generation using Stable Diffusion with LoRA (Low-Rank Adaptation) fine-tuned models. Unlike cloud-based tools such as Midjourney or DALL-E, it runs entirely on private GPU hardware — images are generated locally, nothing is sent to external servers, and the model weights are fully under your control.
The application is built as a Flask web app that communicates with a backend GPU server (Vasari) for image generation, and with a local 7B language model for prompt engineering. This architecture separates concerns cleanly: the web interface handles user input and the gallery, the GPU server handles inference, and the LLM handles text intelligence.
What Are LoRA Models?
LoRA (Low-Rank Adaptation) is a fine-tuning technique that allows a Stable Diffusion base model to be trained on a specific style, character, object or aesthetic — at a fraction of the cost and time of full fine-tuning. A LoRA is a small set of weight adjustments (typically 50-200 MB) that can be loaded on top of the base model at inference time to shift its outputs towards a particular style.
The practical implication is significant: instead of maintaining separate full model copies for each style, you maintain one base model and a library of LoRA adapters. Multiple LoRAs can be combined in a single generation, each with an independent scale weight, allowing fine-grained blending of styles.
Core Features
Multi-LoRA Support with Scale Control
The playground loads all available LoRA models from the Vasari server at startup. Users can select any combination of LoRAs for a generation, and set a scale value for each (0.0 to 1.5). A scale of 0.7 applies the LoRA at moderate strength; pushing above 1.0 exaggerates its stylistic influence; dropping towards 0 blends it in subtly. This per-LoRA weight control is the key to achieving nuanced mixed-style outputs.
LLM Prompt Enhancement
Writing effective Stable Diffusion prompts is a skill. The Playground addresses this with an AI-powered enhancement step: users can enter simple keywords (a subject, a mood, a setting), and a 7B language model — running locally on the same GPU server — expands those keywords into a detailed, technically-structured prompt.
The LLM is aware of which LoRA models are selected and tailors the prompt to their known styles. If a LoRA is trained on a specific artistic style, the enhancement step will include style-relevant language that activates that LoRA’s strengths.
Eight Style Rewrite Presets
Once a prompt exists, users can apply style transformations using eight presets, each with a specific LLM instruction:
- Fight: Transforms the scene into a freeze-frame combat moment with dynamic angles, foreshortening, motion lines and impact effects
- Portrait: Refocuses on the face and upper body with shallow depth of field, rim lighting and detailed expression
- Cinematic: Adds dramatic lighting, volumetric rays, film grain, lens flare and movie-still composition
- Dark: Shifts the mood with high contrast, deep shadows, desaturated palette and horror atmosphere
- Vibrant: Pushes colour to bold, saturated hues with clean lines and bright highlights
- Manga: Converts the scene to ink lines, screentone, halftone dots and Japanese comic panel style
- Painted: Applies oil painting texture with visible brushstrokes, impasto and classical composition
- HD: Appends quality tokens — masterpiece, best quality, highly detailed, 8k — without changing the subject
Each preset sends the current prompt to the 7B LLM with a precise rewriting instruction and returns a new prompt. This means users can iterate rapidly through visual directions without rewriting prompts manually.
Full Generation Parameter Control
Users control all standard Stable Diffusion parameters:
- Steps: 1-50 (default 30) — higher steps produce more refined but slower outputs
- Resolution: Width and height, default 1024×1024
- Seed: Manual or random — a fixed seed with the same prompt reproduces the same image, enabling reproducible iteration
- Negative prompt: Explicit exclusions passed directly to the diffusion model
Async Generation with Polling
Image generation runs asynchronously in a background thread. The browser receives a job ID immediately and polls for completion, keeping the UI responsive during the generation process (which can take 30-240 seconds depending on steps and resolution). Status is reported live — users can see when generation is in progress, done or has errored.
Persistent Gallery
Generated images are saved to disk as PNGs with all metadata stored in SQLite: prompt, negative prompt, LoRA configuration, seed, steps and resolution. The gallery retains the last 20 images, with older entries automatically pruned. This means users can return to previous generations, compare outputs across different LoRA combinations, and reproduce any image exactly by reusing its seed and parameters.
Prompt state also persists between sessions — the last used prompt and keyword set are written to disk and reloaded on the next visit, so users do not lose their working prompt when the browser closes.
The Architecture Behind It
The Playground sits as a lightweight Flask application on the theoracle server, proxied through Nginx. It communicates with two services on the Vasari GPU server:
- /giorgio/paint — the Stable Diffusion inference endpoint, receives prompt, parameters and LoRA configuration, returns the generated PNG
- /chat — the 7B language model endpoint, used for both prompt enhancement and style rewriting
This separation means the web application itself is stateless with respect to generation — it holds job state in memory and image state on disk, but all the compute happens on the GPU server. The Flask app can be restarted, updated or migrated without touching the generation backend.
Why Self-Hosted Matters for Creative Work
Cloud-based image generation tools are fast and accessible, but they come with constraints that matter for professional creative use:
- Content policies: Commercial services enforce content filters that can block stylistic territory that is entirely legitimate for illustration, game art or concept work. Self-hosted models have no such restrictions.
- Model control: With a self-hosted setup, you choose exactly which base model and which LoRAs to run. You are not limited to the provider’s model selection.
- Cost at scale: Per-image pricing adds up quickly at production volumes. GPU hardware amortises its cost over time, making high-volume generation economically viable.
- IP and privacy: Prompts and outputs do not leave your infrastructure. For clients with confidentiality requirements or for work involving proprietary visual styles, this is not optional.
Building Something Similar
The LoRA Playground is one example of what is possible when LLM and image generation capabilities are combined in a self-hosted environment. The same architecture can be adapted for:
- Brand-specific image generation tools trained on your visual identity
- Product photography automation for e-commerce
- Game asset generation pipelines
- Architectural and interior design visualisation tools
- Personalised content generation at scale
If you are interested in building a custom AI image generation system — self-hosted, fine-tuned on your own visual data, integrated into your existing workflow — get in touch with our team. We build end-to-end AI solutions from the model layer up.