Creative AI Tools for the Curious

The landscape of AI-powered creative tools has expanded rapidly over the past few years, with a growing number of platforms now covering everything from image generation and video production to voice synthesis. This compilation brings together a selection of those tools, each approaching creativity from a different angle and serving a different set of needs.
The tools featured here span a broad range of disciplines. Some, such as Stable Diffusion and ComfyUI, are built around open-source flexibility and local control, whilst others, such as Midjourney and ElevenLabs, offer polished, hosted experiences designed for quick results. Rather than ranking or comparing them, this piece simply presents each tool on its own terms, outlining what it does, how it works, and who it might suit.
ComfyUI is an open-source, node-based application for generating images from text prompts. Developed by comfyanonymous and released in January 2023, it has gained significant traction within the AI art community due to its integration with diffusion models like Stable Diffusion. Key features include a visual workflow system, support for multiple diffusion models, saveable workflows in JSON format, and customisable extensions. The software has over 50,000 stars on GitHub and is supported by Nvidia's RTX Remix modding software, as well as the Open Model Initiative. Despite some learning curve and performance challenges, ComfyUI offers flexibility and visual clarity that make it a popular tool for artists and developers interested in AI-generated imagery.
This AI voice synthesis platform specialises in creating hyperrealistic voice generation for applications such as audiobooks, video narration and voice cloning. The service employs advanced AI models to produce natural, conversational narrations and can create realistic podcast-style content with dual AI co-hosts.
Users can replicate voices using just minutes of audio for basic cloning, or achieve professional-grade results with longer recordings of over 30 minutes. The platform caters to developers through comprehensive APIs and SDKs supporting various programming languages including Python and TypeScript for integration into applications and services.
However, the platform restricts studio projects to three concurrent works, each supporting up to 500 chapters with individual chapters limited to 400 paragraphs and 5,000 characters per paragraph. The handling of voice data presents potential privacy considerations, particularly regarding voice cloning features, requiring users to carefully examine terms of service and data security policies before proceeding with sensitive audio content.
Ideogram is a user-friendly, text-to-image AI generator that enables users to create visually appealing images from written prompts. The tool caters to various users, including artists, marketers, and content creators, who can generate photorealistic images and artwork quickly and efficiently using deep learning neural networks. Users can input text prompts which the AI translates into corresponding visuals, offering customisation options such as styles and moods. All in all, Ideogram is a versatile tool that facilitates creativity and streamlines the design process across various fields.
This AI image generation service produces pictures from written descriptions provided in natural language, creating artwork ranging from photorealistic photography to illustrations, anime and abstract art. Originally accessible only through Discord channels, the platform now offers a standalone web interface where users can enter prompts directly, browse galleries and organise their creations, though Discord remains available for those who prefer community-driven workflows.
Users describe desired images with varying levels of detail, such as specifying landscapes with particular lighting conditions or requesting minimalist illustrations in specific colour palettes, and the system returns generated options that can be upscaled, varied or regenerated through refined prompts. The service excels at producing aesthetically coherent imagery with strong handling of lighting, textures and composition, making it particularly useful for blog illustrations, concept art, marketing materials and social media graphics, though it requires subscription access and tends to prioritise artistic quality over strict prompt accuracy. While rapid iteration supports creative concept development effectively, the system proves less deterministic than traditional graphics tools and sometimes demands complex prompting for precise control, making it less suitable for technical diagrams or pixel-perfect editing work.
Founded in 2018, this generative AI platform specialises in creating and editing video, images and multimedia content through artificial intelligence models. The cloud-based software allows users to generate video clips from written descriptions, animate still images, apply AI-driven effects to existing footage and perform tasks like background removal, motion tracking and style transfers through a graphical interface.
The platform has developed progressively sophisticated models, moving from video-to-video transformation to commercial systems capable of producing increasingly realistic motion and visual consistency from written prompts, images or other video clips. Widely adopted by filmmakers, designers and content creators across film production, advertising and social media industries, it represents a significant development in AI capabilities beyond written content generation towards the automated creation of moving visual media, functioning essentially as an AI-powered video studio where users input descriptions and receive generated or edited video output.
Released in 2022 by Stability AI alongside CompVis and Runway, this deep learning model generates pictures from written prompts by progressively transforming random noise into coherent images through an iterative denoising process. Unlike closed cloud services such as Midjourney or DALL-E, it offers publicly available model weights that allow users to run the software locally on consumer-grade hardware with reasonably modern GPUs, modify the underlying system and fine-tune outputs without requiring internet connectivity or subscription fees.
The model supports extensive customisation through community checkpoints, LoRA adapters and ControlNet composition tools, whilst various interfaces including AUTOMATIC1111, ComfyUI, InvokeAI and DreamStudio provide accessible workflows beyond command line operation. Its popularity stems from offering full control over models and outputs, offline capability that addresses privacy concerns, a large community sharing styles and the ability to embed into automated pipelines, though it does require technical setup expertise and GPU resources for optimal performance. Quality depends heavily on which model checkpoint is used, and unlike hosted alternatives, it lacks built-in moderation systems, making it the most flexible but less polished option for developers, artists and privacy-conscious users.