Cloudinary adds Generative AI to its Programmable Media Image and Video APIs

By News Desk On Jun 22, 2023

Cloudinary, the image and video platform that powers many of the world’s top brands, announced the availability of several new generative AI, large language models (LLM) and GPT-based features within its Programmable Media image and video APIs including Generative Fill, Generative Remove, Generative Replace, AI-powered Image Captioning, and a ChatGPT-backed natural language interface. The new Cloudinary features, available now, allow users to create customized, personalized assets in seconds and help technical teams scale quickly by intelligently automating workflows and eliminating repetitive and time-consuming image manipulation tasks.

Advanced creative work is time-consuming, expensive and, for some brands, simply out of reach. More than 10,000 customers and 1.5 million users have long benefited from the power of AI via Cloudinary’s award-winning image and video APIs. Today’s new generative AI capabilities extend these benefits even further by making what was once impossible, possible and more accessible for users to create, edit and deliver dynamic visual experiences at unprecedented scale. For example, instead of re-shooting an entire campaign, developers and digital marketers can remove unwanted objects and create beautiful images at scale through Cloudinary’s APIs. Likewise, AI-powered image captioning produces intelligent captions for images instantly to improve accessibility, asset searchability and SEO while boosting productivity and reducing production time.

“Generative AI has completely transformed the way we work, and the most recent advancements are just scratching the surface of what’s possible,” said Nadav Soferman, co-founder and chief product officer, Cloudinary. “Our founding mission was to revolutionize the way in which brands manage and deliver images and video at scale, and building solutions that harness the most advanced technologies has been central to delivering on that promise.”

Soferman continued, “Since launching our flagship image management product, which utilized AI for face-detection-based cropping, we’ve led advancements across media management, leveraging the power of AI, machine and deep learning, and raising the bar for what’s possible in media creation and delivery. It’s always been about letting advanced technology streamline or eliminate tedious tasks so brands can focus their limited resources on creating high impact, highly visual sites, apps and campaigns that connect, engage, inspire and convert. We are very excited to make these powerful new generative AI capabilities a reality for the technical and non-technical teams committed to bringing their best visual stories to life – and we’re just getting started.”

Also Read: Nativo Unveils Its ContentAI Suite, Improving Branded Content Campaign Performance Via Generative AI

New features bring ease and automation to visual media workflows

Generative Fill: Enables users to enhance and expand an image with ease. For example, users can intelligently expand and extend an original image, especially useful when needing to transform an image from vertical to horizontal. With generative fill, the new AI-generated background will blend seamlessly with the original image.
Generative Remove: Via natural language prompts, users can remove unwanted elements from images and automatically add a matching background. The new feature uses state-of-the-art technology such as open-set object detection models and powerful AI capabilities through Stable Diffusion.
Generative Replace: Allows users to easily detect, change and replace unwanted elements and colors all via natural-language prompts. This capability is especially useful for users looking to more easily create color-based variations of products or to improve web accessibility for those with color blindness.
AI-powered Image Captioning: Intelligently creates image captions for galleries, user-generated content and product descriptions at scale. Image-to-text features strengthen image SEO, improve accessibility, and automate image classification for better findability. It also helps e-commerce users save time by automatically creating smart product descriptions.
Conversational Transformations Builder: This intuitive feature provides a natural language interface through ChatGPT, allowing users to effortlessly communicate desired image transformations and optimizations. For example, a simple command such as “please blur this image and crop to a 1:1 aspect ratio” would deliver a complete and correct transformation.

AI at its core from the start

Cloudinary has a long history of delivering powerful AI capabilities to its customers via trusted industry-leading AI technologies such as OpenAI, Google Vision, and Amazon Rekognition, as well as its own domain expertise and content-aware machine learning models including those for background removal, smart image tagging, video cropping and domain-specific models for industries such as fashion and furniture. For more than a decade, Cloudinary’s image and video solutions have leveraged AI and ML, offering the most advanced image and video transformations including face detection, contextual cropping and auto-tagging, and now generative AI capabilities and ChatGPT integrations for intelligent image captioning. With Cloudinary Assets, digital asset management is effortless through UI-based auto-tagging, AI-powered visual search, and content moderation, enabling seamless workflows for user-generated content. What’s more, the Cloudinary Assets Studio feature harnesses generative AI power to make editing bulk assets simple and powerful.

SOURCE: BusinessWire