Generative AI continues to develop at a blistering pace, making it more important than ever that organizations have access to enterprise-ready capabilities to help them leverage this disruptive technology. 

Harnessing the power of decades of Google’s research, innovation, and investment in AI, Google Cloud continues to make generative AI available with baked-in security, data governance, and scalability across the board. 

To this end, last month, Google announced the general availability of Generative AI support on Vertex AI, giving our customers the ability to access powerful foundation models from Google Research and tools for customizing and applying them. 

Today Google are announcing the general availability (GA) of four important foundation models for Vertex AI. These include Imagen, PaLM 2 for Chat, Codey, and Chirp. For each of these models, organizations can access APIs on Model Garden and do prompt design and tuning on Generative AI Studio.

  • Imagen includes four key features:
    • Image generation for creating studio-grade images at scale
    • Image editing to edit generated or existing images via text prompts 
    • Image captioning for creating captions of images at scale
    • Visual Question & Answering (VQA) for interacting with, analyzing, and explaining images
  • PaLM 2 for Chat follows the general availability of PaLM 2 for Text in June 
  • Codey supports code generation, completion, and code chat
  • Chirp supports multilingual Speech AI 

Google also announcing Multimodal Embeddings API in preview, which lets customers combine the power of Vertex AI’s generative AI models with their proprietary data, to generate embeddings, or interchangeable vector representations, of their text and image data. These capabilities can enable data science teams to deliver a variety of downstream tasks such as image classification, content recommendations, and visual search. 

In this blog post, Google will explore what your organization can do with these powerful models and how Vertex AI provides the enterprise-ready capabilities you can use to get up and running with generative AI. 

Helping to drive enterprise value from Generative AI models

Powerful models are the foundation of generative AI, but the software, tools, and infrastructure that surround these models are equally important for enterprise adoption. Organizations face challenges not only accessing these models, but also integrating AI while maintaining protection over intellectual property, adhering to regulations around data security and privacy, and ensuring models and applications are safe to use. Many organizations also want to use generative AI without incurring large costs or managing huge clusters.

Google help address these challenges head-on with Vertex AI’s platform capabilities for scalable application integration, purpose-built AI infrastructure, secure and private data customization, and responsible use of this technology. 

Let’s see how each of these pillars can help your organization. 

Access models to build production-ready generative applications 
Vertex AI can make it easy to access foundation models, as today’s model announcements attest. While models are an inextricable part of generative AI, the software that helps enterprises use this technology is equally important—which is why Vertex AI also offers a range of tools for tuning, deploying, monitoring, and maintaining models, so you can build differentiated applications using your own data. 

Turning to today’s announcements, in May Google announced Imagen, our foundation model for image generation. Now, Google are excited to announce Imagen is generally available with an allowlist (i.e., approved access via your sales representative), letting onboarded customers start using image generation and editing capabilities. Visual Q&A and Captioning for production workloads are also generally available for all customers. Visual Q&A provides new ways to engage with image-based data like retail products or image libraries. This new capability can give you answers to questions about an image, helping you analyze large amounts of data quickly, and it can even help the visually impaired understand images or graphs that they wouldn’t be able to otherwise. Captioning, meanwhile, can make it easy to generate relevant descriptions for your images. Captions can help with indexing and searching, as well as assigning image descriptions to product listings on eCommerce websites. 

“Imagen is beginning to power key capabilities within Omni, Omnicom’s open operating system, that will enable 17,000+ trained and certified users to create audience-driven customized images in minutes. Imagen has been instrumental in offering a scalable platform for image generation and customization. Integrating it into our platform allows us to expand the scope of audience-powered creative inspiration, at a scale that wasn’t previously possible,” said Art Schram, Annalect Chief Product Officer at Omnicom. “We’re starting to adopt the latest features like styles and fine tuning, and engineering data-driven prompts. We look forward to continuing to provide our users relevant visual inspiration in a responsible way.”

“The latest improvements in Imagen’s product preservation capabilities are a perfect match for Typeface’s focus on personalized AI for brands,” explained Vishal Sood, Head of Product at Typeface. “By combining Google Vertex AI’s Imagen with Typeface’s brand-personalized AI, we are able to help enterprises to create 10x personalized content in a fraction of time.”

Google Shopping recently built an application called Product Studio using Imagen on Vertex AI. Product Studio can enable merchants to create rich product images quickly and easily, at a fraction of the time it takes to do professional product photo shoots. “We’re excited about the feedback we’re getting from merchants in our early pilots, who say that Product Studio, which leverages Imagen on Vertex AI, helps them generate and publish lifestyle product photos directly to their product catalogs,” says Jeff Harrell, Google’s Senior Director of Product Management for Merchant Shopping. 

Announced in May, PaLM 2 is a family of models that power dozens of Google products, including Bard and Duet AI in Google Cloud. With the PaLM 2 for Chat model, now generally available, you can leverage Google’s PaLM’s variety of abilities for multi-turn chat applications, such as shopping assistants, customer support agents, and more. 

ThoughtSpot, provider of a widely-adopted business intelligence platform, is using PaLM 2 to build a new feature in ThoughtSpot for Google Sheets called “AI Explain,” which can instantly generate explanations of charts, visuals, and anomalies, and will launch new conversational AI and ML-enabled predictive forecasting capabilities into its analytics platform.

With Codey, your organization’s developers can accelerate a wide variety of coding tasks, helping to empower them to work efficiently and close skills gaps. The model enables not only code completion and code generation capabilities, but also chat to help with debugging, documentation, learning new concepts, and more. Since launching in preview in May, we’ve added additional programming languages including Go, Google Standard SQL, Java, Javascript, Python, and Typescript. We’ve also improved the quality of code responses and increased serving capacity, enabling your developers with the right tools to enter the era of generative engineering.  

“Security and privacy are key to incorporating AI into the software development lifecycle,” said David DeSanto, Chief Product Officer at GitLab. “GitLab leverages Vertex AI to deliver new, AI-powered features with a privacy-first approach, including the ability to run our own models and leverage Codey foundation models built on top of PaLM 2. The GitLab DevSecOps platform empowers organizations to harness the benefits of AI for faster software delivery, while ensuring their data, intellectual property, and source code are protected.”

Originally released in May in preview, Chirp is a version of our 2 billion-parameter speech model, which was trained on millions of hours of audio and supports over 100 languages. Chirp achieves 98% accuracy on English and relative improvement of up to 300% in languages with less than 10 million speakers. Whether the use case involves customer support, transcriptions, or voice control, Chirp can help your organization communicate with customers and constituents inclusively, by engaging audiences in their native languages. 

Last but not least, our Multimodal Embeddings API, now in preview, can unlock an array of new applications, such as image and text-based recommendations, by enabling the processing of text and images interchangeably. This capability complements our Text Embeddings API, which became generally available in June, and remains a recommended choice for those with fully text-based use cases. Multimodal Embeddings API makes it possible to categorize images and text together and can be crucial for use cases like retail recommendation systems that can provide relevant outputs from both images of products and text descriptions.

Match generative AI with infrastructure 
Beyond access to models and tools for building generative AI apps, you need infrastructure to make sure your apps can scale and reliably perform — ideally without running into daunting compute costs or management overhead that distracts your technical talent from building innovative products. Google Cloud offers the choice and power to run smaller models running finite tasks at the lowest latency levels, as well as to run large models capable of cutting-edge experiments. 

As Google large language model customers are looking to scale up their projects and applications using our models, they often need assurances that their requests will be serviced with acceptable performance. This is especially critical for delivering real-time applications where customer service is paramount. Starting in August, Vertex AI will support provisioned, dedicated generative AI capacity that can deliver guaranteed throughput. This feature can be especially beneficial to customers who have a high volume of sustained workloads.  

Leverage generative AI while protecting data and privacy 
One capability enabled by Google Cloud is the ability to customize models using your own data. Vertex AI can help customers keep their data protected, secure, and private. When a company tunes a foundation model in Vertex AI, private data, model outputs, and prompts can be kept private, and they are never used in the foundation model training corpus. Google recently published a whitepaper, “Adaptation of Large Foundation Models,” which outlines how we help protect customer data. 

Auditability and compliance are essential to helping ensure the security and privacy of customer data. Google also engage in comprehensive GDPR privacy efforts, including our transparency commitments for customer data usage and the support for our customer’s Data Protection Impact Assessments (DPIAs). Now, Google are excited to support HIPAA compliance for many of our generally available models on Vertex AI, so that healthcare and life science customers with whom we have a Business Associate Agreement can run workloads with Protected Health Information (PHI) data on Google Cloud. 

Innovate responsibly 
Google AI Principles put beneficial use, user safety, and avoidance of harms above business outcomes and are embedded in how we develop our AI products. Google conducted extensive reviews on Google generative AI products to identify potential risks and have developed guardrails to mitigate these impacts. For example, to address concerns around safety, Google have implemented safety filters for bias, toxicity, and other harmful content. Google also equip our customers with the tools they need to help reduce risk within their applications and provide recommendations to help navigate responsible AI. 

Bring the power of generative AI to your organization

With both a wide selection of foundation models and extensive, enterprise-grade platform capabilities, Vertex AI continues to unlock ways for your business or organization to access foundation models, tune them on your proprietary data, and leverage them for differentiated apps and digital experiences.