The deployment of generative AI models can now be accelerated with the release of the Ray 2.4 upgrade, which is open-source.

by B2B Technology Zone
April 27, 2023

The latest version 2.4 of Ray, an open-source machine learning (ML) technology for scaling and deploying AI workloads, is focused on speeding up generative AI tasks. With support from lead commercial vendor Anyscale and a broad community of open-source contributors, Ray has become one of the most popular technologies in the ML field. It is also utilized by OpenAI, the company behind GPT-4 and ChatGPT, to scale up its ML training workloads and technology, and for AI inference purposes as well.

Ray 2.x was initially introduced in August 2022 and has since undergone continuous enhancements, including the observability-focused Ray 2.2 release.

The recent Ray 2.4 upgrade centers around generative AI workloads, offering users new features that enable quicker and easier model building and deployment. Additionally, the update includes integration with Hugging Face models, such as GPT-J for text and Stable Diffusion for image generation.

According to Robert Nishihara, Anyscale's CEO and co-founder, Ray is an open-source infrastructure that manages the life cycle of large language model (LLM) and generative AI, from training to deployment and productization. The aim is to reduce the level of expertise required to build this infrastructure, thereby lowering the barrier to entry for integrating AI into products across all industries.

Release of Ray 2.4 is introducing new workflows for generative AI by providing faster capabilities for building and deploying models and integrating with popular models from Hugging Face, such as GPT-J and Stable Diffusion, which can help reduce the expertise required to build and manage the infrastructure for AI workloads.

Ray 2.4 offers prebuilt scripts for easy generative AI deployment.

Ray 2.4 is simplifying the process of building and deploying generative AI with prebuilt scripts and configurations that eliminate the need for manual configuration and scripting. According to Robert Nishihara, the CEO of Anyscale, Ray 2.4 is providing a starting point that is already delivering good performance, allowing users to modify it and bring their own data. The goal is not just to set up the cluster but to provide runnable code for training and deploying an LLM. Ray 2.4 includes a set of Python scripts that would have otherwise been written by the user. The focus of Ray 2.4 is on a particular set of generative AI integrations utilizing open-source models available on Hugging Face. The release provides integration with LangChain and a series of new integrated trainers for ML-training frameworks, including ones for Hugging Face Accelerate, DeepSpeed, and PyTorch Lightning.

Ray 2.4 speeds up training and inference through performance optimizations.

Ray 2.4 introduces a new approach to handling data for AI training and inference, which optimizes the use of compute and GPU resources. Traditionally, data is processed in multiple stages before operations like training or inference are executed. However, this pipeline approach can introduce latency and underutilization of resources. Ray 2.4 addresses this challenge by streaming and pipelining data so that it all fits into memory at once, and by preloading some data onto GPUs to keep utilization high. The result is a more efficient use of both CPU and GPU resources, which is uniquely enabled by Ray's technology.

FAQ

What is Ray 2.4?

Ray 2.4 is the latest version of an open-source machine learning technology that focuses on scaling and deploying AI workloads, particularly for generative AI tasks.

How does Ray 2.4 improve generative AI deployment?

Ray 2.4 offers prebuilt scripts and configurations that simplify the process of building and deploying generative AI models, eliminating the need for manual configuration and scripting.

What integrations does Ray 2.4 support?

Ray 2.4 includes integration with Hugging Face models (like GPT-J and Stable Diffusion), LangChain, and various ML-training frameworks including Hugging Face Accelerate, DeepSpeed, and PyTorch Lightning.

How does Ray 2.4 optimize performance?

Ray 2.4 optimizes performance by streaming and pipelining data to fit into memory at once, and preloading data onto GPUs to maintain high utilization of both CPU and GPU resources.

Who uses Ray technology?

Ray is used by major organizations including OpenAI (for GPT-4 and ChatGPT), and is supported by Anyscale and a broad community of open-source contributors.

What are the key benefits of Ray 2.4?

The key benefits include simplified AI deployment, improved performance optimization, integration with popular AI models, and reduced expertise requirements for building AI infrastructure.

Your email address will not be published. Required fields are marked *

Did You Catch That ?

Loading questions...

How to Use Agile and DevOps Together for Continuous Improvement

KPIT Technologies' Q1 FY25 Success: 24.8% Revenue and 52.4% PAT Jump

Critical Bug in CrowdStrike's Content Validator Sparks Worldwide Tech Outage

How to Implement Edge AI for Real-Time Data Processing