Loading…
September 19-21, 2023
Bilbao, Spain
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit Europe 2023 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Central European Summer Time (UTC/GMT +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Back To Schedule
Wednesday, September 20 • 14:40 - 15:20
Scaling AI Workloads with Kubernetes: Sharing GPU Resources Across Multiple Containers - Jack Min Ong, Jina AI

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.


With the rise of AI and machine learning applications, GPU resources have become a critical bottleneck in scaling infrastructure to efficiently serve AI workloads. Kubernetes, an open-source container orchestration platform, provides a solution to this problem through the NVIDIA device plugin which allows multiple containers to share access to GPU devices. In this talk, we will explore how Kubernetes can be used to efficiently scale AI workloads by sharing GPU resources across multiple containers. We will discuss the challenges of GPU resource management, explore various techniques for optimizing GPU usage and set resource limits to ensure fair and efficient allocation of GPU resources among containers. By the end of this talk, attendees will have a solid understanding of how Kubernetes can be used to share GPU resources across multiple containers, allowing them to make the most of their GPU investments and achieve faster, more accurate results in their AI applications.

Speakers
avatar for Jack Min Ong

Jack Min Ong

Machine Learning Engineer, Jina AI
Jack Min is a Site Reliability Engineer at Jina AI, where he manages cloud infrastructure for serving and training AI models used in multimodal search and generation applications. He's also the proud owner of a janky 4 GPU machine he built from second-hand parts for machine learning... Read More →



Wednesday September 20, 2023 14:40 - 15:20 CEST
Room 0E-1 (Floor 0)
  CloudOpen