BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250822T115805Z
LOCATION:Campussaal - Plenary Room
DTSTART;TZID=Europe/Stockholm:20250617T103000
DTEND;TZID=Europe/Stockholm:20250617T110000
UID:submissions.pasc-conference.org_PASC25_sess150_pos135@linklings.com
SUMMARY:P34 - Optimizing the ECsim Plasma Code for Exascale Architectures:
  GPU Acceleration, Portability, and Scalability
DESCRIPTION:Nitin Shukla (CINECA), Elisabetta Boella (E4 Computer Engineer
 ing), Filippo Spiga (NVIDIA Inc.), Michael Redenti (CINECA), Mozhgan Kabir
 i Chimeh (NVIDIA Inc.), and Maria Elena Innocenti (Ruhr University Bochum)
 \n\nThis work presents the adaptation of the plasma code ECsim for future 
 exascale architectures. The code has three main blocks called particle mov
 ers, moment gathering and field solver. The first two blocks are the most 
 computationally challenging, thus we focused on optimizing them for GPU ac
 celeration using OpenACC directives. Our approach prioritized GPU readines
 s with minimal code restructuring. The legacy CPU code makes extensive use
  of C++ structures and templates, which hinder seamless GPU implementation
 . To overcome this, we manually managed data transfers through CUDA API ca
 lls. Performance profiling on NVIDIA GPUs reveals a speedup of 5x to 9x co
 mpared to the CPU implementation (considering node-to-node comparison). Sc
 aling tests conducted on multiple supercomputers demonstrate ECsim scalabi
 lity, achieving above 80% efficiency up to 1024 GPUs in weak and strong sc
 aling tests for adequately sized problems.\nWe further extended this work 
 to use also OpenMP target directives. Our memory management strategy for G
 PU porting allowed for minimal effort in this case, enhancing the portabil
 ity of ECsim across different GPU architectures. Comparative analysis on N
 VIDIA GPUs highlights the code portability and significant speedup also wi
 th OpenMP target directives compared to the CPU. Similar work is underway 
 on an AMD GPU system at EuroHPC.\n\nSession Chair: David Moxey (King's Col
 lege London)\n\n
END:VEVENT
END:VCALENDAR