BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250822T115805Z
LOCATION:Campussaal - Plenary Room
DTSTART;TZID=Europe/Stockholm:20250617T103000
DTEND;TZID=Europe/Stockholm:20250617T110000
UID:submissions.pasc-conference.org_PASC25_sess150_pos121@linklings.com
SUMMARY:P33 - Optimizing Data Offload in the IFS Using GPU-Aware Data Stru
 ctures and Source-To-Source Translation
DESCRIPTION:Johan Ericsson, Ahmad Nawab, and Balthasar Reuter (ECMWF); Phi
 lippe Marguinaud and Judicaël Grasset (Meteo-France); and Michael Lange (E
 CMWF)\n\nThe adaptation of the ECMWF’s medium-range forecasting model, the
  Integrated Forecasting System (IFS), to heterogeneous computing architect
 ures is an ongoing effort. The IFS consists of millions of lines of Fortra
 n code that is highly optimized for modern CPUs. This poses significant ch
 allenges when porting the code to heterogeneous architectures, as data lay
 outs and compute patterns need to be changed to efficiently utilise the ha
 rdware. To solve this problem, at ECMWF, we use FIELD API, a GPU-aware dat
 a-structure library and Loki, a freely programmable source-to-source trans
 lation\ntoolchain written in Python, to generate architecture specific opt
 imised code. In this poster, we show how FIELD API and Loki can be used to
  generate efficient code for asynchronously offloading data to GPUs. We us
 e “dwarf-cloudsc”, a computationally representative proxy of the IFS physi
 cs, to demonstrate the application of Loki to generate two versions of Ope
 nACC accelerated Fortran. One version, that offloads all Fields over the s
 ame stream and a second version that blocks the offload of fields over mul
 tiple streams and overlaps computation and communication. We provide a com
 parison of the performance of the three versions showing promising results
  for the offload of the full IFS to GPUs.\n\nSession Chair: David Moxey (K
 ing's College London)\n\n
END:VEVENT
END:VCALENDAR
