BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250822T115808Z
LOCATION:Room 5.2D02
DTSTART;TZID=Europe/Stockholm:20250618T100000
DTEND;TZID=Europe/Stockholm:20250618T103000
UID:submissions.pasc-conference.org_PASC25_sess106_msa142@linklings.com
SUMMARY:DGEMM Emulation Using INT8 Matrix Engines and its Rounding Error A
 nalysis
DESCRIPTION:Yuki Uchino (RIKEN Center for Computational Science), Katsuhis
 a Ozaki (Shibaura Institute of Technology), and Toshiyuki Imamura (RIKEN C
 enter for Computational Science)\n\nModern architectures are equipped with
  high-performance matrix engines optimized for low-precision matrix multip
 lications used in machine learning models. Fully leveraging these architec
 tures is the key to achieving superior performance in numerical algorithms
 . This study aims to design methods for emulating DGEMM using int8 matrix 
 engines to achieve superior performance on modern architectures. The Ozaki
  scheme, a highly accurate matrix multiplication algorithm using error-fre
 e transformations, enables higher-precision matrix multiplication to be pe
 rformed through multiple lower-precision matrix multiplications and higher
 -precision matrix additions. Ootomo et al. implemented the Ozaki scheme us
 ing int8 matrix engines with the aim of achieving both sufficient accuracy
  and high performance. We propose alternative approaches to improving perf
 ormance by reducing the numbers of lower-precision matrix multiplications 
 and higher-precision matrix additions. Numerical experiments demonstrate t
 he accuracy of the results and conduct performance benchmarks of the propo
 sed approaches. These approaches are expected to yield more efficient resu
 lts in next-generation architectures. We also provide a rounding error ana
 lysis of the proposed methods.\n\nDomain: Computational Methods and Applie
 d Mathematics\n\nSession Chair: Mantas Mikaitis (University of Leeds)\n\n
END:VEVENT
END:VCALENDAR
