BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250822T115804Z
LOCATION:Room 6.0D13
DTSTART;TZID=Europe/Stockholm:20250618T113000
DTEND;TZID=Europe/Stockholm:20250618T120000
UID:submissions.pasc-conference.org_PASC25_sess173_pap127@linklings.com
SUMMARY:HiPerRAG: High-Performance Retrieval Augmented Generation for Scie
 ntific Insights
DESCRIPTION:Ozan Gokdemir, Carlo Siebenschuh, and Alexander Brace (Univers
 ity of Chicago, Argonne National Laboratory); Azton Wells (Argonne Nationa
 l Laboratory); Brian Hsu (Argonne National Laboratory, University of Chica
 go); Kyle Hippe and Priyanka Setty (University of Chicago, Argonne Nationa
 l Laboratory); Aswathy Ajith and J. Gregory Pauloski (University of Chicag
 o); Varuni Sastry, Sam Foreman, Huihuo Zheng, Heng Ma, Bharat Kale, and Ni
 cholas Chia (Argonne National Laboratory); Thomas Gibbs (NVIDIA Inc.); Mic
 hael Papka (Argonne National Laboratory, University of Illinois Chicago); 
 Thomas Brettin and Francis Alexander (Argonne National Laboratory); Anima 
 Anandkumar (California Institute of Technology); Ian Foster (Argonne Natio
 nal Laboratory, University of Chicago); Rick Stevens and Venkatram Vishwan
 ath (Argonne National Laboratory); Arvind Ramanathan (Argonne National Lab
 oratory, University of Chicago); and Thomas Uram (Argonne National Laborat
 ory)\n\nThe volume of scientific literature is growing exponentially, lead
 ing to underutilized discoveries, duplicated efforts, and limited cross-di
 sciplinary collaboration. Retrieval-Augmented Generation (RAG) offers a wa
 y to assist scientists by improving the factuality of Large Language Model
 s (LLMs) in processing this influx of information. However, scaling RAG to
  handle millions of articles introduces significant challenges, including 
 the high computational costs associated with parsing documents and embeddi
 ng scientific knowledge, as well as the algorithmic complexity of aligning
  these representations with the nuanced semantics of scientific content. T
 o address these issues, we introduce HiPerRAG, a RAG workflow powered by h
 igh performance computing (HPC) to index and retrieve knowledge from more 
 than 3.6 million scientific articles. At its core are Oreo, a high-through
 put model for multimodal document parsing, and ColTrast, a query-aware enc
 oder fine-tuning algorithm that enhances retrieval accuracy by using contr
 astive learning and late-interaction techniques. HiPerRAG delivers robust 
 performance on existing scientific question answering (Q/A) benchmarks and
  two new benchmarks introduced in this work, achieving 90% accuracy on Sci
 Q and 76% on PubMedQA—outperforming both domain-specific models like PubMe
 dGPT and commercial LLMs such as GPT-4. Scaling to thousands of GPUs on th
 e Polaris, Sunspot, and Frontier supercomputers, HiPerRAG delivers million
  document-scale RAG workflows for unifying scientific knowledge and foster
 ing interdisciplinary innovation.\n\nDomain: Engineering, Life Sciences, C
 omputational Methods and Applied Mathematics\n\nSession Chair: Zhaohui Son
 g (Zhaohui Song, Politecnico di Milano, Italy)\n\n
END:VEVENT
END:VCALENDAR