ORNL’s Incoming Supercomputer Requires Advanced Infrastructure
Frontier will reside in the former data center of the Oak Ridge Leadership Computing Facility’s (OLCF’s) Cray XK7 Titan supercomputer.
January 4, 2021
When the U.S. Department of Energy’s (DOE’s) new exascale supercomputer, Frontier, completes installation at Oak Ridge National Laboratory (ORNL) in 2021, it is expected to debut as a landmark in high-performance computing with performance of greater than 1.5 exaflops (one quintillion floating-point operations per second), DoE reports. But for now the room it will occupy is undergoing a complete mechanical, electrical and structural transformation.
Frontier will reside in the former data center of the Oak Ridge Leadership Computing Facility’s (OLCF’s) Cray XK7 Titan supercomputer; once reportedly the most powerful supercomputer in the world, it was decommissioned after 7 years of service on August 1 last year. It took about a month for a team of HPE Cray technicians to dismantle 430,000 lbs.’ worth of Titan components and remove them for recycling.
Just days later, work began to revamp the 20,000-sq.-ft. room to accommodate Frontier’s much higher requirements for power, cooling and structural support. That meant everything in room E102 of Building 5600 had to be stripped out: piping, electrical infrastructure even the floor.
“Titan at peak probably consumed about 10 megawatts of power. At peak, Frontier will consume about 30 megawatts. If you use more power, you have to get rid of additional heat, so we are adding the equivalent of 40 megawatts of cooling capacity, about 11,000 tons, for Frontier—much bigger pipes to distribute cool water to the computer,” says Justin Whitt, program director for the OLCF, a DOE Office of Science User Facility located at ORNL. “Additionally, supercomputer systems have become denser and heavier with each new generation, and Frontier is no exception to that, so we upgraded the raised floor so it could support that weight.”
With demolition completed, a total work force of about 100 contractors and ORNL craft employees is currently installing that new infrastructure, welding in serpentine piping above and below the huge, empty room while simultaneously building out a new raised floor. (Now completed, the floor consists of over 4,500 tiles weighing 48 lbs. each—or nearly 110 tons altogether.)
Before the data center’s construction began—even before the demo work to gut the room—another building project had to be tackled: a mechanical room for all the machinery that will be feeding cooling to Frontier. The new supercomputer’s cooling water towers will have a system volume of 130,000 gallons, with 350-horsepower pumps that can each move over 5,000 gallons/minute of the high-temperature water through the Frontier system. The four pumps will connect to the data center via 500 linear ft. of 24-in. pipe.
While a new 28-megawatt electrical room for the system’s transformers was built around the perimeter of the E102 computer room—taking up what was once the office space of OLCF’s leadership group—planners still had to find a big enough area for Frontier’s cooling towers and all their associated infrastructure. They ended up going to the building next door, 5800, which could provide the needed space—but also required more moves and had some architectural hurdles to overcome.
“The building that we’re putting the mechanical plant into was originally designed as a lab space, with not a lot of structure to it—it was basically just holding up the roof,” says Bart Hammontree, technical project manager for ORNL’s Laboratory Modernization Division. “But we’re putting in the neighborhood of a million pounds’ worth of piping and cooling towers on the roof of this building. So we had to basically build a new structure inside of an existing building and we had to put new foundations in to support all that.”
Crews had to saw-cut the entire slab underlying Building 5800, rip it out and dig new foundations inside the building—while avoiding many electrical conduits passing through the construction area to supply power to other parts of the building. Building 5800 is still an operational space, with labs conducting scientific research even during construction.Although the team has diagrams showing where these power lines ought to be, they decided not to take any chances—and turned to technology for added safety.
“We brought in a specialty consultant that uses ground-penetrating radar and an electromagnetic wand to scan the area looking for signals from live conduits. It went really well once we took the time to do that investigative step,” Hammontree says.
Every large project at ORNL includes a risk management program that predicts potential issues that may slow the work, which helps each team preplan mitigation strategies to stay on schedule and within budget. However, there was one factor that the Frontier team did not anticipate when it was assembling its risk register in 2018: a global pandemic.
Taking all the precautions necessary to ensure the team remains healthy has presented challenges. In March, ORNL instituted stringent rules on who it would allow on its campus, requiring COVID screening for all contractors and visitors.
“Especially in the early days, it was very difficult to get contractors from out of state or from outside of East Tennessee in, so we had to do a lot of planning any time we needed to bring in a specialty contractor from out of the area,” Hammontree says. “That was very challenging. But the medical staff has been great—they worked with us any time we needed one of our contractors tested and we got the results back really quickly. That has allowed us to keep marching forward.”
A construction project of this scale normally takes about 2 years to complete, Whitt says—but to put Frontier into service as soon as possible, the team plans to complete the work in less than a year and a half.
“The biggest challenge that we face at this point is just the incredibly aggressive schedule that we set for the work,” Whitt says. “We’re responding to the science need for exascale computers so we had to accelerate the time frame for everything—for the technologies, for having the room ready. So we’re doing more work for the OLCF than we’ve ever done before and we’re doing it on a shorter time scale.”
Despite the effects of the pandemic, the team is on track to complete the data center in spring of 2021.
UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science. For more information, click here.
Sources: Press materials received from the company and additional information gleaned from the company’s website.
About the Author
DE’s editors contribute news and new product announcements to Digital Engineering.
Press releases may be sent to them via [email protected].