UL Research Institutes is a safety science organization that is part of the Underwriters Laboratories group of enterprises. UL rebranded its three organizations in 2022, with UL Research Institutes (ULRI) serving as a nonprofit research group.
Part of that safety research includes a variety of simulations, which increasingly require significant computing resources. ULRI is leveraging high-performance computing (HPC) to accelerate complex simulations, data-intensive research and advanced computational modeling across their institutes. Through a partnership with TotalCAE, ULRI has streamlined access to Microsoft Azure-based HPC resources by deploying a hybrid infrastructure, including cloud-based environments and optimized on-prem clusters, according to Craig Hamill, director of innovation, technology and knowledge management at ULRI.
For example, researchers extensively utilize HPC for modeling fire scenarios using Fire Dynamics Simulator (FDS) and for conducting multiphysics analyses with COMSOL software tools. This has improved the organization’s ability to rapidly perform complex, large-scale simulations, and as a result, has driven faster insights and innovation across safety science and engineering disciplines, Hamill says.
We spoke to Hamill to get more details about how (and why) ULRI relies on HPC for this work.
Can you tell me more about the type of work that ULRI does?
We have five thematic research labs in the U.S. We are not the for-profit side of the UL enterprise. The UL that everyone thinks of for testing and certification is different. We split from that for-profit entity, and our initiatives are different. We have an endowment from the UL enterprise to do research, and to be through partners with different industries. This is open science. We publish all our materials and it’s free.
I am on the operations side, but my team does operational HPC work and we work with data scientists at each of the labs. We make sure they have the right resources, and the right types of resources. We are driving science into impact.
For example, our Fire Research Safety Institute in Columbia, MD, is the pre-eminent source for fire research for first responders. They do a lot of fire simulation, and were the principal developers with NIST on the Fire Dynamics Simulator. We are working on some use cases to take those fire simulations on electric vehicles and batteries, and use a digital twin of a parking garage or a house and simulate an e-scooter fire in that environment.
We want to create an immersive environment that drives research and creates impact, while helping inform regulations and standards. Instead of a thick research paper, if I can get a lawmaker into a VR [virtual reality] headset and show them what the impacts can be, that makes more of an impression. How can we do these simulations in real time?
In terms of HPC, we use a lot of simulation software, and some of those have a heavier load than others. We are pushing away from scientists doing this locally or on a gaming rig under their desk. How do we do it to scale and get the simulations done faster?
At our Materials Discovery Research Institute, for example, we can do a million derivations on particular materials, and then simulate what those look like. They take the five they want to go to prototype within the research lab. How can we use simulation and HPC to get to the ideal state so they can focus on the type of material and resources within the lab that can drive what they ultimately take to prototype?
How long have you been working with TotalCAE?
When we split from the for-profit organization, we had to build our own data domain, and still do research while we were migrating user accounts and getting new computers.
What we realized early on was that we needed a bigger compute cycle to get things done faster. We had a great relationship with Microsoft. We had a digital cloud-first initiative throughout the split, so we did not want to maintain old servers. We wanted to push things into the cloud, and we are very light on anything on premises.
We use MS Azure as much as we can, and as we were looking at all those old things that weren’t up to speed, we decided we would need to leverage HPC. We decided to build that in Azure rather than have a data center.
TotalCAE has been a partner with us, and helped us determine how we can spin up resources, how we manage resources. We don’t have a huge team, but it really democratizes how to use all these resources and get them onboard very quickly.
What are some of the common compute challenges you would face with these simulations? How does HPC help?
For our initial assessment we reached out to the data scientists in each of the labs and did forecasting. If you had unlimited money, what would you buy?
People also want to run things locally, but we don’t want to have local resources, because the IT team would have to support something physical. We don’t have a server room or data center.
We took a startup mentality. How do we get this ramped up quickly, and learn what our users want to do? Let’s crawl, walk, run, and then spring. Going to Azure was a no-brainer. It was an easy on-ramp. You can light up some nodes and build infrastructure that is pretty agile.
Some super users came in that had some requirements. We determined the resources needed in partnership with TotalCAE and Microsoft, and built them out robustly and really quickly—it was about three weeks. As users scaled, we built a development environment or a sandbox environment in Azure, so users can test things out, recycle the environment quickly and roll that into production.
This was somewhat expensive, but it’s easily maintained and we have access to anything that NVIDIA, AMD or Intel is releasing. If we need it, we can just light it up. We only pay for what we use.
That was critical. We need to test at scale, but it’s very seasonal. Some of these resources are only used for a day. If we brought in a physical resource it would have taken us 8 months to build the network while we were still trying to do all of our work and stabilize after the split of the UL organizations. Also, we don’t have one campus. We have labs all over the country. Would that introduce latency? We were able to be successful quickly, and continually iterate. When people go on vacation and don’t need the resources, they aren’t costing us any money. We aren’t stuck with hardware for years. We can test and simulate different infrastructure really easily and cheaply.
Were there any issues with software licensing and cloud-based compute resources?
No, that’s why we went with TotalCAE. As long as they support it, we can use it. We do have some esoteric open-source tools, but [TotalCAE] has also been on the spot with their support and been able to do that work for us.
If a job fails, TotalCAE can sort out what the reason was and help solve it, and that allows us to focus on science. When things are hard failing, TotalCAE knows before the data scientists even submit a ticket. Their portal gives us a lot of good instant feedback.
Are there any changes planned to the HPC infrastructure in the future?
We are always evaluating what our current workloads are as we are onboarding additional data scientists and resources. How do we optimize our costs in Azure? We want to make sure we optimize what we have available to our end users, always looking at whether it makes sense to have an on-premises data center. What resources can we pull up? What would be the cost for heating and cooling? We are not putting all of our eggs in one basket, but on-premises resources are something we forecast we won’t need for quite a while, if at all.
But being a nonprofit, running to scale, as we mature some simulations and workloads might be beyond our capacity. We have made great contacts at Argonne National Laboratory and Oak Ridge that support us. Why would we bring things on-prem if we use their supercomputing resources? We are always making sure we are doing the right thing for our scientists.

Brian Albright is the editorial director of Digital Engineering.
Contact him at [email protected].

Join over 90,000 engineering professionals who get fresh engineering news as soon as it is published.