The Case for Grid Computing

Why pooling computing resources makes sense for product development right now.

Why pooling computing resources makes sense for product development right now.

By Michael Schulman

The analysis phase of the product development cycle is extremely compute-intensive. Typically, a mathematical simulation of the product is performed to check stresses,  displacements, heat, and other behaviors anticipated to be applied in real-world environments.

In the analysis phase, material properties are applied, geometry defined,  and mathematical equations are solved to determine the behavior of the system in a physical world. Although the performance of computer systems is on a continuously upward growth path, in many cases the processing requirements cannot be met by a single computer. In addition, research has shown that many systems in data centers are only used to 15 to 20 percent of their capacity.

The traditional design, validate, then tweak the design workflow has a number of bottlenecks. With leading MCAD systems able to accurately model complex geometries and assemblies, a significant portion of the design time is wasted waiting for the simulation phase to complete. Typically,  this phase is an FEA run using such applications as ABAQUS, ANSYS,  LS-DYNA, and NASTRAN. Here, the MCAD model is broken into smaller solid elements, the loads applied, and the constraints as well as material properties are entered. Then, using mathematical and material science knowledge, the stresses and displacements can be determined.

 

Internals of the Sun Fire X2100 server.


Many times, designers are limited in the types of analysis they can run due to the time it would take. They must settle for less resolution in their modeling of the mesh, or sacrifice different loading or constraint scenarios. Such compromises may lead to an incomplete understanding of the behavior of the product.

Grid computing is one method of giving users more compute power than a single physical workstation can give them. With the creating of sophisticated distributed resource management software, large numbers of systems can be assembled into a grid. A number of technologies, both software and hardware, have come together to make grid computing a useful process for organizations that require it.

How Grid Computing Works

Setting up a grid of existing systems is quite easy. Typically, no additional hardware is required as long as the servers are on a network,  although some additional software is required to enable these systems to act as a pool of computing resources. This software knows which systems are available to be used and what types of systems are on the grid.

Here’s a basic description of how a grid for an HPC (high performance computing) environment works. Users, typically from a workstation or PC,  submit a “job.” A job is an amount of work to be done. For a design and manufacturing company, this often is some kind of analysis run.

A certain server, the master, takes requests from the different users and determines where and when the job will be run, based on a number of factors. A scheduler within a distributed resource manager then determines where the job should be scheduled to run after taking many different parameters into account beforehand (See a “Brief Grid Computing Glossary,” below.). Examples are:

• Applications require a certain amount of RAM
• MPI (message passing interface)  job requires certain number of computers to run on
• Specific hardware due to application availability
• Available licenses for desired application
• Interconnect of certain type
• Fine grained
• User-defined resources

In addition to the resources that a job may require, other factors that will determine where and when the job will be run include:

• Users priority based on urgency
• The  seniority of the submitter within an organization
• Project deadlines
• Amount of resources allocated to that particular project by management (possibly pre-determined by who paid for what)

One such distributed resource management application available today is the Sun N1 Grid Engine from Sun Microsystems. This resource management product, available in either an open source version or one supported by Sun Microsystems, gives users and system administrators the ability to fully use their collection of servers or workstations as part of a grid. (See Figure 1, below, for a diagram of how Sun N1 Grid Engine processes jobs.)

 

Figure 1.



Once it is determined which system or group of systems are available and underutilized, the master submits job requests to the compute nodes for execution. It is important for the master server to send jobs to nodes that are available, meaning that these nodes do not have a job running on them.  If a node gets overloaded with a job that uses all available resources, the performance of the application may suffer, and the work done may not be performed in a timely manner.

Submitting jobs to the grid is quick, easy, and built into the standardized workflow. In Unix environments, many users are accustomed to command line driven tasks, where the user would have to enter the job submittal program name to run, the application, the data set, where to send the output, and so forth. However, you can create web page interfaces that allow users to make a few simple clicks to initiate a job description and send it to the grid of computing resources. Figure 2 (below)   shows an example of a web-based interface that can be developed with open source software. (To learn more about open source software, go to SunSource.net, a site devoted to Sun’s involvement in Free & Open Source projects, click here.)

Collaboration Possibilities

As corporations become more global in their design and analysis activities, grids can also enable more seamless collaboration. For example, when engineers leave to go home at 6 p.m. in Bangalore, their computing resources can be used by the design team in New York beginning work at 8:30 a.m.

 

Figure 2.


By setting up internal processes that allow all available computing resources to be used by authorized designers or analysts, assets become more heavily used in an efficient manner. The benefits of this are that products can be brought to market faster, thus increasing revenue.

Use It or Lose It?

In many companies, different organizations have purchased and built up data centers and compute facilities for new projects or because of their local compute needs. However, the needs of these organizations for compute power might decrease from time to time. Unless an organized grid has been developed and implemented, other organizations within the corporation will not be able to use these resources. This, coupled with idle or underused resources, decreases the corporation’s ROA (return on assets).

 

Sun Fire X2100 Sever

 

Sun Grid Rack System.

High-performance computing (HPC) environments today are typically made up of a number of small, inexpensive yet high-performance servers. An example of this is the Sun Fire X2100 server (left), which contains a single AMD Opteron CPU. Although such a server by itself is quite fast compared to previous generations, today’s compute requirements by those doing analysis need multiple systems working together. On the right is the Sun Grid Rack System, a collection of systems, which in this example, includes the Sun Fire X2100, switches, cabling, and optional software loaded. Click images to enlarge.

 


Thus by pooling the compute resources from different organizations into an organized grid, productivity can increase for the entire organization. Since assets (computers) that have been purchased should be used and not idle, the ROA will be higher for machines that are working more of the time. Even more important is that a more complete analysis of a product under development can be performed, resulting in quicker results and better products.

Grid computing is about the sharing of resources to make organizations more productive. By creating and using a grid environment for computationally intense applications, your asset utilization is increased, users become more productive, and more analysis can be done.

By leveraging grids or a combination of in-house and on-demand grid services, corporations will see benefits in their product design process. Corporations embracing these new technologies will bring products to market faster, with better quality, and will be able to beat the competition in competitive markets.

Michael Schulman is the product line manager, HPC Solutions, NSG for Sun Microsystems. Send your comments about this article through e-mail by clicking here. Please reference “Grid Computing, October 2006” in your message.


 

 A Brief Grid Computing Glossary

MPI stands for Message Passing Interface. MPI is an application that controls and delivers thread-safe message passing designed for communication among multiple nodes in clusters.

Fine grain—The term fine-grain describes instruction-level multithreading to carry out many independent calculations in parallel. This capability is especially useful for high-level matrix and vector operations on huge data sets. 


 

On-Demand Grid Computing

At times, a company’s in-house computing resources are not enough to meet the needs of the organization. This might occur periodically as peak demand for compute power is needed near a deadline or as design decisions need to be made. Recently, Sun has made available the public Sun Grid, which is a computing utility that addresses this reality.

Just as electricity became standardized in the early 1900s, computing power on-demand is now available from Sun Microsystems. In the early days of electrical power, it was necessary for companies to own their own power generation equipment. Fortunately, power standards were developed,  such that it is now only necessary at the consumer end to plug the device into the wall outlet to get the required electricity.

The Sun Grid deploys a similar concept. It provides standardized grid-computing power over the Internet with a pay-per-use model. As users run out of capacity, or experience temporary peaks in demand, they can take advantage of available capacity on a public compute grid.   Future development in this area will allow corporate data centers to connect to the Sun Grid seamlessly, allowing for sharing of data,  licenses, and coordinating the compute power available.

To learn more about the Sun Grid or to test drive it, click here.—MS


 

Interconnect Options in Clusters

In data centers today, all computers are networked to other systems. While most are connected to the rest of the corporate computing infrastructure through Ethernet, other connection technologies are needed in the HPC world.

The reasons for this are that the communication requirements for systems performing numerically intensive applications that divide a single application into smaller pieces require each node in a grid or cluster to communicate with the other nodes. As each node solves a piece of the larger problem, it must communicate this information to the other nodes. Inexpensive standard Gigabit Ethernet does not perform well enough for environments that need a lot of communication.

This is why it is important to look at the costs for high-speed and low-latency networking solutions from companies such as Cisco, Myricom,  Silverstorm, and Voltaire, when designing an HPC environment using clusters or grids. Table 1 (click here for a PDF)  shows that when using an Infiniband interconnect while running a crash simulation, that the Infiniband solution has a much greater scaling when using more processors (CPUs).—MS


 

Product Information

ABAQUS
Abaqus,  Inc.

Providence, RI

ANSYS
Ansys Corp.
Canonsburg, PA

Cisco Systems, Inc.
San Jose, CA

LS-DYNA
Livermore Software Technology Corp.
Livermore, CA

NASTRAN
MSC.Software Corp.

Santa Ana, CA

Myricom,  Inc.
Arcadia, CA

SilverStorm Technologies
King of Prussia, PA

N1 Grid Engine, Fire X2100 server, Sun Grid Rack System
Sun Microsystems, Inc.

Santa Clara, CA

Voltaire, Inc.
Billerica, MA

Share This Article

Subscribe to our FREE magazine, FREE email newsletters or both!

Join over 90,000 engineering professionals who get fresh engineering news as soon as it is published.


About the Author

DE Editors's avatar
DE Editors

DE’s editors contribute news and new product announcements to Digital Engineering.
Press releases may be sent to them via [email protected].

Follow DE
#10896