Subscribe to Digital Engineering
Webcasts · Downloads · Archives
Companies · Glossary · Podcasts

DE · Topics · Engineering Computing · Engineering Computing

The Rise of Data Science Workstations

NVIDIA’s new hardware is making it easier for organizations to process data right on the desktop, as engineers are being drafted into data science roles.

The NVIDIA Data Science Workstation specification is primarily used by vendors to create desktop units, but the hardware and software are also available for mobile workstations. Image courtesy of NVIDIA.

Engineering Computing News

Engineering Computing Resources

Dell

Lenovo

Latest News

GE Aerospace Launches as Independent Public Company

Formlabs Introduces Form 4 Resin 3D Printer

Argonne and RIKEN Sign AI-Focused Memorandum of Understanding

Lenovo Debuts ThinkCentre Desktops Powered by AMD Processors

Power of Good Design

Leadership Profile: Marco Turchetto of ESTECO How Automation can Improve Simulation Workflows

All posts

By Randall Newton

August 1, 2019

Earlier this year, NVIDIA announced a reference architecture for a new class of professional workstation, the data science workstation. Almost immediately, leading workstation original equipment manufacturers (OEMs) announced workstations that conform to NVIDIA’s Data Science specification.

“Data science is one of the fastest growing fields of computer science and impacts every industry,” said NVIDIA founder and CEO Jensen Huang at the announcement. “Enterprises are eager to unlock the value of their business data using machine learning and are hiring—at an unprecedented rate—data scientists who require powerful workstations architected specifically for their needs.”

An obvious question comes to mind: When are specialized workstations needed? Why not just go with the utility of a typical professional workstation? Digital Engineering asked NVIDIA and several workstation vendors.

Their answers repeated a general theme, one we have been hearing from several sources, not just workstation vendors: There is strong and increasing demand for data science in every industry that now uses professional workstations. There are not enough data science specialists to meet employer demand, so engineers and programmers from other disciplines are being drafted as data scientists. This lack of expertise extends to the specifics of what software to run and what computer hardware is best suited for the task.

Specification Defined

A new workstation meeting NVIDIA’s standard will have several specific features not common to most existing workstations. The first is dual NVIDIA Quadro RTX graphics processing units (GPUs), based on the Turing GPU architecture. Each RTX 8000 has 48GB of GPU memory, required for large data sets typical of artificial intelligence (AI) training or deep learning and machine learning analysis. The new NVIDIA GV100, a Volta class GPU, also may be used in a data science workstation.

Both the RTX line and the GV100 use two new types of compute cores, RT cores and Tensor cores. RT is short for ray tracing but could also refer to real time; these cores are specialized for high-performance, local visualization.

Tensor Cores specialize in matrix math, common to deep learning and some applications in other fields that now run only on high-performance computing (HPC) clusters or cloud computing platforms. “[Tensor Cores] do the basics for workhorse calculating in deep learning,” says Michael Houston, a senior distinguished engineer at NVIDIA.

Tensor cores perform a fused multiply add, where two 4x4 FP16 matrices are multiplied, and the result added to a 4x4 FP16 or FP32 matrix. It sounds like high school math, but tensor cores do millions of these calculations every second, much faster than commodity CPU or GPU compute circuitry. There is also an advantage in the tensor core’s ability to accumulate everything in FP32. “Thirty-two bit accumulation tends to really matter for convergence of networks,” says Houston, “to make mixed precision really work.” Houston says the theoretical performance boost of using tensor cores is 8x. “On a lot of neural nets, NVIDIA sees a 4x speed increase end to end.” Data science models often take several days to run; a 4x speed increase would complete a four-day job in one day.

The NVIDIA Data Science Workstation specification calls for Ubuntu Linux 18.04, nicknamed Bionic Beaver, as the operating system. Along with Ubuntu comes a set of software libraries based on the NVIDIA CUDA-X AI protocol for AI research. The collection includes RAPIDS, TensorFlow, PyTorch and Caffe open source libraries and several NVIDIA-written acceleration libraries for machine learning, artificial intelligence and deep learning.

The department of aeronautics and astronautics at MIT is a pre-release user of the NVIDIA Data Science specification. “The NVIDIA-powered data science workstation provides significant capabilities for training deep neural networks for robot perception. With it, the MIT FAST Labs’ ability to train drones to see depth and avoid collisions from a single camera was significantly accelerated because we could process larger batch sizes,” says Sertac Karaman, an associate professor in the department.

Computex Announcements Reshape Workstations

AMD, Intel and NVIDIA all introduced new technologies and products at the recent Computex trade show in Taiwan that were of interest to workstation users.

AMD announced a significant update to its Zen 2 core, the technology used in its Ryzen and EPYC processors. The company claims the new core runs 15% more instructions per clock cycle than its predecessor, using larger cache sizes and a redesigned floating point engine. The Zen 2 core will power the 3rd Generation AMD Ryzen 9 processor, a new high-end CPU in the Ryzen line designed for workstations. It offers 12 cores/24 threads and is the only CPU or graphics processing unit central processor on the market built with 7nm lithography.

AMD also announced a new motherboard chipset (X570 for socket AM4) that offers the first availability of PCIe 4.0. It claims this generation of PCIe offers 42% faster storage performance than the previous version, and can double motherboard bandwidth compared with the previous version. AMD says it anticipates more than 50 new motherboard models to ship in the next few months from a variety of vendors.

Intel announced its next-generation CPU platform Ice Lake, an integrated heterogeneous computing architecture with enhancements for artificial intelligence and deep learning. The company claims its Intel Deep Learning Boost (DL Boost) addition to the CPU and new AI instructions on the CPU’s integrated graphics driver will “usher in a new era of intelligent performance for PCs.” Intel claims DL Boost can offer up to 8.8x higher peak AI inference throughput than comparable products. DL Boost AI accelerators will also be available in the Xeon line of workstation and server CPUs. Intel claims common AI workloads such as image recognition and segmentation as well as object detection will run up to 14 times faster than the previous generation of Xeon processors.

NVIDIA announced the launch of new mobile workstations from several vendors using the Quadro RTX line of mobile GPUs. The RTX line for mobile brings the same specs as the desktop line, but in a mobile form factor. It offers real-time photorealistic rendering, AI acceleration and 8K video support for content creation including virtual reality. Dell, HP, Lenovo and MSI were among the vendors with new mobile workstations using the latest RTX technology.

Workstations meeting the NVIDIA Data Science specification go through testing and optimization tailored to the needs of data science users. The result is a local, single-user computer that NVIDIA says replaces the need for time on more expensive HPC or cloud computing platforms.

“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large datasets faster than ever,” said Mike Koelemay, chief data scientist at Lockheed Martin Rotary & Mission Systems. “Leveraging RAPIDS [software libraries] to push more of the data processing pipeline to the GPU reduces model development time, which leads to faster deployment and business insights.”

When Money is no Object

“AI is a big market and huge talking point, and it starts on a workstation,” says Mike Leach, the workstation portfolio manager for AI, AR and VR at Lenovo. By following the NVIDIA specification, Leach says Lenovo can give users “a certified solution and the right software tools out of the box.”

“We see a big shift in data scientists,” adds Leach. They start with “gigabytes of data, a data lake of images [or] financial data on the left. They want to move to a fully predictive AI on the right. The journey in-between is seeing the data, iterating on predictions and accuracy.”

While attending an AI conference recently, Leach observed “sometimes money is no object.” Data scientists are expensive employees to bring aboard, but companies “consider them must-have, and they will provide the right hardware.” As a result, Leach says, these scientists “can deliver massive cost savings based on the products they create.”

Recognizing the Software Stack

Dell, HP and Microway are also releasing workstations for data scientists that incorporate NVIDIA GPUs. Boutique workstation vendor Velocity Micro is another vendor building workstations to the NVIDIA Data Science specification and is awaiting formal NVIDIA certification. “We’ve been doing scientific computers for 20 years, just called different things,” notes CEO and founder Randall Copeland.

The biggest change with the NVIDIA Data Science specification is not as much about the GPU as the software stack, Copeland says. “A computer designed to run better for Revit or 3ds Max is not the same as the computer that runs CUDA best.”

Copeland says NVIDIA has done a “great job developing a market for artificial intelligence and deep learning.” They recognized their CUDA architecture was well-suited for massive data sets, and “they help people who are good in something else who have to become AI experts.”

A Personal Data Sandbox

The phrase “data scientist” was first coined by a Google executive in 2010, says NVIDIA’s Geoffrey Levene, director of global business development for data science workstations. “Now [the phrase] has caught up to us; data doubles every 18 months in every vertical.” From one industry to the next the process is similar, Levene says. Data must be “wrangled,” formally known as extract transform load (ETL). “Then you write the code to see what the data can do,” he explains.

From the exploration, the data scientist builds a model of the data use case. “This is the training part, used for inference and prediction,” Levene says. “Training is time-consuming; inference is fast. ETL is a lengthy process.” The workflow usually involved tabular data and “GPUs can accelerate tabular data,” he says.

Having a “personal sandbox” for data work is a boon for data scientists, Levene says. “Some are finding they do a week’s work in one day with GPU-accelerated workflows.” Levene also observes that “98% of AI for product development is machine learning.”

Artificial intelligence has been around for a generation, Levene notes, but there wasn’t enough data in many industries. Now data is abundant, thanks to both the internet and mobile devices. “Go back a year ago—you either bought time on a cloud GPU or spent up to $500,000 on a system” that took weeks for IT to install, says Levene. “Now you can order a couple of workstations and go to work.”

More Dell Coverage

Artificial Intelligence for Design and Engineering Workflows

In this white paper, learn how artificial intelligence and machine learning can improve design and simulation.

Dell Offers Complete NVIDIA-Powered AI Factory Solutions

New Dell AI Factory with NVIDIA is an AI solution for enterprises spanning workstations, data centers and cloud to supercharge era of generative AI.

OzenCon Keynotes Highlight AI-Powered Simulation, Digital Twins, More

Conference examines the role of AI in future product development.

AI on Tap at 3DEXPERIENCE World 2024

Artificial intelligence to become the driving force behind SOLIDWORKS.

Configuring a Workstation for SOLIDWORKS 2024

Learn how to select the right hardware for the latest release of SOLIDWORKS.

GPUs Drive HPC-Powered CAE and Machine Learning

JPR reports evolving CAE landscape

Dell Company Profile

More Lenovo Coverage

Making the Case for Engineering Workstation Upgrades

In this Making the Case whitepaper, Lenovo outlines how upgrading to the latest generation of professional workstations can provide a return on investment through increased engineering efficiency and greater flexibility.

Lenovo Debuts ThinkCentre Desktops Powered by AMD Processors

Next-generation business workstations feature AI capabilities.

Big, Powerful and Pricey: Lenovo ThinkStation PX

Lenovo delivers a new flagship workstation marked by new design and myriad options.

Lenovo Now Offers PC Refurbishment Program

Lenovo Certified Refurbished helps extend device lifecycles and bolster sustainability, company says.

Lenovo and Anaconda Partner to Accelerate AI Development

Intel-powered Lenovo workstation portfolio and Anaconda Navigator streamline data science workflows.

New Year, New Challenges

It already looks like 2024 is going to keep us all off balance.

Lenovo Company Profile

More NVIDIA Coverage

NVIDIA RTX A400 and A1000 GPUs Enhance AI-Powered Design

Intel Goes After Enterprise AI Market with Gaudi

NVIDIA Blackwell Platform to Power Advanced Computing

AWS and NVIDIA Extend Collaboration

GTC 2024: NVIDIA Unveils New Blackwell System, Showcases Partner Lineup

Share This Article

Subscribe to our FREE magazine,
FREE email newsletters or both!

Join over 90,000 engineering professionals who get fresh engineering news as soon as it is published.

Join Now

Latest News

GE Aerospace Launches as Independent Public Company

Formlabs Introduces Form 4 Resin 3D Printer

Argonne and RIKEN Sign AI-Focused Memorandum of Understanding

Lenovo Debuts ThinkCentre Desktops Powered by AMD Processors

Power of Good Design

Leadership Profile: Marco Turchetto of ESTECO How Automation can Improve Simulation Workflows

All posts

About the Author

Randall Newton

Randall S. Newton is principal analyst at Consilia Vektor, covering engineering technology. He has been part of the computer graphics industry in a variety of roles since 1985.

Follow DE

Digital Engineering https://www.digitalengineering247.com//article/the-rise-of-data-science-workstations/engineering-computing https://www.digitalengineering247.com//article/the-rise-of-data-science-workstations/engineering-computing Last updated October 9, 2019

#22949

New & Noteworthy

New & Noteworthy: Safe, Cost-Effective Metal 3D Printing - Anywhere

Desktop Metal’s Studio System offers turnkey metal printing for prototypes and...

New & Noteworthy: Direct Neutronics Analysis on CAD

Coreform Cubit 2023.11 workflows enable neutronics directly on CAD for next-generation nuclear energy...

New & Noteworthy: Agile Engineering Collaboration

Authentise Threads is a new software tool for distributed communications and project...

New & Noteworthy Product Introduction: Enterprise VR Headset

Lenovo ThinkReality VRX has an immersive display works with virtual, augmented and...

Design

Simulate

Additive Manufacturing

Digital Thread

Engineering Computing

Companies

Glossary

Podcasts

Webcasts

Downloads

Reviews

Subscribe

Advertise

Customer Service

The Rise of Data Science Workstations