NVIDIA’s GPU Technical Conference (GTC) in San Jose always marks the start of spring. Although at this time of year the daffodils haven’t yet broken ground in Boston, but the days are warming, and the snow is becoming less frequent. In San Jose spring was in full swing with sunny days and temperatures above 20⁰C following a winter of record rains. The GTC conference addresses all things GPU and increasingly their AI use cases. Over the last few years the number of attendees has ramped-up from a few thousand to currently in the region of 9,000. The largest grouping of attendees appears to be applying GPUs to AI deep neural network training and inference.
Jensen Huang’s keynote
The event has three principal pillars: the keynote, the GPU training sessions and the exhibition floor. Jensen Huang’s keynote moved to the Monday afternoon this year and almost filled the San Jose State University Event Center which has 2,000 additional seats compared to the San Jose Convention Center. This keynote traditionally announced all NVIDIA’s compelling new technology and product/industry initiatives. This year did not disappoint with the number of product/industry initiatives expanding to show increasing commercialisation of last year’s flood of new products.
My favorite AI related technical announcements were the automatic mixed precision operation which will maximise the performance of memory limited DNN training and the set of CUDA-X AI libraries to facilitate speedy development from data curation to DNN deployment. Additionally, the Jetson Nano costing only 99 USD and available at the show will help move innovative AI into the world of IoT.
Last year’s fall announcement of the RAPIDS suite of data analytics tools was accelerated with the announcement of a data analytics workstation available from NVIDIA’s OEM partners. The data workstation includes two Quadro RTX GPUs, RTX 8000s would be my choice, exploiting CUDA-X and RAPIDS which will help drive large scale data mining into the FTSE 1000. Microway, one of NVIDIA’s OEMs, have priced their workstations starting at 15,000 USD.
Microway’s implementation of NVIDIA's data analytics workstation
NVIDIA’s acquisition of Mellonox positions them to be at the center of the highest performant AI clusters for some time to come. Every one of the large AI DNN clusters on-site at Verne Global’s Icelandic data center exploits Mellonox InfiniBand networking to avoid GPU delays waiting for data to be delivered to them. NVIDIA and Mellonox have been peas in the same AI DNN training pod for some time. Perhaps by coincidence “heavy-metal” DGX-2 GPU PODs were announced too. These are DGX-2, 16-GPU, storage and networking reference designs for the major storage vendors which ensure that the overall cluster operates at peak performance and reliability not surprisingly exploiting Mellonox networking.
I always spend a day walking the exhibition floor at GTC. This year there appears to be slightly fewer hardware systems integrators and server manufacturers but many more start-ups with innovative AI products. The number of autonomous vehicle demos both on and off the NVIDIA partner booth felt like it doubled. I did not even recognise some of the Asian car manufacturers showcasing their cars. As the autonomous vehicle technology evolves rapidly from research to prototypes to early production there was more focus on in-vehicle inference than at previous GTCs.
Chinese Autonomous Test Vehicles
Reflecting on the industry bigger picture a couple of weeks after GTC, I continue to be impressed by how hard-charging and encompassing the NVIDIA AI/GPU portfolio has become.
Over drinks, I was brainstorming with a fellow strategy executive about how hard it’s going to be for competitors to make major inroads into NVIDIA's markets. It’s now computer accelerator (GPU or alternative) table stakes to need two or three revisions of silicon, at about $150M USD each, and an extensive suite of software development tools, libraries and drivers perhaps costing even more to develop just to be a market entrant.
This type of funding will become increasingly difficult to garner all at once. I suspect that computer accelerator start-ups will follow their pharmaceutical colleagues and get initially funded for the design proof of concept, then get acquired and the development ecosystem developed, then get acquired again as the end-user designs are won and start to ramp and then finally get acquired again by the large semi-conductor conglomerates for massive scale production.
Surprisingly there is a cottage industry shipping autonomous vehicle test data from the cars to the device company’s or car manufacturer’s data center. The vehicle data solid state disk drives with tera bytes of data gathered from 100s of sensors at 100hz are put in Pelican shipping cases and sent DHL.
Brainstorming with a transmissions AI expert from Germany it appeared that there is an opportunity to do some edge computing on the data either in the cars or their local garage to validate and compress the data before sending it across a network to a local aggregation computer center for the data to be curated, de-duplicated and tested for uniqueness.
Imagine a test-driving center in Munich with about 50 cars. Each morning each car drives to the petrol station, and then to the major roundabout branching to all the differing terrain types surrounding Munich. Hence, every day there are 50 sets of this same data all the way to the roundabout and back. Clearly 49 or so copies of this data add very little to the DNN training exercise and should be culled and then tested for uniqueness against the previous data.
Getting ready for another day of data collection
Thereafter, the drastically reduced in size, compressed and unique data can be sent over the network to an industrial scale compute facility for DNN training and additional data simulation, like kids running in front of a car which you would never want real data for. By chance Verne Global has an ideal industrial scale compute facility for this last phase – come take a look!