After of a couple of weeks sunning and sailplane racing in Florida it was straight back into the saddle for a London-based Quarterly Business Review and the excellent “Rise of AI” event in Berlin. Following a year of attending various international AI events I’m starting to develop a feel for their differing textures and here is my current evaluation.
So far, I’ve sampled:
- Pure AI events
- High Performance Computing (HPC) events which include some AI
- Vendor events promoting technology which can accommodate AI
- AI pavilions at general computing events
- AI technology conferences (like NVIDIA GPU technology conferences))
- AI user groups
I enjoy the pure AI events the most, likely because most of the material and concepts are new and interesting. The “Rise of AI” last week had a great cross section of the AI industry attendees including a few veterans from the 80s 'expert systems' era but more commonly freshly minded PhDs in machine learning. The three speaking tracks had a great balance of industry predictions, AI technologies and product showcases with only standing room remaining at many of the sessions.
As an industrial scale data center specialising in HPC and intensive computing, we’re rather attracted to the deep neural network (DNN) training folks who needs 100s of kW of servers and GPUs running non-stop, hence I focused on the DNN crowd attending in abundance. It’s amazing that you often get better engagement with good prospects queuing for coffee than standing on a booth! Such was my luck. My favourite sales and technical guys work both angles well. Below is the excellent Verne Global team we had at the conference:
I struggle to pick the best speaking slots from the extensive event programs. I was sure that the Natural Language session would be fascinating based on the large quantity of it being trained in our data center (check out DeepL as an example of that). So, I made a bee-line for the “Natural Language Processing – The Magic of Communication” by Dr. Aljoscha Burchardt. I arrived early to get a seat and caught the last 15 minutes of the previous talk “Can neuroscience inspire AI?” by Dr. Prateep Beed (below), and my goodness this was a real gem, introducing some concepts that have the capability to evolve the way we approach neural network training, it’s a pity that I missed the earlier part.
Deep neural network training today applies once unimaginable amounts of compute power to configure the network to efficiently perform a task like image recognition. Our brains have evolved much more efficient ways of doing that. Imagine how many times you need to be nearly eaten by some animal to learn to keep your distance. Some of Dr. Beed’s research is looking for these accelerators and how to improve the way machines learn. He gave us a primer on some neuroscience basics with catchy aide memoires:
Neurons that fire together wire together - A repeated task or observation will imprint itself effectively increasing your sensitivity to it.
Memories are actively remembered and forgotten - I always assumed that when I forgot something it was rather random. Something I tried to remember would disappear and others that had no real importance would loiter for years. Clearly, I need to work on my programming! The concept that our intelligence manages which data/DNN component is cached for immediate use, stored locally, stored for long-term retrieval or discarded is fascinating and will become increasingly important as DNNs have prolonged production environments.
Timing of, and direction of synaptic signals impactful - The DNNs I’ve encountered are unidirectional with the data entering at one end and the result being delivered at the other. Human synapses can have electrical or chemical neurotransmitters to build their logic. The electrical ones can be bi-directional, and the uni-directional chemical ones have many state options beyond the binary on/off. In both cases the neurotransmitters have a lifespan after which their activation is no longer effective. Just imagine how much more powerful today’s DNNs would be with some of these more complex attributes.
Transference problem - solve issue with modest data - We can transfer learning from one problem to the next like using tennis skills in badminton. Imagine the possibilities if tasks can be recognised as similar using DNN components from one to another, to solve a similar puzzle, when there is insufficient data for the second’s own DNN training.
This sort of neuroscience-based insight maybe why IBM added native NVLink support to their Power8/9 CPUs. I keep finding folks who want to test their machine vision type DNN training on Power9 to verify the rumours that it significantly reduces DNN training time.
These insights imply that we are just beginning a long DNN adventure where the current technology is analogous to Bell 212, 1,200 bits per second, modems of the pre-Internet era. The next 30 years will be a blast!
My next AI event will be the AI Summit in London June 13/14th which looks like it could be superb. Let’s chat there. One of my colleagues, Wil Wellington, will also be presenting at the German AI meet-up in Hamburg on June 22nd and expects to have a great dialog with the experts there. And remember, all AI training roads lead to Iceland!