Future computer architecture of IEDM 2018 artificial intelligence

Last Update Time: 2019-07-27 11:16:30

 

Today's systems typically rely on heterogeneous computing when processing algorithms in AI applications. However, for the future, IBM's Jeff Welser sees new approaches, such as brain-inspired approximations, simulating AI cores, and finally quantum computing.

 

IBM's Jeff Welser gives the prospect of future computer architectures that may look artificial intelligence. His view is that the hardware has a mature process to date, and in the "narrow AI" (single-task, single-domain) communication supporting role, but will play the "big AI" in the future (multi-tasking innovation and introduction of a key role, The role of multiple domains). The simultaneous development of extensive AI with specially designed hardware will shift to the traditional balance between cloud computing and edge, between structured and unstructured data, and between training and inference (conclusion). The architecture of the heterogeneous system has been introduced, in which different computing resources include CPU, high bandwidth, KI special accelerator, and each node in the high-performance network to achieve significant improvements in computing power. Looking ahead is the roadmap for the advancement of Welze's expertise before accelerating AI: starting with the heterogeneous digital von Neumann machine via the accelerator architecture with reduced accuracy (approximate calculation), the analog AI core for AI quantum calculations.

 

Jeff Welser of IBM Research from Almaden looks ahead to what the future artificial intelligence computing architecture might look like.

Matrix multiplication usually plays an important role in AI applications.

Narrow AIs are characterized by their accuracy with humans or superhumans and the performance of high-speed single domains for certain tasks, mainly circulated in applications ranging from facial recognition to natural language translation. We are only in the beginning and distribution of a wide range of AI, multitasking, multi-domain, multi-mode, including the interpretation of AI. Transfer learning and conclusions are key to extending AI to small data sets. The reduction in time required and the ability to calculate AI are critical for the development and implementation of a broad-AI system that will enable "general AI" (cross-domain learning and reasoning) and its direction of development. In general, machine learning and deep learning (ML / DL) can be divided into two operational states: training and reasoning. In the training phase, initially in the multidimensional parameter space, one of the models is established, allowing for optimization problems that are further generalized in the inference process. In deep learning, the model is usually composed of a multi-layer network with many free parameters (weights) whose values are set during the training process. The trained model must then process the data from the real world in an inferential mode. In many applications, the inference step requires a fixed training model for reasons of consistency, repeatability, reliability, performance, or legal compliance.

Talking about future AI hardware, it is worth analyzing the AI algorithm. In deep neural networks, matrix multiplication is the core of deep learning networks. In an all-wireline network, data is propagated through the network in the form of vector matrix multiplication. To make better use of computer resources, more data points are combined into a stack (micro-batch). Convolution in a neural network (convolutional neural network - CNN) can be described as a matrix and the first step is a matrix multiplication of the matrix, input convolution operations. Developing an optimized system to accomplish this task requires consideration of end-to-end systems, including transistor structure, hardware, software, and programming.