Artificial intelligence goes through the wind and rain, AI special chip into the blue sea

Just as the multimedia applications and 3D games flourished 20 years ago, the rise of Internet big data has put new demands on super-calculated chips.

In fact, the AI ​​community, Professor Hiton of the University of Toronto, Canada, proposed the concept of deep learning as early as 2006. The shallow learning algorithm was widely recognized by the academic community as early as the 1980s. The reason why the application in this field has gradually warmed up in recent years is because the development of AI is inseparable from the support of two aspects, big data and computing resources.

First, from "dark blue" to "AlphaGO", artificial intelligence has gone through 20 years

After 20 years of the "Deep Blue" battle of Kasparov in 1996, "AlphaGO" once again added a strong stroke to the development history of artificial intelligence through the form of man-machine battle. Standing today, we can laugh that the once-stunning "dark blue" is actually just a great chess program running on a supercomputer, and in order to support this program, the IBM team built a 1.2-tonne weight. It is equipped with 480 chess chips.

Unlike the traversal search strategy adopted by Deep Blue, which relies on superior computing power, the design of "AlphaGO" incorporates deep learning algorithms that have made significant progress in recent years. The reason why deep learning is called “depth” is relative to the shallow learning algorithm such as forward error feedback neural network and support vector machine. The limitation of the latter is that in the case of finite samples and computational units, the ability to represent complex functions is limited, and it is necessary to rely on manual experience to extract sample features. The deep learning algorithm realizes complex function approximation and automatic feature extraction by constructing a deep nonlinear network structure, and has powerful ability to mine statistical rules from a small number of sample sets.

In the field of face recognition based on deep learning method, in 2014, the recognition rate of Facebook's DeepFace project and the Chinese University of Hong Kong's DeepID project on the outdoor face recognition database reached 97.45% and 97.35%, respectively, almost 97.5. % recognition rate. In addition, in the fields of image classification and natural speech recognition, deep learning has also proved its incomparable advantages, especially in the past of one of the most complicated complete information games, indicating that the algorithm has great potential. Can be dug.

There is also an unknown episode about AlphaGO. Before the match against Li Shishi, AlphaGO won the second round of the European Go Champion Fan Yi in January 2016 with a disparity of 5:0. By the end of the game, Li Shishi expressed confidence in defending the final honor of humanity in the chess movement. However, in just one month, Google replaced AlphaGO's core computing unit with a dedicated deep learning chip from the CPUGPU. So, we saw the smile of the "Stone Buddha" and the trembling fingers.

Second, out of hardware support, deep learning can only be "killing dragon skills"

In fact, the AI ​​community, Professor Hiton of the University of Toronto, Canada, proposed the concept of deep learning as early as 2006. The shallow learning algorithm was widely recognized by the academic community as early as the 1980s. The reason why the application in this field has gradually warmed up in recent years is because the development of AI is inseparable from the support of two aspects, big data and computing resources.

Deep learning models require a lot of data training to get the desired results. Taking the speech recognition problem as an example, in its acoustic modeling part, the algorithm faces a training sample of one billion to one hundred billion. In this case, only the mathematical model with strong expressive ability can fully exploit the rich information contained in the massive data. Correspondingly, the computational processing of massive data must also be supported by powerful computing resources.

To give an exaggerated example, today's computer training for a small and medium-sized network takes one day, and it may take nearly 20 years for a computer to be used 20 years ago. Therefore, even if the deep learning algorithm was born 20 years ago, there is no hardware matching and it can only be the skill of the dragon. Even today, the development of AI-related hardware is still far behind software algorithms. On the one hand, there are too many algorithms in the AI ​​world, and Moore's Law has been upgraded with software for decades. On the other hand, the current mainstream method of implementing deep learning algorithms is to use GPU chips, which are specialized for deep learning algorithms. Customized chips are still far from being scaled. Although the GPU is more efficient than the CPU in terms of architecture, it is far from optimal. And GPU power consumption is amazing, it is difficult to commit to mobile terminals, let alone Internet of Things applications.

Third, the cloud "high throughput", local "small fast spirit"

The current AI applications are mainly divided into two categories: server-side and mobile-terminal. On the one hand, the chip responsible for the AI ​​algorithm on the server side should support as many network structures as possible to ensure the correct rate and generalization ability of the algorithm; on the other hand, it must support high-precision floating-point arithmetic, and the peak performance must be at least Tflops (10 executions per second). ^12 times floating point arithmetic) level, so the power consumption is very large ("200W); and in order to improve performance must support the array structure (that is, multiple chips can be combined into a computing array to accelerate the operation). Because the AI ​​chip on the server side must take into account the versatility, performance optimization can not be tailored, and only some macro optimization can be done.

The existing mainstream server-side hardware accelerators are dominated by graphics processors and field programmable gate arrays. GPU has powerful floating-point computing capabilities, so in addition to the original work of image processing, it is widely used in scientific computing, password cracking, numerical analysis, massive data processing and other fields that require large-scale parallel computing. Compared with the GPU, although the FPGA device has a gap between the computing speed and the ASIC chip, the product update speed is slower than the GPU chip; but the power consumption is only 1/10 of the GPU, and can also be reconfigured. Target applications for maximum optimization. In addition to FPGAs and GPUs, there are also a number of companies doing server-side deep learning acceleration chips, such as Google's TPU, Intel's NervanaSystem, and WaveCompuTIng.

The AI ​​chip on the mobile side and the AI ​​chip on the server side have fundamental differences in design ideas. First, the AI ​​chip on the mobile side must meet the low latency requirements. The delay here refers to the communication delay between the mobile terminal and the cloud or server. Taking the familiar siri application as an example, the mobile terminal uploads the voice data to the cloud, and the cloud executes the algorithm and sends the result back to the mobile terminal, which of course requires the network delay to be as small as possible to enhance the user experience. In the application scenarios where driving assistance, security monitoring and other real-time requirements are extremely demanding, the importance of low latency is no need to be rumored. Secondly, the mobile AI chip must ensure that the power consumption is controlled within a certain range. In other words, it must be guaranteed. High computational efficiency; finally, the performance requirements of the mobile AI application are not as demanding as the server, which allows some calculation accuracy loss, so some fixed-point operations and network compression can be used to speed up the operation. And from another perspective, passing all the data back to the cloud may cause network congestion on the one hand, and data security problems on the other hand. Once the data is maliciously hijacked in the transmission process, the consequences will be unimaginable. Therefore, an inevitable trend is to share some of the fast-reacting AI algorithms locally on the mobile side, thus avoiding the above problems as much as possible.

Fourth, AI dedicated chips, the blue ocean of the industry giants

Just as the multimedia applications and 3D games flourished 20 years ago, the rise of Internet big data has put new demands on super-calculated chips. As mentioned earlier, GPUs and FPGAs are the mainstream solutions currently adopted by software companies. Baidu's machine learning hardware system uses FPGA to build an AI-specific chip, and has been deployed on a large scale in applications such as speech recognition and ad click-through rate estimation models. In the field of speech recognition, the University of Science and Technology will fly almost all depths. The learning and learning operations are all run on the GPU accelerator card. However, there is news in the industry that HKUST plans to enable the FPGA platform in the speech recognition business.

As a giant in the GPU and FPGA fields, Nvidia and Intel have announced plans to develop AI-specific chips. In the first half of 2016, NVIDIA launched the TeslaP100 GPU for Deep Neural Networks and developed the deep learning supercomputer NVIDIADGX-1 based on this. At the same time, IBM and NVIDIA have launched several server products specifically for the field of artificial intelligence. The acquisition of FPGA giant Altera's Intel is not far behind, combined with the advantages of FPGA in big data computing processing, to create a new focus on high-performance computing and AI applications.

In addition, Intel announced in August 2016 the acquisition of deep learning chip startup Nervana to enhance Intel's business capabilities in AI. Currently, the biggest chip-level variable comes from Google's TPU chip. The chip is tailored to Google's software-driven engine TensorFlow for deep neural networks. According to Google, according to the development track of Moore's Law, the computing power of the current TPU is equivalent to the calculation level that can be achieved in the next seven years. At present, TPU has already served Google's AI system RankBrain, Street View StreetView, AlphaGO and other application services.

The high performance of the TPU comes from Google's targeted optimization for AI applications. TPU can more closely adapt to machine learning algorithms in terms of performance and power consumption, which is far superior to general-purpose chips such as GPUs and FPGAs. From a performance point of view, the current performance of a dedicated AI chip optimized for an algorithm can be improved compared to the GPU performance, which must be combined with specific algorithms. If the GPU just happens to get stuck in a bottleneck, it is also possible to increase the speed of the AI ​​chip by several times. The AI ​​algorithm has always maintained a trend of rapid evolution, so the development of dedicated AI chips must be complementary to the software.

From a cost perspective, once any chip is mass-produced, the cost will drop rapidly. As far as the server-side AI chip is concerned, the amount is not as good as the mobile market. Secondly, due to the emphasis on computing performance, its technical barriers are high, and it is difficult for new competitors to quickly enter. At present, there is basically no entrepreneurial opportunity for AI chips. The film is in the tens of millions of dollars, and there are only a handful of players around the world. And all the giants are staring at the huge cake of AI, so it is almost impossible to have a spoiler in this field. Although AI is a blue ocean, it is only the blue ocean of a large company.

Fuse Tap

Automotive Fug Wire Tap,Fuse Tap Adapter,Cor Add A Circuit Fuse Tap,Inline Fuse Holder

Dongguan Andu Electronic Co., Ltd. , https://www.idofuseholder.com