Microsoft AI tool enables ‘extremely large’ models with a trillion parameters

Advertisement

BEGIN ARTICLE PREVIEW:

Microsoft Corp. has released a new version of its open-source DeepSpeed tool that it says will enable the creation of deep learning models with a trillion parameters, more than five times as many as in the world’s current largest model.
The company also sees the tool, released Thursday, boosting the work of developers working on smaller projects. DeepSpeed is a software library for performing artificial intelligence training. Announced in February, it has already gone through multiple iterations that increased the maximum size of the models it can train from more than 100 billion to more than a trillion. 
At a high level, parameters can be thought of as the insights that an AI learns from processing data. These insights are what enable AI models to improve their accuracy and speed with time. The more parameters a neural network has, the more proficiently it can process the data it ingests and thereby produce higher-quality results.
The challenge that DeepSpeed was created to address is that developers can only equip their neural networks with as many parameters as their AI training infrastructure can handle. In other words, hardware limitations are an obstacle to building bigger and better models. DeepSpeed makes the AI …

END ARTICLE PREVIEW

READ MORE FROM SOURCE ARTICLE