Microsoft is engaged on a brand new large-scale AI language mannequin referred to as MAI-1, which may probably rival state-of-the-art fashions from Google, Anthropic, and OpenAI, in keeping with a report by The Info. This marks the primary time Microsoft has developed an in-house AI mannequin of this magnitude since investing over $10 billion in OpenAI for the rights to reuse the startup’s AI fashions. OpenAI’s GPT-4 powers not solely ChatGPT but additionally Microsoft Copilot.
The event of MAI-1 is being led by Mustafa Suleyman, the previous Google AI chief who not too long ago served as CEO of the AI startup Inflection earlier than Microsoft acquired the vast majority of the startup’s employees and mental property for $650 million in March. Though MAI-1 might construct on methods introduced over by former Inflection employees, it’s reportedly a completely new giant language mannequin (LLM), as confirmed by two Microsoft staff accustomed to the challenge.
With roughly 500 billion parameters, MAI-1 will probably be considerably bigger than Microsoft’s earlier open supply fashions (similar to Phi-3, which we coated final month), requiring extra computing energy and coaching knowledge. This reportedly locations MAI-1 in the same league as OpenAI’s GPT-4, which is rumored to have over 1 trillion parameters (in a mixture-of-experts configuration) and properly above smaller fashions like Meta and Mistral’s 70 billion parameter fashions.
The event of MAI-1 suggests a twin method to AI inside Microsoft, specializing in each small domestically run language fashions for cellular units and bigger state-of-the-art fashions which can be powered by the cloud. Apple is reportedly exploring the same method. It additionally highlights the corporate’s willingness to discover AI growth independently from OpenAI, whose know-how at present powers Microsoft’s most formidable generative AI options, together with a chatbot baked into Home windows.
Reportedly, the precise function of MAI-1 has not been decided (even inside Microsoft), and its most preferrred use will rely on its efficiency, in keeping with one in every of The Info’s sources. To coach the mannequin, Microsoft has been allocating a big cluster of servers with Nvidia GPUs and compiling coaching knowledge from numerous sources, together with textual content generated by OpenAI’s GPT-4 and public Web knowledge.
Relying on the progress made within the coming weeks, The Info reviews that Microsoft might preview MAI-1 as early as its Construct developer convention later this month, as reported by one of many sources cited by the publication.