Microsoft Unveils Phi-2: A Breakthrough in AI Model Development

Microsoft Unveils Phi-2: A Breakthrough in AI Model Development

Microsoft has made a significant announcement in the field of artificial intelligence (AI), unveiling its latest AI model, Phi-2. This new model is part of Microsoft’s strategy to develop smaller, more specialized AI models for specific use cases. It follows the release of Phi-1 and Phi-1.5, which were designed to have fewer parameters than their larger counterparts.

The Phi-2 model boasts 2.7 billion parameters and is claimed to outperform much larger language models by up to 25 times. This achievement is noteworthy considering that the traditional approach of scaling up GPU chips to accommodate larger models is becoming increasingly challenging. Smaller, more industry- or business-focused models like Phi-2 are a cost-effective alternative that can deliver tailored results to meet specific business needs.

Avivah Litan, a vice president distinguished analyst with Gartner Research, explains the limitations of scaling up models indefinitely: “Sooner or later, scaling of GPU chips will fail to keep up with increases in model size. So, continuing to make models bigger and bigger is not a viable option.” This presents an opportunity for smaller, more domain-specific language models, trained on targeted data, to challenge the dominance of today’s leading large language models.

Dan Diasio, Ernst & Young’s Global Artificial Intelligence Consulting Leader, adds another dimension to the discussion, highlighting the impact of the ongoing chip shortage on both tech firms and user companies seeking to build their own proprietary models. This shortage has driven up costs and led to a trend of using knowledge enhancement packs and prompt libraries containing specialized knowledge.

Microsoft positions Phi-2 as an “ideal playground for researchers” due to its compact size. Researchers can explore areas such as mechanistic interpretability, safety improvements, and fine-tuning experimentation across various tasks. But the significance of Phi-2 goes beyond its size. Victor Botev, CTO and co-founder at start-up Iris.ai, emphasizes that Microsoft’s achievement challenges the notion that AI progress is solely reliant on increasing model size. He states, “It’s a testament to the fact that there’s more to AI than just increasing the size of the model.”

Prompt engineering plays a crucial role in training all-sized language models. By feeding queries and correct responses into the models, algorithms can generate more accurate responses. However, as more data is ingested, the possibility of flawed and inaccurate outputs increases. This issue highlights the importance of using well-structured and reasoned data to ensure language models produce factual and accurate answers. Botev suggests that domain-specific, structured knowledge and knowledge graphs can guide models to arrive at factually accurate outputs.

Overall, Microsoft’s Phi-2 represents a significant milestone in the development of AI models. It offers a compelling alternative to the traditional approach of increasing model size, delivering tailored results while being cost-effective. The release of Phi-2 showcases the potential of smaller models, emphasizing the importance of structured knowledge and reasoning in training AI models and providing a glimpse of the way forward in AI advancement.


Written By

Jiri Bílek

In the vast realm of AI and U.N. directives, Jiri crafts tales that bridge tech divides. With every word, he champions a world where machines serve all, harmoniously.