Revolutionizing AI with Efficient and Cost-effective Language Model Training

Revolutionizing AI with Efficient and Cost-effective Language Model Training

Exabits and MyShell have made waves in the field of machine learning by revolutionizing the training of large language models (LLMs) and dramatically reducing training costs. Through their partnership, they have achieved a breakthrough that lowers the cost of training LLMs from billions to under $100,000. This achievement is extraordinary, as it outperforms the multi-billion dollar compute cost of Meta AI’s LLaMA2-7B model.

The JetMoE-8B model, trained at less than a $0.1 million cost, is a testament to the remarkable milestone in machine learning that Exabits and MyShell have reached. With 8 billion parameters and a sophisticated structure of 24 blocks, each containing two MoE layers, the JetMoE-8B showcases advanced efficiency and computational intelligence. Its use of the Sparse Mixture of Experts (SMoE) framework, with selective activation of 2 out of 8 experts per input token, enhances the model’s responsiveness and resource management.

The efficiency of the JetMoE-8B model is demonstrated by its state-of-the-art performance in five categories on eight evaluation benchmarks. It surpasses competitors like LLaMA-13B, LLaMA2-7B, and DeepseekMoE-16B. Even models with larger capacities, such as LLaMA2 and Vicuna, which possess 13 billion parameters, are outperformed by JetMoE-8B on the MT-Bench benchmark.

The success of this architectural sophistication is powered by Exabits' contribution of an accelerated and stabilized cluster of 12 H100 GPU nodes (96 GPUs). Exabits' cloud compute infrastructure ensures stable, ultra-available, and robust performance at a fraction of the cost of traditional “big compute” methods. This combination of innovative design and cutting-edge GPU technology exemplifies a leap in machine learning capabilities.

Exabits has also shattered the skepticism surrounding the suitability of decentralized GPU platforms for LLM training. By aggregating, accelerating, and stabilizing consumer-grade GPUs, Exabits has created a decentralized cloud compute platform that matches enterprise-grade GPU performance. This approach taps into a vast supply of consumer GPUs, alleviating the GPU shortage crisis. Additionally, Exabits' experience in the data center sector provides access to enterprise-grade GPUs, further democratizing AI development.

The future of LLM training with Exabits is promising. Their platform embodies affordability, accessibility, and environmental consciousness, making it a beacon for the future of AI research and development. JetMoE-8B’s success paves the way for more sustainable and inclusive advancements in AI.

In conclusion, Exabits and MyShell are revolutionizing the AI domain by challenging traditional “big compute” methods. Their breakthrough achievements in training large language models efficiently and cost-effectively open up new avenues for AI research and application. The success of their collaboration sets a new standard in the computational economy and ushers in a new era of innovation and collaboration in the field of web3 and artificial intelligence.


Written By

Jiri Bílek

In the vast realm of AI and U.N. directives, Jiri crafts tales that bridge tech divides. With every word, he champions a world where machines serve all, harmoniously.