Stability AI, the company behind Stable Diffusion image generation AI, introduces the groundbreaking FreeWilly1 and FreeWilly2 language models. It was founded by former UK hedge funder Emad Mostaque who has been accused of exaggerating his resume.
The two new LLMs are based on versions of Meta’s LLaMA and LLaMA 2 open-source models but trained on an entirely new, smaller dataset, including synthetic data. Both models excel in intricate reasoning, linguistic subtleties, and answering complex questions related to specialized domains like law and mathematics.
CarperAi, Stability’s subsidiary, released the FreeWillys under a “non-commercial license.” They can’t be used for moneymaking, enterprise, or business purposes. Instead, they aim to advance research and promote open access in the AI community.
Smaller, Environmentally Friendly Models:
The FreeWilly models, named playfully after the “Orca” AI training methodology, utilize a smaller dataset comprising just 600,000 data points, a mere 10% of the original Orca dataset. These models were trained with instructions from four datasets by Enrico Shippole, resulting in reduced costs and a lower carbon footprint than their larger counterparts. The models still produced an outstanding performance, comparable to and even exceeding ChatGPT on GPT-3.5 in some cases.
Promising Training on Synthetic Data:
Addressing concerns about model collapse due to AI-generated content, Stability AI trained the FreeWillys using synthetic data generated by two other LLMs that generated 500,000 examples and 100,000 synthetic examples. The models performed well, hinting that synthetic data might be a viable solution to prevent model collapse and avoid using copyrighted or proprietary data. Researchers and developers can explore the models’ potential in various domains, enhancing natural language understanding and tackling complex tasks.
Accessible Weights and Potential Impact:
FreeWilly2’s weights are accessible, while FreeWilly1’s weights are available as deltas over the original model. These models set new standards for open-access LLMs, presenting exciting possibilities for AI enthusiasts and inspiring new applications.
Remaining Optimistic Amid Controversy:
Despite controversies surrounding Stability AI founder Emad Mostaque’s resume, the AI community remains optimistic about the potential of the FreeWilly models to advance the field of artificial intelligence.
The FreeWilly1 and FreeWilly2 language models open doors for collaboration and innovation in the AI community. Stability AI is confident that these models will shape the future of natural language processing, paving the way for new horizons and opportunities for researchers and developers.