The GenAI models market moves extremely fast and it's hard to keep up with the latest models. In this Monthly GenAI models Roy roundup, we highlight the top 5 open source GenAI models that you should have on your radar.
Mistral 7b
Who built it? Mistral team
How large is it? 7B
Why is it interesting? Fine-tuned model from Llama2. It punches above its weight in HELM benchmarks and real-world usage. Beats GPT 3.5 8 out of 10 times for next-generation use cases.
Where can you use it? General purpose replacement for Llama-2, GPT-3.5
[2310.06825] Mistral 7B
Zephyr
Who built it? Hugging Face H4 (helpful, honest, harmless, and huggy)
How large is it? 7B
Why is it interesting? It’s a fine-tuned version of Mistral-7b trained on a mix of publicly available, synthetic data using DPO.
Where can you use it? It is as good as gpt-3.5, gpt-4, claude and llama-70b-chat on topics of writing, humanities, stem and roleplay. Lacks a fair bit on complex tasks like mathematics, reasoning and coding.
[2310.16944] Zephyr: Direct Distillation of LM Alignment
Phi 1.5
Who built it? Microsoft
How large is it? 1.5B
Why is it interesting? Trained on synthetic datasets curated with the help of larger LLMs and significantly better than its counterparts in the same parametric range and competitive with models with 13B parameters.
Where can you use it? When efficiency matters more, it is cheaper to run since it’s low on parameters. Also, can be used as a student for knowledge distillation
[2309.05463] Textbooks Are All You Need II: phi-1.5 technical report
Orca-2
Who built it? Microsoft
How large is it? Orca 2 is a FT version of llama2 - 7B & 13B
Why is it interesting? It shows that synthetic data created by models capable of complex workflows (advanced prompts, multiple calls) can teach Small Language Models (SLMs i.e. order of 10B) new capabilities, for this case reasoning
Where can you use it? Fine-tuning and Distillation
[2212.08410] Teaching Small Language Models to Reason
Qwen
Who built it? Alibaba Cloud
How large is it? 7B and 14B
Why is it interesting? First time that a Chinese model has made its way through the tops of leader boards. As good as gpt-3.5-16k in long text understanding and python coding, big focus on chinese and english
Where can you use it? Works especially good when used for Tooling and Agents, even better than gpt-4, also for uses which might only occur in chinese
[2309.16609] Qwen Technical Report
*In proprietary models land, we also want to give kudos to:
Claude 2.1 - 200k context
GPT-4-turbo - 128K context length
Learn more
For an in-depth look at our capabilities, you can read our full launch blog post and check out the platform here.