Introducing the github-top-code dataset: A curated dataset of 1.3M+ source code files from GitHub's top ranked developers.
I collected the best source code files from Github's highest trending developers of all time, and compiled a dataset to train LLMs to write well-structured, production-grade code.
AgentCPM-Explore🔥 on device agent foundation model released by OpenBMB openbmb/AgentCPM-Explore ✨ 4B - Apache2.0 ✨ Supports 100+ multi-turn environment interactions with search + verification ✨ Full training/inference stack is openly shared as well
installama.sh at the TigerBeetle 1000x World Tour !
Last week I had the chance to give a short talk during the TigerBeetle 1000x World Tour (organized by @jedisct1 👏 ) a fantastic event celebrating high-performance engineering and the people who love pushing systems to their limits!
In the talk, I focused on the CPU and Linux side of things, with a simple goal in mind: making the installation of llama.cpp instant, automatic, and optimal, no matter your OS or hardware setup.