Ship better on-device AI/ML faster
2024 is the year of the Apple Intelligence, AI PCs, and Neural Processing Units (NPUs). Apps are leveraging this new hardware to ship on-device AI/ML across a range of use cases, from creative tooling (e.g. image editing) to video conferencing to copilots for warehouse workers.
On-device inference is important for apps/features requiring low latency, privacy/security, offline functionality, and zero server costs.
Developing effective on-device AI/ML is finicky and time-consuming. You have to convert Python models to on-device formats. Then, you have to optimize models across end-user devices (with varying capabilities), benchmark them on physical devices, and evaluate performance-quality trade-offs… in a highly iterative development cycle.
Each phase of this dev cycle requires tedious work, like figuring out appropriate model optimizations across different devices. Transitioning between phases is also cumbersome, like preparing benchmarking data for evaluation.
Neuralize is a single interface for applying model compression, benchmarking on our device farm, evaluating performance-quality trade-offs in our web dashboard, and deploying optimized models for different end-user devices.
Our backend identifies promising model optimizations, with automated and guided parameter sweeps, and benchmarks them across target devices. Results are automatically visualized for objective trade-off evaluation, and model outputs can be inspected/compared with subjective evaluation tools. Everything is organized in a single repository, accessible to the entire team, for better tracking and discussion.
Neuralize automates each phase of the on-device dev cycle and streamlines the transition between phases, accelerating the entire process.
Ciaran (middle) and Ismail (right) were at Deep Render building the world’s first AI video codec optimized for cross-platform on-device inference. They were early to the pain of optimizing models across devices and benchmarking performance/quality.
Ivan (left) and Ismail had been hacking together for years, and had built various apps with on-device AI. Having worked on Marshall Wace’s internal server-side AI platform, though, Ivan really felt the shortcomings of on-device AI tooling.
We scrapped the on-device AI apps and focused on the tooling.
Contact founders@runlocal.ai if you work with on-device AI/ML in production.
Also, please share this post with anyone that works on:
Thank you! 🧡