nCompass is an API that requires only one-line-of-code to integrate low latency versions of open-source/custom models into your AI pipeline.
tl;dr If unpredictable response times and rate limits of OpenAI are causing your tool’s user experience to suffer, nCompass allows you to effortlessly tap into the world of open-source AI models while ensuring that the served models meet your target budget and performance requirements.
—
Hey all, we are Diederik and Aditya, the co-founders of nCompass, a platform for simplified hosting and acceleration of open-source and custom LLMs.
LLM-based products that use closed-source model providers like OpenAI suffer from slow response times and rate limits.
Open-source models are a great alternative, but hosting a model yourself is a lot of extra work and maintenance which distracts you from your core business.
nCompass provides an API that allows you to integrate accelerated versions of any open-source or custom model of your choice into your AI pipeline. We support OpenAI style chat templates, work with all web frameworks, and have a time-based pricing model that results in a predictable compute cost for users.
We serve models to users with a simple 3-step process:
We set up the deployment that meets these requirements and provide you with a single API Key that you can then use to integrate the model with a single line of code.
We support any model currently hosted on Hugging Face, with some highlights being:
https://www.youtube.com/watch?v=sdHVji8QGOg
Also, check out our GitHub repository for code examples.
Since we met in undergrad (9 years ago) through to our PhDs at Imperial College London, we’ve worked on every project together. Our PhDs focused on hardware acceleration of large-scale machine learning models covering all levels of the stack from algorithms and compilers down to digital hardware design.
Our emails are aditya.rajagopal@ncompass.tech and diederik.vink@ncompass.tech