With AIME API one deploys deep learning models (Pytorch, Tensorflow) through a job queue as scalable API endpoint capable of serving millions of model inference requests.
Turn a console Python script to a secure and robust web API acting as your interface to the mobile, browser and desktop world.
The AIME API server solution implements a distributed server architecture with a central API Server communicating through a job queue with a scalable GPU compute cluster. The GPU compute cluster can be heterogeneous and distributed at different locations without requiring an interconnect.
The central part is the API Server, an efficient asynchronous HTTP/HTTPS web server which can be used stand-alone web server or integrated into Apache, NGINX or similar web servers. It takes the client requests, load balances the requests and distributes them to the API compute workers.
The model compute jobs are processed through so called compute workers which connect to the API server through a secure HTTPS interface.
You can easily turn your existing Pytorch and Tensorflow script into an API compute worker by integrating the AIME API Worker Interface.
Clients, like web browsers, smartphones, desktop apps can easily integrating model inference API class with the AIME API Client Interfaces.
To illustrate the usage and capabilities of AIME API we currently run following GenAI (generative AI) demo api services:
Chat with 'Steve', our Llama 3.3 70B based instruct chat-bot.
Chat with 'Chloe', our Mixtral 8x7B or 8X22B based instruct chat-bot.
Create photo realistic images with Black Forest Labs FLUX.1-Dev.
Create photo realistic images with Stable Diffusion 3.
Translate between 36 languages in near realtime: Text-to-Text, Speech-to-Text, Text-to-Speech and Speech-to-Speech!
Create photo realistic images from text prompts.
Chat with 'Steve', the Llama3 based chat-bot.
Chat with 'Dave', the Llama2 based chat-bot.
Tortoise TTS: high quality Text-To-Speech Demo
We recommend creating a virtual environment for local development. Create and activate a virtual environment, like 'venv' with:
python3 -m venv venv
source ./venv/bin/activate
Download or clone the AIME API server:
git clone --recurse-submodules https://github.com/aime-team/aime-api-server.git
Alternative, for excluding Worker interface and Client interfaces submodules, which are not needed to run the API server itself, use:
git clone https://github.com/aime-team/aime-api-server.git
Then install required pip packages:
pip install -r requirements.txt
Ubuntu/Debian:
sudo apt install ffmpeg
To start the API server run:
python3 run api_server.py [-H HOST] [-p PORT] [-c EP_CONFIG] [--dev]
The server is booting and loading the example endpoints configurations defined in the "/endpoints" directory.
When started it is reachable at http://localhost:7777 (or the port given). As default this README.md file is serverd. The example endpoints are available and are taking requests.
The server is now ready to connect corresponding compute workers.
You can easily turn your existing Pytorch and Tensorflow script into an API compute worker by integrating the AIME API Worker Interface.
Following example workers implementations are available as open source, which easily can be be adapted to similair use cases:
https://github.com/aime-labs/llama3_chat
https://github.com/aime-labs/stable_diffusion_xl
https://github.com/aime-labs/seamless_communication
Simple single call example for an AIME API Server request on endpoint LlaMa 2 with Javascript:
<script src="/js/model_api.js"></script>
<script>
function onResultCallback(data) {
console.log(data.text) // print generated text to console
}
params = new Object({
text : 'Your text prompt'
});
doAPIRequest('llama2_chat', params, onResultCallback, 'user_name', 'user_key');
</script>
Simple synchronous single call example for an AIME API Server request on endpoint LlaMa 2 with Python:
aime_api_client_interface import do_api_request
params = {'text': 'Your text prompt'}
result = do_api_request('https://api.aime.info', 'llama2_chat', params, 'user_name', 'user_key')
print(result.get('text')) # print generated text to console
We are currently working on sample interfaces for: iOS, Android, Java, PHP, Ruby, C/C++,
For more information about the AIME read our blog article about AIME API
The AIME API is free of charge for AIME customers. Details can be found in the LICENSE file. We look forward to hearing from you regarding collaboration or licensing on other devices: hello@aime.info.
Or consult the AIME API documentation.