➡️ Apply here: Lead AI Inference Engineer
🔔 Monitor #prompt_engineer jobs
👩💼 Want to stand out? Improve your resume to appeal to recruiters, hiring managers, and Applicant Tracking Systems. ➡️ Improve your resume
**Job Title:** Lead AI Inference Engineer
**Company:** Tether.io
**Location:** Georgia
**Job Description:**
You’ll lead a cross-functional pod that spans the full stack, from C++ inference engines to JavaScript applications. Your responsibility is to ensure that local AI capabilities ship reliably and perform well across devices. You’ll balance hands-on technical work with team coordination, guiding foundation and middleware engineers toward shared goals.
This role is ideal for someone who understands both the low-level challenges of edge AI and the product-facing needs of app developers, and wants to drive the delivery of cohesive, production-ready local AI systems.
**Responsibilities:**
* Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, onnx
* Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments
* Integrate AI features into existing products, enriching them with the latest advancements in machine learning
* Managing a cross functional team (pod) made of middleware (JS), foundation (C++), QA and documentation engineers to produce high quality deliverables
* Regularly assessing, both qualitatively and quantitatively, our position in the market with regards to similar products or platforms
* Leveraging the expertise of technical architects to ensure robust architectural choices and code quality
* Ensuring stable releases by following precise internal release processes
**Qualifications:**
* Excellent programming skills C++
* Strong experience with Llama.cpp and ggml inference engines, which facilitates the deployment of models to specific GPU architectures
* Good understanding of deep learning concepts and model architectures
* Experience with transformers and LLMs
* Demonstrated ability to rapidly assimilate new technologies and techniques
* Has experience managing a small, specialized, cross functional team (pod) of 3-5 people
* Has a genuine passion for building good products that improve people’s lives
* A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D
**Bonus points if:**
* You have extensive experience with Javascript/Typescript
* You have experience with AWS, containerization platforms, orchestration, and automated testing suites (Maestro, Appium)
* You understand the difficulties, nuances and importance of p2p technology
* You have worked with MLC, TVM or similar frameworks
* You have experience with Vulkan, CUDA
* You have productionized models
