Offer summary
Qualifications:
PhD in CS, EE or equivalent experience, 5+ years of experience in deep learning, Strong background in neural networks, specifically inference, Understanding of computer architecture and GPU fundamentals, Programming skills in C++ and Python.
Key responsabilities:
- Deliver hyper-optimized recipes for LLM inference
- Analyze and debug performance and accuracy of models
- Benchmark and perform competitive analysis for NVIDIA SW/HW
- Develop software and processes to streamline model delivery
- Collaborate with SW/HW co-design teams on AI services