Match score not available

Sr Data Scientist - GenAI

Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Bachelor’s / Master’s in relevant fields, 3+ years of hands-on ML, NLP experience, Proven experience with Python, AI libraries.

Key responsabilities:

  • Design and develop custom Gen AI systems for batch and stream processing.
  • Maintain vectorization pipelines and databases for efficient data flow.
  • Collaborate with team members for seamless data integration.
  • Stay updated with latest GenAI technologies and best practices.
Concepta Tech logo
Concepta Tech Information Technology & Services TPE https://www.conceptatech.com/
11 - 50 Employees
See more Concepta Tech offers

Job description

This is a remote position.

Job Summary: We are seeking a skilled GenAI to develop, and maintain the GenAI systems and libraries. This role is a unique opportunity for hands-on ML scientists and NLP/Gen AI/ LLM scientists to grow into the next step in their career journey and apply her or his technical expertise in RAG, fine tuning, embeddings and LLMs to drive business value for multiple stakeholders while developing cutting-edge Gen AI systems and libraries.

Responsibilities:

  • Design and develop custom Gen AI systems for batch and stream processing-based AI pipelines. Model components will include data ingestion, preprocessing, embedding, vectorization, retrieval, reranker, and other RAG related components.
  • Embedding and Foundational model development, fine-tuning and implementation tasks.
  • Prompt engineering capabilities to ensure the GenAI solution meets all technical and business requirements.
  • Develop and maintain vectorization pipelines for data ingestion, embedding, indexing, storage and loading, ensuring efficient vector flow and retrieval processing.
  • Manage and optimize vector databases, to ensure data availability and performance.
  • Implement prompt and other security guardrails to protect sensitive data and GenAI systems.
  • Monitor response quality and ensure response consistency, implementing response validation and cleansing processes.
  • Collaborate with other team members, including the Tech Lead, ML/AI Engineer, and Backend Engineer, to ensure seamless data integration and processing.
  • Troubleshoot complex issues related to machine learning model development and data pipelines and develop innovative solutions.
  • Stay up-to-date with the latest GenAI technologies and best practices.

Requirements:

  • Bachelor's / Master’s in Computer Science, Mathematics or Statistics, Computational linguistics, Engineering, or a related field can be considered an advantage..
  • 3+ years of professional hands-on experience leveraging large sets of structured and unstructured data to develop data-driven tactical and strategic analytics and insights using ML, NLP, computer vision solutions.
  • Demonstrated 4+ years hands-on experience with Python, Hugging Face, TensorFlow, Keras, PyTorch, and Transformers.
  • 2+ years hands-on experience developing natural language processing (NLP) models, ideally with transformer architectures.
  • 2+ years of experience with implementing information search and retrieval at scale (RAG), using a range of solutions ranging from keyword search to semantic search using embeddings.
  • Must have: proven experience architecting end-to-end Generative AI solutions for enterprise customers with GenAI open-source platforms, libraries and packages.
  • Nice to have: Experience with contributing to Github and open source initiatives or in research projects and/or participation in Kaggle competitions.


Required profile

Experience

Level of experience: Mid-level (2-5 years)
Industry :
Information Technology & Services
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Data Scientist Related jobs