Responsibilities:
· Data Creation and Evaluation: Develop and assess synthetic data for Generative AI projects.
· Data Labeling: Annotate and label large volumes of textual data to create high-quality training datasets for Generative AI models.
· Data Preprocessing: Clean and preprocess raw text data, ensuring it is appropriately formatted for training and evaluation purposes.
· Language Modeling: Develop and refine language models to improve the fluency, coherence, and naturalness of generated text.
· Semantic Understanding: Apply computational linguistics techniques to enhance the understanding of semantics, sentiment, and context within text.
· Grammar and Syntax: Analyze and improve the grammar, syntax, and sentence structure of generated text to ensure it adheres to linguistic rules and conventions.
· Evaluation Metrics: Define and implement evaluation metrics to measure the quality and performance of Generative AI models.
· Collaboration: Collaborate closely with cross-functional teams, including data scientists, researchers, and software engineers, to refine and iterate on the AI models.
· Continuous Learning: Stay up to date with the latest advancements in Generative AI and computational linguistics, applying new techniques and methodologies to improve model performance.
Not sure if you meet every qualification? We still encourage you to apply! We value inclusivity, welcoming candidates from diverse backgrounds, including non-traditional paths. Unique experiences enrich our team, and the willingness to dream big makes you an exceptional candidate!Qualifications
· PhD in Linguistics preferred.
· Experience teaching at university level.
· Linguistic Expertise: Strong understanding of linguistic principles, grammar, syntax, and semantics, with the ability to apply this knowledge to improve language generation.
· Knowledge of core programming principles. Python programming skills. Familiarity with popular NLP libraries and frameworks.
· Data Annotation: Proficient in data annotation and labeling techniques, with experience working on large-scale text datasets.
· Knowledge of more than one language and experience in localization.
· Analytical Thinking: Strong analytical and problem-solving skills, with the ability to analyze and interpret data to drive improvements in AI models.
· Communication: Excellent written and verbal communication skills, with the ability to present complex technical concepts to both technical and non-technical stakeholders.
· Team Player: Demonstrated ability to work effectively in a collaborative team environment, fostering a culture of knowledge sharing and continuous learning.
Basic Qualification:
· Education: Bachelor's or Master's degree in Linguistics, Computational linguistics, computer science, natural language processing, or a related field.
- 6+ years of relevant industry experience in data labeling, computational linguistics, natural language processing, or similar roles.
Compensation is based on the geographic location in which the role is located and is subject to change based on work location.
For positions in this location, we offer a base pay of $98,500 - $162,5000, plus equity (when applicable), variable/incentive compensation and benefits. Sales positions generally offer a competitive On Target Earnings (OTE) incentive compensation structure. Please note that the base pay shown is a guideline, and individual total compensation will vary based on factors such as qualifications, skill level, competencies, and work location. We also offer health plans, including flexible spending accounts, a 401(k) Plan with company match, ESPP, matching donations, a flexible time away plan and family leave programs.
Compensation is based on the geographic location in which the role is located and is subject to change based on work location.