About the Role:
We have an existing commercial SaaS platform that consists of 3 components: a web application, several 3rd party databases integrated into our backend, and a Natural Language Processing, supervised ML model based on a custom taxonomy.
We are looking to build 2.0 of our platform, with a brand new front end based on new algorithms, a supervised deep learning model, and sophisticated data science models that use a confluence of data from various data sources (e.g., patent, financial, and market data). Its a challenge and a fun opportunity for someone looking to make the next big platform that the world is going to use.
Our Python Developer will have the opportunity to build scalable systems and software for commercial and government use, by developing scalable solutions that leverage modern architecture and best practices, and enable significant processing at runtime. You will provide technical expertise and support in the design, development, implementation, and testing. You will also participate in and/or direct major deliverables of projects through all aspects of the software development lifecycle including scope and work estimation, architecture and design, coding for back-end, and unit testing. In addition, you would also work with our Data Science team to help operationalize the ML models.
Your solutions should keep in mind scalability, with optimized usage of distributed computing using frameworks like Spark. You should also have strong familiarity and experience with how to leverage the AWS ecosystem to bring in relevant AWS tools, services, and resources to enable substantial processing of very large datasets before runtime, entity resolution between very large datasets, and real-time processing in a scalable, distributed computing environment.
Our Current Stack:
AWS to host the infrastructure, including the CICD, SpringBoot, Angular, Python, PySpark, Kubernetes, EMR, Spark, Elasticsearch, RedShift, AWS (S3, Code Commit, Code Build, Code Deploy, EC2, EMR, etc.), Docker, Spacy, Scikit learn, Openpyxl, Streamlit, Watchdog, sklearn, seaborn, nltk, matplotlib, pandas, SQLAlchemy, and additional ML and python libraries.
This stack is subject to change as we build. We want to modernize and streamline our code, deployment, front-end, and distributed processing capabilities.
Primary Responsibilities:
Lead the development efforts.
Participate in software programming initiatives to support innovation and
enhancement, using Python and PySpark.
Leverage the AWS ecosystem to bring in relevant tools, services, and resources to enable distributed computing and scalable processing of very large datasets.
Problem solve and think creatively about the big picture and solution for our customers, proactively anticipate problems, and be customer-centric in our development and design, and be open-minded to different solutions for achieving a development milestone.
Clearly document processes, methodologies, and tools used.
Experience Required:
B.S. in relevant technical degree
Significant use and experience (at least 3-5 years) with Python, required
Significant experience (at least 3-5 years) with distributed computing and corresponding languages such as PySpark in the Spark ecosystem, required
Significant experience (at least 3-5 years) with the AWS ecosystem, including tools, services, and resources that enable scalable, distributed processing, required
Significant use and experience with writing complex SQL queries and analysis of data correlations, required
Experience utilizing software testing performance tools, such as Junit
Experience and knowledge of Git, AWS CodeCommit, or other such repos
Knowledge of ML fundamentals and acquaintance of popular ML libraries
Ability to work independently and integrate with other team members.
Project management skills, ability to scope out timeline, methodology, and deliverables for development, testing, and integration into the platform
Excellent communication skills (written and verbal)
Well-versed with using version control systems
Logistics: Geography, Work Status, Etc.
The position is remote. The candidate must have the legal right to work in the United States
We are also looking for someone full-time to join us immediately.
Interview Process:
We will conduct 3 rounds of interviews.
First Round: Culture, fit, and background interview with the Founders
Second Round: Technical Interview
Third Round: In-Person Day with Founders
How to Apply:
Please provide the following:
Resume
Cover Letter
Any links to Git repositories or projects that we can review