We have 30 years of expertise in designing and building custom software systems. We provide software development services focusing on complex high-load applications, AI and BI solutions, and mobile apps.
About the role:
You will be working on a large-scale database project that serves as a central data source for multiple computational chemistry and machine learning teams. The database is built on PostgreSQL and is divided into two main components: reaction data and building block data.
Your primary focus will be on one of the large commercial reaction datasets. The initial task involves modernizing the existing ETL pipelines—updating them to process the latest data dumps and automating the update process. The team currently uses Apache Airflow for ETL orchestration.
In the next phase, the scientific team plans to develop additional ETL pipelines using data from the reaction dataset. You will be responsible for gathering requirements from the team, designing new database schemas, and implementing the necessary ETL processes.
Required skills
Preferred qualifications
Our offer as your future employer
Software Country (ТОО Балхаш Системс)
Тбилиси
от 1000 USD