We're seeking a Data Engineer for a FTE opportunity located in New York City. This position will be remote until about March 2021 and then will be onsite. Local candidates to apply. Must have healthcare industry experience, along with HL7 experience and extensive knowledge of EHR/EMR language. Some programming experience. Experience handling big data.
The Data Engineer provides the foundation for analytics, namely the collection, quality assurance and availability of data. This IT role requires a significant set of technical skills, including a deep knowledge of SQL database design and multiple programming languages as well as communication skills to understand what data and analysis the healthcare analysts want to gain from data stores. They are responsible for the maintenance, improvement, cleansing and manipulation of data in the organization’s operational and analytics databases. The Data Engineer defines and builds data pipelines to orchestrate the movement, transformation, validation, and loading of data from the source to the final destination – data stores defined and implemented based on system requirements and data consumer requirements.
Typical tasks performed include working with the data architect on data warehouse design, ensuring data loads quickly, is easily accessible and rapidly comprehensible for analysts/users; extract, transform, and load (ETL) processing; performing thorough testing and validation in order to support the accuracy of data transformations and data verification to ensure correctness and that data is generally reliable for downstream consumption; data bounds checking and database tuning.
The Data Engineer strives to ensure proper data governance and quality for the Data Strategy team and the entire organization. They analyze complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models. They work collaboratively with the Data Strategy team, providing support for their data centric needs. The candidate has strong working and conceptual knowledge of building and maintaining physical and logical data models and will also have system management expertise with server and database tuning, query, multidimensional query and index tuning, monitoring, disaster recovery, backup, automated testing, automated schema migration, and continuous deployment.
ESSENTIAL DUTIES AND RESPONSIBILITIES:
- Responsible for expanding and optimizing data and data pipeline architecture, as well as optimizing data flow and collection
- Supports software developers, database architect, and data analysts on data initiatives and ensures optimal data delivery architecture is consistent throughout ongoing projects
- Must be self-directed and comfortable supporting the data needs of multiple teams, systems and products
- Responsible for optimizing or re-designing data architecture to support the next generation of products and data initiatives
- Develops and maintains scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity
- Collaborates with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility
- Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes
- Writes unit/integration tests, contributes to Confluence, and documents work
- Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues
- Works closely with a team of frontend and backend engineers, product managers, and analysts
- Contributes to definition of company data assets (data models), and processes to populate data models
- Designs data integrations and data quality framework
- Designs and evaluates open source and vendor tools for data lineage
- Works closely with all business units and engineering teams to develop strategy for long term data platform architecture
- Contribute to development of a rapidly growing, integrated data warehouse that will provide a complete vision of the entire healthcare landscape
• This job has no supervisory responsibilities.