Amazon.com Data Engineer - Big Data Technologies in Seattle, Washington
Amazon’s eCommerce Platform (eCP) organization is responsible for the core components that drive the Amazon website and customer experience. Serving millions of customer page views and orders per day, eCP builds for scale.
As an organization within eCP, the Big Data Technologies (BDT) group is no exception. We collect petabytes of data from thousands of data sources inside and outside Amazon including the Amazon catalog system, inventory system, customer order system, page views on the website and Alexa systems. We also support Amazon subsidiaries such as IMDB and Audible. We provide interfaces for our internal customers to access and query the data hundreds of thousands of times per day, using Amazon Web Service’s (AWS) Redshift, Hive, Spark and Oracle. We build scalable solutions that grow with the Amazon business.
BDT is growing, and the data processing landscape is shifting. Our data is consumed by thousands of teams across Amazon including Research Scientists, Machine Learning Specialists, Business Analysts and Data Engineers. Amazon.com is seeking an outstanding Data Engineer to join the Big Data Technologies Business Intelligence team. The Business Intelligence team delivers business intelligence to over 1000 internal customers, and a diverse community of external customers. Amazon.com has culture of data-driven decision-making, and demands business intelligence that is timely, accurate, and actionable. If you join the Amazon.com Business Intelligence team your work will have an immediate influence on day-to-day decision making at Amazon.com.
As an Amazon.com Data Engineer I you will be working in one of the world's largest and most complex data warehouse environments. You should be skilled in the architecture of DW solutions for the Enterprise using multiple platforms (RDBMS, Columnar, Cloud). You should have extensive experience in the design, creation, management, and business use of extremely large datasets. You should have excellent business and communication skills to be able to work with business owners to develop and define key business questions, and to build data sets that answer those questions. Above all you should be passionate about working with huge data sets and someone who loves to bring datasets together to answer business questions and drive change.
As a Data Engineer I on Amazon.com’s Big Data Technologies team, you will develop new data engineering patterns that leverage a new cloud architecture, and will extend or migrate our existing data pipelines to this architecture as needed. You will also be assisting with integrating the Redshift platform as our primary processing platform to create the curated Amazon.com data model for the enterprise to leverage. You will be part of a team building the next generation data warehouse platform and to drive the adoption of new technologies and new practices in existing implementations. You will be responsible for designing and implementing the complex ETL pipelines in data warehouse platform and other BI solutions to support the rapidly growing and dynamic business demand for data, and use it to deliver the data as service which will have an immediate influence on day-to-day decision making at Amazon.com.
· Interfacing with business customers, gathering requirements and developing new datasets in data warehouse
· Building and migrating the complex ETL pipelines from Oracle system to Redshift and Elastic Map Reduce to make the system grow elastically
· Optimizing the performance of business-critical queries and dealing with ETL job related issues
· Tuning application and query performance using Unix profiling tools and SQL
· Identifying the data quality issues across Oracle and Redshift to address them immediately to provide great user experience
· Extracting and combining data from various heterogeneous data sources
· Designing, implementing and supporting a platform that can provide ad-hoc access to large datasets
· Modelling data and metadata to support ad-hoc and pre-built reporting
· Working with customers to fulfill their data requirement using DW tables & maintain metadata for all DW Tables.
· A desire to work in a collaborative, intellectually curious environment.
· Degree in Computer Science, Engineering, Mathematics, or a related field and 2+ years industry experience
· Demonstrated ability in SQL, data modeling, ETL development, and data warehousing.
· Industry experience as a Data Engineer or related specialty (e.g., Software Engineer, Business Intelligence Engineer, Data Scientist) with a track record of manipulating, processing, and extracting value from large datasets.
· Coding proficiency in at least one modern programming language (Python, Ruby, Java, etc)
· Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
· Experience building data products incrementally and integrating and managing datasets from multiple sources
· Query performance tuning skills using Unix profiling tools and SQL
· Experience leading large-scale data warehousing and analytics projects, including using AWS technologies – Redshift, S3, EC2, Data-pipeline and other big data technologies
· Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
· Linux/UNIX including to process large data sets.
· Experience with AWS
· Data Warehousing Experience with Oracle, Redshift, Teradata, etc.
· Experience with Big Data Technologies (Hadoop, Hive, Hbase, Pig, Spark, etc.)
AMZR Req ID: 536294
External Company URL: www.amazon.com