Locations: Plano, TX / Jersey City, NJ / Charlotte, NC / Newark, DE/ Atlanta, GA/New York City, NY
Duration: Contract
Mandatory Skills: Needs to have good Hadoop on-prem data engineering experience. Should be able to create reusable frameworks. Should know Python, Spark, Java, Unix Shell scripting, CI/CD.
Position Overview
The Senior Engineer is a hybrid role with experience in database management, clustered compute, operating system integration, cloud concepts, storage solutions, application processing, and advanced monitoring techniques. The resource usually has experience in multiple disciplines including Cloud, Linux as well as Hadoop. Must be able to lead complex projects and competing priorities with a high level of technical acumen and strong communication skills.
Description
Experience with multiple large-scale Enterprise Hadoop or Big Data, Data Bricks, Cloudera, HD Insights, or other environments focused on operations, design, capacity planning, cluster set up, security, performance tuning and monitoring
Experience with the full Cloudera CDH/CDP distribution to install, configure and monitor all services in the Cloudera stack
Strong understanding of core Hadoop services such as HDFS, MapReduce, Kafka, Spark and Spark-Streaming, Hive, Impala, HBase, Kudu, Sqoop, and Oozie
Experience in administering, and supporting RHEL Linux operating systems, databases, and hardware in an enterprise environment
Expertise in typical system administration and programming skills such as storage capacity management, debugging, performance tuning
Proficient in shell scripting (e.g. Bash, KSH, etc)
Experience in setup, configuration and management of security for Hadoop clusters using Kerberos with integration with LDAP/AD at an enterprise-level
Experience with large enterprise scale, separation of resource concepts and the physical nature for those environments to operate (storage, memory, network, and compute)
Desired Skills
Certifications in one or many disciplines of Cloudera Hadoop, Cloud (Azure, AWS, Google), or RHEL
Experience in version control systems (Git)
Experience with Spectrum Conductor or Databricks with Apache Spark
Experience in different programming languages (Python, Java, R)