Big Data - Data Lake Developer | Milpitas, CA
Contract: 1+yr (renewed quarterly)
Interview: Phone and Skype
Candidates are required to take a java programming test with prime vendor on
Skype for 60-90minutes.
The Data Lake engineer should come from the background of a senior Hadoop
developer. The position will be in support of data management, ingestion,
and client consumption. This individual has an absolute requirement to be
well versed in Big Data fundamentals such as HDFS and YARN. More than a
working knowledge of Hive is required with understanding of
partitioning/reducer size/block sizing/etc. Preferably, the candidate has a
strong knowledge of Spark on either Python or Scala. Basic Spark knowledge
is required. Knowledge of other industry ETL tools (including No SQL) such
Cassandra/Drill/Impala/etc. is a plus.
The candidate should be comfortable with Unix and standard enterprise
environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc.
Our Enterprise environment also contains a number of other Utilities such as
Tableau, Informatica, Pentaho, Mule, Talend, and others. Proficiency in
these is a plus.
Must have: Java + big data (hive hbase).
Big Data - Data Lake Developer:
Very strong in Core Java implementations. All the applications in Big Data
are written in core Java.
Must be able to code algorithms and try to reduce Big O in Java (O(n), O(n
log n), O(n2), etc). Eg: sorting, searching, etc.
Client is going to ask the candidate to implement code in core java on a
webex (audio, video and screen sharing).
Scoop is being used heavily. 90% of all the data imports are being done
using Scoop. Number of ways the data can be imported, parameters used,
distribute jobs, optimize the parameters, etc Very good understanding and
implementation experience of Hive and HBase (NoSQL) Wiring Bash scripts
(Shell Scripting) and working in Unix environment is mandatory most of the
unix commands, grep logs, write bash scripts and schedule them, etc
Excellent in RDBMS SQL. Client has access to many data sources Teradata, SQL
Server, MySQL, Oracle etc. The candidate must be able to easily connect and
run complex queries.
Python and Kafka are a plus
Java REST API implementation is a plus
-The Data Lake engineer should come from the background of a senior Hadoop
developer.
-The position will be in support of data management, ingestion, and client
consumption.
-This individual has an absolute requirement to be well versed in Big Data
fundamentals such as HDFS and YARN.
-More than a working knowledge of Hive is required with understanding of
partitioning/reducer size/block sizing/etc.
-Preferably, the candidate has a strong knowledge of Spark on either Python
or Scala.
-Basic Spark knowledge is required.
-Knowledge of other industry ETL tools (including No SQL) such
Cassandra/Drill/Impala/etc. is a plus.
-The candidate should be comfortable with Unix and standard enterprise
environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc.
-Our Enterprise environment also contains a number of other Utilities such
as Tableau, Informatica, Pentaho, Mule, Talend, and others. Proficiency in
these is a plus.
Garima Gupta | Technical Recruiter | Apetan Consulting LLC
You received this message because you are subscribed to the Google Groups "US Jobs: Requirements, Clients and Consultants" group.
To unsubscribe from this group and stop receiving emails from it, send an email to recruiters-r-us+unsubscribe@googlegroups.com.
To post to this group, send email to recruiters-r-us@googlegroups.com.
Visit this group at https://groups.google.com/group/recruiters-r-us.
For more options, visit https://groups.google.com/d/optout.
No comments:
Post a Comment