Required Skills: HDFS, Hive, Spark, HBase, Oozie
Job Description
ROLE - SA AIML
Location - Bangalore
Years of experience needed – 8-10 years of relevant experience
Responsibilities :
• Work closely with clients to understand their business requirements and design data solutions that meet their needs.
* Develop and implement end-to-end data solutions that include data ingestion, data storage, data processing, and data visualization components.
* Design and implement data architectures that are scalable, secure, and compliant with industry standards.
• Work with data engineers, data analysts, and other stakeholders to ensure the successful delivery of data solutions.
* Participate in presales activities, including solution design, proposal creation, and client presentations.
• Act as a technical liaison between the client and our internal teams, providing technical guidance and expertise throughout the project lifecycle.
* Stay up-to-date with industry trends and emerging technologies related to data architecture and engineering.
* Develop and maintain relationships with clients to ensure their ongoing satisfaction and identify opportunities for additional business.
• Understands Entire End to End AI Life Cycle starting from Ingestion to Inferencing along with Operations.
• Exposure to Gen AI Emerging technologies.
* Exposure to Kubernetes Platform and hands on deploying and containorizing Applications.
* Good Knowledge on Data Governance, data warehousing and data modelling.
Requirements :
* Bachelor's or Master's degree in Computer Science, Data Science, or related field.
• 10+ years of experience as a Data Solution Architect, with a proven track record of designing and implementing end-to-end data solutions.
* Strong technical background in data architecture, data engineering, and data management.
• Extensive experience on working with any of the hadoop flavours preferably Data Fabric.
* Experience with presales activities such as solution design, proposal creation, and client presentations.
* Familiarity with cloud-based data platforms (e.g., AWS, Azure, Google Cloud) and related technologies such as data warehousing, data lakes, and data streaming.
* Experience with Kubernetes and Gen AI tools and tech stack.
* Excellent communication and interpersonal skills, with the ability to effectively communicate technical concepts to both technical and non-technical audiences. • Strong problem-solving skills, with the ability to analyze complex data systems and identify areas for improvement.
• Strong project management skills, with the ability to manage multiple projects simultaneously and prioritize tasks effectively.
Tools & Tech Stack :
1. Data Architecture and Engineering: a. Hadoop Ecosystem: Preferred: Cloudera Data Platform (CDP) or Data Fabric. Tools: HDFS, Hive, Spark, HBase, Oozie. b. Data Warehousing: Cloud-based: Azure Synapse, Amazon Redshift, Google Big Query, Snowflake, Azure Synapsis and Azure DataBricks On-premises: , Teradata, Vertica c. Data Integration and ETL Tools: Apache NiFi, Talend, Informatica, Azure Data Factory, Glue.
2. Cloud Platforms: Azure (preferred for its Data Services and Synapse integration), AWS, or GCP. Cloud-native Components: Data Lakes: Azure Data Lake Storage, AWS S3, or Google Cloud Storage. Data Streaming: Apache Kafka, Azure Event Hubs, AWS Kinesis.
3. HPE Platforms: Data Fabric, AI Essentials or Unified Analytics, HPE MLDM and HPE MLDE
4. AI and Gen AI Technologies: AI Lifecycle Management: MLOps: MLflow, KubeFlow, Azure ML, or SageMaker, Ray Inference tools: TensorFlow Serving, K Serve, Seldon Generative AI: Frameworks: Hugging Face Transformers, LangChain. Tools: OpenAI API (e.g., GPT-4)
5. Orchestration and Deployment: Kubernetes: Platforms: Azure Kubernetes Service (AKS)or Amazon EKS or Google Kubernetes Engine (GKE) or Open Source K8 Tools: Helm CI/CD for Data Pipelines and Applications: Jenkins, GitHub Actions, GitLab CI, or Azure DevOps