- グルグラム - Haryana - インド
Senior AI & ML Engineer
We’re seeking a hands-on AI/ML engineer to design, build, and produce Generative AI solutions—including RAG pipelines and multi-agent systems—to automate workflows and drive operational excellence. You’ll work closely with solution/data architects, software developers, data engineers, and domain experts to rapidly prototype and deliver scalable, enterprise-grade systems.
This is an individual contributor role requiring strong research skills, deep expertise in AI foundation models, and the ability to translate cutting-edge concepts into impactful solutions for digital grid challenges.
A Snapshot of your Day
How You’ll Make an Impact (responsibilities of role)
- End-to-End GenAI Development: Design and implement RAG pipelines, agentic workflows, and LLM integrations for tasks such as document understanding, classification, and knowledge assistance.
- Multi-Agent Orchestration: Build agent-based applications for planning, tool use, and execution using frameworks like LangGraph, Semantic Kernel, and prompt orchestration tools.
- AI Enterprise Architecture: Strong Experience in AI architecture (scalable, modern, and secure) design across AI/ML enterprise solutions.
- Data & MLOps Foundations: Architect data pipelines and cloud solutions for training, deployment, and monitoring on Azure/AWS with Docker, Kubernetes, and CI/CD.
- Rapid Prototyping to Production: Convert problem statements into prototypes, iterate with stakeholders, and harden into production-ready microservices (FastAPI) with APIs and event-driven workflows.
- Evaluation & Reliability: Define rigorous evaluation metrics for LLM/ML systems (accuracy, latency, cost, safety), optimize retrieval quality, prompt strategies, and agent policies.
- Security & Compliance: Implement Responsible AI guardrails, data privacy, PII handling, access controls, and auditability.
- Collaboration & Enablement: Partner with data engineers, mentor junior team members, and contribute to internal documentation and demos.
What You Bring (required qualification and skill sets)
- Education: Bachelor’s/master’s in computer science, Data Science, Engineering, or equivalent experience
- Experience:
- 7–12 years delivering AI/ML, Data Science solutions in production.
- 2-3 years focused on Generative AI/LLM applications.
- Technical Skills:
- Programming: Strong Python (typing, packaging, testing), data stacks (NumPy, Pandas, scikit-learn), API development (FastAPI/Flask).
- GenAI Expertise:
- Prompt engineering, RAG design (indexing, chunking, reranking).
- Embeddings and vector databases (FAISS, Azure AI Search, Pinecone).
- Agent frameworks (LangGraph, Semantic Kernel) and orchestration strategies.
- Model selection/fine-tuning, cost-performance optimization, safety filters.
- Cloud & Data: Hands-on with Azure/AWS; experience with Azure OpenAI, Azure AI Search, Microsoft Fabric/Databricks (preferred), Snowflake or similar DWH.
- MLOps: Docker, Kubernetes, CI/CD (GitHub Actions/Gitlab), model deployment/monitoring.
- Architecture: Microservices, event-driven design, API security, scalability, and resilience.
- Soft Skills:
- Excellent team player with the ability to work collaboratively in cross-functional and multicultural teams.
- Strong communication skills able to explain complex technical ideas to non-technical stakeholders.
- Adaptability to changing priorities and evolving technologies.
- Problem-solving mindset with creativity, curiosity, and a proactive approach.
- Time management & prioritization in fast-paced, iterative environments.
- A mentoring attitude toward junior colleagues and an openness to receiving feedback.
- Strong sense of ownership and accountability over deliverables.
- Domain Knowledge: Experience applying AI/ML to power systems, electrical grids, or related domains.
Preferred Qualifications
- Experience with Azure OpenAI, Microsoft Fabric/Prompt Flow, Copilot Studio connectors, or enterprise integrations (SharePoint/Teams).
- Expertise in ML/DL techniques: time-series forecasting, anomaly detection, NLP document AI (OCR, classification, extraction).
- Familiarity with security (OAuth2, RBAC), observability (OpenTelemetry), and cost governance (token budgeting).
Tech Stack
- Languages/Frameworks: Python, FastAPI/Flask, LangGraph/Semantic Kernel/CrewAI/AutoGen, scikit-learn, PyTorch/TensorFlow.
- LLM & Retrieval: Azure OpenAI/Open weights, embeddings, vector DBs (FAISS/Milvus/Pinecone), reranking.
- Data & Cloud: Snowflake, Azure/AWS (storage, compute, messaging), SQL.
- Ops: Docker, Kubernetes, GitHub Actions/Jenkins, Helm, monitoring/logging.
- Collaboration: Git, Jira/Azure DevOps, Agile/Scrum.