Job Description
Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.
Job Responsibilities :
We are seeking a highly skilled and innovative Data Scientist to join our software team, leveraging user configuration data and software configuration schemas to fine-tune large language models (LLMs) in the 8B–32B parameter range. You will build an AI-powered configuration assistant that combines LLM fine-tuning, prompt engineering, retrieval-augmented generation (RAG) with VectorDB & GraphDB, and model optimization (including quantization) to deliver accurate, fast, and cost-effective recommendations to users.
This is a full-stack applied AI role, covering data handling, model training, deployment, monitoring, and optimization in production.Key Responsibilities
1. LLM Fine-tuning & Evaluation
- Fine-tune and adapt LLMs for domain-specific configuration assistance.
- Apply instruction tuning, LoRA, RLHF, and domain adaptation.
- Establish automated evaluation pipelines for accuracy, latency, and safety.
2. Prompt Engineering
- Design, test, and optimize prompt strategies for varied scenarios, personas, and workflows.
- Develop reusable prompt templates and dynamic context injection logic.
- Run A/B tests to measure prompt impact on user outcomes.
3. Retrieval-Augmented Generation (RAG) with VectorDB & GraphDB
- Implement semantic retrieval with VectorDB (e.g., FAISS, Pinecone, Weaviate).
- Build GraphDB (e.g., Neo4j, TigerGraph) pipelines to represent and query configuration relationships.
- Combine embedding search with graph reasoning for richer context in LLM outputs.
- Optimize retrieval for both latency and relevance.
4. Model Quantization & Optimization
- Apply quantization, pruning, and distillation to right-size LLMs for deployment.
- Benchmark trade-offs between quality, speed, and cost across CPU/GPU/edge.
- Collaborate with infrastructure teams on inference optimization.
5. Data Handling & Engineering
- Extract, clean, and structure configuration and schema data (JSON, YAML, XML).
- Proficiency with SQL for querying and transforming relational datasets.
- Build automated pipelines for continuous retraining and RAG index updates.
- Apply schema-aware data modeling for improved retrieval and training.
6. Production Deployment & Monitoring
- Collaborate with software engineers to integrate AI into live products.
- Develop APIs and microservices for LLM-powered features.
- Set up monitoring dashboards, drift detection, and feedback loops.
- Implement safety guardrails to prevent hallucinations and unsafe recommendations.
7. Security, Privacy & Compliance
- Ensure compliance with data privacy regulations (e.g., GDPR, SOC 2).
- Apply data anonymization and access control practices.
- Design output filtering to avoid sensitive or incorrect recommendations.
Pre-Requisites :
Requirements
Must-Have:
- 3+ years in Data Science, ML, or NLP with hands-on LLM fine-tuning experience.
- Proven skills in prompt engineering and RAG pipeline development.
- Experience with VectorDB and GraphDB integration.
- Hands-on experience with model quantization and optimization.
- Proficiency in Python (Hugging Face Transformers, PyTorch, LangChain).
- Proficiency with SQL and relational data modeling.
- Knowledge of YAML, JSON, XML, and schema-based data structures.
- Strong grasp of MLOps principles for production deployment.
Preferred:
- Experience with GPU optimization tools (ONNX Runtime, TensorRT).
- Background in software configuration management systems.
- Familiarity with CI/CD, Docker, Kubernetes for ML services.
- Experience in LLM evaluation frameworks (e.g., Ragas, HELM, OpenAI Evals).
Are you game?