Full course description
This course is designed to provide a comprehensive foundation in Data-Centric AI, focusing on the crucial role of data in developing effective AI solutions. Participants will explore key concepts such as data wrangling, enrichment, augmentation, and synthetic data generation, learning how to refine datasets for improved model performance. The course covers exploratory data analysis (EDA) using Excel, PyCaret, and various Python libraries, while also introducing MLOps to integrate and manage data pipelines with continuous integration, delivery, and training. Learners will gain hands-on experience in automated machine learning (AutoML), MLFlow, and containerization, enabling them to build and deploy end-to-end machine learning solutions efficiently. Additionally, the course examines data labeling best practices, ethical considerations, and real-world challenges such as bias, domain gaps, and data noise, equipping participants with the knowledge to optimize AI workflows and implement scalable, ethical AI solutions.