Balvinder has 15 years of experience in building large-scale custom software and big data platform solutions for complicated client problems. She has extensive experience in Analysis, Design, Architecture,
and Development of Web based Enterprise systems and Analytical systems using Agile practices like Scrum and XP. Her technical skills lie in the areas of backend development using Java and Scala, big data technologies, complex systems architectures and distributed computing.  She is one of the ThoughtLeader in BigData space and actively speaks at various conferences.
Balvinder currently works as a Data Architect and Global Data Community Lead for Thoughtworks.

QA and Agile on Big Data Projects

Great software quality standards make great software and hence quality practices are baked-in with the software delivery process itself. While application software quality and testing processes have gained maturity and there are industry standards around those, when it comes to data platforms, especially those involving ML/AI components, there is still a lot of vagueness around the process and practices.

There are complexities that arise due to inherent volume, velocity, and variety of data in data platforms. Additionally, there is ambiguity around the entire test pyramid as to what should be tested in unit tests, integration tests, and E2E (user journey) tests. These problems are further multiplied due to the absence of established testing tools like Selenium, Cypress, etc.

In this talk, we will be sharing ourinsights from data projects we had worked on as Quality Analysts.We will do so by presenting a case study and through that, we will discuss how QA (and Agile processes) specifically differs on Data Projects. At the end of this talk, you will be familiarwith several common QA practices to employ on Data Projects, which align with agile project delivery processes, and a few tools, which we used on these projects.