Dimpy Adhikary


Dimpy Adhikary

Experienced Test architect with 16+ years of industry experience, worked predominantly in web/API/mobile/DWH testing area and very well versed in developing testing strategies to complex architectural problems, establishing and implementing automated testing strategies. Helped many clients authoring, building, and adopting large-scale, cross-functional automation frameworks with well-balanced coverage across UI and back-end integration. 

Dimpy loves to share knowledge and participate in various events organized by testing communities. She has been a speaker at various events organized by Agile Testing Alliance, Selenium and Appium Conference.

Title: Follow a Tweet - BigData Pipeline Testing


In this digital world, every second the social media generates huge terabytes of data. This data is consumed by so many companies to transform into opportunities. Big Data plays a vital role when handling such a large volume of data. Big Data deals with volume, velocity and variety of data. The data can be in any form be it a message, log, xml, sensor data, photos, images etc. using Big data tech stack processing of data and performing analytics will be very insightful. The daily Google Searches, Facebook messages, likes, Twitter tweets that are generated is phenomenal. Businesses are utilizing this information in numerous ways, managing and analyzing it to get a competitive edge.

There are different types of DataPipelines like Batch Processing, Stream Processing. So different strategies are required to perform testing on these data pipelines. We are considering a near real time (Stream processing)

Use case:

Let us consider a new Sporting Company would like to target selling of sport equipment based on the pulse of the sport in each country.

The Olympics is the most trending topic currently. We shall try to extract all the tweets from Twitter and Load it into our ecosystem. Using this data we shall build analytics to get the top sport per country and target based on demography the corresponding sports equipment and present a visualization.

As part of this talk,we shall Follow The Tweets in the entire data pipeline and eventually see how the tweet is being utilized & analyzed. We are going to provide a detailed test strategy on how a tweet on Twitter is extracted and transformed and loaded into reports. The mindset that is required in Data Pipeline testing will be highlighted.A high level testing strategy of Twitter Data Pipeline will be demonstrated.

Outline/Structure of the Demonstration

Introduction and Agenda - 5 mins

Big Data Pipeline & Testing Strategy - 10 mins

Demo with Follow the tweet from Twitter - 10 mins

Code walkthrough - 5 mins

Learning Outcome

● Understanding the Big Data pipelines

● Testing Mindset for BigData

Target Audience

Enthusiastic Testers who are interested in understanding testing aspects of BigData

Prerequisites for Attendees

Basic knowledge of data & sql