Dimpy Adhikary

Speaker

Dimpy Adhikary

Experienced Test architect with 16+ years of industry experience currently playing a dynamic leadership role in a fast-paced service company responsible for developing testing strategies to complex architectural problems, establishing and implementing automated testing strategies and being a hands-on peer leader.

Very Hands-on current experience authoring, building and adopting large scale, cross-functional automation frameworks with well-balanced coverage across UI and back-end integration.Working closely with business in identifying solution requirements and key case-studies/scenarios for the future solution.

Leading implementation of the test plan/strategy from establishing project requirements and goals.
Participating in the full cycle of pre-sales activities with potential customers, development of proposals for implementation and design of the solution, presenting the proposed solution to customers, participation in customer demos.

Title:Follow a Tweet - BigData Pipeline Testing

Abstract:

In this digital world, every second the social media generates huge terabytes of data. This data is consumed by so many companies to transform into opportunities. Big Data plays a vital role when handling such a large volume of data. Big Data deals with volume, velocity and variety of data. The data can be in any form be it a message, log, xml, sensor data, photos, images etc. using Big data tech stack processing of data and performing analytics will be very insightful. The daily Google Searches, Facebook messages, likes, Twitter tweets that are generated is phenomenal. Businesses are utilizing this information in numerous ways, managing and analyzing it to get a competitive edge.

There are different types of DataPipelines like Batch Processing, Stream Processing. So different strategies are required to perform testing on these data pipelines. We are considering a near real time (Stream processing)

Use case:

Let us consider a new Sporting Company would like to target selling of sport equipment based on the pulse of the sport in each country.

The Olympics is the most trending topic currently. We shall try to extract all the tweets from Twitter and Load it into our ecosystem. Using this data we shall build analytics to get the top sport per country and target based on demography the corresponding sports equipment and present a visualization.

As part of this talk,we shall Follow The Tweets in the entire data pipeline and eventually see how the tweet is being utilized & analyzed. We are going to provide a detailed test strategy on how a tweet on Twitter is extracted and transformed and loaded into reports. The mindset that is required in Data Pipeline testing will be highlighted.A high level testing strategy of Twitter Data Pipeline will be demonstrated.

Outline/Structure of the Demonstration

Introduction and Agenda - 5 mins

Big Data Pipeline & Testing Strategy - 10 mins

Demo with Follow the tweet from Twitter - 10 mins

Code walkthrough - 5 mins

Learning Outcome

● Understanding the Big Data pipelines

● Testing Mindset for BigData

Target Audience

Enthusiastic Testers who are interested in understanding testing aspects of BigData

Prerequisites for Attendees

Basic knowledge of data & sql