Overview

There is a lot of public data present over the internet related but nothing is centralized. Even if centralized, it’s not much use for citizens of India, as people are not very familiar with data manipulation and extraction. The objective behind this problem is to empower the citizens of India to extract the knowledge or any insights without diving deep into the enormous datasets.

We would like the participants to build a smart and intuitive interface that behaves as an optimized search tool for the database provided. You can think of it as Google for a database.

All the candidates will have to go through a preliminary screening round of the technology test. The qualified candidates will be invited via email to participate in the hackathon. Dataset collected via open source websites on Indian data will be provided to you over email if you successfully qualify the preliminary screening round. This dataset contains district-level information on various sectors of India like health, education, telecommunication etc.

The following basic features must be implemented in the interface thus built:

  1. It should provide a column as a search result if asked for
  2. It should select a column and put a filter on the selected column to provide with the subsetted data or column that addresses the request
  3. It should provide with the values to certain standard functions like mean or standard deviation if asked for
  4. If the direct solution of a request is not found in the database then it should try and derive new columns from existing columns to address the request.
  5. If no direct or derived solution is found then it should display the most relevant data frame in which one might find what the request is asking for.

The above features can be implemented via numerous ways so you have an option to build it either by taking inputs through a web form viz-a-viz Version 1 OR by implementing NLP viz-a-viz Version 2

Either of Version 1 OR Version 2 will be considered as a complete submission for the Prizemoney. 30% extra marks will be added for successfully attempting version 2.

Summary of steps for completing the problem:

  • Step 1: Register yourself on the platform and upload your resume.
  • Step 2: Complete the preliminary screening round for the challenge
  • Step 3: Successful candidates will be invited via email. The required datasets will be shared in the invitation mail.
  • Step 4: Submit your solution for Version1 OR Version2 of the problem. Extra points for Version 2.

Here is the link to the preliminary screening round:

ALL THE BEST!!!

Themes

VERSION 1: DATABOT

A.1 The questions are asked by the user via a web form of selection of a number of drop down menus.
Example:
Metric drop down
State drop down → Selection aggregated or queries data over the state District drop down → Selection queries or aggregates data over the district

A.2 The user will also have the ability to create their own metric (ARTIFICIAL METRIC) by using two metrics and doing basic mu...

Read More
VERSION 2: DATA CHATBOT For the “experimentalists”

Let’s make this more interesting for the experimentalists amongst you. Let’s do the same as above, but this time with NLP. We recommend the participants to include NLP Packages in the interface developed for better performance and the relevant information on NLP for this could be found on Rasa.ai but we believe that there are numerous other methods and libraries that participant may resort to, mai...

Read More

Prizes

Main Prizes
Winners (3)

Upto 3 people will be selected based on their code to collaborate with Swaniti-Ank Aha on a part time or full-time basis to deliver the real version of Jaano India chatbot to the country. Each will also be awarded in ₹ 50,000 prize money each.

Social Share

Notifications
View All Notifications

?