Which is the most popular Radish variety? Solved Using Python

Background:

Conducted a survey on Radish consumers and recorded their votes to their favorite radish variety in a txt file and now our data is ready for analysis.

Objective:

Is to answer two questions:-

  • To find most popular radish variety.
  • To find whether anybody voted twice (rigging!)

Tool used:

iPython Notebook version 3.4.3

Procedure:

Step 1: Reading txt file into iPython Notebook using f.open() function.

Step 2: Breaking each line (string) into two parts one is name of person who voted and the second is the name of radish variety using string functions like strip() and split().

Step 3: Now we will create a generic function to count how many people voted for each variety. We shall use if condition along with string functions to achieve our objective.

Gen Func

Step 4: We shall count all the votes by creating a dictionary, this is one of the most powerful tool in Python language.This makes our life easy by checking all the names of radish varieties and counting how many of the customers voted for each unique name in the dictionary.

Dictionary

Step 5: The output after the above step is not the desired one so we have to clean the output. We do that by using some of the string functions like capitalize() and strip(). This is mainly to clean typo errors in names for example ” Aditya”, “aditya”, “A ditya”. Computer understands all of them differently. So clean this we will be using the functions stated above.

Data_Munging

Step 6: The last but before cleaning step is to check fraudsters who voted twice using by creating a separated dictionary and checking each instance of vote with the dictionary and printing the names who voted twice as fraud and not counting the same fake votes.

Fraud Detection

Step 7: Finally, we will create three functions: one for cleaning names of persons who voted, names of radish varieties, two for checking whether anyone voted twice, third for counting votes of each radish variety.

Three functions

Final Step: Time to announce the winner after our analysis on the text file, we will use basic for and if loops to decide the winner. Here is the code snippet for same.

winner

This completes our analysis on the most consumed radish variety by the sample consumers. Out of total 11 varieties “The Champion” evolved as winner. By this we answered our two questions which were the objective of this analysis.

Let me know your thoughts about the code snippets used here.

Advertisements

One thought on “Which is the most popular Radish variety? Solved Using Python

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s