Creating Charts using matplotlib in Python [Hands-on]
Data Storytelling is a very important branch of Data Science. Your world may not be as fond of numbers as you are hence it's very important to show them your results in the language that they understand. Hence for any language to be a member of the data science world, it's not only their data processing capabilities should be great but also the data visualizations should be exceptional and hence Python with packages like matplotlib is capable of competing in the world of R.
So let's try to represent the data of our previous post in terms of graphs/charts.
Problem:
Draw a bar graph with a dictionary counts that we built in our previous blogpost.
Takeaways:
Approach:
As we do for every new package, the first job is to import matplotlib package.
Now let's draw a bar graph with the values (vote) of the dictionary counts
Our graph is ready now but it's kind of naked (without labels ;) ) but let's show it!
But a graph with no labels would make no sense to anyone hence it's our duty to make sure that the graph's x-axis and y-axis are labelled correctly. Let's add them too!
And here's how the bar graph looks: beautiful isn't?
Download the source code here.
So let's try to represent the data of our previous post in terms of graphs/charts.
Problem:
Draw a bar graph with a dictionary counts that we built in our previous blogpost.
Takeaways:
- Basics of matplotlib
Approach:
As we do for every new package, the first job is to import matplotlib package.
import matplotlib.pyplot as plt
Now let's draw a bar graph with the values (vote) of the dictionary counts
plt.bar(range(len(counts)), counts.values(), align='center')
plt.show()
plt.ylabel(s = "Votes") plt.xticks(range(len(counts)), counts.keys(),rotation=90)
And here's how the bar graph looks: beautiful isn't?
Download the source code here.
Sunday, August 9, 2015
Posted by Netbloggy
Solving Voting Problem from OpenTechSchool with Python [Hands-on]
The best way to learn any programming language is to solve problems with it. While programming documentations can teach you syntax, you can get closer to the language only when you get hands-on with the code. So let's get started with our first problem in this python journey: Voting Problem
Problem:
We have 300 lines of survey data in the file radishsurvey.txt. Each line consists of a name, a hyphen, then a radish variety and so on. Our objective is to find answers for the following:
We have successfully built a dictionary with Radish variety as Key and Vote count as its Value and also we've handled the most important test case of printing the duplicate voters and disregarding their vote.
While this code can give us all the details that we wanted, we still manually need to go through every line to see the most voted and least voted variety. And we, programmers who are meant to be lazy, would want the program itself to tell us that too. Here's the code:
Here's the output after executing the code in python 2:
Our objectives are met and hope you've learnt something from this blogpost.
Download the entire python code here.
Problem:
We have 300 lines of survey data in the file radishsurvey.txt. Each line consists of a name, a hyphen, then a radish variety and so on. Our objective is to find answers for the following:
- What's the most popular radish variety?
- What are the least popular?
- Did anyone vote twice?
Takeaways:
In our attempt to solve this problem, we'll come across the following concepts of python:
- Reading & Cleaning a text file
- Basic String Operations
- Traversing a Dictionary & List
- Iterative Looping and Conditional Looping
- Defining and Calling a function
Approach:
We can read the file radishsurvey.txt and put its contents in a file object to traverse it. Like this:
radish_contents = open("radishsurvey.txt") for line in radish_contents:
Instead we can directly use the file open() function in our iteration to reduce one step. But before that we are creating an empty dictionary counts to store the vote counts and an empty list voted to track the duplicate voters. Comments in the below code explain the purpose of every step.
counts = {} voted = [] for line in open("radishsurvey.txt"): line1 = line.strip() #print line #remove this comment to see how the line would be printed without strip() name, vote = line1.split(" - ") vote = vote.strip().capitalize() #just to make the 'vote' elements in proper case vote = vote.replace(" "," ") #data cleaning: replacing two white spaces with one if name in voted: print name, "has already voted" #printing the voter's name who voted again continue #skip their vote and process the next line voted.append(name) #for first time voters: adding their name to voted list if vote not in counts: # First vote for this variety - make a new entry in dictionary and set value to 1 counts[vote] = 1 else: # Increment the vote count as the entry is already present in the dictionary counts[vote] = counts[vote] + 1
for item in counts: print item, counts[item]
While this code can give us all the details that we wanted, we still manually need to go through every line to see the most voted and least voted variety. And we, programmers who are meant to be lazy, would want the program itself to tell us that too. Here's the code:
def find_winner(counts): winner = "" pre_vote = 0 for vote in counts: if counts[vote] >= pre_vote: winner = vote pre_vote = counts[vote] return winner, pre_vote def find_loser(counts): loser, pre_vote = find_winner(counts) #calling a function inside another fn. for vote in counts: if counts[vote] < pre_vote: loser = vote pre_vote = counts[vote] return loser, pre_vote
Here's the output after executing the code in python 2:
Phoebe Barwell has already voted Procopio Zito has already voted White icicle 64 Snow belle 63 Champion 76 Cherry belle 58 French breakfast 72 Daikon 63 Bunny tail 72 Sicily giant 57 Red king 56 Plum purple 56 April cross 72 And the winner is Mr. Champion with 76 votes Sorry, the loser is Mr. Red king with 56 votes
Our objectives are met and hope you've learnt something from this blogpost.
Download the entire python code here.
How to install Python on your Computer? [Tutorial]
Data science is all about making sense of the data that we have. And for that purposes, two widely used languages are Python and R. So let's start with Python!
As every other high level programming language, your machine needs an interpreter to read the code (.py) and understand it. And for us to code (to create the .py file) any text editor would do the job but Python being an indentation-sensitive language, it's better to use some editor that would take care of the indentation part and also highlighting the built-in keywords so that the interface would look great. A software that does this job is called an IDE (integrated development environment) and for python there are many such IDEs.
A typical programmer being lazier than an average human being should always look for one package that has all these - an interpreter, an IDE and much more - so just one click should install everything related to python on your machine and there's such an application package called "Anaconda".
Whether you are running Windows, Linux or Macintosh - Jump in here and download your appropriate package!
Double-click the downloaded Anaconda setup and proceed with installation. You are done once the installation is finished.
Few things to be noted:
1. Anaconda comes with a huge set of Python packages which you primarily require for your data analysis and scientific calculations.
2. Windows & Linux users - You don't need to set the environment path to access python from any directory but Mac users might need to set the path (export PATH=~/anaconda/bin:$PATH)
3. Anaconda has a huge list of FAQs so check them if you have any trouble in getting this work.
4. After installation just open your command prompt or terminal and type spyder and if the spyder IDE opens, you're perfectly done with installation.
Happy pythoning!!!
As every other high level programming language, your machine needs an interpreter to read the code (.py) and understand it. And for us to code (to create the .py file) any text editor would do the job but Python being an indentation-sensitive language, it's better to use some editor that would take care of the indentation part and also highlighting the built-in keywords so that the interface would look great. A software that does this job is called an IDE (integrated development environment) and for python there are many such IDEs.
A typical programmer being lazier than an average human being should always look for one package that has all these - an interpreter, an IDE and much more - so just one click should install everything related to python on your machine and there's such an application package called "Anaconda".
Whether you are running Windows, Linux or Macintosh - Jump in here and download your appropriate package!
Double-click the downloaded Anaconda setup and proceed with installation. You are done once the installation is finished.
Few things to be noted:
1. Anaconda comes with a huge set of Python packages which you primarily require for your data analysis and scientific calculations.
2. Windows & Linux users - You don't need to set the environment path to access python from any directory but Mac users might need to set the path (export PATH=~/anaconda/bin:$PATH)
3. Anaconda has a huge list of FAQs so check them if you have any trouble in getting this work.
4. After installation just open your command prompt or terminal and type spyder and if the spyder IDE opens, you're perfectly done with installation.
Happy pythoning!!!
Hello World!
We, as an average internet user consume a lot of data from the web but the data that we (knowingly) publish online can be relatively NEGLIGIBLE except our daily Facebook posts. But just imagine that if we are in a universe where everyone is like us - just not caring about online contribution but just consuming data - at some point of time there wouldn't be any new data for us to consume.
Hence to deviate from the mass crowd and to become an online contributor, here's an attempt (Pushed by my professor from Praxis Business School).
Hello World!