Archive for October 2015

MxxN = M^M^M^....M for N times - Interview Question (DP Analytics) [Solved in Python]

DP Analytics, a well-known analytics company came up with an interestingly logical written interview test.

Question:

2xx2 = 4
2xx3 = 16
2xx4 = 65536 = 2^16
3xx2 = 27
3xx3 = 7625597484987 = 3^27
4x3 = 64
2xxx3 = 65536
2xxx3 = 2xxx4

In general,
MxxN = M^M^M^....^M for N times
MxxxN = MxxMxxMxx.....xxM (N times)
and so on

Let the symbol 'x' be represented by a computer function 'T'. The inputs to 
'T' function are(left number, number of x, right number). so,

T(2,2,2)=4
T(2,2,3)=16
T(2,4,3)=T(2,3,4)


Solution:
 
def T(m,t,n):
 
    if t == 1:
        result = m ** n
        return result
    n = n + (t-2)  
    result = m    
    while(n>1):
        result = m ** result
        n = n-1
    return result
 
print T(3,2,3)
print T(2,2,3)
print T(4,1,3)
 
Understanding: For t=2, N-1 is the number of times M must be raised to the power of itself (eg: when N is 2, then M ^ M and when N is 3, then M ^ M ^ M) We built a logic that is only valid for xx hence if the number of x is greater than two then it is brought down to 2 and N is incremented by the same count, and when x is one, M ^ N is the actual result. Happy Coding!
Tuesday, October 13, 2015
Posted by Netbloggy

Learning SAS: How to draw simple BAR charts in SAS?

Any data analysis is not complete without visualization and now it's time for us to learn how to draw simple BAR charts in SAS. We've used PROC SGPLOT to illustrate this. SGPLOT works fine in SAS University Edition too.

Problem 1:

Using the SAS data set Bicycles, produce two vertical bar charts showing frequencies for Country and Model.

Solution:


*Data set BICYCLES;
 
data A15001.A01_bicycles;
    input Country  & $25. Model    & $14. Manuf    : $10. Units    :   5. 
        UnitCost :  comma8.;
    TotalSales=(Units * UnitCost) / 1000;
    format UnitCost TotalSales dollar10.;
    label TotalSales="Sales in Thousands" Manuf="Manufacturer";
    datalines;
USA  Road Bike  Trek 5000 $2,200
USA  Road Bike  Cannondale 2000 $2,100
USA  Mountain Bike  Trek 6000 $1,200
USA  Mountain Bike  Cannondale 4000 $2,700
USA  Hybrid  Trek 4500 $650
France  Road Bike  Trek 3400 $2,500
France  Road Bike  Cannondale 900 $3,700
France  Mountain Bike  Trek 5600 $1,300
France  Mountain Bike  Cannondale  800 $1,899
France  Hybrid  Trek 1100 $540
United Kingdom  Road Bike  Trek 2444 $2,100
United Kingdom  Road Bike  Cannondale  1200 $2,123
United Kingdom  Hybrid  Trek 800 $490
United Kingdom  Hybrid  Cannondale 500 $880
United Kingdom  Mountain Bike  Trek 1211 $1,121
Italy  Hybrid  Trek 700 $690
Italy  Road Bike  Trek 4500  $2,890
Italy  Mountain Bike  Trek 3400  $1,877
;
 
title 'Frequency of Countries';
proc sgplot data=A15001.A01_bicycles;
    vbar Country;
    xaxis label='Names of Country';
run;
 
title 'Frequency of Models';
proc sgplot data=A15001.A01_bicycles;
    hbar Model;
    xaxis label='Count';
run;

Output:



Learning:

  • How to draw simple graphs in SAS using SGPLOT
  • How to use different options like XAXIS LABEL while creating BAR Graphs


Friday, October 2, 2015
Posted by Netbloggy

Learning SAS: How to create a SAS report in .html file using ODS?

In our previous blogposts, we discussed about different ways of tweaking the way we display our output. Now it's time for something more advanced. We're in the age of Web and sometimes our company might require our output in HTML format either to upload online or in the company's intranet portal, in either cases it's not efficient to give our output to a web designer and ask him/her to create a .html equivalent of our summary. Rather SAS has a great stuff called Output Delivery Syste (ODS in short) that can just push a .HTML file our report.

Problem 1:

Sending the output to an HTML file. Issue the appropriate commands to prevent SAS from creating a listing file. (19:1)

Solution:


ods listing close;
ods html file='/folders/myfolders/iSAS/Assignment/2/college_report.html'; 

title "Sending Output to an HTML File";
 
proc print data=A15001.A01_college(obs=8) noobs;
run;
 
proc means data=A15001.A01_college n mean maxdec=2;
    var GPA ClassRank;
run;
 
ods;

Output:

Problem 2:

Run the same procedures as shown in Problem 1, except use the JOURNAL (or FANCYPRINTER) style instead of the default style. (19:3)

Solution:


ods listing close;
ods html file='/folders/myfolders/iSAS/Assignment/2/college_report.html' 
    style=FancyPrinter;
 
title "Sending Output to an HTML File";
 
proc print data=A15001.A01_college(obs=8) noobs;
run;
 
proc means data=A15001.A01_college n mean maxdec=2;
    var GPA ClassRank;
run;
 
ods;

Output:



Learning:

  • How to use ODS to output a HTML file of the report
  • How to use different styles while creating the HTML report file using ODS


Posted by Netbloggy

Learning SAS: How to sort variables while displaying the frequency output?

Sometimes it's not desirable just to display the output of the PROC FREQ as it is. As an analyst, sometimes our organization would require us to tweak it a bit.

Problem:

Using the SAS data set Blood, produce a table of frequencies for BloodType, in
frequency order. (17:7)

Solution:

/*Using the SAS data set Blood, produce a table of frequencies for BloodType, in
frequency order.*/
 
 
TITLE 'FREQUENCY OF CHOLESTROL GROUPED ORDERED BY INPUT DATA';
PROC FREQ DATA=A15001.A01_BLOOD ORDER=DATA;
 TABLE BLOODTYPE;
RUN; 
 
TITLE 'FREQUENCY OF CHOLESTROL GROUPED ORDERED BY INPUT FREQUENCY';
PROC FREQ DATA=A15001.A01_BLOOD ORDER=FREQ;
 TABLE BLOODTYPE;
RUN; 
TITLE;
 

Output:


Learning:

  • How to change the order of variables displayed in the PROC FREQ Output


Learning SAS: How to handle missing values while counting frequencies

Missing values are very obvious in any raw dataset and it's very important for an analyst to know to how to handle them. Especially while counting frequencies, Missing values can give misleading figures.

Problem:

Using the data set Blood, produce frequencies for the variable Chol (cholesterol). Use a format to group the frequencies into three groups: low to 200 (normal), 201 and higher (high), and missing. Run PROC FREQ twice, once using the MISSING option, and once without. Compare the percentages in both listings.

Solution:


 
PROC FORMAT;
 VALUE CHOLGRP 
  LOW-200 = 'NORMAL'
  201-HIGH = 'HIGH'
  OTHER = 'OTHERS';
RUN;
 
TITLE 'FREQUENCY OF CHOLESTROL GROUPED WITHOUT MISSING';
PROC FREQ DATA=A15001.A01_BLOOD;
 TABLE CHOL; 
 FORMAT CHOL CHOLGRP.;
RUN; 
 
 
PROC FORMAT;
 VALUE CHOLGRP 
  LOW-200 = 'NORMAL'
  201-HIGH = 'HIGH'
  . = 'MISSING'
  OTHER = 'OTHERS';
RUN;
 
TITLE 'FREQUENCY OF CHOLESTROL GROUPED INCLUDING MISSING';
PROC FREQ DATA=A15001.A01_BLOOD;
 TABLE CHOL /MISSING; 
 FORMAT CHOL CHOLGRP.;
RUN; 
TITLE;


Output:


Learning:

  • How to group variables while displaying the frequency using PROC FREQ
  • How to use user-defined formats in PROC FREQ
  • How to handle missing values in PROC FREQ


Posted by Netbloggy

Learning SAS: Counting Frequencies - PROC FREQ

As PROC MEANS is very helpful in performing various operations on Numeric variables, PROC FREQ can be used to count frequencies of both character and numeric variables,
in one-way, two-way (Crosstabs/Contingency Tables), and three-way tables.

Problem 1:

Using the SAS data set Blood, generate one-way frequencies for the variables Gender, BloodType, and AgeGroup. Use the appropriate options to omit the cumulative statistics and percentages (17.1)

Input:





Solution:

Title 'Frequency of Gender BloodType Age group without Cum.Freq. and %';
PROC FREQ DATA=A15001.A01_BLOOD; 
 TABLE GENDER BLOODTYPE AGEGROUP /NOCUM NOPERCENT;
RUN; 
Title;

Output:

Learning:

  • How to use PROC FREQ to build a one-way frequency table
  • Different Options of PROC FREQ like NOCUM and NOPERCENT



Posted by Netbloggy

Learning SAS: How to create Summary Dataset using PROC MEANS

As we mentioned in the previous post, PROC MEANS is our handy option to create a new summary dataset that can be used in other data steps or Procedures. Here we'll show how to create a summary dataset using PROC MEANS.

Problem:

Using the SAS data set College, create a summary data set (call it Class_Summary) containing the n, mean, and median of ClassRank and GPA for each value of SchoolSize. Use a CLASS statement and be sure that the summary data set only contains statistics for each level of SchoolSize. Use the AUTONAME option to name the variables in this data set.

Solution:


 
/* create summary dataset from proc means */
/* NWAY to display only the Schoolsize level type */
proc means data=A15001.A01_College NOPRINT NWAY;
    CLASS SchoolSize;
    var ClassRank GPA;
    Output out=A15001.A01_Class_Summary N=Mean=Median= /Autoname;
run;
 
Title 'Grouped Summary Statistics of ClassRank & GPA by Schoolsize';
 
proc print data=A15001.A01_Class_summary;
run;
 
Title;

Output:


Learning:
  • How to efficiently use PROC MEANS to create a summary dataset
  • What's the purpose of NWAY option in PROC MEANS
  • How to automatically name the variables in the newly created Summary dataset



Learning SAS: Summarizing Data (PROC MEANS)

We've seen the basic data manipulation options with SAS in our previous blogpost and it's time for us to understand how to report those processed data. And the first think that comes in this journey is PROC MEANS.

People think of PROC MEANS just as a way to calculate summary statistics of numeric values but these procedures are much more versatile and can be used to create summary data sets that can then be analyzed with more DATA or PROC steps.

Problem 1:

Using the SAS data set College, compute the mean, median, minimum, and maximum and the number of both missing and non-missing values for the variables ClassRank and GPA. Report the statistics to two decimal places. (Ref: Learning SAS by Example, Ron Cody, Chapter 16, Problem 1)

Input Data:


Solution:

 
Title 'Summary Statistics of ClassRank & GPA with two decimal pts';
 
proc means data=A15001.A01_College Mean Median Min Max NMiss N Maxdec=2;
    var ClassRank GPA;
run;
 
Title;

Output:

 

Problem 2:

Repeat Problem 1, except compute the desired statistics for each combination of Gender SchoolSize. Do this twice, once using a BY statement, and once using a CLASS statement.

Solution:

 
Title 'Grouped Summary Statistics of ClassRank & GPA with two decimal pts';
 
proc means data=A15001.A01_College Mean Median Min Max NMiss N Maxdec=2;
    Class Gender SchoolSize;
    var ClassRank GPA;
run;
 
Title;
 
/* Grouping with BY */
proc sort data=A15001.A01_College out=sorted_college;
    by Gender Schoolsize;
run;
 
Title 'Grouped Summary Statistics of ClassRank & GPA with two decimal pts';
 
proc means data=sorted_college Mean Median Min Max NMiss N Maxdec=2;
    by Gender SchoolSize;
    var ClassRank GPA;
run;
 
Title;

Output:




Learning:


  • Different ways of using PROC MEANS
  • Various options of PROC MEANS like MAXDEC
  • Grouping PROC MEANS Summary statistics


Popular Post

Blogger templates

Total Pageviews

Powered by Blogger.

- Copyright © nulldata -Metrominimalist- Powered by Blogger - Designed by Johanes Djogan -