1. Functions

Q1. Create a function that takes in a name as a string argument, and prints out “Hello name”.

hello_you <- function(name){
  print(paste("Greetings", name))
}

hello_you('Jedi')
## [1] "Greetings Jedi"

Q2. Create a function that will return the division of two integers 10 and 5.

prod <- function(num1, num2){
  return(num1 / num2)
}

prod(10, 5)
## [1] 2

2. Matrix

Q1 Create a 2 x 3 matrix with values from 1 to 6, populated horizontally.

mat1 <- matrix(1:6, nrow = 2, byrow = TRUE)
mat1
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6

Q3. confirm that mat is a matrix using is.matrix()

is.matrix(mat1)
## [1] TRUE

Q4. Create a 5 by 5 matrix consisting of the numbers 1-25 and assign it to the name mat2.

The top row should be the numbers 1-5.

mat2 <- matrix(1:25, nrow = 5, byrow = T)
mat2
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]    6    7    8    9   10
## [3,]   11   12   13   14   15
## [4,]   16   17   18   19   20
## [5,]   21   22   23   24   25

Q5. Using index notation, grab a sub-section of mat2

Hint: mat2[0:1, 2:3] returns 2 3

mat2[2:3, 2:3]
##      [,1] [,2]
## [1,]    7    8
## [2,]   12   13

Q6. Using index notation, grab a sub-section of mat2

mat2[4:5, 4:5]
##      [,1] [,2]
## [1,]   19   20
## [2,]   24   25

Q7. What is the sum of all the elements in mat2?

sum(mat2)
## [1] 325

Q8. Multiply each element of mat2 by 2

mat2 * 2
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    2    4    6    8   10
## [2,]   12   14   16   18   20
## [3,]   22   24   26   28   30
## [4,]   32   34   36   38   40
## [5,]   42   44   46   48   50

3. Data Frame

Q1. Load the data set “statex77.csv”

Assign it to df

df <- read.csv("statex77.csv", header = TRUE)
df
##                 X Population Income Illiteracy Life.Exp Murder HS.Grad
## 1         Alabama       3615   3624        2.1    69.05   15.1    41.3
## 2          Alaska        365   6315        1.5    69.31   11.3    66.7
## 3         Arizona       2212   4530        1.8    70.55    7.8    58.1
## 4        Arkansas       2110   3378        1.9    70.66   10.1    39.9
## 5      California      21198   5114        1.1    71.71   10.3    62.6
## 6        Colorado       2541   4884        0.7    72.06    6.8    63.9
## 7     Connecticut       3100   5348        1.1    72.48    3.1    56.0
## 8        Delaware        579   4809        0.9    70.06    6.2    54.6
## 9         Florida       8277   4815        1.3    70.66   10.7    52.6
## 10        Georgia       4931   4091        2.0    68.54   13.9    40.6
## 11         Hawaii        868   4963        1.9    73.60    6.2    61.9
## 12          Idaho        813   4119        0.6    71.87    5.3    59.5
## 13       Illinois      11197   5107        0.9    70.14   10.3    52.6
## 14        Indiana       5313   4458        0.7    70.88    7.1    52.9
## 15           Iowa       2861   4628        0.5    72.56    2.3    59.0
## 16         Kansas       2280   4669        0.6    72.58    4.5    59.9
## 17       Kentucky       3387   3712        1.6    70.10   10.6    38.5
## 18      Louisiana       3806   3545        2.8    68.76   13.2    42.2
## 19          Maine       1058   3694        0.7    70.39    2.7    54.7
## 20       Maryland       4122   5299        0.9    70.22    8.5    52.3
## 21  Massachusetts       5814   4755        1.1    71.83    3.3    58.5
## 22       Michigan       9111   4751        0.9    70.63   11.1    52.8
## 23      Minnesota       3921   4675        0.6    72.96    2.3    57.6
## 24    Mississippi       2341   3098        2.4    68.09   12.5    41.0
## 25       Missouri       4767   4254        0.8    70.69    9.3    48.8
## 26        Montana        746   4347        0.6    70.56    5.0    59.2
## 27       Nebraska       1544   4508        0.6    72.60    2.9    59.3
## 28         Nevada        590   5149        0.5    69.03   11.5    65.2
## 29  New Hampshire        812   4281        0.7    71.23    3.3    57.6
## 30     New Jersey       7333   5237        1.1    70.93    5.2    52.5
## 31     New Mexico       1144   3601        2.2    70.32    9.7    55.2
## 32       New York      18076   4903        1.4    70.55   10.9    52.7
## 33 North Carolina       5441   3875        1.8    69.21   11.1    38.5
## 34   North Dakota        637   5087        0.8    72.78    1.4    50.3
## 35           Ohio      10735   4561        0.8    70.82    7.4    53.2
## 36       Oklahoma       2715   3983        1.1    71.42    6.4    51.6
## 37         Oregon       2284   4660        0.6    72.13    4.2    60.0
## 38   Pennsylvania      11860   4449        1.0    70.43    6.1    50.2
## 39   Rhode Island        931   4558        1.3    71.90    2.4    46.4
## 40 South Carolina       2816   3635        2.3    67.96   11.6    37.8
## 41   South Dakota        681   4167        0.5    72.08    1.7    53.3
## 42      Tennessee       4173   3821        1.7    70.11   11.0    41.8
## 43          Texas      12237   4188        2.2    70.90   12.2    47.4
## 44           Utah       1203   4022        0.6    72.90    4.5    67.3
## 45        Vermont        472   3907        0.6    71.64    5.5    57.1
## 46       Virginia       4981   4701        1.4    70.08    9.5    47.8
## 47     Washington       3559   4864        0.6    71.72    4.3    63.5
## 48  West Virginia       1799   3617        1.4    69.48    6.7    41.6
## 49      Wisconsin       4589   4468        0.7    72.48    3.0    54.5
## 50        Wyoming        376   4566        0.6    70.29    6.9    62.9
##    Frost   Area
## 1     20  50708
## 2    152 566432
## 3     15 113417
## 4     65  51945
## 5     20 156361
## 6    166 103766
## 7    139   4862
## 8    103   1982
## 9     11  54090
## 10    60  58073
## 11     0   6425
## 12   126  82677
## 13   127  55748
## 14   122  36097
## 15   140  55941
## 16   114  81787
## 17    95  39650
## 18    12  44930
## 19   161  30920
## 20   101   9891
## 21   103   7826
## 22   125  56817
## 23   160  79289
## 24    50  47296
## 25   108  68995
## 26   155 145587
## 27   139  76483
## 28   188 109889
## 29   174   9027
## 30   115   7521
## 31   120 121412
## 32    82  47831
## 33    80  48798
## 34   186  69273
## 35   124  40975
## 36    82  68782
## 37    44  96184
## 38   126  44966
## 39   127   1049
## 40    65  30225
## 41   172  75955
## 42    70  41328
## 43    35 262134
## 44   137  82096
## 45   168   9267
## 46    85  39780
## 47    32  66570
## 48   100  24070
## 49   149  54464
## 50   173  97203

Q2. Display the first 6 rows of df

head(df)
##            X Population Income Illiteracy Life.Exp Murder HS.Grad Frost
## 1    Alabama       3615   3624        2.1    69.05   15.1    41.3    20
## 2     Alaska        365   6315        1.5    69.31   11.3    66.7   152
## 3    Arizona       2212   4530        1.8    70.55    7.8    58.1    15
## 4   Arkansas       2110   3378        1.9    70.66   10.1    39.9    65
## 5 California      21198   5114        1.1    71.71   10.3    62.6    20
## 6   Colorado       2541   4884        0.7    72.06    6.8    63.9   166
##     Area
## 1  50708
## 2 566432
## 3 113417
## 4  51945
## 5 156361
## 6 103766
names(df)
## [1] "X"          "Population" "Income"     "Illiteracy" "Life.Exp"  
## [6] "Murder"     "HS.Grad"    "Frost"      "Area"

Q3. What is the average life Expectancy for all the states?

mean(df[, "Life.Exp"])
## [1] 70.8786
mean(df$Life.Exp)
## [1] 70.8786

Q4. Select rows(states) that have Income that are higher than 5000

df[df$Income > 5000, ]
##               X Population Income Illiteracy Life.Exp Murder HS.Grad Frost
## 2        Alaska        365   6315        1.5    69.31   11.3    66.7   152
## 5    California      21198   5114        1.1    71.71   10.3    62.6    20
## 7   Connecticut       3100   5348        1.1    72.48    3.1    56.0   139
## 13     Illinois      11197   5107        0.9    70.14   10.3    52.6   127
## 20     Maryland       4122   5299        0.9    70.22    8.5    52.3   101
## 28       Nevada        590   5149        0.5    69.03   11.5    65.2   188
## 30   New Jersey       7333   5237        1.1    70.93    5.2    52.5   115
## 34 North Dakota        637   5087        0.8    72.78    1.4    50.3   186
##      Area
## 2  566432
## 5  156361
## 7    4862
## 13  55748
## 20   9891
## 28 109889
## 30   7521
## 34  69273

Q5. Select columns Population, Income, HS Grad, and Area

df[c('Population', 'Income', 'HS.Grad', 'Area')]
##    Population Income HS.Grad   Area
## 1        3615   3624    41.3  50708
## 2         365   6315    66.7 566432
## 3        2212   4530    58.1 113417
## 4        2110   3378    39.9  51945
## 5       21198   5114    62.6 156361
## 6        2541   4884    63.9 103766
## 7        3100   5348    56.0   4862
## 8         579   4809    54.6   1982
## 9        8277   4815    52.6  54090
## 10       4931   4091    40.6  58073
## 11        868   4963    61.9   6425
## 12        813   4119    59.5  82677
## 13      11197   5107    52.6  55748
## 14       5313   4458    52.9  36097
## 15       2861   4628    59.0  55941
## 16       2280   4669    59.9  81787
## 17       3387   3712    38.5  39650
## 18       3806   3545    42.2  44930
## 19       1058   3694    54.7  30920
## 20       4122   5299    52.3   9891
## 21       5814   4755    58.5   7826
## 22       9111   4751    52.8  56817
## 23       3921   4675    57.6  79289
## 24       2341   3098    41.0  47296
## 25       4767   4254    48.8  68995
## 26        746   4347    59.2 145587
## 27       1544   4508    59.3  76483
## 28        590   5149    65.2 109889
## 29        812   4281    57.6   9027
## 30       7333   5237    52.5   7521
## 31       1144   3601    55.2 121412
## 32      18076   4903    52.7  47831
## 33       5441   3875    38.5  48798
## 34        637   5087    50.3  69273
## 35      10735   4561    53.2  40975
## 36       2715   3983    51.6  68782
## 37       2284   4660    60.0  96184
## 38      11860   4449    50.2  44966
## 39        931   4558    46.4   1049
## 40       2816   3635    37.8  30225
## 41        681   4167    53.3  75955
## 42       4173   3821    41.8  41328
## 43      12237   4188    47.4 262134
## 44       1203   4022    67.3  82096
## 45        472   3907    57.1   9267
## 46       4981   4701    47.8  39780
## 47       3559   4864    63.5  66570
## 48       1799   3617    41.6  24070
## 49       4589   4468    54.5  54464
## 50        376   4566    62.9  97203

Q6. Create a new column called total Income, which is calculated by Population * Income

df[, 'Total Income'] <- df$Population * df$Income
df
##                 X Population Income Illiteracy Life.Exp Murder HS.Grad
## 1         Alabama       3615   3624        2.1    69.05   15.1    41.3
## 2          Alaska        365   6315        1.5    69.31   11.3    66.7
## 3         Arizona       2212   4530        1.8    70.55    7.8    58.1
## 4        Arkansas       2110   3378        1.9    70.66   10.1    39.9
## 5      California      21198   5114        1.1    71.71   10.3    62.6
## 6        Colorado       2541   4884        0.7    72.06    6.8    63.9
## 7     Connecticut       3100   5348        1.1    72.48    3.1    56.0
## 8        Delaware        579   4809        0.9    70.06    6.2    54.6
## 9         Florida       8277   4815        1.3    70.66   10.7    52.6
## 10        Georgia       4931   4091        2.0    68.54   13.9    40.6
## 11         Hawaii        868   4963        1.9    73.60    6.2    61.9
## 12          Idaho        813   4119        0.6    71.87    5.3    59.5
## 13       Illinois      11197   5107        0.9    70.14   10.3    52.6
## 14        Indiana       5313   4458        0.7    70.88    7.1    52.9
## 15           Iowa       2861   4628        0.5    72.56    2.3    59.0
## 16         Kansas       2280   4669        0.6    72.58    4.5    59.9
## 17       Kentucky       3387   3712        1.6    70.10   10.6    38.5
## 18      Louisiana       3806   3545        2.8    68.76   13.2    42.2
## 19          Maine       1058   3694        0.7    70.39    2.7    54.7
## 20       Maryland       4122   5299        0.9    70.22    8.5    52.3
## 21  Massachusetts       5814   4755        1.1    71.83    3.3    58.5
## 22       Michigan       9111   4751        0.9    70.63   11.1    52.8
## 23      Minnesota       3921   4675        0.6    72.96    2.3    57.6
## 24    Mississippi       2341   3098        2.4    68.09   12.5    41.0
## 25       Missouri       4767   4254        0.8    70.69    9.3    48.8
## 26        Montana        746   4347        0.6    70.56    5.0    59.2
## 27       Nebraska       1544   4508        0.6    72.60    2.9    59.3
## 28         Nevada        590   5149        0.5    69.03   11.5    65.2
## 29  New Hampshire        812   4281        0.7    71.23    3.3    57.6
## 30     New Jersey       7333   5237        1.1    70.93    5.2    52.5
## 31     New Mexico       1144   3601        2.2    70.32    9.7    55.2
## 32       New York      18076   4903        1.4    70.55   10.9    52.7
## 33 North Carolina       5441   3875        1.8    69.21   11.1    38.5
## 34   North Dakota        637   5087        0.8    72.78    1.4    50.3
## 35           Ohio      10735   4561        0.8    70.82    7.4    53.2
## 36       Oklahoma       2715   3983        1.1    71.42    6.4    51.6
## 37         Oregon       2284   4660        0.6    72.13    4.2    60.0
## 38   Pennsylvania      11860   4449        1.0    70.43    6.1    50.2
## 39   Rhode Island        931   4558        1.3    71.90    2.4    46.4
## 40 South Carolina       2816   3635        2.3    67.96   11.6    37.8
## 41   South Dakota        681   4167        0.5    72.08    1.7    53.3
## 42      Tennessee       4173   3821        1.7    70.11   11.0    41.8
## 43          Texas      12237   4188        2.2    70.90   12.2    47.4
## 44           Utah       1203   4022        0.6    72.90    4.5    67.3
## 45        Vermont        472   3907        0.6    71.64    5.5    57.1
## 46       Virginia       4981   4701        1.4    70.08    9.5    47.8
## 47     Washington       3559   4864        0.6    71.72    4.3    63.5
## 48  West Virginia       1799   3617        1.4    69.48    6.7    41.6
## 49      Wisconsin       4589   4468        0.7    72.48    3.0    54.5
## 50        Wyoming        376   4566        0.6    70.29    6.9    62.9
##    Frost   Area Total Income
## 1     20  50708     13100760
## 2    152 566432      2304975
## 3     15 113417     10020360
## 4     65  51945      7127580
## 5     20 156361    108406572
## 6    166 103766     12410244
## 7    139   4862     16578800
## 8    103   1982      2784411
## 9     11  54090     39853755
## 10    60  58073     20172721
## 11     0   6425      4307884
## 12   126  82677      3348747
## 13   127  55748     57183079
## 14   122  36097     23685354
## 15   140  55941     13240708
## 16   114  81787     10645320
## 17    95  39650     12572544
## 18    12  44930     13492270
## 19   161  30920      3908252
## 20   101   9891     21842478
## 21   103   7826     27645570
## 22   125  56817     43286361
## 23   160  79289     18330675
## 24    50  47296      7252418
## 25   108  68995     20278818
## 26   155 145587      3242862
## 27   139  76483      6960352
## 28   188 109889      3037910
## 29   174   9027      3476172
## 30   115   7521     38402921
## 31   120 121412      4119544
## 32    82  47831     88626628
## 33    80  48798     21083875
## 34   186  69273      3240419
## 35   124  40975     48962335
## 36    82  68782     10813845
## 37    44  96184     10643440
## 38   126  44966     52765140
## 39   127   1049      4243498
## 40    65  30225     10236160
## 41   172  75955      2837727
## 42    70  41328     15945033
## 43    35 262134     51248556
## 44   137  82096      4838466
## 45   168   9267      1844104
## 46    85  39780     23415681
## 47    32  66570     17310976
## 48   100  24070      6506983
## 49   149  54464     20503652
## 50   173  97203      1716816

Q7. the Life Exp column has two decimal places. Use round() to reduce this accuracy to only 1 decimal place

hint: use help(round) to see the documentation

df$Life.Exp <- round(df$Life.Exp, 1)
df
##                 X Population Income Illiteracy Life.Exp Murder HS.Grad
## 1         Alabama       3615   3624        2.1     69.0   15.1    41.3
## 2          Alaska        365   6315        1.5     69.3   11.3    66.7
## 3         Arizona       2212   4530        1.8     70.5    7.8    58.1
## 4        Arkansas       2110   3378        1.9     70.7   10.1    39.9
## 5      California      21198   5114        1.1     71.7   10.3    62.6
## 6        Colorado       2541   4884        0.7     72.1    6.8    63.9
## 7     Connecticut       3100   5348        1.1     72.5    3.1    56.0
## 8        Delaware        579   4809        0.9     70.1    6.2    54.6
## 9         Florida       8277   4815        1.3     70.7   10.7    52.6
## 10        Georgia       4931   4091        2.0     68.5   13.9    40.6
## 11         Hawaii        868   4963        1.9     73.6    6.2    61.9
## 12          Idaho        813   4119        0.6     71.9    5.3    59.5
## 13       Illinois      11197   5107        0.9     70.1   10.3    52.6
## 14        Indiana       5313   4458        0.7     70.9    7.1    52.9
## 15           Iowa       2861   4628        0.5     72.6    2.3    59.0
## 16         Kansas       2280   4669        0.6     72.6    4.5    59.9
## 17       Kentucky       3387   3712        1.6     70.1   10.6    38.5
## 18      Louisiana       3806   3545        2.8     68.8   13.2    42.2
## 19          Maine       1058   3694        0.7     70.4    2.7    54.7
## 20       Maryland       4122   5299        0.9     70.2    8.5    52.3
## 21  Massachusetts       5814   4755        1.1     71.8    3.3    58.5
## 22       Michigan       9111   4751        0.9     70.6   11.1    52.8
## 23      Minnesota       3921   4675        0.6     73.0    2.3    57.6
## 24    Mississippi       2341   3098        2.4     68.1   12.5    41.0
## 25       Missouri       4767   4254        0.8     70.7    9.3    48.8
## 26        Montana        746   4347        0.6     70.6    5.0    59.2
## 27       Nebraska       1544   4508        0.6     72.6    2.9    59.3
## 28         Nevada        590   5149        0.5     69.0   11.5    65.2
## 29  New Hampshire        812   4281        0.7     71.2    3.3    57.6
## 30     New Jersey       7333   5237        1.1     70.9    5.2    52.5
## 31     New Mexico       1144   3601        2.2     70.3    9.7    55.2
## 32       New York      18076   4903        1.4     70.5   10.9    52.7
## 33 North Carolina       5441   3875        1.8     69.2   11.1    38.5
## 34   North Dakota        637   5087        0.8     72.8    1.4    50.3
## 35           Ohio      10735   4561        0.8     70.8    7.4    53.2
## 36       Oklahoma       2715   3983        1.1     71.4    6.4    51.6
## 37         Oregon       2284   4660        0.6     72.1    4.2    60.0
## 38   Pennsylvania      11860   4449        1.0     70.4    6.1    50.2
## 39   Rhode Island        931   4558        1.3     71.9    2.4    46.4
## 40 South Carolina       2816   3635        2.3     68.0   11.6    37.8
## 41   South Dakota        681   4167        0.5     72.1    1.7    53.3
## 42      Tennessee       4173   3821        1.7     70.1   11.0    41.8
## 43          Texas      12237   4188        2.2     70.9   12.2    47.4
## 44           Utah       1203   4022        0.6     72.9    4.5    67.3
## 45        Vermont        472   3907        0.6     71.6    5.5    57.1
## 46       Virginia       4981   4701        1.4     70.1    9.5    47.8
## 47     Washington       3559   4864        0.6     71.7    4.3    63.5
## 48  West Virginia       1799   3617        1.4     69.5    6.7    41.6
## 49      Wisconsin       4589   4468        0.7     72.5    3.0    54.5
## 50        Wyoming        376   4566        0.6     70.3    6.9    62.9
##    Frost   Area Total Income
## 1     20  50708     13100760
## 2    152 566432      2304975
## 3     15 113417     10020360
## 4     65  51945      7127580
## 5     20 156361    108406572
## 6    166 103766     12410244
## 7    139   4862     16578800
## 8    103   1982      2784411
## 9     11  54090     39853755
## 10    60  58073     20172721
## 11     0   6425      4307884
## 12   126  82677      3348747
## 13   127  55748     57183079
## 14   122  36097     23685354
## 15   140  55941     13240708
## 16   114  81787     10645320
## 17    95  39650     12572544
## 18    12  44930     13492270
## 19   161  30920      3908252
## 20   101   9891     21842478
## 21   103   7826     27645570
## 22   125  56817     43286361
## 23   160  79289     18330675
## 24    50  47296      7252418
## 25   108  68995     20278818
## 26   155 145587      3242862
## 27   139  76483      6960352
## 28   188 109889      3037910
## 29   174   9027      3476172
## 30   115   7521     38402921
## 31   120 121412      4119544
## 32    82  47831     88626628
## 33    80  48798     21083875
## 34   186  69273      3240419
## 35   124  40975     48962335
## 36    82  68782     10813845
## 37    44  96184     10643440
## 38   126  44966     52765140
## 39   127   1049      4243498
## 40    65  30225     10236160
## 41   172  75955      2837727
## 42    70  41328     15945033
## 43    35 262134     51248556
## 44   137  82096      4838466
## 45   168   9267      1844104
## 46    85  39780     23415681
## 47    32  66570     17310976
## 48   100  24070      6506983
## 49   149  54464     20503652
## 50   173  97203      1716816

Q8. What is the average life expectancy for states that have more than 5000 income AND HS Grad is higher than 50?

mean(df[df$Income > 5000 & df$HS.Grad > 50, ]$Life.Exp)
## [1] 70.8125

Q9. What is the average life expectancy for states that have less than 4000 AND HS Grad is lower than 45?

mean(df[df$Income < 4000 & df$HS.Grad < 45, ]$Life.Exp)
## [1] 69.27778