A profile of the MailChimp founders:
https://www.forbes.com/sites/alexkonrad/2018/10/08/the-new-atlanta-billionaires-behind-an-unlikely-tech-unicorn
Keyhole surgery
Wednesday, October 17, 2018
Friday, October 12, 2018
A goldmine
https://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/
https://raw.githubusercontent.com/asadoughi/stat-learning/master/ch2/answers
http://www-bcf.usc.edu/~gareth/ISL/
https://raw.githubusercontent.com/asadoughi/stat-learning/master/ch2/answers
http://www-bcf.usc.edu/~gareth/ISL/
Monday, October 8, 2018
Saturday, October 6, 2018
Transitioning towards a data science career
Comparing read_csv with spark_read_csv
Reading in a csv file into R using dplyr's `read_csv()` function is so simple. The syntax & parameters of dplyr are fairly easy to remember, once you've done it a few times.
read_csv(file,
col_names = TRUE,
col_types = NULL,
locale = default_locale(),
na = c("", "NA"),
quoted_na = TRUE,
quote = "\"",
comment = "",
trim_ws = TRUE,
skip = 0, n_max = Inf,
guess_max = min(1000, n_max),
progress = show_progress()
)
I've only just started working with big data sets, & was began wondering if what I know about the dplyr syntax can be carried over to sparklyr's spark_read_csv() function.
While not exactly the same, but if you know one, you can quite easily pick the other. There's an additional parameter `sc`, aka spark connection, that's required.
spark_read_csv(
sc,
name,
path,
header = TRUE, # FALSE forces a "V_" prefix
columns = NULL,
infer_schema = TRUE, # to infer column data type
delimiter = ",",
quote = "\"",
escape = "\\",
charset = "UTF-8",
null_value = NULL,
options = list(),
repartition = 0, # number of partitions to distribute the generated table.
memory = TRUE,
overwrite = TRUE, ...
)
read_csv(file,
col_names = TRUE,
col_types = NULL,
locale = default_locale(),
na = c("", "NA"),
quoted_na = TRUE,
quote = "\"",
comment = "",
trim_ws = TRUE,
skip = 0, n_max = Inf,
guess_max = min(1000, n_max),
progress = show_progress()
)
I've only just started working with big data sets, & was began wondering if what I know about the dplyr syntax can be carried over to sparklyr's spark_read_csv() function.
While not exactly the same, but if you know one, you can quite easily pick the other. There's an additional parameter `sc`, aka spark connection, that's required.
spark_read_csv(
sc,
name,
path,
header = TRUE, # FALSE forces a "V_" prefix
columns = NULL,
infer_schema = TRUE, # to infer column data type
delimiter = ",",
quote = "\"",
escape = "\\",
charset = "UTF-8",
null_value = NULL,
options = list(),
repartition = 0, # number of partitions to distribute the generated table.
memory = TRUE,
overwrite = TRUE, ...
)
Tuesday, October 2, 2018
Inspirations for humorous speech contests
and John's Anatomy of the speech here
More movie based speeches - Johnny Cash - Walk the line
and this one on any speech in general - David Henderson
Monday, October 1, 2018
Reading list - week ending 30 Sep 2018
https://stratechery.com/2018/instagrams-ceo/
https://fs.blog/mental-models/
https://medium.com/@timberners_lee/one-small-step-for-the-web-87f92217d085 & related https://solid.inrupt.com/
https://www.nytimes.com/2018/09/28/science/neil-armstrong-auction.html
https://www.bloomberg.com/news/articles/2018-06-13/amazon-s-clever-machines-are-moving-from-the-warehouse-to-headquarters
https://fs.blog/mental-models/
https://medium.com/@timberners_lee/one-small-step-for-the-web-87f92217d085 & related https://solid.inrupt.com/
https://www.nytimes.com/2018/09/28/science/neil-armstrong-auction.html
https://www.bloomberg.com/news/articles/2018-06-13/amazon-s-clever-machines-are-moving-from-the-warehouse-to-headquarters
Subscribe to:
Posts (Atom)