programming

Scala notes III -- Classes and Objects

Classes and objects In Scala, a class is a blueprint for objects. Once you define a class, you can create objects from the class blueprint with the keyword new. Through the object you can use all functionalities of the defined class. An object is a named instance with members such as fields and methods. It is a class that has exactly one instance. There are three uses of objects.

Scala notes II -- Functions

Function Syntax Scala is a functional programming language, which means that functions are first-class citizens and you can pass them around as parameters or values. def add(x: Int, y: Int): Int = { return x + y; } println(add(21, 19)); // Other variants def multiply(x: Int, y: Int): Int = x * y // simplified version def divide(x: Int, y: Int) = x / y // can ignore the output type if it's obvious def substract(x: Int = 10, y: Int = 2) = x - y // set default values You can define the function names as operator, e.

Scala notes I

Data types Boolean true or false Byte 8 bit signed value Short 16 bit signed value Char 16 bit unsigned Unicode character Int 32 bit signed value Long 64 bit signed value Float 32 bit IEEE 754 single-precision float Double 64 bit IEEE 754 double-precision float String A sequence of characters Unit Corresponds to no value Null null or empty references Nothing subtype of every other type; includes no .

Git commands

Configuration Set up global configuration variables if you haven’t done so $ git config --global user.name "<name>" $ git config --global user.email "<email>" Git should automatically do a rebase when you do a pull, which is what you want $ git config branch.autosetuprebase always To set up configuration for DiffMerge, follow the guide http://coding4streetcred.com/blog/post/configure-diffmerge-for-your-git-difftool. Local Usage of Git Staging Check if there is any unstaged or untracked files

sparklyr (Spark in R)

Introduction The R programming language, along with RStudio, has become one of the most popular tools for data analysis as it contains a large amount of open-source packages developed by a community of statisticians. However, R or RStudio is not ideal for Big Data analysis as mostly the data would not fit into R memory. On the other hand, Spark has become the leading platform for big-data analytics. It works with the system to distribute data across clusters and process data in parallel.