Team Machine Learning

Author
Team Machine Learning

Blogs
From dorm rooms to boardrooms, Team has built a career connecting young talent to opportunity. Their writing brings fresh, student-centric views on tech hiring and early careers.
author’s Articles

Insights & Stories by Team Machine Learning

Team Machine Learning explores what today’s grads want from work—and how recruiters can meet them halfway. Expect a mix of optimism, strategy, and sharp tips.
Clear all
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Filter
Filter

8 Latest Artificial Intelligence Software (Apps) Challenging The Human Brain

Introduction

“In the past 2,000 years, the hardware in our brains has not improved… In the next 30 years, AI will overtake human intelligence,” says Softbank CEO Masayoshi Son.

If you’ve read Ray Kurzweil’s “The Singularity is Near: When Humans Transcend Biology,” you’d expect that AI is going to exhibit human-level intelligence in a decade or two. The startlingly thought-provoking work by the futurist gives you a fair picture of the road ahead, a time when humans, with the aid of advanced technologies, will “transcend their biological limitations.”

And you know what? This plausible scenario is at our doorstep. With superintelligence on the brink of becoming a reality, his words ring true, although they are downright scary. Computers and their growing abilities are likely to outpace our skills sooner than we think. $16 trillion will be added to the global economy by 2030, thanks to artificial intelligence.

Terms like artificial intelligence and machine learning have been bandied about for a while now. Despite the groundbreaking strides, in terms of intuition, vision, common sense, and language, there are miles to cover. Machines can’t still beat us at everything we do, but they’ve surely have outsmarted us in some ways.

This post talks about some amazing artificial intelligence software that are just so smart.

Latest Artificial Intelligence Software

1. Deep Mind’s AlphaGo

In 2016, AlphaGo was in the news for beating the 9-Dan top player Lee Sedol at Go. According to Wikipedia, the ancient Chinese game of Go is “an abstract strategy board game for two players, in which the aim is to surround more territory than the opponent.”

Watch this2 minute video:

The AI software from Google beat the South Korean grandmaster in a five-game match, winning 4­–1. Brute-force calculations will not work with this complex game. It needed much more.

AlphaGo used deep neural networks and advanced tree search to win. “AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving,” said David Silver, Go team’s main programmer. Of the two artificial networks used, the policy network predicted the next move and the value network evaluated the winner of every position on the board.

The team used the Google Cloud Platform for the massive computing power it needed. With advanced machine learning techniques, such as reinforcement learning, and fantastic engineering skills, DeepMind did much better than expected. The cyborg had to figure out how to win, and not just know how to mimic human moves.

This highly publicized event marked the beginning of a new era. Considering the magic of Moves 37 and 78, it was more a case of a human and machine than human against machine. This outcome has immense possibilities. Like computer scientist Andy Salerno says, “AlphaGo isn’t a mysterious beast from some distant unknown planet. AlphaGo is us. AlphaGo is our incessant curiosity. AlphaGo is our drive to push ourselves beyond what we thought possible.” You can read more here.

2. DeepStack

Quite like Go, Poker fell to the magic of AI as well. In a hands-on no-limit Texas hold’em game, DeepStack beat pro poker players. The algorithm had a staggering 450 milli big blinds per game when a professional player typically has a win rate of 50 milli big blinds per game. This is quite an achievement considering this version of poker has 10160 paths that are possible for each hand!

DeepStack is based more on “intuition” than on working out the moves ahead of time. The algorithm makes real-time decisions by computing fewer possibilities in a matter of seconds.

In their paper, a team of researchers from the Czech Technical University and Charles University in the Czech Republic and the University of Alberta in Canada, talks about the winning AI algorithm DeepStack, which “combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition that is automatically learned from self-play using deep learning.” A team from Carnegie Mellon has also developed another winning AI software called Libratus. However, game theory won’t hold for multi-player games.

This approach has important implications in other fields that have imperfect information such as medicine, finance, cybersecurity, and defense.

Machine learning challenge, ML challenge

3. AI Duet

An artificial “pianist” from Google’s Creative Lab, AI Duet was built in collaboration with Yotam Mann, developer/musician. Watch this short video and see it working:

In this video, he tells you how this AI software works using the concept of neural networks. This interactive experiment is part of Magenta, an open-source project from Google’s Google Brain unit. You can access the code here.AI Duet is built with Tone.js, TensorFlow, and other Magenta tools.

Who needs a partner when this virtual piano player will accompany you in a lilting duet!

Even if you are no Chopin, this intelligent software will respond to you and create a rhythm. It could even inspire you. It is not going to get you ready for a concert in Boston Symphony Hall, you could have some real fun hitting random notes and waiting for the computer to come back with something improvisational based on melodies it has been trained on.

4. COIN

It looks like artificial intelligence is revolutionizing investment banking. JPMorgan’s software COIN, which is an acronym for contract intelligence, has worked magic by “interpreting commercial loan agreements” in seconds, a task that previously cost 360,000 man hours.

COIN is based on machine learning concepts. The software is naturally less error-prone while checking loan-servicing agreements. A Bloomberg report said that JPMorgan is keen on “deploying the technology which learns by ingesting data to identify patterns and relationships. The bank plans to use it for other types of complex legal filings like credit-default swaps and custody agreements. Someday, the firm may use it to help interpret regulations and analyze corporate communications.”

The company believes that it is only the start of smart automation of processes in the financial industry. JP Morgan is committed to new initiatives. “We’re willing to invest to stay ahead of the curve, even if in the final analysis some of that money will go to product or a service that wasn’t needed,” said Marianne Lake, the finance chief.

5. LipNet

Lip reading has become so easy with University of Oxford’s Department of Computer Science’s AI software, LipNet. The team of researchers have detailed it in the paper titled Lipnet: End-to-end sentence-level lipreading.

The paper says, LipNet “maps a variable-length sequence of video frames to text, making use of spatiotemporal convolutions, a recurrent network, and the connectionist temporal classification loss, trained entirely end-to-end.”

Watch this short interesting video:

When you compare this neural network-based software to human lip readers where the accuracy is 12.3%, it has an accuracy of 46.8% while annotating video footage. “All existing [lip-reading approaches] perform only word classification, not sentence-level sequence prediction…. To the best of our knowledge, LipNet is the first lip-reading model to operate at sentence-level,” say the researchers. AI will soon be able to transcribe footage that has a low frame rate and poor image quality sooner than we think.

Apart from the immense help it will be to people who suffer from disabling hearing loss, the team is also interested in its practical possibilities such as “silent dictation in public spaces, covert conversations, speech recognition in noisy environments, biometric identification, and silent-movie processing.”

6. Philip

For those who fear the dark side of AI, this new “killer” program is just another factor reinforcing their misgivings. MIT’s Computer Science and Artificial Intelligence Laboratory has come up with “Philip,” who is out for blood in the popular Super Smash Bros Melee multiplayer video game.

It is based on neural networks and is an “in-game computer player that learned everything from scratch.” The team led by Vlad Firou fed the vicious AI coordinates of the gameplay objects. In their deep reinforcement learning technique, the computer played itself repeatedly in Nintendo’s popular console game.

The team used algorithms such as Actor-Critic and Q Learning to beat 10 top-ranked human players. Philip bested the players with a reaction time of 33 milliseconds and being 6 times faster than humans.

You can read the research paper here.

7. DeepCoder

Cambridge University and Microsoft have come up with deep learning-based software, called DeepCoder, that can write code on its own. “The approach is to train a neural network to predict properties of the program that generated the outputs from the inputs. We use the neural network’s predictions to augment search techniques from the programming languages community, including enumerative search and an SMT-based solver,” says the team in its research paper.

They used a domain-specific language to teach the system to solve online programming challenges involving 3 to 6 lines of code. The system practices and figures out what code combinations work best. Using program synthesis, DeepCoder puts together pieces of code from software that already exists just like a programmer would.

One of the researchers Marc Brockschmidt says, “We’re targeting the people who can’t or don’t want to code, but can specify what their problem is.”

8. GoogLeNet

A deep learning AI system from Google can detect cancer with better accuracy and speed than pathologists. Identifying tumors scanning images can be error-prone and laborious.

Here’s a video tutorial on learning about googlenet in detail:

Google says, “After additional customization, including training networks to examine the image at different magnifications (much like what a pathologist does), we showed that it was possible to train a model that either matched or exceeded the performance of a pathologist who had unlimited time to examine the slides.”

“We present a framework to automatically detect and localise tumours as small as 100 × 100 pixels in gigapixel microscopy images sized 100,000×100,000 pixels. Our method leverages a convolutional neural network (CNN) architecture and obtains state-of-the-art results on the Camelyon16 dataset in the challenging lesion-level tumour detection task,” writes Google’s team in its white paper.

Google will continue its research, working on larger datasets, to improve patient outcomes.

Summary

New possibilities and advances in artificial intelligence are pushing the boundaries of the human brain like never before. The brilliant artificial intelligence programs outlined in this post is only a glimpse into a terrifying future. If these trends continue, scientists believe that machines could surpass human capabilities sooner than later. But there really is no reason for mass hysteria as of now argues the other camp. Only time will tell, right?

Winning Tips on Machine Learning Competitions by Kazanova, Current Kaggle #3

Introduction

Machine Learning is tricky. No matter how many books you read, tutorials you finish or problems you solve, there will always be a data set you might come across where you get clueless. Specially, when you are in your early days of Machine Learning. Isn’t it ?

In this blog post, you’ll learn some essential tips on building machine learning models which most people learn with experience.These tips were shared by Marios Michailidis(a.k.a Kazanova), Kaggle Grandmaster, Current Rank #3 in a webinar happened on 5th March 2016. The webinar had three aspects:

  1. VideoWatch Here.
  2. Slides – Slides used in the video were shared by Marios. Indeed, an enriching compilation of machine learning knowledge. Below are the slides.
  3. Q & As – This blog enlists all the questions asked by participants at webinar.

The key to succeeding in competitions is perseverance. Marios said, ‘I won my first competition (Acquired valued shoppers challenge) and entered kaggle’s top 20 after a year of continued participation on 4 GB RAM laptop (i3)’.Were you planning to give up ?

While reading Q & As, if you have any questions, please feel free to drop them in comments!

Questions & Answers

1. What are the steps you follow for solving a ML problem? Please describe from scratch.

Following are the steps I undertake while solving any ML problem:

  1. Understand the data – After you download the data, start exploring features. Look at data types. Check variable classes. Create some univariate – bivariate plots to understand the nature of variables.
  2. Understand the metric to optimize – Every problem comes with a unique evaluation metric. It’s imperative for you to understand it, specially how does it change with target variable.
  3. Decide cross validation strategy – To avoid overfitting, make sure you’ve set up a cross validation strategy in early stages. A nice CV strategy willhelp you get reliable score on leaderboard.
  4. Start hyper parameter tuning– Once CV is at place, try improving model’s accuracy using hyper parameter tuning. It further includes the following steps:
    • Data transformations: It involve steps like scaling, removing outliers, treating null values, transform categorical variables, do feature selections, create interactions etc.
    • Choosing algorithms and tuning their hyper parameters: Try multiple algorithms to understand how model performance changes.
    • Saving results: From all the models trained above, make sure you save their predictions. They will be useful for ensembling.
    • Combining models: At last, ensemble the models, possibly on multiple levels. Make sure the models are correlated for best results.

Machine learning challenge, ML challenge

2. What are the model selection and data manipulation techniques you follow to solve a problem?

Generally, I try (almost) everything for most problems. In principle for:

  • Time series: I use GARCH, ARCH, regression, ARIMA models etc.
  • Image classification: I use deep learning (convolutional nets) in python.
  • Sound Classification :Common neural networks
  • High cardinality categorical (like text data): I use linear models, FTRL, Vowpal wabbit, LibFFM, libFM, SVD etc.

For everything else,I use Gradient boosting machines (like XGBoost and LightGBM) and deep learning (like keras, Lasagne, caffe, Cxxnet). I decide what model to keep/drop in Meta modelling with feature selection techniques.Some of the feature selection techniques I use includes:

  • Forward (cv or not) – Start from null model. Add one feature at a time and check CV accuracy. If it improves keep the variable, else discard.
  • Backward (cv or not) – Start from full model and remove variables one by one. It CV accuracy improves by removing any variable, discard it.
  • Mixed (or stepwise) – Use a mix of above to techniques.
  • Permutations
  • Using feature importance – Use random forest, gbm, xgboost feature selection feature.
  • Apply some stats’ logic such as chi-square test, anova.

Data manipulation could be different for every problem :

  • Time series : You can calculate moving averages, derivatives. Remove outliers.
  • Text : Useful techniques are tfidf, countvectorizers, word2vec, svd (dimensionality reduction). Stemming, spell checking, sparse matrices, likelihood encoding, one hot encoding (or dummies), hashing.
  • Image classification: Here you can do scaling, resizing, removing noise (smoothening), annotating etc
  • Sounds : Calculate Furrier Transforms , MFCC (Mel frequency cepstral coefficients), Low pass filters etc
  • Everything else : Univariate feature transformations (like log +1 for numerical data), feature selections, treating null values, removing outliers, converting categorical variables to numeric.

3. Can you elaborate cross validation strategy?

Cross validation means that from my main set, I create RANDOMLY 2 sets. I built (train) my algorithm with the first one (let’s call it training set) and score the other (let’s call it validation set). I repeat this process multiple times and always check how my model performs on the test set in respect to the metric I want to optimize.

The process may look like:

  • For 10 (you choose how many X) times
  • Split the set in training (50%-90% of the original data)
  • And validation (50%-10% of the original data)
  • Then fit the algorithm on the training set
  • Score the validation set.
  • Save the result of that scoring in respect to the chosen metric.
  • Calculate the average of these 10 (X) times. That how much you expect this score in real life and is generally a good estimate.
  • Remember to use a SEED to be able to replicate these X splits

Other things to consider is Kfold and stratified KFold . Read here.For time sensitive data, make certain you always the rule of having past predicting future when testing’s.

4. Can you please explain sometechniques usedfor cross validation?

  • Kfold
  • Stratified Kfold
  • Random X% split
  • Time based split
  • For large data, just one validation set could suffice (like 20% of the data – you don’t need to do multiple times).

5. How did you improve your skills in machine learning? What training strategy did you use?

I did a mix of stuff in 2. Plus a lot of self-research. Alongside,programming and software (in java) and A LOT of Kaggling ☺

6. Which are the most useful python libraries for a data scientist ?

Below are some libraries which I find most useful in solving problems:

  • Data Manipulation
    • Numpy
    • Scipy
    • Pandas
  • Data Visualization
    • Matplotlib
  • Machine Learning / Deep Learning
    • Xgboost
    • Keras
    • Nolearn
    • Gensim
    • Scikit image
  • Natural Language Processing
    • NLTK

7. What are useful ML techniques / strategies to impute missing values or predict categorical label when all the variables are categorical in nature.

Imputing missing values is a critical step. Sometimes you may find a trend in missing values. Below are some techniques I use:

  • Use mean, mode, median for imputation
  • Use a value outside the range of the normal values for a variable. like -1 ,or -9999 etc.
  • Replace witha likelihood – e.g. something that relates to the target variable.
  • Replace with something which makes sense. For example: sometimes null may mean zero
    • Try to predict missing values based on subsets of know values
    • You may consider removing rows with many null values

8. Can you elaborate what kind of hardware investment you have done i.e. your own PC/GPU setup for Deep learning related tasks? Or were you using more cloud based GPU services?

I won my first competition (Acquired valued shoppers challenge) and entered kaggle’s top 20 after a year of continued participation on 4 GB RAM laptop (i3). I was using mostly self-made solutions up to this point (in Java). That competition it had something like 300,000,000 rows of data of transactions you had to aggregate so I had to parse the data and be smart to keep memory usage at a minimum.

However since then I made some good investments to become Rank #1. Now, I have access to linux servers of 32 cores and 256 GBM of RAM. I also have a geforce 670 machine (for deep learning /gpu tasks) . Also, I use mostly Python now. You can consider Amazon’s AWS too, however this is mostly if you are really interested in getting to the top, because the cost may be high if you use it a lot.

9. Do you use high performing machine like GPU. or for example do you do thing like grid search for parameters for random forest(say), which takes lot of time, so which machine do you use?

I use GPUs (not very fast, like a geforce 670) for every deep learning training model. I have to state that for deep learning GPU is a MUST. Training neural nets on CPUs takes ages, while a mediocre GPU can make a simple nn (e.g deep learning) 50-70 times faster. I don’t like grid search. I do this fairly manually. I think in the beginning it might be slow, but after a while you can get to decent solutions with the first set of parameters! That is because you can sort of learn which parameters are best for each problem and you get to know the algorithms better this way.

10. How do people built around 80+ models is it by changing the hyper parameter tuning ?

It takes time. Some people do it differently. I have some sets of params that worked in the past and I initialize with these values and then I start adjusting them based on the problem at hand. Obviously you need to forcefully explore more areas (of hyper params in order to know how they work) and enrich this bank of past successful hyper parameter combinations for each model. You should consider what others are doing too. There is NO only 1 optimal set of hyper params. It is possible you get a similar score with a completely different set of params than the one you have.

11. How does one improve their kaggle rank? Sometimes I feel hopeless while working on any competition.

It’s not an overnight process. Improvement on kaggle or anywhere happens with time. There are no shortcuts. You need to just keep doing things. Below are some of the my recommendations:

  • Learn better programming: Learn python if you know R.
  • Keep learning tools (listed below)
  • Read some books.
  • Play in ‘knowledge’ competitions
  • See what the others are doing in kernels or in past competitions look for the ‘winning solution sections’
  • Team up with more experience users, but you need to improve your ranking slightly before this happens
  • Create a code bank
  • Play … a lot!

12. Can you tellus about some usefultools used in machine learning ?

Below is the list of my favourite tools:

13. How to start with machine learning?

I like these slides from the university of utah in terms of understanding some basic algorithms and concepts about machine learning. This book for python. I like this book too. Don’t forget to follow the wonderful scikit learn documentation. Use jupyter notebook from anaconda.

You can find many good links that have helped me in kaggle here. Look at ‘How Did you Get Better at Kaggle’

In addition, you should do Andrew Ng’s machine learning course. Alongside, you can follow some good blogs such as mlwave, fastml, analyticsvidhya. But the best way is to get your hands dirty. do some kaggle! tackle competitions that have the “knowledge” flag first and then start tackling some of the main ones. Try to tackle some older ones too.

14. What techniques perform best on large data sets on Kaggle and in general ? How to tackle memory issues ?

Big data sets with high cardinality can be tackled well with linearmodels. Consider sparse models. Tools like vowpal wabbit. FTRL , libfm, libffm, liblinear are good tools matrices in python (things like csr matrices). Consider ensembling (like combining) models trained on smaller parts of the data.

15. What is the SDLC (Sofware Development Life Cycle) of projects involving Machine Learning ?

  • Give a walk-through on an industrial project and steps involved, so that we can get an idea how they are used. Basically, I am in learning phase and would expect to get an industry level exposure.
  • Business questions: How to recommend products online to increase purchases.
  • Translate this into an ml problem. Try to predict what the customer will buy in the future given some data available at the time the customer is likely to make the click/purchase, given some historical exposures to recommendations
  • Establish a test /validation framework.
  • Find best solutions to predict best what customer chose.
  • Consider time/cost efficiency as well as performance
  • Export model parameters/pipeline settings
  • Apply these in an online environment. Expose some customers but NOT all. Keep test and control groups
  • Assess how well the algorithm is doing and make adjustments over time.

16. Which is your favorite machine learning algorithm?

It has to be Gradient Boosted Trees. All may be good though in different tasks.

15. Which language is best for deep learning, R or Python?

I prefer Python. I think it is more program-ish . R is good too.

16. What would someone trying to switch careers in data science need to gain aside from technical skills? As I don’t have a developer background would personal projects be the best way to showcase my knowledge?

The ability to translate business problems to machine learning, and transforming them into solvable problems.

17. Do you agree with the statement that in general feature engineering (so exploring and recombining predictors) is more efficient than improving predictive models to increase accuracy?

In principle – Yes. I think model diversity is better than having a few really strong models. But it depends on the problem.

18. Are the skills required to get to the leaderboard top on Kaggle also those you need for your day-to day job as a data scientist? Or do they intersect or are somewhat different? Can I make the idea of what a data scientist’s job is based on Kaggle competitions? And if a person does well on Kaggle does it follow that she will be a successful data scientist in her career ?

There is some percentage of overlap especially when it comes to making predictive models, working with data through python/R and creating reports and visualizations. What Kaggle does not offer (but you can get some idea) is:

  • How to translate a business question to a modelling (possibly supervised) problem
  • How to monitor models past their deployment
  • How to explain (many times) difficult concepts to stake holders.
  • I think there is always room for a good kaggler in the industry world. It is just that data science can have many possible routes. It may be for example that not everyone tends to be entrepreneurial in their work or gets to be very client facing, but rather solving very particular (technical) tasks.

19. Which machine learning concepts are must to have to perform well in a kaggle competition?

  • Data interrogation/exploration
  • Data transformation – pre-processing
  • Hands on knowledge of tools
  • Familiarity with metrics and optimization
  • Cross Validation
  • Model Tuning
  • Ensembling

20. How do you see the future of data scientist job? Is automation going to kill this job?

No – I don’t think so. This is what they used to say about automation through computing. But ended up requiring a lot of developers to get the job done! It may be possible that data scientists focus on softer tasks over time like translating business questions to ml problems and generally becoming shepherds’ of the process – as in managers/supervisors of the modelling process.

21. How to use ensemble modelling in R and Python to increase the accuracy of prediction. Please quote some real life examples?

You can see my github script as I explain different Machine leaning methods based on a Kaggle competition. Also, check this ensembling guide.

22. What is best python deep learning libraries or framework for text analysis?

I like Keras (because now supports sparse data), Gensim (for word 2 vec).

23. How valuable is the knowledge gained through these competitions in real life? Most often I see competitions won by ensembling many #s of models … is this the case in real life production systems? Or are interpretable models more valuable than these monster ensembles in real productions systems?

In some cases yes – being interpretable or fast (or memory efficient) is more important. Butthis is likely to change over time as people will be less afraid of black box solutions and focus on accuracy.

24. Should I worry about learning about the internals about the machine learning algorithms or just go ahead and try to form an understanding of the algorithms and use them (in competitions and to solve real life business problems) ?

You don’t need the internals. I don’t know all the internals. It is good if you do, but you don’t need to. Also there are new stuff coming out every day – sometimes is tough to keep track of it. That is why you should focus on the decent usage of any algorithm rather than over investing in one.

25. Which are the best machine learning techniques for imbalanced data?

I don’t do a special treatment here. I know people find that strange. This comes down to optimizing the right metric (for me). It is tough to explain in a few lines. There are many techniques for sampling, but I never had to use. Some people are using Smote. I don’t see value in trying to change the principal distribution of your target variable. You just end up with augmented or altered principal odds. If you really want a cut-off to decide on whether you should act or not – you may set it based on the principal odds.

I may not be the best person to answer this. I personally have never found it (significantly) useful to change the distribution of the target variable or the perception of the odds in the target variable. It may just be that other algorithms are better than others when dealing with this task (for example tree-based ones should be able to handle this).

26. Typically, marketing research problems have been mostly handled using standard regression techniques – linear and logistic regression, clustering, factor analyses, etc…My question is how useful are machine learning and deep learning techniques/algorithms useful to marketing research or business problems? For example how useful is say interpreting the output of a neural network to clients? Are there any resources you can refer to?

They are useful in the sense that you can most probably improve accuracy (in predicting let’s say marketing response) versus linear models (like regressions). Interpreting the output is hard and in my opinion it should not be necessary as we are generally moving towards more black box and complicated solutions.

As a data scientist you should put effort in making certain that you have a way to test how good your results are on some unobserved (test) data rather trying to understand why you get the type of predictions you are getting. I do think that decompressing information from complicating models is a nice topic (and valid for research), but I don’t see it as necessary.

On the other hand, companies, people, data scientists, statisticians and generally anybody who could be classified as a ‘data science player’ needs to get educated to accept black box solutions as perfectly normal. This may take a while, so it may be good to run some regressions along with any other modelling you are doing and generally try to provide explanatory graphs and summarized information to make a case for why your models perform as such.

27. How to build teams for collaboration on Kaggle ?

You can ask in forums (i.e in kaggle) . This may take a few competitions though before ’people can trust you’. Reason being, they are afraid of duplicate accounts (which violate competition rules), so people would prefer somebody who is proven to play fair. Assuming some time has passed, you just need to think of people you would like play with, people you think you can learn from and generally people who are likely to take different approaches than you so you can leverage the benefits of diversity when combining methods.

28. I have gone through basic machine learning course(theoretical) . Now I am starting up my practical journey , you just recommended to go through sci-kit learn docs & now people are saying TENSORFLOW is the next scikit learn , so should I go through scikit or TF is a good choice ?

I don’t agree with this statement ‘people are saying TENSORFLOW is the next scikit learn’. Tensorflow is a framework to do well certain machine learning tasks (like for deep learning). I think you can learn both, but I would start with scikit. I personally don’t know TensorFlow , but I use tools that are based on tensor flow (for example Keras). I am lazy I guess!

29. The main challenge that I face in any competition is cleaning the data and making it usable for prediction models. How do you overcome it ?

Yeah. I join the club! After a while you will create pipelines that could handle this relatively quicker. However…you always need to spend time here.

30. How to compute big data without having powerful machine?

You should consider tools like vowpal wabbit and online solutions, where you parse everything line by line. You need to invest more in programming though.

31. What is Feature Engineering?

In short, feature engineering can be understood as:

  • Feature transformation (e.g. converting numerical or categorical variables to other types)
  • Feature selections
  • Exploiting feature interactions (like should I combine variable A with variable B?)
  • Treating null values
  • Treating outliers

32. Which maths skills are important in machine learning?

Some basic probabilities along with linear algebra (e.g. vectors). Then some stats help too. Like averages, frequency, standard deviation etc.

33. Can you share your previous solutions?

See some with code and some without (just general approach).

34. How long should it take for you to build your first machine learning predictor ?

Depends on the problem (size, complexity, number of features). You should not worry about the time. Generally in the beginning you might spend much time on things that could be considered much easier later on. You should not worry about the time as it may be different for each person, given the programming, background or other experience.

35. Are there any knowledge competitions that you can recommend where you are not necessarily competing on the level as Kaggle but building your skills?

From here, both titanic and digit recognizer are good competitions to start. Titanic is better because it assumes a flat file. Digit recognizer is for image classification so it might be more advanced.

36. What is your opinion about using Weka and/or R vs Python for learning machine learning?

I like Weka. It has a good documentation– especially if you want to learn the algorithms. However I have to admit that it is not as efficient as some of the R and Python implementations. It has good coverage though. Weka has some good visualizations too – especially for some tree-based algorithms. I would probably suggest you to focus on R and Python at first unless your background is strictly in Java.

Summary

In short, succeeding in machine learning competition is all about learning new things, spending a lot of time training, feature engineering and validating models. Alongside, interact with community on forums, read blogs and learn from approach of fellow competitors.

Success is imminent, given that if you keep trying. Cheers!

8 Different Job Roles in Data Science / Big Data Industry

Introduction

“This hot new field promises to revolutionize industries from business to government, health care to academia,” says the New York Times. People have woken up to the fact that without analyzing the massive amounts of data that’s at their disposal and extracting valuable insights, there really is no way to successfully sustain in the coming years.

Touted as the most promising profession of the century, data science needs business savvy people who have listed data literacy and strategic thinking as their key skills. Anjul Bhambri, VP of Architecture at Adobe, says, “A Data Scientist is somebody who is inquisitive, who can stare at data and spot trends. It’s almost like a Renaissance individual who really wants to learn and bring change to an organization.” (She was previously IBM’s VP of Big Data Products.)

How do we get value from this avalanche of data in every sector in the economy? Well, we get persistent and data-mad personnel skilled in math, stats, and programming to weave magic using reams of letters and numbers.

Over the last few years, people have moved away from the umbrella term, data scientist. Companies now advertise for a diverse set of job roles such as data engineers, data architects, business analysts, MIS reporting executives, statisticians, machine learning engineers, and big data engineers.

In this post, you’ll get a quick overview about these exciting positions in the field of analytics. But do remember that companies often tend to define job roles in different ways based on the inner workings rather than market descriptions.

List of Job Roles in Data Science / Big Data

1. MIS Reporting Executive

Business managers rely on Management Information System reports to automatically track progress, make decisions, and identify problems. Most systems give you on-demand reports that collate business information, such as sales revenue, customer service calls, or product inventory, which can be shared with key stakeholders in an organization.

Skills Required:

MIS reporting executives typically have degrees in computer science or engineering, information systems, and business management or financial analysis. Some universities also offer degrees in MIS. Look at this image from the University of Arizona which clearly distinguishes MIS from CS and Engineering.

Roles & Responsibilities:

MIS reporting executives meet with top clients and co-workers in public relations, finance, operations, and marketing teams in the company to discuss how far the systems are helping the business achieve its goals, discern areas of concern, and troubleshoot system-related problems including security.

They are proficient in handling data management tools and different types of operating systems, implementing enterprise hardware and software systems, and in coming up with best practices, quality standards, and service level agreements. Like they say, an MIS executive is a “communication bridge between business needs and technology.”

Machine learning challenge, ML challenge

2. Business Analyst

Although many of their job tasks are similar to that of data analysts, business analysts are experts in the domain they work in. They try to narrow the gap between business and IT. Business analysts provide solutions that are often technology-based to enhance business processes, such as distribution or productivity.

Organizations need these “information conduits” for a plethora of things such as gap analysis, requirements gathering, knowledge transfer to developers, defining scope using optimal solutions, test preparation, and software documentation.

Skills Required:

Apart from a degree in business administration in the field of your choice, say, healthcare or finance, aspiring business analysts need to have knowledge of data visualization tools such as Tableau and requisite IT know-how, including database management and programming.

You could also major in computer science with additional courses that include statistics, organizational behavior, and quality management. Or you could get professional certifications such as the Certified Business Analysis Professional (CBAP®) or PMI Professional in Business Analysis (PBA). Many universities offer degrees in business intelligence, business analytics, and analytics. Check out the courses in the U.S/India.

Roles & Responsibilities:

Business analysts identify business needs, crystallizing the data for easy understanding, manipulation, and analysis via clear and precise requirements documentation, process models, and wireframes. They identify key gaps, challenges, and potential impacts of a solution or strategy.

In a day, a business analyst could be doing anything from defining a business case or eliciting information from top management to validating solutions or conducting quality testing. Business analysts need to be effective communicators and active listeners, resilient and incisive, to translate tech speak or statistical analysis into business intelligence.

They use predictive, prescriptive, and descriptive analysis to transform complex data into easily understood actionable insights for the users. A change manager, a process analyst, and a data analyst could well be doing business analysis tasks in their everyday work.

3. Data Analyst

Unlike data scientists, data analysts are more of generalists. Udacity calls them junior data scientists. They play a gamut of roles, from acquiring massive amounts of data to processing and summarizing it.

Skills Required:

Data analysts are expected to know R, Python, HTML, SQL, C++, and Javascript. They need to be more than a little familiar with data retrieval and storing systems, data visualization and data warehousing using ETL tools, Hadoop-based analytics, and business intelligence concepts. These persistent and passionate data miners usually have a strong background in math, statistics, machine learning, and programming.

Roles & Responsibilities:

Data analysts are involved in data munging and data visualization. If there are requests from stakeholders, data analysts have to query databases. They are in charge of data that is scraped, assuring the quality and managing it. They have to interpret data and effectively communicate the findings.

Optimization is must-know skill for a data analyst. Designing and deploying algorithms, culling information and recognizing risk, extrapolating data using advanced computer modeling, triaging code problems, and pruning data are all in a day’s work for a data analyst. For more information about how a data analyst is different from a data scientist.

4. Statistician

Statisticians collect, organize, present, analyze, and interpret data to reach valid conclusions and make correct decisions. They are key players in ensuring the success of companies involved in market research, transportation, product development, finance, forensics, sport, quality control, environment, education, and also in governmental agencies. A lot of statisticians continue to enjoy their place in academia and research.

Skills Required:

Typically, statisticians need higher degrees in statistics, mathematics, or any quantitative subject. They need to be mini-experts of the industries they choose to work in. They need to be well-versed in R programming, MATLAB, SAS, Python, Stata, Pig, Hive, SQL, and Perl.

They need to have strong background in statistical theories, machine learning and data mining and munging, cloud tools, distributed tools, and DBMS. Data visualization is a hugely useful skill for a statistician. Aside from industry knowledge and problem-solving and analytical skills, excellent communication is a must-have skill to report results to non-statisticians in a clear and concise manner.

Roles & Responsibilities:

Using statistical analysis software tools, statisticians analyze collected or extracted data, trying to identify patterns, relationships, or trends to answer data-related questions posed by administrators or managers. They interpret the results, along with strategic recommendations or incisive predictions, using data visualization tools or reports.

Maintaining databases and statistical programs, ensuring data quality, and devising new programs, models, or tools if required also come under the purview of statisticians. Translating boring numbers into exciting stories is no easy task!

5. Data Scientist

One of the most in-demand professionals today, data scientists rule the roost of number crunchers. Glassdoor says this is the best job role for someone focusing on work-life balance. Data scientists are no longer just scripting success stories for global giants such as Google, LinkedIn, and Facebook.

Almost every company has some sort of a data role on its careers page.Job Descriptions for data scientists and data analysts show a significant overlap.

Skills Required:

They are expected to be experts in R, SAS, Python, SQL, MatLab, Hive, Pig, and Spark. They typically hold higher degrees in quantitative subjects such as statistics and mathematics and are proficient in Big Data technologies and analytical tools. Using Burning Glass’s tool Labor Insight, Rutgers students came up with some key insights after running a fine-toothed comb through job postings data in 2015.

Roles & Responsibilities:

Like Jean-Paul Isson, Monster Worldwide, Inc., says, “Being a data scientist is not only about data crunching. It’s about understanding the business challenge, creating some valuable actionable insights to the data, and communicating their findings to the business.” Data scientists come up with queries.

Along with predictive analytics, they also use coding to sift through large amounts of unstructured data to derive insights and help design future strategies. Data scientists clean, manage, and structure big data from disparate sources. These “curious data wizards” are versatile to say the least—they enable data-driven decision making often by creating models or prototypes from trends or patterns they discern and by underscoring implications.

6. Data Engineer/Data Architect

“Data engineers are the designers, builders and managers of the information or “big data” infrastructure.” Data engineers ensure that an organization’s big data ecosystem is running without glitches for data scientists to carry out the analysis.

Skills Required:

Data engineers are computer engineers who must know Pig, Hadoop, MapReduce, Hive, MySQL, Cassandra, MongoDB, NoSQL, SQL, Data streaming, and programming. Data engineers have to be proficient in R, Python, Ruby, C++, Perl, Java, SAS, SPSS, and Matlab.

Other must-have skills include knowledge of ETL tools, data APIs, data modeling, and data warehousing solutions. They are typically not expected to know analytics or machine learning.

Roles & Responsibilities:

Data infrastructure engineers develop, construct, test, and maintain highly scalable data management systems. Unlike data scientists who seek an exploratory and iterative path to arrive at a solution, data engineers look for the linear path. Data engineers will improve existing systems by integrating newer data management technologies.

They will develop custom analytics applications and software components. Data engineers collect and store data, do real-time or batch processing, and serve it for analysis to data scientists via an API. They log and handle errors, identify when to scale up, ensure seamless integration, and “build human-fault-tolerant pipelines.” The career path would be Data Engineer?Senior Data Engineer?BI Architect?Data Architect.

7. Machine Learning Engineer

Machine learning (ML) has become quite a booming field with the mind-boggling amount of data we have to tap into. And, thankfully, the world still needs engineers who use amazing algorithms to make sense of this data.

Skills Required:

Engineers should focus on Python, Java, Scala, C++, and Javascript. To become a machine learning engineer, you need to know to build highly-scalable distributed systems, be sure of the machine learning concepts, play around with big datasets, and work in teams that focus on personalization.

ML engineers are data- and metric-driven and have a strong foundation in mathematics and statistics. They are expected to have experience in Elasticsearch, SQL, Amazon Web Service, and REST APIs. As always, great communication skills are vital to interpret complex ML concepts to non-experts.

Roles & Responsibilities:

Machine learning engineers have to design and implement machine learning applications/algorithms such as clustering, anomaly detection, classification, or prediction to address business challenges. ML engineers build data pipelines, benchmark infrastructure, and do A/B testing.

They work collaboratively with product and development teams to improve data quality via tooling, optimization, and testing. ML engineers have to monitor the performance and ensure the reliability of machine learning systems in the organization.

8. Big Data Engineer

What a big data solutions architect designs, a big data engineer builds, says DataFloq founder Mark van Rijmenam. Big data is a big domain, every kind of role has its own specific responsibilities.

Skills Required:

Big data engineers, who have computer engineering or computer science degrees, need to know basics of algorithms and data structures, distributed computing, Hadoop cluster management, HDFS, MapReduce, stream-processing solutions such as Storm or Spark, big data querying tools such as Pig, Impala and Hive, data integration, NoSQL databases such as MongoDB, Cassandra, and HBase, frameworks such as Flume and ETL tools, messaging systems such as Kafka and RabbitMQ, and big data toolkits such as H2O, SparkML, and Mahout.

They must have experience with Hortonworks, Cloudera, and MapR. Knowledge of different programming and scripting languages is a non-negotiable skill. Usually, people with 1 to 3 years of experience handling databases and software development is preferred for an entry-level position.

Roles & Responsibilities:

Rijmenam says “Big data engineers develop, maintain, test, and evaluate big data solutions within organizations. Most of the time they are also involved in the design of big data solutions, because of the experience they have with Hadoop[-]based technologies such as MapReduce, Hive, MongoDB or Cassandra.”

To support big data analysts and meet business requirements via customization and optimization of features, big data engineers configure, use, and program big data solutions. Using various open source tools, they “architect highly scalable distributed systems.” They have to integrate data processing infrastructure and data management.

It is a highly cross-functional role. With more years of experience, the responsibilities in development and operations; policies, standards and procedures; communication; business continuity and disaster recovery; coaching and mentoring; and research and evaluation increase.

Summary

Companies are running helter-skelter looking for experts to draw meaningful conclusions and make logical predictions from mammoth amounts of data. To meet these requirements, a slew of new job roles have cropped up, each with slightly different roles & responsibilities and skill requirements.

Blurring boundaries aside, these job roles are equally exciting and as much in demand. Whether you are a data hygienist, data explorer, data modeling expert, data scientist, or business solution architect, ramping up your skill portfolio is always the best way forward.

Look at these trends from Indeed.com

If you know exactly what you want to do with your coveted skillset comprising math, statistics, and computer science, then all you need to do is hone the specific combination that will make you a name to reckon with in the field of data science or data engineering.

To read more informative posts about data science and machine learning, go here.

17 Post Graduation Courses on Machine Learning & Data Science in the US and India

Introduction

We certainly have some interesting times to look forward to. All ed tech and career forecasts for this decade talk about artificial intelligence (AI) technologies, including machine learning, deep learning, and natural language processing, enabling digital transformation in ways that are quite “out there.”

To stay relevant in this economy, the brightest minds, naturally, want to stay ahead of the pack by specialising in these exciting fields.

Going back to school may not be a feasible or attractive route when looking for new career options for people who are already equipped with degrees in computer science, engineering, math, or statistics. So, they typically get certified from edX, Coursera, and Udacity. Read more top free courses from these ed platforms here.

In the U.S., many premier universities offer offline and online graduate programs in data science and only a few in machine learning. Some universities such as Johns Hopkins, Princeton, Rutgers, and University of Wisconsin–Madison offers machine learning/AI courses designed for data science, computer science, math, or stats graduate students.

But for students who can’t wait to learn on the job, we’ve put together a list of universities that offer graduate and/or PhD programs on the campus in the US and India.

Table of Contents

  1. Universities / Colleges in the US
    • Carnegie Mellon University, Pennsylvania
    • University of Washington, Washington
    • Colombia University, New York
    • Stanford University, California
    • Texas A & M University, Texas
    • New York University, New York
    • Georgia Tech, Georgia
    • North Carolina State University, North Carolina
    • Northwestern University, Illionis
    • UC Berkley, California
  2. Universities / Colleges in India
    • Great Lakes Institute of Management, Gurgaon / Chennai / Bengaluru
    • SP Jain School of Global Management, Pune
    • Narsee Monjee Institute of Management Studies, Mumbai
    • MISB Bocconi, Mumbai
    • Indian School of Business (ISB), Bengaluru
    • IIM Bangalore
    • Institute of Finance and International Management (IFIM), Bengaluru

Universities / Colleges in the US

1. Carnegie Mellon University, Pennsylvania

Situated in Pittsburgh, CMU has seven colleges and independent schools and is among the top 25 universities in the U.S. The Machine Learning Department offers three courses to introduce students to the concept of data-driven decision making:

  • Master of Science in Machine Learning which focuses on data mining.For information about the application procedure and deadlines, go here.
  • Secondary Master’s in Machine Learning which is open only to its PhD students, faculty, and staff.For information about admission requirements and application, go here.
  • Fifth Year Master’s in Machine Learning for its undergraduate students to get an MS by earning credits in ML courses.For information about program requirements and application, go here.
  • The Language Technologies Department offers a Master of Computational Data ScienceDegree.

2. University of Washington, Washington

UW’s Master of Science in Data Science degree teaches students to manage, model, and visualize big data. Expert faculty from six of the university’s departments who teach this fee-based course expect the students to have “a solid background mathematics, computer programming and communication.” The course is designed for working professionals, with evening classes on the campus, who can enroll as part-time or full-time students.

  • For information about the application procedure and deadlines, go here.
  • For information about financial aid and cost of study, go here.

UW’s Certificate in Data Science teaches basic math, computer science, and analytics to aspiring data scientists. Professionals are expected to know some SQL, programming, and statistics. Data storage and manipulation tools (e.g. Hadoop, MapReduce), core machine learning concepts, types of databases, and real-life data science applications are part of the curriculum.

3. Columbia University, New York

Its Master of Science in Data Science is a great option for careerists who want to switch to data science. Students need to earn 30 credits, 21 by taking the core courses, including machine learning, and 9 credits by working on an elective (Foundations of Data Science, Cybersecurity, Financial and Business Analytics, Health Analytics, New Media Sense, Collect and Move Data, Smart Cities) from the Data Science Institute. The university offers both part-time and full-time options.

  • For more course information, go here.

The department also has an online Certification of Professional Achievement in Data Sciences course. The Computer Science Department has a Machine Learning Track as a part of the MS degree in CS.

4. Stanford University, California

The Department of Statistics and Institute for Computational and Mathematical Engineering (ICME) offer an M.S. in Data Science, where it is a terminal degree for the former and a specialized track for ICME. There are several electives that range from machine learning to human neuroimaging methods for students, but strong math (linear algebra, numerical methods, probabilities, PDE, stats, etc.) and programming skills (C++, R) form the core of the course. Go to the homepage for more information about prerequisites and requirements.

  • For information about admissions and financial aid, go here.
Machine learning challenge, ML challenge

5. Texas A&M University, Texas

The Houston-based university has a Master of Science in Analytics degree offered by the Department of Statistics. The course is tailored for “working professionals with strong quantitative skills.” What’s more, students can access Mays Business School courses as well. The part-time course, with evening classes, takes two years to complete. The program, which focuses on statistical modeling and predictive analysis, does have an online option.

  • For information on course requirements, go here.

6. New York University, New York

The Master of Science in Data Science is for students with a strong programming and mathematical background. The Center for Urban Science and Progress and the Center for the Promotion of Research Involving Innovative Statistical Methodology work closely with the Center for Data Science. The university offers full-time and part-time options; students have to earn 36 credits and also have six electives to choose from. Tuition scholarships are available although not for university fees.

  • For more information about the course, go here.

7. Georgia Tech, Georgia

The on-campus Master of Science in Analytics program Georgia Tech offers opportunities to strengthen your skills in statistics, computing, operations research, and business. The instructors include experts from the College of Engineering, the College of Computing, and Scheller College of Business. Applicants to this premium tuition program are expected to be proficient in basic mathematical concepts such as calculus, statistics, and high-level computing languages such as C++ and Python. Depending on what their career goals are, students can choose from one of these tracks: Analytical Tools, Business Analytics, and Computational Data Analytics.

What’s great for the students is that the college has dedicated job placement assistance and chances to network with influencers in the data science industry.

  • For more information on how to apply, go here.

The College of Computing has courses in artificial intelligence (AI) and machine learning (ML) at the undergraduate and graduate levels; they do not award degrees in these.

8. North Carolina State University

The Institute for Advanced Analytics offers a 10-month long Master of Science in Analytics degree. The program is “innovative, practical, and relevant.” The Summer session includes Statistics primer and Analytics tools and foundation. The Practicum, which last eight months in the fall and spring, teaches you a range of topics including data mining, machine learning, optimization, simulation & risk, web analytics, financial analytics, data visualization, and business concepts such as project management.

  • For information about application requirements and procedures, go here.
  • For information about the tuition and fees, go here.

9. Northwestern University, Illinois

McCormick School of Engineering and Applied Science offers a 15-month full-time MS in Analytics degree. The faculty “combines mathematical and statistical studies with instruction in advanced information technology and data management.” The course has an 8-month practicum project, 3-month summer internship, and a 10-week capstone project. Scholarships that cover up to 50% of the tuition are available on merit basis.

  • For information about admission requirements and procedures, go here.
  • For information about the tuition and funding, go here.

10. UC Berkeley, California

Although the Master of Information and Data Science is an online course, students have to attend a week on campus. The curriculum covers areas in social science, policy research, statistics, computer science, and engineering. The full-time option takes 12 to 20 months; the university lets you complete the course part time as well.

  • For more information about the course, go here.

Universities / Colleges in India

1. Great Lakes Institute of Management

Great Lakes’ Post Graduate Program in Business Analytics and Business Intelligence has been ranked the best analytics course in the country by Analytics India Magazine. The course is designed for working professionals and is offered in its Chennai, Gurgaon, and Bengaluru campuses. The curriculum combines business management skills and analytics, including case studies and hands-on training in relevant tools such as Tableau, R, and SAS. Students have to attend 230 hours of classroom sessions and 110 hours of online sessions.

  • For more information about the program, go here.

2. SP Jain School of Global Management

Students can opt for the full-time or part-time options of the Big Data & Analytics program offered by the Mumbai-based institute. People with prior work experience are given preference. The program has 10 core courses including cutting-edge topics such as machine learning, data mining, predictive modeling, natural language processing, visualization techniques, and statistics. Industry experts and academicians focus on application-based learning, teaching students how to apply current tools and technologies to extract valuable insights from big data.

  • For more information about the program, go here.

3. Narsee Monjee Institute of Management Studies

It offers a 1-year Postgraduate Certificate Program in Business Analytics in partnership with University of South Florida. The course conducted in its Mumbai campus combines classroom training with online sessions. NMIMS will take 12 hours and USF Muma College of Business faculty will take 20 hours to instruct students on the current Business Analytical tools, methodologies, and technologies. Course covers topics such as introduction to statistics, database management, business intelligence and visualization, machine learning, big data analytics, data mining, financial analytics, and optimization. Students will learn how to tackle real-world business issues through the capstone project.

  • For more information about the program, go here.

4. MISB Bocconi

The 12-month Executive Program in Business Analytics is taught by renowned faculty from SDA Bocconi (Milan) and Jigsaw Academy at the Mumbai International School of Business Bocconi (MISB) campus in Mumbai. The course content comprises web analytics, statistics, visualization, R, time series, text mining, SAS, machine learning, Big Data (Sqoop, Flume, Pig, HBASE, Hive, Oozie, and SPARK), and digital marketing. Students learn core concepts of business analytics and its application across various domains.

  • For more information about the course curriculum, go here.

5. Indian School of Business (ISB)

ISB offers a Certificate Program in Business Analytics on its Hyderabad campus. The course is designed for working professionals (with at least 3 years of work experience) who have to spend 18 days at the institute during the 12-month program; a technology-aided learning platform takes over the rest of the time. The rigorous course is chock-full with lectures, projects, and assignments. The comprehensive curriculum also includes preparatory pre-term courses and a capstone project.

  • For more information about the course curriculum, go here.

6. IIM Bangalore

The year-long Certificate Program on Business Analytics and Intelligence comprises six modules and a project. The course content includes Data Visualization and Interpretation, Data Preprocessing and Imputation, Predictive Analytics: Supervised Learning Algorithms, Optimization Analytics, Stochastic Models, Data Reduction, Advanced Forecasting and Operations Analytics, Machine Learning Algorithms, Big Data Analytics,and Analytics in Finance and Marketing. The Institute would like the applicants to have a minimum of 3 years of work experience. Online classes are open to a limited number of participants, who must attend on-campus sessions as well.

  • For information about eligibility criteria, go here.
  • For information about the program fees, go here.

7. Institute of Finance and International Management (IFIM)

The Institute of Finance and International Management, Bangalore, offers a 15-month full-time Business Analytics program for working executives. Program features include live streaming and classroom sessions, opportunity to work with relevant IBM, OpenSource, and Microsoft software, and convenient weekend classes.

  • For more information about this program, go here.

Conclusion

With the huge amounts of data pouring in and the need to apply analytical solutions to address business challenges, the future looks brighter than ever for data scientists and machine learning experts. Salaries are naturally high for these much sought-after skills.

For programmers and statisticians, getting certified is the next step. For students looking to distinguish themselves, these are great career opportunities.

In this post, we have put together a list of graduate programs offered by highly ranked institutes and universities in the US and India. On-campus courses are interactive; nothing can beat face-to-face contact with the faculty and peers, the friends you make, and the easy access to relevant resources.

Top 13 (free) must read machine leaning books for beginners

Getting learners to read textbooks and use other teaching aids effectively can be tricky. Especially, when the books are just too dreary.

In this post, we’ve compiled great e-resources for you digital natives looking to explore the exciting world of Machine Learning and Neural Networks. But before you dive into the deep end, you need to make sure you’ve got the fundamentals down pat.

It doesn’t matter what catches your fancy, machine learning, artificial intelligence, or deep learning; you need to know the basics of math and stats—linear algebra, calculus, optimization, probability—to get ahead.

Top machine learning books to read for beginners

  1. Matrix Computations

    This 2013 edition by Golub and Van Loan, published by The Johns Hopkins University Press, teaches you about matrix analysis, linear systems, eigenvalues, discrete Poisson solvers, least squares, parallel LU, pseudospectra, Singular Value Decomposition, and much more.

    This book is an indispensable tool for engineers and computational scientists. It has great reviews on Amazon, especially by users looking for problems, discussions, codes, solutions, and references in numerical linear algebra.

    Free Book:Download here

  2. A Probabilistic Theory of Pattern Recognition

    Written by Devroye, Lugosi, and Györfi, this an excellent book for graduate students and researchers. The book covers various probabilistic techniques including nearest neighbour rules, feature extraction, Vapnik-Chervonenkis theory, distance measures, parametric classification, and kernel rules.

    Amazon reviewers laud it for its nearly 500 problems and exercises.

    Wikipedia says “The terms pattern recognition, machine learning, data mining and knowledge discovery in databases are hard to separate, as they largely overlap in their scope.”

    No wonder, machine learning enthusiasts swear by this comprehensive, theoretical book on “nonparametric, distribution-free methodology in Pattern Recognition.”

    Free Book:Download here

  3. Advanced Engineering Mathematics

    Erwin Kreyszig’s book beautifully covers the basics of applied math in a comprehensive and simplistic manner for engineers, computer scientists, mathematicians, and physicists.

    It teaches you Fourier analysis, vector analysis, linear algebra, optimization, graphs, complex analysis, and differential and partial differential equations.

    It has up-to-date and effective problem sets that ensure you understand the concepts clearly.

  4. Probability and Statistics Cookbook

    A collection of math and stats reference material from the University of California (Berkeley) and other sources put together by Matthias Vallentin, this cookbook is a must-have for learners.

    There are no elaborate explanations but concise representations of key concepts. You can view it on GitHub, or download a PDF file using the link below.

    Free Book:Download here

  5. An Introduction to Statistical Learning (with applications in R)

    This book written by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani is meant for non-math students.

    For data scientists, this is a valuable addition because of its R labs.

    The TOC includes linear regression, classification, resampling methods, linear model and regularization, tree-based methods, shrinkage approaches, clustering, support vector machines, and unsupervised learning.

    With interesting real-world examples and attractive graphics, this is a great text for statistical tools and techniques.

    Free Book:Download here

  6. Probabilistic Programming and Bayesian Methods for Hackers

    Cameron Davidson-Pilon describes Bayesian methods and probabilistic programming from math and computation perspectives.

    The book discusses modeling Bayesian problems using Python’s PyMC, loss functions, the Law of Large Numbers, Markov Chain Monte Carlo, priors, and so lots more.

    The content is open sourced. The print version has updated examples, EOC questions, and improved and extra sections.

    Free Book:Download here

  7. The Elements of Statistical Learning

    Authors Trevor Hastie, Robert Tibshirani, and Jerome Friedman (all three are Stanford professors) discuss supervised learning, linear methods of regression and classification, kernel smoothing methods, regularization, model selection and assessment, additive trees, SVM, neural networks, random forests, nearest neighbors, unsupervised learning, ensemble methods, and more.

    This book covers a broad range of topics is particularly useful for researchers interested in data mining and machine learning.

    You need to know linear algebra and some stats before you can appreciate the text.

    This is what one of the reviewers said about the book on Amazon: The Elements of Statistical Learning is a comprehensive mathematical treatment of machine learning from a statistical perspective.

    Free Book:Download here

  8. Bayesian Reasoning and Machine Learning

    David Barber’s books is a comprehensive piece of writing on graphical models and machine learning.

    Meant for final-year undergraduate and graduate students, this text has ample guidelines, examples,and exercises. The author also offers a MATLAB toolbox and a related website.

    It covers inference in probabilistic models including belief networks, inference in trees,the junction tree algorithm, decision trees; learning in probabilistic models including Naive Bayes, hidden variables and missing data, supervised and unsupervised linear dimension reduction, Gaussian processes, and linear models; dynamic models including discrete- and continuous-state model Markov models, and distribution computation; and approximate inference.

    Free Book:Download here

  9. Information Theory, Inference, and Learning Algorithms

    David MacKay exciting book discusses key concepts that form the core of machine learning, data mining, pattern recognition, bioinformatics, and cryptography.

    Amazon reviewers find the illustrations, depth, and “esoteric” approach remarkable.

    It is a great book on information theory and inference, which covers topics such as data compression, noisy-channel coding, probabilities, neural networks, and sparse graph codes.

    Free Book:Download here

  10. Deep Learning

    This what Elon Musk, co-founder of Tesla Motors, has to say about this definitive text written by Ian Goodfellow, Yoshua Bengio, and Aaron Courville: “Written by three experts in the field, Deep Learning is the only comprehensive book on the subject.”

    The authors talk about applied math and machine learning basics, deep networks and modern practices, and deep learning research.

    For engineers interested in neural networks, this could well be their bible.

    The book is highly recommended for people in academia, providing the required mathematical background to fully appreciate deep learning in its current state.

    Free Book:Download here

  11. Neural Networks and Deep Learning

    Michael Nielsen’s free online book is a comprehensive text on the core concepts of deep learning and artificial neural networks.

    The book has great interactive elements, but it does not provide solutions for the exercises. Laid out like a narrative, Nielsen holds onto core math and code to explain the key ideas.

    He talks about back propagation, hyper parameter optimization, activation functions, neural networks as functional approximates, regularization, a little about convolution neural networks, etc.

    The author includes valuable links to ongoing research and influential research papers and related tutorials.

    Free Book:Download here

  12. Supervised Sequence Labelling with Recurrent Neural Networks

    Alex Graves discusses how to classify and transcribe sequential data, which is important in part-of-speech tagging, gesture, handwriting, and speech recognition, and protein secondary structure prediction.

    He talks about the role of recurrent neural networks in sequence labeling.

    Long short-term memory, a comparison of network architectures, hidden Markov model hybrids, connectionist temporal classification, multidimensional networks, and hierarchical sub sampling networks are other chapters in this book.

    Free Book:Download here

  13. Reinforcement Learning: An Introduction

    Richard S. Sutton and Andrew G. Barto’s pioneering book onreinforcement learning covers the intellectual background, applications, algorithms, and the future of this exciting field. These University of Massachusetts Professors describe this artificial intelligence concept with clarity and simplicity.

    This book includes interesting topics such as Markov decision processes, Monte Carlo methods, dynamic programming, temporal-difference learning, eligibility traces, and artificial neural networks.

    Free Book:Download here

Summary

What’s better than getting educational resources that are free and authored by pioneers in the field?

Can’t think of a downside really…Especially for struggling students, these ebooks are a boon.

They don’t need to wait for the books to turn up at the library or swap with others;grab them and start learning!

So, what’s stopping you from picking up one of these excellent books and fashioning a successful career in data science, AI, or machine learning?