## LATEST POSTS

### Assessing the attractiveness of an industry using the 5 forces

In this blog post, I aim to discuss how we, as economists understand the relative performance of companies that operate within different industries. This is a hugely important skill as looking at the

Strategy

### An overview of the industry lifecycle

The industry lifecycle provides a view of the typical birth to death cycle of an industry. It has the phases: introduction, growth, maturity and decline – we will use the

Strategy

### Corporate vs business strategy – what are they?

It’s been a little while, but we’re back with a series on business strategy and to kick it off, we’re going to look at the difference between corporate and business

Big Data

### Overview of decision trees & random forests

A decision tree builds a model in the form of a tree structure – almost like a flow chart. In order to calculate the expected outcome, it uses decision points

Big Data

### Using Spark in conjunction with Pandas

When completing my domain normalisation project, I used Spark to do the heavy lifting – getting data in to a dataframe & aggregating (group by and sum) and then used

Big Data

### Machine Learning: A simple logistic regression model in Python

The below is a logistic regression model, which uses some dummy data to determine whether people are at risk of diabetes or not – of course, this model couldn’t actually

Big Data

### Machine learning: A simple linear regression model in Python

Machine learning is described in detail in this article. Today, I want to run through a simple machine learning model, that uses linear regression. What is regression? Regression aims to

Big Data

### Hive: Partition an un-partitioned table

There is no way to automatically partition an un-partitioned table. So, we have to follow the below simple process as a workaround: Create a new table #SHOW THE CREATE STATEMENT

Big Data

### Types of machine learning models: supervised vs unsupervised

Supervised learning Supervised learning is where we provide the model with the actual outputs from the data. This let’s it build a picture of the data and form links between

Big Data

### How does machine learning work?

Machine learning uses statistical techniques to give computer systems the ability to ‘learn’ rather than being explicitly programmed. By learning from historical inputs. we’re able to achieve far greater accuracy

Big Data

### Python: finding most travelled customers

Using customer usage logs, I need to identify the customers that travel most each day and understand the most popular routes across the world. Justification:  Understanding the most travelled customers

Big Data

### Getting started with PySpark

In this article, we’ll look at some of the key components of PySpark, which is one of the most in-demand big data technologies at the current time. Spark Session Spark