Using Pandas for basic stock analysis


Pandas is excellent for data analyis as we’ve demonstrated in previous articles. In this article, I’m going to some basic stock analysis. To do this, I have downloaded some sample (historic) NASDAQ data and will use that as the basis of the study. Note: I have changed the price of some of the stocks, to ensure that at least one stock fails a condition within the codeself.

It’s important to mention that this is purely for the purpose of demonstrating Pandas functionality. This would not be an exercise carried out by investment analysts. However, subsequent articles will cover more ‘real-world’ applications.

Let’s get started:
First off, we need to import our libraries and load the txt document into a dataframe, as we have done before. The difference here is that the headers don’t exist in the file & hence can’t be inferred. So, we explicitly define them.

Next, we need to add a calculated field to the dataframe. In the below, we’re calculating the percentage change between the open price & the closing price.

Next, I like to sort the values in the dataframe. This isn’t entirely necessay, but it’s useful for visual validation.

Now, I want to create a subset of this dataframe. This will reduce it from hundreds of rows, down to only those that are in my test portfolio.

I don’t want to monitor the stock changes daily, so I only want to output those that have a lower percent change than I am happy with. So, I’ve created a parameter and filter my portfolio to show only those records where the percent change is lower than my parameter.

Next, I want to save the risk items to a csv, in my historical data folder:

Now, I want to read yesterdays file into a dataframe and load both towday & yesterdays risk items into smaller dataframes & then append the two dataframes, to make one dataframe with both days data:

Next, I want to look at the mean percent change on the newly joined dataframe, which will show me the mean percent change for all the items I’ve categorised as ‘risk’

I can then compare this against the average percent change of my overall portfolio

Then, I convert my dataframes to be lists & print all the tickers that appear in both today’s and yesterday’s lists – which would flag that they’re consistently problematic / risky.