Jul 19, 2011

[Fixed] Extracting EOD Data from NSE

My prime interest being the Indian financial markets, the first step would be to get the data to play around. NSE India provides EOD of data as bhavcopies. The same are stored as zipped files at their servers. Downloading them one by one for a larger time frame will be very tedious, hence I will attempt to automate the process.

There is a great tool for statistical computing called R. It is open-source with a lot of development being done across various packages. This interests me a lot because of it simplicity and power. I would make my attempt of automating bhavcopy downloads using this software. If you want to try the same, you can visit the downloads section of R-Project and get the latest version


Objective: Download Bhavcopy (Equity) from http://www.nseindia.com and save only relevant columns Date, Symbol, Open, High, Low, Close, Last and Volume.

To download the Bhavcopy (Equity) from http://www.bseindia.com refer to this post.



Here is the R Code for the same

#28-10-2014: Fix for '403 Forbidden'
## Credit http://stackoverflow.com/questions/26086868/error-downloading-a-csv-in-zip-from-website-with-get-in-r
library(httr)
 
#Define Working Directory, where files would be saved
setwd('C:/Users/Gaurav/Documents/R Code/Working')
 
Define start and end dates, and convert them into date format
startDate = as.Date("2014-01-01", order="ymd")
endDate = as.Date("2014-10-28", order="ymd")
 
#work with date, month, year for which data has to be extracted
myDate = startDate
zippedFile <- tempfile() 
 
while (myDate <= endDate){
  filenameDate = paste(as.character(myDate, "%y%m%d"), ".csv", sep = "")
 monthfilename=paste(as.character(myDate, "%y%m"),".csv", sep = "")
 downloadfilename=paste("cm", toupper(as.character(myDate, "%d%b%Y")), "bhav.csv", sep = "")
 temp =""
 
  #Generate URL
 myURL = paste("http://www.nseindia.com/content/historical/EQUITIES/", as.character(myDate, "%Y"), "/", toupper(as.character(myDate, "%b")), "/", downloadfilename, ".zip", sep = "")
 
  #retrieve Zipped file
  tryCatch({
  #Download Zipped File
 
#28-10-2014: Fix for '403 Forbidden'
  #download.file(myURL,zippedFile, quiet=TRUE, mode="wb",cacheOK=TRUE)
  GET(myURL, user_agent("Mozilla/5.0"), write_disk(paste(downloadfilename,".zip",sep="")))
 
 
  #Unzip file and save it in temp 
#28-10-2014: Fix for '403 Forbidden'
  temp <- read.csv(unzip(paste(downloadfilename,".zip",sep="")), sep = ",") 
 
  #Rename Columns Volume and Date
  colnames(temp)[9] <- "VOLUME"
  colnames(temp)[11] <- "DATE"
 
  #Define Date format
  temp$DATE <- as.Date(temp$DATE, format="%d-%b-%Y")
 
  #Reorder Columns and Select relevant columns
  temp<-subset(temp,select=c("DATE","SYMBOL","OPEN","HIGH","LOW","CLOSE","LAST","VOLUME"))
 
  #Write the BHAVCOPY csv - datewise
  write.csv(temp,file=filenameDate,row.names = FALSE)
 
  #Write the csv in Monthly file
  if (file.exists(monthfilename))
  {
   write.table(temp,file=monthfilename,sep=",", eol="\n", row.names = FALSE, col.names = FALSE, append=TRUE)
  }else
  {
   write.table(temp,file=monthfilename,sep=",", eol="\n", row.names = FALSE, col.names = TRUE, append=FALSE)
  }
 
 
  #Print Progress
  #print(paste (myDate, "-Done!", endDate-myDate, "left"))
 }, error=function(err){
  #print(paste(myDate, "-No Record"))
 }
 )
  myDate <- myDate+1
  print(paste(myDate, "Next Record"))
}
 
 #Delete temp file - Bhavcopy
 junk <- dir(pattern="cm")
 file.remove(junk)


Created by Pretty R at inside-R.org

30 comments:

  1. Nice blog would like to add that NSE and BSE are one of the most superior stock exchanges of India. If you wish to earn good money from the share market then you need to understand the functionality of the stock market properly.

    Indian stock market offers lot of earning opportunities still many less traders earn from it. Now the question is who earns from the share trading? To be honest only those who rely on stock research as no one can earn big by speculating in the market.

    Regards
    SHARETIPSINFO TEAM

    ReplyDelete
  2. Thanks.

    I agree with you, and that is what I aim to learn.. finding right opportunities.

    If you have any suggestions/ideas, Feel free to share

    ReplyDelete
  3. Really a great blog.Tons of appreciations.
    I am a beginner in R and this is really quite helpful. I am thankful because using those codes I could download EOD data of stocks trading on NSE. I have one request as well. I am in need of derivatives data (both futures and options) for all stocks trading in derivatives segment since 2004 and I tried to manipulate those codes as per requirement but perhaps I am making some mistake and not able to do it. It would really be of immense help if you lend your helping hand in this matter.
    Thanks in anticipation.
    Regards
    Rajesh

    ReplyDelete
  4. @Rajesh: Thanks for appreciation. Let me know what exactly you want from derivatives data, and where you are faltering. Shall try my best to help you out!

    Let me know, how you want the data to be finally combined, columns, series etc

    ReplyDelete
  5. Thanks a lot for your quick response.
    I am willing to extract data of futures and options for all stocks trading in derivatives segment on NSE since January 2004 till December 2010. I need the near month expiry data with columns EXPIRY_DT,STRIKE_PR,OPTION_TYP,OPEN, HIGH, LOW, CLOSE,SETTLE_PR, CONTRACTS, VAL_INLAKH,OPEN_INT, CHG_IN_OI, for every stock in separate file such that I have three files for every stock containing data of futures, call options and put options separately.
    I shall always be thankful for your kind help in this regard.
    Thanks and Regards
    Rajesh.

    ReplyDelete
  6. @Rajesh Your desire was a planned future exercise for me. Are you into R programming, or you are just interested in data?

    If you are into R, I can give you a start, so that you can do it on your own. Else, let me know, I will code the required logic

    Also, I am curious on the logic of having three different files? What is the tool you want to use for your analysis? Other than excel, if I am not mistaken, it would make more sense in having single file with everything in it... or is there something I am missing?

    ReplyDelete
  7. Thank you gentleman. It really feels happy to come across a helping and responsive person like you. I am not into R programming and just started using it for basic statistical use on my own by reading through on line publications. I need those data for my research and the logic of three files is nothing more than making data handling more convenient for a beginner like me. one single file is perfectly fine containing all observations of my requirement. I would be thankful for your help to provide me with the codes that performs as per requirement.

    ReplyDelete
  8. Dear Rajesh,
    My code is ready, and I am testing it, though it is taking ages to process, as each day's record has to be split into 232 symbol files * 3 sets (futures, calls, puts). A single days download and process takes 5 minutes.

    Not the best in terms of performance, and would take days to download the volumes you require.

    A single file for a day containing all records would not take more than few seconds to download. Hope, you are OK with this.

    Let me know, if I can share the code with such terrible performance.

    ReplyDelete
  9. Thank you dear. It would indeed be a great help in the way that other than this I have only few doubtful options to do the same. Please share it. Thanks a lot from core of my heart.

    ReplyDelete
  10. Hi..
    can I request for a help. I have collected data from nse and i have nifty derivatives (futures and options)data in separate csv files. I want to sort options data in such a way that for everyday only the options with with highest no. of contracts traded remain in my new datafile. Could you help me with code of this as an example.
    thanks in anticipation

    ReplyDelete
  11. @Rajesh, your requirement is not clear. Can you elaborate please?

    My understanding is -> You have daily files for futures and options, now for each symbol, you need one record with highest volume

    Can you please send a mail to me (email address on my profile)?

    ReplyDelete
  12. Thanks for your reply.
    you got it partially correct. my apologies for sounding dubious. Actually I have daily data of Nifty index futures and options and for options I want to sort data in the way that for each trading day I am left with data of option with highest volume. for ex if its at strike price 100 then only data related to strike price 100 like its option price, volume, time to maturity, open interest etc. remains in the dataset. One more query, can i import multiple sheets from an excel file at a time in R.
    in anticipation of your response.

    ReplyDelete
  13. i tried the same.... but i am recieving this error, plz help
    1: In download.file(myURL, zippedFile, quiet = TRUE, mode = "wb") :
    cannot open: HTTP status was '404 Not Found'
    2: In download.file(myURL, zippedFile, quiet = TRUE, mode = "wb") :
    cannot open: HTTP status was '404 Not Found'

    ReplyDelete
    Replies
    1. @Shashank, The issue has been fixed. Thanks for highlighting it. Apologies for the delay, I was too busy in other activities

      Delete
  14. I am also getting '403 Forbidden' error. but when i enter the url directly into the browser it is downloading.
    Below is the error
    Error in download.file(myURL, zippedFile, quiet = TRUE, mode = "wb") :
    cannot open URL 'http://nseindia.com/content/historical/EQUITIES/2014/AUG/cm25AUG2014bhav.csv.zip'
    In addition: Warning message:
    In download.file(myURL, zippedFile, quiet = TRUE, mode = "wb") :
    cannot open: HTTP status was '403 Forbidden'

    ReplyDelete
    Replies
    1. @Sandeep, The issue has been fixed. Thanks for highlighting it.

      Delete
  15. Is this code still working? I am new to R and tried installing it and ran this code. I am getting a series of warnings and error messages. Can you please help? My aim is to download historical bhavcopies of NSE and BSE. Thanks in advance.

    ReplyDelete
  16. This comment has been removed by the author.

    ReplyDelete
  17. Could someone please put step by step of process to download data from NSE.

    ReplyDelete
  18. Hi Market Analyzer, Gaurav,

    I am new to R. I am interested to get data(cash/derivates, etc...) from NSE/BSE.
    Could you please send the code and process to do this.
    Also, you send the code which you created for @rajesh as well.

    Thanks..

    ReplyDelete
  19. hi, i came across this while searching for a tool to convert EOD (bhavcopy) to Historical data, something like what metastock downloader does. Is it possible to add code to the above work to save data in scrip based csv files. Would be glad to know if its possible.

    thanks

    ReplyDelete
  20. Very impressive post. I would like to add that in stock market you can earn good profit using accurate stock tips recommendation of market experts.

    ReplyDelete
  21. Hi _ i am trying to learn and use R and this is very helpful, thanks. I get the following error message:

    Error in file(file,"rt") : invalid 'description' argument
    In addition: warning message:
    In unzip(paste(downloadfilename, ".zip", sep = ""):
    error 1 in extracting fro zip file.

    can you pls help in understanding and fixing this?




    ReplyDelete
  22. this no longer is working. It gives forbidden error for all files before 2015-12-31

    ReplyDelete
  23. I believe there are many more pleasurable opportunities ahead for individuals that looked at your site. rprogramming training in chennai

    ReplyDelete
  24. Positive site, where did u come up with the information on this posting?I have read a few of the articles on your website now, and I really like your style. Thanks a million and please keep up the effective work. R Programming Course Fees

    ReplyDelete
  25. Hi Experts,

    I need Market Capitalization data of each stock registered on BSE or NSE since inception.

    It should be day-wise/week-wise.

    This is required for my personal project.

    Any idea how to get it?

    Regards,

    Yogesh
    y_shinde@yahoo.com

    ReplyDelete
  26. Hi, I am facing this error

    In unzip(paste(downloadfilename, ".zip", sep = "")) :
    error 1 in extracting from zip file

    ReplyDelete
  27. Zip file downloaded using this code is empty. Can you help?

    ReplyDelete