Design a feed system of stock informaiton

If you were integrating a feed of end of day stock price information (open, high, low, and closing price) for 5,000 companies, how would you do it? You are responsible for the development, rollout and ongoing monitoring and maintenance of the feed. Describe the different methods you considered and why you would recommend your approach. The feed is delivered once per trading day in a comma-separated format via an FTP site. The feed will be used by 1000 daily users in a web application.

My initial thoughts:
A .csv file containing 5000 lines. Each line stores the information for a company. It can be easily stored and parsed.

Solution:
Let’s assume we have some scripts which are scheduled to get the data via FTP at the end of the day. Where do we store the data? How do we store the data in such a way that we can do various analyses of it?

  • Proposal #1
    Keep the data in text files. This would be very difficult to manage and update, as well as very hard to query. Keeping unorganized text files would lead to a very inefficient data model.
  • Proposal #2
    We could use a database. This provides the following benefits:

    • Logical storage of data.
    • Facilitates an easy way of doing query processing over the data.

    Example: return all stocks having open > N AND closing price < M.
    Advantages:

    • Makes the maintenance easy once installed properly.
    • Roll back, backing up data, and security could be provided using standard database features. We don’t have to “reinvent the wheel.”
  • Proposal #3
    If requirements are not that broad and we just want to do a simple analysis and distribute the data, then XML could be another good option.
    Our data has fixed format and fixed size: company_name, open, high, low, closing price. The XML could look like this:

    <root>
    <date value=“2008-10-12”>
    	<company name=“foo”>
    		<open>126.23</open>
    		<high>130.27</high>
    		<low>122.83</low>
    		<closingPrice>127.30</closingPrice>
    	</company>
    	<company name=“bar”>
    		<open>52.73</open>
    		<high>60.27</high>
    		<low>50.29</low>
    		<closingPrice>54.91</closingPrice>
    	</company>
    </date>
    <date value=“2008-10-11”> . . . </date>
    </root>
    

    Benefits:

    • Very easy to distribute. This is one reason that XML is a standard data model to share /distribute data.
    • Efficient parsers are available to parse the data and extract out only desired data.
    • We can add new data to the XML file by carefully appending data. We would not have to re-query the database.

    However, querying the data could be difficult.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: