http://www.shanghaidaily.com/nsp/Business/2011/06/23/Overseas%2BBanks%2BReady%2BFor%2BMutual%2BFunds/
"title"=>"Overseas Banks Ready For Mutual Funds -- Shanghai Daily | 上海日报 -"
expecting: "Overseas Banks Ready For Mutual Funds"
The name of the publication should be removed from the title.
"date"=>"on October 1."
expecting: "2011-06-23"
(or a date time stamp reflecting June 23, 2011 instead of October 1st). October 1st appears in the article text. "FOUR overseas banks yesterday said they are poised to sell mutual funds as regulators open the business on October 1."
"text"=>"Business | Banking\n By Maggie Zhang | 2011-6-23 | NEWSPAPER EDITION\n FOUR overseas banks yesterday said they are poised to sell mutual funds as regulators open the business on October 1"
expecting: "FOUR overseas banks yesterday said they are poised to sell mutual funds as regulators open the business on October 1"
The article should begin with the actual article text and not the category, date, author, etc info.
If you look at the RSS feed for ShanghaiDaily at http://www.shanghaidaily.com/rss/Business/, you can see in their RSS that there is more accurate information. Perhaps diffbot should consider incorporating RSS feed info? This is just a suggestion. Ideally, it would be great if diffbot returned more precise data for those fields regardless of how they arrived at it.
Forgot a mistake. For article: http://www.shanghaidaily.com/nsp/Business/2011/06/23/JPMorgan%2Bagrees%2Bto%2Bsettle%2BSEC%2Bcase/
diffbot returns an author of "Shopping Cart"
It looks like you're new here. If you want to get involved, click one of these buttons!