{"id":1259,"date":"2012-02-10T11:09:05","date_gmt":"2012-02-10T18:09:05","guid":{"rendered":"http:\/\/xiehang.com\/blog\/?p=1259"},"modified":"2012-02-11T15:05:12","modified_gmt":"2012-02-11T22:05:12","slug":"science-stage-now","status":"publish","type":"post","link":"https:\/\/xiehang.com\/blog\/2012\/02\/10\/science-stage-now\/","title":{"rendered":"Science stage now"},"content":{"rendered":"

I believe I’ve set up majority parts of my terminology extraction demo site, there are still several parts were glued together instead of tightly integrated, but it works.<\/p>\n

I think it’s the time to get into science part – need to dig out what kind of terminology should be returned whenever I got dozens or hundreds from a web page. Current algorithm is pretty simple (and may not make sense at all), just for testing purpose: sort by tf-idf, and title’s tf-idf has 3 times higher weight than content’s.<\/p>\n

Anyway, don’t want to mention too many details here at least for now, I still need to get those glued parts done in a better way.<\/p>\n

Demo’s here: http:\/\/solr.xiehang.com\/<\/a>, note that this may be taken down anytime without notice since I don’t want to leave such a easy-to-be-abused entry point on my servers.<\/p>\n","protected":false},"excerpt":{"rendered":"

I believe I’ve set up majority parts of my terminology extraction demo site, there are still several parts were glued together instead of tightly integrated, but it works. I think it’s the time to get into science part – need to dig out what kind of terminology should be returned whenever I got dozens or […]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[289,290,287,288],"_links":{"self":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts\/1259"}],"collection":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/comments?post=1259"}],"version-history":[{"count":2,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts\/1259\/revisions"}],"predecessor-version":[{"id":1261,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts\/1259\/revisions\/1261"}],"wp:attachment":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/media?parent=1259"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/categories?post=1259"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/tags?post=1259"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}