My Projects: TheRarestWords, RarestNews, Suggestan, TheCraziestIdeas, Flim.me, MereFact, SemanticKernelBot, My development blog . Wanna help?


14th Nov 2008

It’s like Chinese room experiment

This is more about upcoming SEmangic - I’ve improved algo even more.

Damn, I love science, the computer don’t know squat about what those words mean or how they’re related to “skiing”, but look at it go:

1000 ngrams analyzed: skiing,winter,mountain,industry,beach,map,summer,unique,enjoy,culture,
welcome,season,offers,room,beautiful,built,shop,outdoor,golf,areasskiing,mountain,winter,beach

6000 ngrams analyzed: skiing,winter,snow,shops,ski,fishing,hiking,village,accommodation,resort,
alpine,zealand,finest,magnificent,guests,springs,unit,bathroom,vacation,attractions

15000 ngrams analyzed: skiing,ski,shops,hiking,village,accommodation,alpine,resort,magnificent,
guests,mountains,climbing,scenic,trails,harbour,comfortable,bookings,prestigious,seasons,magazines

32000 ngrams analyzed: skiing,ski,hiking,alpine,magnificent,climbing,scenic,trails,harbour,
bookings,prestigious,seasons,magazines,coastal,majestic,situated,renowned,picturesque,superb,lodge

With each iteration the words are getting more and more closely related. Damn and that’s with only 7000 random sites training! Only 24 (yep twenty four) of them contain word ’skiing’!

Science is magic! I’m even having second thoughts on whether I should release this at all :) I really feel like I’m being the “tester” in Chinese room experiment and computer plays me.

This entry was posted on Friday, November 14th, 2008 at 3:47 am and is filed under Uncategorized, word frequency.

Subscribe via RSS: or e-mail (the form in right sidebar).