My Projects: TheRarestWords, RarestNews, Suggestan, TheCraziestIdeas, Flim.me, MereFact, SemanticKernelBot, My development blog . Wanna help?


Archive for the 'Uncategorized' Category

27th Dec 2008

New projects

Well, there are a number of interesting projects in development, and I need people to try and have fun with them before they go mainstream. So if you like projects, like Suggestan and TheRarestWords (or any other in the top line) - drop me a note on bomboze@gmail.com that you are interested in trying new projects. The next one promises to be REALLY useful and once again - nothing like that tubes of internets has yet seen. :)

Posted in Uncategorized | Comments Off

21st Nov 2008

Google SearchWiki - I don’t buy it

I’m sure you’ve seen the up arrow in your Google results today. The official post says (and I quote): “The changes you make only affect your own searches” and yet…

Well, if it’s going to to affect MY search results only - why do I want to see “Picked by (my name)?” I understand why is my name in comment form as comments are global, but not the selections. [ Especially as we know that Google fights for each pixel they put into their pages, so there is no way this is written by accident ]

Besides Google isn’t stupid, but they think we are. They claim they’ve developed this (and again I quote) “Maybe you’re an avid hiker and the trail map site you always go to is in the 4th or 5th position and you want to move it to the top.”

If I ALWAYG GO to the result for particular search on 4-5 place - there’s three outcomes:
1) I’ll remember that it’s on 4th place
2) (most likely!) I’ll remember the URL and go there directly
or
3) I have an attention span of a hummingbird and can’t remember anything (Google makes us believe this is WHY they developed this feature).

Ok, the REAL reason why they developed this is that they WANT to incoroprate social element into the search results, but they want people to believe this isn’t affecting anything so that we won’t be pushing “up” on our own sites or (what’s worse) to stop spammers from spawning hoards of zombies /low-paid employees to push “up” on their sites.

Another possibility (also very probable) is that they’re going to only consider sites that are “Deleted” (”x” button) from results to be bad. Anyway the outcome is that we’re going to see even more focused Internet - and by focused I mean for each search there’s gonna be less and less diversity - enjoy CNN, ABC, Wikipedia, Answers and other similar sites. There’s going to be much less small sites. Why? Because everybody would click up on known brands, like CNN and hardly they’ll remember to push up the site they’ve accidentally stumbled upon.

This might have only one way this isn’t going to be incorporated in global results - that is if the spammers are going to use it now way too much, so that Google won’t find a reasonable way to differentiate between spam and regular user activity. In which case both sides would just lose a lot of time.

Posted in Uncategorized | Comments Off

17th Nov 2008

RarestNews - new reincarnation

Well, I’m kind of tired of writing “happy posts” about old/new project online, so just link: http://rarestnews.com/ (new approach)

Posted in Uncategorized | Comments Off

14th Nov 2008

It’s like Chinese room experiment

This is more about upcoming SEmangic - I’ve improved algo even more.

Damn, I love science, the computer don’t know squat about what those words mean or how they’re related to “skiing”, but look at it go:

1000 ngrams analyzed: skiing,winter,mountain,industry,beach,map,summer,unique,enjoy,culture,
welcome,season,offers,room,beautiful,built,shop,outdoor,golf,areasskiing,mountain,winter,beach

6000 ngrams analyzed: skiing,winter,snow,shops,ski,fishing,hiking,village,accommodation,resort,
alpine,zealand,finest,magnificent,guests,springs,unit,bathroom,vacation,attractions

15000 ngrams analyzed: skiing,ski,shops,hiking,village,accommodation,alpine,resort,magnificent,
guests,mountains,climbing,scenic,trails,harbour,comfortable,bookings,prestigious,seasons,magazines

32000 ngrams analyzed: skiing,ski,hiking,alpine,magnificent,climbing,scenic,trails,harbour,
bookings,prestigious,seasons,magazines,coastal,majestic,situated,renowned,picturesque,superb,lodge

With each iteration the words are getting more and more closely related. Damn and that’s with only 7000 random sites training! Only 24 (yep twenty four) of them contain word ’skiing’!

Science is magic! I’m even having second thoughts on whether I should release this at all :) I really feel like I’m being the “tester” in Chinese room experiment and computer plays me.

Posted in Uncategorized, word frequency | Comments Off

14th Nov 2008

More experiments with semantic kernels and TheRarestWords

More science for today. I’m trying to replace SEwOrdizer that I had in TheRarestWords, which I can’t use at this VPS and well, it was kind of disappointing, maybe even useless. So, I’m experimenting with semantic kernels on a more global scale. Well, not 10 million words global, but 80 000 kernels from 80 000 words (actually uni- and bigrams), i.e. I’m trying to build semantic kernels for every word that is now in the TheRarestWords database. I’m using the same process that I use for semantic kernel bot, but on a massive local scale. Well, the first results were a disappontment:

skiing: guests, decide, secure, body, categories, video, list

Even though somewhat related, but not by a long shot. And “secure”??.. tell it to my broken downhill ski :) Well, it’s a disappointment more than ex-SEwOrdizer. Ok, but with a few imrovements to the algorithm:

skiing: majestic, magnificent, tourist, shops, guests, route, tennis, minute

Yes, you can see the majestic and magnificent nature views on a tourist ski route, while guests on a ski resort can take minute to visit the ski shops. While tennis is also a sport.

And this isn’t yet a final version. This should replace SEwOrdizer for the time being, I thikn I’ll call it SEmangic (Semantic + Magic). I think it’s due to be released in a few days (probably earlier unless I stumble memory problem somewhere).

And of course If TheRarestWords ever to return on a full scale (i.e. on a dedicated server) - the SEwOrdizer will make it comeback. Probably.

Posted in Uncategorized | Comments Off

12th Nov 2008

TheRarestWords resurrection

Well, I didn’t think I could do it, but here it is - TheRarestWords is ALIVE! I’ve managed to fit 25 freaking GIGABYTES of raw data into 64MB of memory on this VPS. Well, since this is a complete overhaul - things don’t work. Most of the things. And there’s a lot of problems, so stay tuned - I’m working on it day and night. There’s gonna be a host of new features and things should now work really snappier than before (2 minutes per page).

I’m gonna be testing this for a while, before enabling editing capabilities.

Posted in Uncategorized | Comments Off

11th Nov 2008

Dear Opera, I’m close to hating you

Dear Opera developers,

I’ve been using Opera browser since 7th versions and loved it dearly, I’ve asked waves after waves of people to use it instead of other browsers, but starting with 7.5 you started to make the stupidest decisions that force me to use Firefox!

  1. I loved single key shortcuts, especially “1″ and “2″. Why do I have now to enable them in profile if they were there for ages and I’m used to them? Because Firefox users do not use them? I’m not a Firefox user, and neither is any Opera user - do I even need to explain it?
  2. If the server doesn’t return an answer (i.e. there’s an error in PHP script or the server is down completely) - DON’T EVER RETURN ME A CACHED COPY ON REFRESH! This pisses me off as I’m a developer and I refresh the page to see the changes and there’s none, but instead - the script DIDN’T EVEN RUN, but for some reason you decided to return me a cached copy! And now I don’t even know that something is wrong.
  3. If I select “Store 0(zero) of my typed addresses”, and unselect “Remember content on visited pages” - that means NEVER STORE ANYTHING I TYPE INTO ADDRESS BAR, INCLUDING my google searches and custom search engine searches.

This was all working before 9.5 - why did you have to break it?

Yes, I did send it to Opera developers. I have nothing against Firefox, BTW. I would’ve used it if it weren’t for it’s close to 20 seconds startup time even on clean install and if they implemented gestures in default installation.

Posted in Uncategorized | Comments Off

10th Nov 2008

Flim.me! - personalized movie recommendations

Flim.me is a software I developed to solve my worst problem of all times - “Which movie to watch tonight?”

Well, a few weeks of data mining and voila! The software was born - “flim.me!”

Think of “last.fm for movies”. Basically, you rate a few films that you have already seen and get recommendations on what movies you would probably enjoy watching. It does so by finding people with tastes that are similar to yours.

This is 0.001alpha version, so it’s a very pre-release, might have bugs, problems, some films don’t have years, some have some weird stuff in titles, etc, but as far as I used it - it works pretty stable.

And I did enjoy the films that it recommended me. It’s free.

Site: http://www.flim.me/

Posted in Uncategorized | Comments Off

10th Nov 2008

Did Google develop Artificial Intelligence? And now it wants to know the meaning of life??…

Google becomes more and more weird every day. I’m kind of used (by now) to see Googlebot loading scripts, executing them, but today there was something weird. Not only did the Googlebot had a REFERRER field, but… well, look for yourself:

The Googlebot was searching for “meaning of life” limited to my “suggestan.com” site and it “clicked” the link to read the “If you want to know the meaning of life” page there. This is spooky, I know Google is working on artificial intelligence, but I was thinking of “here’s a couple more book searches for you” kind of intelligence, not the “who are you people and where’s all the booze?” kind. Thank God Google bot didn’t start filling the empty spaces there.

P.S. Ok, no conspiracy theories or UFOs here… It was a first time I ever saw Gbot with referer and it was a nice one. My best explanation is that is the way Google fights cloaking, trying to emulate a person completely. But it was just too fun to see this exact page being hit.

Posted in Uncategorized | Comments Off

06th Nov 2008

Reverse Suggestan

Think you’re witty enough? Well, I’ve just developed a new Sugestan feature which will show you how much wit you got. Ready? Take a challenge at “Reverse Suggestan“.

It’s basically reverse of what Suggestan is - try to guess a topic by phrase from it. And if you think this guess is interesting enough - create a topic with it!

Also as suggested by fans on “Suggest a feature” - I’ve opened Facebook fan page. I don’t know what it’s useful for. :)

P.S. I’ve also closed TheRarestWords this week - it’s due for a comeback next year (I think so).

Posted in Uncategorized | Comments Off