Tuesday, January 19, 2010

[Thinking Cap Question]: Think of a world that never was and ask why not

[[[From time to time, I will send "thinking cap" questions on the class blog. The idea is that you
respond to the question with your thoughts on the blog (posted as a comment to the question). 
This will count towards "participation" credit, but also allows you to share your class-related ideas with 
other folks in the list. 

As for how often you should feel compelled to respond vs. how deep your thoughts should be, I would use Woody 
Allen's philosophy on quantity vs. quality, expounded in the context of a
  slightly different situation (start at 3:30)]]]
Here is the first thinking cap. 
Post your answers to this "homework 1" question on the blog, so we can perhaps aggregate/discuss:


Think of and list 3 queries (or activities) that you would like to do on the
Web that the current day search engines (e.g. Google) don't quite

A quote to get you inspired:

"Some people see things as they are and say why? I dream things that
never were and say why not?" 
-(Mis)attributed to Robert Kennedy
who paraphrased  Bernard Shaw

post away

ps: David Bendit is the first to accept the invitation to the class blog. That kind of enthusiasm is usually grounds for either an A+ in the course or a piece of yummy candy (straight from the candy mountain).


  1. - Any query that transforms data from one format to another. This is supported in rudimentary form, like PDF to HTML, but what if I want the NYTimes RSS feed as CSV or JSON? Or what if I want the semantics about a particular website in a structured form?

    - A query that allows me to describe what it is I'm looking for and helps be resolve it to what that thing actually is. For example, "1969 beatles album" = "abbey road". This is easy because it exists somewhere on the internet. But what about "Number of cars manufactured in italy since 1956"? All of the data for this query exists online, but still requires a human to compile it.

    - Let me draw or speak what it is I'm looking for (e.g. airplane) and provide me results about that.

  2. -A query that can get me results with suggestions on what is the most appropriate meal for me depending on the day, time, and even physical state that I currently have (perhaps results could indicate to stay away from sugars if I'm under a medication and such)
    -A query that can help me determine what kind of activities I should get engaged on according to the mood that the computer can detect that I have at the moment of the query.
    -A query whose results could give me the best suggestions on where to spend/invest my money according to the plans that I have in my life and also taking into consideration my family's budgets. So if I'm interested in certain products, real estate, cars, etc. the results could guide me better on where to buy, when, how to find the best interests according to the money I could put down, where is the best deal that I can get if I'm eligible for such, etc.

  3. -What is the best _____ to buy with a budget of $___, optimizing ______, ______, and _____? Even with that level of detail, most search engines are unable to fulfill the query. Even product searches, unless they're searching a specific inventory, are unable to do this. And even the inventory-specific ones (say, Best Buy's site) that let you filter by criteria won't be able to tell you the "best" one for you.
    -Where did I leave my keys? Until we get the "network of things", this may be impractical, but I'd like to see it happen someday.
    -Why is the flag being flown at half-mast today? (This one's always bothered me) The rules are available (and US law is closer to kernel code than prose [as an aside, it's really aggravating that I can't find the site where I read this comparison. Thanks, Google]), but search engines don't parse it, they just see keywords.

    @Patrick: In regards to your second point, Wolfram|Alpha is really close to that. I tried with cars, but they're still loading that data. Try it with Italian population data, though, and you can see trends.

    In regards to your last point, on any Android phone, you can do a voice search or, with Google Googles, picture/text/logo/etc. search. It's really neat stuff.

  4. -Answer questions like who is most suitable to be my friend on a specific network. Although most social networking websites provide suggestions but, a single search engine which can give me relevant results would be nice.

    -queries like how to get from A to B using public transportation. This is available for some of the major cities in the U.S. But, for cities in other countries or going from one country to another using cheap modes of transportation.

    -Definitely multimedia search as Patrick suggests. Given multimedia data answer questions like who/what/where/when about the data.

  5. These are the things that I feel it is required from a search engine (or rather I have tried and I did not get my queries answered):

    (a) Queries like getting the details of all the games where Sachin Tendulkar got a man of the match award in the game of cricket(or any other sport). [ There have not been good search engines in the field of sports. If i was looking for an answer i would get all games sachin has just played or pages with some other players who have got man of the match]. I feel there are lots of sports followers :)

    (b) I had always wanted to see things or meet people not by searching manually but according to the persons feelings. For example, setting up the status Message of your gtalk as "happy" or "sad" should produce some search which will suit you at that moment of time. And all those people who are in the "happy" or "chatty" mood at an instance of time can be made to discuss something happy or to share a conference.

    (c) Queries like "display the pages with words which contain able". I would like to see the pages with words table, disable, pliable. I might be interested in words which are related to integr like integrity, integration. I am not able to give a right example. Basically I want also the pages with related words(jargon) in it.

  6. @Patrick,
    To Q1, you might to try Google squared/kosmix/freebase. Esp. Google Squard and Freebase.
    To Q2, you can try Wolfram|Alpha
    To Q3, in addition to Googles Goggles, you might get some ideas from Computer vision people, especially from this paper: "Sketch2Photo: Internet Image Montage". They get photos from sketch. In your case, you get query from sketch and then get answers from SE. Its quite similar in general.

    To Q1, you can get the result from Location-based services (LBS). A lot of research on that including query processing.

    To Q1, you might try Yahoo answers, though.
    To Q2, you might refer to https://networkchallenge.darpa.mil/Default.aspx
    about finding balloon in the world.
    To Q3,again, Yahoo answers, or you might try "why, date, flag, half-mast" in a search engine and I think you can get the answer.

    I think most of the proposed query can be answered using 1)better keyword combination 2) collaborative answering

  7. 1. A search engine should identify what the user "actually" has in mind(should keep in mind that not all users can articulate what they actually need precisely) and return the "correct" answer, instead of giving the user 100,000 choices to choose from.

    2. Image search should not search just the text that is attached to the images, but the content itself.This could also help in censoring images and videos, not just based on the keywords attached to the media, but based on the content itself.

    3. As mentioned by David, I think future search engines should also solve our everyday problems by helping us find our misplaced keys and books etc.

  8. 1. Image/Document as query. The results tend to be the relevant images, texts, etc.

    2. I always find its difficult to compose a query sometimes or make the query queryable (precise yet expressible). Modern SE should provide me a dynamic query interface to help me composing the query. (The query interface can be generated from query log given my query patterns, user behavior, etc)

    3. Man-powered search. That is I send a query to SE, rather than returning the web-pages. SE can distribute my query to the other users (especially the experts) who is using SE right now. Given this way, I can use very complicated query (NLP query since people can understand that). Nowadays, we might user Yahoo answers to do the same thing, however, 1) people there are not quite expertise. 2)Expert on that query might not use Yahoo answer, however they use SE. Hence, relied on user's searching log. It is possible to build a man-powered searching engine.

  9. In response to Henry Hu

    --poeple-powered search of some kind is already a reality. see http://www.chacha.com/
    --while chacha.com allows only normal search questions, a different idea is practised by mturk
    https://www.mturk.com/mturk/welcome (which allows you to "farm" a complex task to legions of people workers. The trick in using mturk of course is how to cleverly do the farming-out. As of now the tasker has to do it, and there is research afoot to try and automate it).


  10. First, I would like Google to be able to search through a video (Youtube, etc.) and find a statement. For instance, when I watch a lecture by a scientist for instance, and I really wanted to go back to something that person said, I should be able to type it in a search box, maybe with a "Transcript:" prefix, and it would automatically transcript the entire video to a word document (or perhaps a Google doc). That would be immeasurably helpful for me because not only could I use it for research purposes on a paper, but I also have something hard copy to cite for my professors.

    Second, I feel search engines should immediately detect errors or driver issues when I have a problem installing or loading something. I can't tell you how many times I have loaded an insignificant printer driver for a fresh install of an operating system at home, only to find it is not compatible, and the Microsoft search for the driver is woefully inadequate. I'm left to Google the error code in vain attempts at old blog posts and forum discussions. I rarely find the answer I'm looking for.

    Third, I would like search engines to have an option to tell me if I am paraphrasing or quoting another work too much in my own paper. Not that I plagiarize, but in papers, it's always nice to know that you have created something original that nobody else has written. There are paid options that I know teachers will run a student's paper through, but if people could upload their writing assignments/research papers to a search engine, and students would be able to run their own paper against papers with similar topics, it would be helpful. Also, some kind of metric could be used to gauge how well someone's grade might be in relation to how well other students did with their papers on the same topic.

  11. 1. a search engine which understands the query rather than using query contents as keywords
    - contact details of students who graduated in ___ or stadiums with seating capacity of more than ___

    2. a more dynamic form of location based services which uses the current location of your friends - who all are there in brickyard or search for friends who are within a mile from my current location

    3. an improved query interface which gives more control to the user so that user can guide the search (where to look for information or not to look for information) and filter/arrange the results based on his preference

  12. A search engine which is able to output pieces of code. For example, You're coding a particular project in whatever language and you realize you need a conversion function to convert from dollars to euro. Instead of googling for a library, downloading it, installing it and using it, I'd like to be able to write something that will automatically search for a library or service that has that function and make it work.

    So I'd be able to write something like:
    search-import "money conversion from dollars to euro"
    euro = dollars.convert("euro")

    Where the search engine finds me an applicable library or service and makes it work.

  13. ~ To a certain extent, it provides support for getting the result in structured format such as XML but it doesn't support it completely. It is limited in terms of number of results retrieved and number of automated queries that can be sent. Therefore we can't obtain results in a required format sometimes.

    ~There should be an interface to provide an option to retrieve the result set according to given specifications. For example, a user can give search string and specify the number of results along with fields required (such as main link, small description etc we see on web search result page) and get the result exported in some standard formats like Excel, CVS.

    ~ Suppose, user is not able to remember a search string completely. For example, "Hasta la vista" is the search string and user is not able to recall "Hasta". There is no provision to give a search query like "? la vista" so that a list of all possible combination strings can be made available and user can pick the required string to do the actual required search.

  14. 1. It would be nice to have a search engine that 'guessed' your query or intent in a sort of 20 questions style manner for when you don't know exactly what it is you are trying to find. The typical scenario would be giving the search engine a string of semi connected points related to your topic(ie, "tv show, few weeks ago, maybe NBC or Comedy Central, man washed dog with hose") and the engine would ask you questions to narrow down and determine what it is you are trying to ask about. A similar service is available today, but it is more of a game for having a 'genie' guess what character you are thinking of (http://us.akinator.com/).

    2. Contextual search experience, where your prior searches and habits help influence the results you receive. For instance, if you are researching information about the sun, then pose the query "how hot is it", you wouldn't get results from dating services, song lyrics from nelly, or some other unrelated information. It would be more akin to having a conversation with someone where you don't have to explicitly describe the subject, circumstances, or situation every time. Now that I think about it, this idea reminds me of the Bing commercials, but I don't think they are doing this sort of 'learning'. The types of influences on your searches could be numerous, including time of day, weather, current events, seasons, trends(fashion, music, etc), et al.

    3. Maybe someone will point me in the right direction for a service similar to my number three, but what I would like to see is a 'trendy' search, that focuses on new and fresh internet material that is popular. That could be news, reviews, comics, research papers, and other interesting items. I realize google has a trends section for tracking popular search terms, and that they also have a feature for results from the last week, 24 hours, etc, but neither provide the information in a very useful manner(WNBA.com updated 5 hours ago? Why should I care?). It would be similar in style to Digg, but instead of user submitted submissions, it would rely on dynamically tracking current trends through data mining.

  15. -An extension from the currently existing voice technology for Android 2.1 devices to the much more powerful computer platform. Voice recognition is a hard thing to do properly (as demonstrated by my garbled messages in Google voice), but obviously Google thinks that it is reliable enough to include it as a main feature in its android devices. I would like to be able to use the mics I have on my computer to speed up my searching. An example would be if I were reading a website, something struck my google fancy, and all I would have to do would say "google ducks laughing" and a new tab would pop up with my query. This is reasonably possible with a combination of plugins and microsoft voice recognition, but I would like to see google do it.

    -A query that I could ask google: "what do I want to look for (or buy or something else) next?" and it would tell me what I need/want/am looking for. Seriously though, I bet they could do it. They have so much information about me at their disposal, they know all my contacts, all my queries, everything I do on the internet, everything I purchase, who I call and what I say during my calls, that they can put a profile together that tells them what I want.

    I once heard a guy on the internet (http://www.reddit.com/r/IAmA/comments/al3tl/iama_fraud_prevention_agent_for_a_major_credit/) say that the credit card companies can predict the odds that you will go to McDonalds on a given day based on your history and profile. I bet that knowing my information, a search engine can compare my history and profile and predict what I will look for next.

    -We should be able to search for our physical things. Everything should have rfid tags, and there should be readers everywhere. Then all I have to do is search for my keys and boom! under the couch. Of course there are security issues, but someone clever will think of a solution.

    On a related note - think of a world that never was and ask why not: Google should have won the auction on the wireless spectrum. Spend several billion on infrastructure investment, a couple more on device research/manufacturing, and they could have destroyed the entire telecom industry in a few short years. Imagine a google phone/laptop that connects to their wireless network (20-30 mile range for the spectrum frequencies as opposed to the 3-5 range for current cell phone frequencies) for free (or something much much lower than 70$/month most people pay now), supported by google ads. One connection for voip/data, and internet pretty much everywhere it is wanted. Yep, that would totally rock.

  16. 1. Upload an image and search for the web pages which contains that image or similar ones.
    2. A sort of "semantic" search: introduce a word in the search engine and search for the documents which contains not the word but its meaning.
    3. Nested searchs: given a set of keywords, search for all the documents which contains the first keyword, then on the results search for the second keyword and so on. It is possible right now but not in a single search, the idea is to propose a syntax which allows that type of search.

  17. 1)A query that combines enties which are independent yet related e.g a query that would put a complete outfit together.eg If we search for a green dress,the results show us the matching combinations(with that green dress) from accessories to shoes along with the price,so we can choose a particular selection and buy everything together instead of buying one shirt and searching various stores or websites to find the suitable combinations,same for the other cases e.g.furniture for a room with all the suitable accessories.
    2)A query that can do a 'real time'search e.g.Search all the events in last five minutes.
    3)Queries to search by color or shape -like images with blue background or text with Green color.
    4)A query that searches all the recent searches.

  18. 1. When I search for a video/music/article (any similar document) I should be able to get results based on the authenticity. I want to address the issue of plagiarism here. Only the sources which have the copyright to some document should be displayed. Also if other sources are being displayed the authenticity should appear as unknown (or similar).

    2. If I search for a "Free" source for a document, only the free sources should be displayed. Many websites ask to pay at the end of the transaction (fooling the users at the start), resulting in wastage of an ample amount of time for the user.

    3. Search Engines should display the results on the basis of contents (number of posts). Suppose I am searching a normal text (any question), then the popular sites (eg. yahoo answers which generally has more hits) will appear first even if it has less posts than other sites for a similar kind of question. So the questions which have more posts should appear first (irrespective of the popularity of the site).

    4. Search engines should have specific sub class engines for developers, for travellers, for researchers, for bloggers etc.

  19. I guess these are more activities than queries necessarily
    1. Learn the search context from history or have a setting to set the context. For instance, Banjo is a musical instrument, but also a software application for Bayesian network inference. It would be nice if the search mechanism would know from my history that the software application is also interesting to me and show it in the first page of results.
    2. Search to translate between terminologies/lingo/slang rather than just languages. A simple example(which probably can be achieved by integrating several keyword searches) is translating a doctor's description of a condition into something a layman would understand.
    3. Search through aisles in supermarkets/ stores or even different stores. If I feed in a list of things I need to buy, it should give me the shortest route to take to buy everything I need.
    4. This is an offshoot of the previous idea. A system to plan a trip(such as a bunch of errands). If I feed in the places I need to go to and how long at each stop, it should tell me the best order to finish them, taking into account traffic, hours of operation etc.

  20. To Rao,

    Chacha.com is an good example for human computation in SE. However, as you pointed out that it can only take normal query. From what I learned, there are some "guides" sitting there to answer the questions which are sent thru SMS. Then the problem is how real-time it is for a normal query? The story told me which is less than 3 min. Well, that might be real-time, however, what about this query "who was the PC chair of AAAI 05 and who was his advisor?" I suspect it must take longer time to be answered. say 10min or more.

    To me, this is not a hard/complex task to answer and I can answer it really fast. So, it is not necessary to post it in MTurk since the people there might not be interested. (They are seeking for long-term task). My conclusion is ChaCha is not a perfect choice for this kind of very domain specific query while MTurk seems to be too luxurious for this simple query.

    One solution might be should pushed the specific question to specific people in order to get real-time answering. ChaCha might be a good platform to do that. It just need to mine the user profile and spread the question to that specific users. But the problem of targeting potential users and aggregate their answers is very hard, let along the real-time answering.

  21. As some one has already pointed out, it would be good if for any search in the form of a question posed to the search engine (for which the information is already available in the web), the search engine itself can generate Queries in the background and give the results to the user.

    Rather than searching solely on the keywords provided by the user, the search should be based on the context in which the user is posing the query.For example, if i search for some xyz free ebook, the results i get will contain pages that give have free ebooks, but the xyz book that i am searching for may not be available in that sites. So in this case, the search engine should understand what exactly i am looking for, and provide me the results based on the same.

    When you are uncertain about the information that you are looking for and manage to give some keywords for searching, most of the search engines give the results in terms of the closest match(single closest match).In fact some of them also ask "Did you mean : xyz?" and display the results of xyz).. It would be good if the top 3-4 closest matches are given as options for the user to select from.

  22. 1. Multimedia Search:

    Related Video search:

    What is currently supported:

    I think content based multimedia search has lot of scope to be improved. Most of the audio and video searches are based on retrieving results by searching tags, title of the video, user comments and other text data around the video. Recently there has been some improvement in research related to fetching "related results" for video, the techniques are based on fetching results that are having similar frame content in terms of color content, similar objects etc. For example: For a video on a football match, the related videos of other football matches can be fetched purely based on similar contents(color histogram, playground, players, stadium) without considering the text around the video.

    What can be done:

    But what can be improved here is making the search more semantically meaningful. For example, related results can be based on similar emotional sequences in the video, similar events, similar happenings etc. These results need not have exactly the same background or environmental settings but yet they can be similar in their content. For example: Related Videos of Two fast moving cars of totally different colors, totally different tracks, Related videos of earthquakes on completely different locations or related videos of newsreading even though the person reading the news and environment setting is different, related video on people discussing about same topics etc.

    Within Video Search:

    Within the video, if the user wants to search a particular instance by just searching using keywords. For example, A user watching a football match video wants to just quickly watch the second goal, he may just search by using keywords "second goal" and the result should start the video from second goal.

    2. Product Searches:

    What is currently supported:

    I think product searches done for online shopping are mostly exploratory in nature. The users are mostly not very familiar with the product, but they take their time in exploring about the product themselves and making a decision. The present day product search systems help users make decisions by means of user reviews, product ratings, statistical facts on the product's purchases made so far etc. But yet I think product search systems have not become very interactive when compared to conventional store shopping and a major percentage of people still prefer store shopping with high confidence.

    What can be done:

    Social networking can be relied to support online shopping search in useful ways. For example; A person's blogs, wiki, social networking profile, history of purchases etc provide enormous amount of details about the person's expertise in a particular field. Based on a person's expertise knowledge, the person can be termed as an expert in the respective field. Think of a situation when a user is trying to search for a camera is being recommended instantly by experts online to buy the second product result or being suggested with a different search query to get better results, such a system can provide good confidence to the user buying the product online. The challenge lies in making such system interactive and easy-to-use rather than annoyance to both the users and the experts.

    3. News searches:

    Search for any news and compare the results from different newspapers. The system should extract and output in short, the contrasting or slightly different stories published by different newspapers about the same topic. It would be an interesting tool for avid news followers to get an objective view about the news rather than relying on single news source.

  23. Many search engines do not return the actual content pages rather - you get even what you shouldn't get. Sometimes the search result may contain links that doesn't provide any useful content but merely unwanted advertisements.With the advent of the semantic web we can expect some more precise and even more exact search results.
    Voice search can also be helpful in many ways which is currently available only with the mobile phones.
    Currently we just have video and image search and no audio search available with any search engines.
    Security is one of main issues with the search engines. Many search engines result in high ranked unsafe websites when the search made for downloading music and games.

  24. I think semantic search is most lacking in common search engines. Semantic web may help in elevating that problem, but it seems it can be done for a small percentage of all the web content. I have often been unable able to find pages based on synonyms or implied meaning of my keywords. Like if I am searching for "risk posed by financial instruments" it wont give me anything about credit derivatives or exotic option in the stock market.
    In other words the WIMWIG "What i mean is what i get" is missing some what.

  25. 1. summarize the website or list of websites or a search result for me.

    2. Where was Osama Bin Laden(given the image, speech, writing style) yesterday?

    3. search within video and audio

  26. 1) Similar to what Chaube and others had mentioned, my first query is: "Give me a song similar to the tone I just sang to the computer.

    2) Find me the paper I want, figure out how to access it the full paper via all sources available to me. (my ACM membership, ASU library research database, CiteSeer or google scholar etc..). The point is I don't want to spend the time figuring out which way it may work.

    3) Where are my keys? (Connecting to the physical world, via RFID tags, WSN etc..)

  27. 1) Right now, if you query a math equation into Google, Google will return the answer to the math equation on top of web site results. You then don't have to even look through the web results because you have your answer. I would like it if a search engine would return a definite answer like Google does for math problems for any question. For example, if I put the query, "How long do zebras live?", I would like it if a search engine would return the answer, "The average life of a zebra is...". This would make it so the user doesn't have to click on the Wikipedia page and search for the words "live" and "life" in the document.

    2) As several people have mentioned before, a search engine that would analyze the image for shapes of objects instead of analyze the tags associated with the image. For example, if I searched for images of tigers, I would like a search engine to analyze images for tigers and not return an image of a man nicknamed "The Tiger".

    3) Right now if you search for something on Google, it returns a link to a website and a snippet of a sentence with your query in context to the paper. I would like it if once you visited the site if Google could highlight that snippet on the site instead of having to search through the page for all the instances of the words you are looking for.

  28. As an additional thought, I think a search engine that can search our memories would be really nice. You think of a timeframe or an event and the search engine shows all the memories from friends and family corresponding to it.

  29. This comment has been removed by the author.

  30. Something that takes comparisons to a new level. For example, when deciding whether to go with one vet or another, have information that directly compares them instead of just independent reviews (that are often spotty). Include different metrics like price of procedures, customer relations, etc.

    Be able to search for something I want to buy and have closest stores and prices come up (like nails bringing up the closest home depot and how much their nails cost).

    A bias detector when I read political news that upon searching for articles puts information about how biased the results are to the left or right next to the result (based on the language of the article).

  31. -one day, if that is possible, I hope I can specify (upload) a picture instead of keywords, then the search engine will return related pictures ordered by relevance.

    -Sometimes, I feel that the search engine is not smart enough. I'd like to input some sentence as query string, very likely, the sentence means something but all the words are stop words.

    -Keyword NOT is usually omitted by search engines. I need information that are negative, but search engines fail to do that.

  32. * I would like to be able to type a question in a sentence format, for example "Why do carrots turn green in carrot cake" and have the search engine return the answer, such as "because there is too much baking soda in the cake" instead of having to search through many pages that are returned and read about each theory.

    * I think it would be useful to have a search engine that not only searches on the keywords, but also on variations of the keywords or similar words. It would be useful to have an option to include similar words, so that if the actual word is alluding you, you can search for a word similar and receive results for both.

    * I would like to be able to search for information on the internet about a specific person, by name and only receive information about the person that I am thinking of, for example, not all of the other "John Smith"s in the world, just the one who is my brother-in-law.


Note: Only a member of this blog may post a comment.