<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Noisy Channel &#187; Search Results  &#187;  twitter</title>
	<atom:link href="http://thenoisychannel.com/search/twitter/feed/rss2/" rel="self" type="application/rss+xml" />
	<link>http://thenoisychannel.com</link>
	<description></description>
	<lastBuildDate>Sat, 04 Feb 2012 19:24:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Social Wisdom in Seattle</title>
		<link>http://thenoisychannel.com/2012/02/04/social-wisdom-in-seattle/</link>
		<comments>http://thenoisychannel.com/2012/02/04/social-wisdom-in-seattle/#comments</comments>
		<pubDate>Sat, 04 Feb 2012 19:24:14 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=4089</guid>
		<description><![CDATA[      First, I wanted to give readers a heads up that I&#8217;ll be in Seattle this Friday and Saturday. I&#8217;ll spend Friday afternoon at the University of Washington, meeting with some of their outstanding computer science doctoral students. My schedule filled up with unexpected haste! But if you&#8217;re on campus and urgently want [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cs.washington.edu/"><img class="alignnone" style="margin-left: -10px; margin-right: -10px;" title="University of Washington Computer Science &amp; Engineering" src="http://www.cs.washington.edu/images/cse_logo_80x133.gif" alt="" width="106" height="64" /></a><a href="http://wsdm2012.org/"><img class="alignnone" title="WSDM 2012" src="http://wsdm2012.org/img/topheader.png?1312770667" alt="" width="274" height="72" /></a>     <a href="http://brynnevans.com/blog/wp-content/uploads/2010/03/social-search.png"><img class="alignnone" title="Social Search" src="http://brynnevans.com/blog/wp-content/uploads/2010/03/social-search.png" alt="" width="85" height="63" /></a></p>
<p>First, I wanted to give readers a heads up that I&#8217;ll be in Seattle this Friday and Saturday. I&#8217;ll spend Friday afternoon at the <a href="http://www.cs.washington.edu/">University of Washington</a>, meeting with some of their outstanding computer science doctoral students. My schedule filled up with unexpected haste! But if you&#8217;re on campus and urgently want to meet, let me know and I&#8217;ll see what I can do.</p>
<p>Saturday I&#8217;ll be attending the social track of <a href="http://wsdm2012.org/">WSDM 2012</a>, the premier international ACM conference covering research in the areas of search and data mining on the Web. I&#8217;m excited about the program, as well as the opportunity to catch up with friends and make new ones. Back in 2010, I had the pleasure of co-organizing the Workshop on Search and Social Media (<a href="http://thenoisychannel.com/2010/01/25/workshop-on-search-and-social-media-ssm-2010/">SSM 2010</a>) and being the official ACM blogger for <a href="http://www.wsdm-conference.org/2010/">WSDM 2010</a>. You can read my posts <a href="http://thenoisychannel.com/2010/02/04/report-on-the-third-workshop-on-search-and-social-media-ssm-2010/">here</a>.</p>
<p>Then, on Saturday evening, I&#8217;ll be heading to Microsoft Research to attend the Social Search Social (<a href="http://research.microsoft.com/en-us/events/sss2012/">SSS 2012</a>). Hats off to organizers <a href="http://research.microsoft.com/en-us/um/people/merrie/">Meredith Ringel Morris</a>, <a href="http://www.fxpal.com/?p=gene">Gene Golovchinksy</a>, <a href="http://twitter.com/#!/jerepick">Jeremy Pickens</a>, <a href="http://faculty.ist.psu.edu/reddy/">Madhu Reddy</a>, <a href="http://comminfo.rutgers.edu/~chirags/">Chirag Shah</a>, and <a href="http://people.lis.illinois.edu/~twidale/">Michael Twidale</a> for creating what looks to be a fun (and very social!) event. I&#8217;m especially looking forward to the 45-second &#8220;madness&#8221; presentations (in which I&#8217;m participating) and the &#8220;speed dating&#8221; to help cross-pollinate  the WSDM and <a href="http://en.wikipedia.org/wiki/Computer-supported_cooperative_work">CSCW</a> communities.</p>
<p>Hope to see some of you there, and of course will share what I learn here at The Noisy Channel. I also encourage you to follow the tweet streams for <a href="https://twitter.com/#!/search?q=%23wsdm2012">#wsdm2012</a> and <a href="https://twitter.com/#!/search?q=%23sss2012">#sss2012</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2012/02/04/social-wisdom-in-seattle/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2012/02/04/social-wisdom-in-seattle/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LinkedIn @ CMU</title>
		<link>http://thenoisychannel.com/2012/01/26/linkedin-cmu/</link>
		<comments>http://thenoisychannel.com/2012/01/26/linkedin-cmu/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 18:47:49 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=4083</guid>
		<description><![CDATA[As regular readers know, I have a deep affection for Carnegie Mellon University, where I did my graduate work. I&#8217;m happy to announce that two of my colleagues (both fellow CMU PhDs) will be giving talks at CMU in a couple of weeks, and I hope that some of you will have the opportunities to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://engineering.linkedin.com"><img title="LinkedIn" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2011/09/in-logo.jpeg" alt="" width="205" height="205" /></a><a href="http://www.cs.cmu.edu/"><img title="CMU School of Computer Science" src="http://www.cs.cmu.edu/~ref/naacl/logos/bronze/dragon-small.jpeg" alt="" width="277" height="241" /></a></p>
<p>As regular readers know, I have a deep affection for Carnegie Mellon University, where I did my graduate work. I&#8217;m happy to announce that two of my colleagues (both fellow CMU PhDs) will be giving talks at CMU in a couple of weeks, and I hope that some of you will have the opportunities to attend.</p>
<p>On Tuesday, February 7th, <a href="http://www.linkedin.com/in/abhilad">Abhimanyu Lad</a> will be hosting an information session at 6pm in Scaife Hall, Room 214. Abhi is rock star on our data science team, and he&#8217;s been working on the next generation of LinkedIn search. You can get a taste of his work from his recent <a href="http://hcir.info/hcir-2011">HCIR 2011</a> presentation, &#8220;<a href="http://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6MWVlMGNhZWY5NTA3MzQ2ZA">Is it Time to Abandon Abandonment?</a>&#8220;. Abhi will talk about a variety of technical challenges that data scientists and engineers are working on at LinkedIn.</p>
<p>On Thursday, February 9th, <a href="http://www.linkedin.com/in/paulogilvie">Paul Ogilvie</a> will talk about &#8220;<a href="http://www.lti.cs.cmu.edu/LinkedInPaulOgilvie.pdf">Where Big Data Meets Real-Time: Efficiently Indexing and Ranking News using Activity</a>&#8221; at 3:30pm in GHC 6115. Paul is responsible for article relevance infrastructure and algorithms on <a href="http://www.linkedin.com/today/">LinkedIn Today</a>, a great example of <a href="http://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6MjEzNDNjZTk5NGYyYWQwOA">social navigation</a> &#8211; not to mention a <a href="http://techcrunch.com/2011/06/30/linkedin-traffic-twitter/">great success for users</a>. Paul will talk about the technical details that make LinkedIn Today possible, including a novel use of inverted lists to efficiently index and support real-time updates to document representations.</p>
<p>And, even if you can&#8217;t make it to the talks, I encourage you to visit the LinkedIn booth at the <a href="http://www.studentaffairs.cmu.edu/career/job-fairs/eoc/index.html">EOC</a> fair on Wednesday, February 8th. We&#8217;re looking for great software engineers and data scientists, and we&#8217;re especially interested in interns.</p>
<div>I hope that CMU students and faculty will take the time to meet Abhi, Paul, and their colleagues when they visit in a couple of weeks.</div>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2012/01/26/linkedin-cmu/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2012/01/26/linkedin-cmu/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CIKM 2011 Industry Event: Ilya Segalovich on Improving Search Quality at Yandex</title>
		<link>http://thenoisychannel.com/2011/11/27/cikm-2011-industry-event-ilya-segalovich-on-improving-search-quality-at-yandex/</link>
		<comments>http://thenoisychannel.com/2011/11/27/cikm-2011-industry-event-ilya-segalovich-on-improving-search-quality-at-yandex/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 06:10:37 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3963</guid>
		<description><![CDATA[This post is last in a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose. The final talk of the CIKM 2011 Industry Event was a talk from Yandex co-founder and CTO Ilya Segalovich on &#8220;Improving Search Quality at Yandex: Current Challenges and Solutions&#8220;. Yandex is the world&#8217;s #5 search engine. It dominates [...]]]></description>
			<content:encoded><![CDATA[<p><iframe src="http://www.slideshare.net/slideshow/embed_code/10357517?rel=0" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" width="425" height="355"></iframe></p>
<p>This post is last in a series summarizing the presentations at the <a href="http://www.cikm2011.org/industryevent">CIKM 2011 Industry Event</a>, which I chaired with former <a href="http://www.endeca.com/">Endeca</a> colleague <a href="http://isquared.wordpress.com/about/">Tony Russell-Rose</a>.</p>
<p>The final talk of the CIKM 2011 Industry Event was a talk from <a href="http://www.yandex.com/">Yandex</a> co-founder and CTO <a href="http://company.yandex.com/corporate_governance/board_of_directors/ilya_segalovich.xml">Ilya Segalovich</a> on &#8220;<a href="http://www.cikm2011.org/industryevent#is">Improving Search Quality at Yandex: Current Challenges and Solutions</a>&#8220;.</p>
<p>Yandex is the world&#8217;s #5 search engine. It dominates the Russian search market, where it has over 64% market share. Ilya focused on three challenges facing Yandex: result diversification, recency-specific ranking, and cross-lingual search.</p>
<p>For result diversification, Ilya focused on queries containing entities without any addition indicators of intent. He asserted that entities offer a strong but incomplete signal of query intent, and in particular that entities often call for suggested query reformulations. The first step in processing such a query is entity categorization. Ilya said that Yandex achieved almost 90% precision using machine learning, and over 95% precision by incorporating manually tuned heuristics. The second step is enumerating possible search intents for the identified category in order to optimize for intent-aware <a href="http://www.isi.edu/~metzler/papers/metzler-cikm09.pdf">expected reciprocal rank</a>. By diversifying entity queries, Yandex reduced abandonment on popular queries, increased click-through rates, and was able to highlight possible intents in result snippets.</p>
<p>Ilya then talked about the problem of balancing recency and relevance in handling queries about current events. He sees recency ranking as a diversification problem, since a desire for recent content is a kind of query intent. A challenge is managing recency-specific ranking is to predict the recency sensitivity of the user for a given query. Yandex considers factors such as the fraction of results found that are at most 3 days old, the number of news results, spikes in the query stream, lexical cues (e.g., searches for &#8220;explosion&#8221; or &#8220;fire&#8221;), and Twitter trending topics. He also referred to a WWW 2006 paper he co-authored on <a href="http://www2006.org/programme/files/pdf/p71.pdf">extracting news-related queries from web query logs</a>. The results of these efforts led to measurable improvements in click-based metrics of user happiness.</p>
<p>Ilya talked about a variety of efforts to support cross-lingual search. Russian users enter a significant fraction (about 15%) of non-Russian queries, but many still prefer Russian-language results. For example, a search for a company name return that company&#8217;s Russian-language home page if one is available. Yandex implements language personalization by learning a user&#8217;s language knowledge and using it as a factor in relevance computation. Yandex also uses machine translation to serve results for Russian-language queries when there are no relevant Russian-language results.</p>
<p>Ilya concluded by pitching the efforts that Yandex is making to participate in and support the broader information retrieval community, including running (and releasing data for) a <a href="http://imat-relpred.yandex.ru/en">relevance prediction challenge</a>. It&#8217;s great to see a reminder that there is more to web search than Google vs. Bing, and refreshing to see how much Yandex shares its methodology and results with the IR community.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/11/27/cikm-2011-industry-event-ilya-segalovich-on-improving-search-quality-at-yandex/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/11/27/cikm-2011-industry-event-ilya-segalovich-on-improving-search-quality-at-yandex/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CIKM 2011 Industry Event: Ed Chi on Model-Driven Research in Social Computing</title>
		<link>http://thenoisychannel.com/2011/11/25/cikm-2011-industry-event-ed-chi-on-model-driven-research-in-social-computing/</link>
		<comments>http://thenoisychannel.com/2011/11/25/cikm-2011-industry-event-ed-chi-on-model-driven-research-in-social-computing/#comments</comments>
		<pubDate>Fri, 25 Nov 2011 21:23:16 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3951</guid>
		<description><![CDATA[This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose. Given the extraordinary ascent of all things social in today&#8217;s online world, we could hardly neglect this theme at the CIKM 2011 Industry Event. We were lucky to have Ed Chi, who recently left the PARC [...]]]></description>
			<content:encoded><![CDATA[<p><iframe src="http://www.slideshare.net/slideshow/embed_code/10164910" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" width="425" height="355"></iframe></p>
<p>This post is part of a series summarizing the presentations at the <a href="http://www.cikm2011.org/industryevent">CIKM 2011 Industry Event</a>, which I chaired with former <a href="http://www.endeca.com/">Endeca</a> colleague <a href="http://isquared.wordpress.com/about/">Tony Russell-Rose</a>.</p>
<p>Given the extraordinary ascent of all things social in today&#8217;s online world, we could hardly neglect this theme at the CIKM 2011 Industry Event. We were lucky to have <a href="http://www-users.cs.umn.edu/~echi/">Ed Chi</a>, who recently left the <a href="http://www.parc.com/">PARC</a> Augmented Social Cognition Group to work on Google+, presenting &#8220;<a href="http://www.cikm2011.org/industryevent#ec">Model-Driven Research in Social Computing</a>&#8220;.</p>
<p>Ed warned us at the beginning of the talk that his focus would be on work he&#8217;d done prior to joining Google. Nonetheless, he offered an interesting collection of public statistics about social activity associated with Google properties: 360M words per day being published on Blogger, 150 years of YouTube video being watched everyday on Facebook, and 40M+ people using Google+. Regardless of how Google has fared in the competition for social networking mindshare, Google is clearly no stranger to online social behavior.</p>
<p>Ed then dove into recent research that he and colleagues have done on Twitter activity. Since all of the papers he discussed are available online, I will only touch on highlights. I encourage you to read the full papers:</p>
<ul>
<li><a href="http://www-users.cs.umn.edu/~echi/papers/2010-socialcom/2010-06-25-retweetability-cameraready-v3.pdf">Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network</a></li>
<li><a href="http://www.parc.com/content/attachments/tweets-from-justin.pdf">Tweets from Justin Bieber&#8217;s Heart: the Dynamics of the &#8220;Location&#8221; Field in User Profiles</a></li>
<li><a href="http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2813/3225">Is Twitter a Good Place for Asking Questions? A Characterization Study</a></li>
<li><a href="http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2856/3250">Language Matters in Twitter: A Large Scale Study</a></li>
<li><a href="http://www-users.cs.umn.edu/~echi/papers/2010-UIST/eddi-uist2010.pdf">Eddi: Interactive Topic-based Browsing of Social Status Streams</a></li>
<li><a href="http://www-users.cs.umn.edu/~echi/papers/2010-CHI/Zerozero88-tweet-recommender-ASC-PARC.pdf">Short and Tweet: Experiments on Recommending Content from Information Streams</a></li>
<li><a href="http://www.grouplens.org/system/files/p217-chen.pdf">Speak Little and Well: Recommending Conversations in Online Social Streams</a></li>
</ul>
<p>Ed talked at some length about language-dependent behavior on Twitter. For example, tweets in French are more likely to contain URLs than those in English, while tweets in Japanese are less likely (perhaps because the language is more compact relative to Twitter&#8217;s 140-character limit?). Tweets in Korean are far more likely to be conversational (i.e., explicitly mentioning or replying to other users) than those in English. These differences remind us to be cautious in generalizing our understanding of online social behavior from the behavior of English-speaking users. Ed also talked about cross-language &#8220;brokers&#8221; who tweet in multiple languages: he sees these as indicating connection strength between languages, as well as giving us insight to improve cross-­language communication.</p>
<p>Ed then talked about ways to reduce information overload in social streams. These included <a href="http://www-users.cs.umn.edu/~echi/papers/2010-UIST/eddi-uist2010.pdf">Eddi</a>, a tool for summarizing social streams, and <a href="https://twitter.com/#!/zerozero88">zerozero88</a>, a closed experiment to produce a personal newspaper from a tweet stream. In analyzing the results of the zerozero88 experiment, Ed and his colleagues found that the most successful recommendation strategy combined users&#8217; self-voting with social voting by their friends of friends. They also found that users wanted both relevance and serendipity &#8212; a challenge since the two criteria often compete with one another.</p>
<p>Ed concluded by offering the following design rule: since interaction costs determine number of the people who participate in social activity, get more people into the system by reducing interaction cost. He asserted that this is a key design principle for Google+.</p>
<p>My skepticism about Google&#8217;s social efforts is a matter of public record (cf. <a href="http://thenoisychannel.com/2011/04/14/social-utility-25/">Social Utility, +/- 25%</a>; <a href="http://thenoisychannel.com/2011/07/04/google%C2%B1/">Google±?</a>). But hiring Ed Chi was a real coup for Google, and I&#8217;m optimistic about what he&#8217;ll bring to the Google+ effort.</p>
<p>ps. My thanks to <a href="http://www.searchenginecaffe.com/">Jeff Dalton</a> for live-blogging his <a href="http://www.searchenginecaffe.com/2011/10/cikm-2011-industry-model-driven.html">notes</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/11/25/cikm-2011-industry-event-ed-chi-on-model-driven-research-in-social-computing/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/11/25/cikm-2011-industry-event-ed-chi-on-model-driven-research-in-social-computing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HCIR 2011: We Have Arrived!</title>
		<link>http://thenoisychannel.com/2011/10/21/hcir-2011-we-have-arrived/</link>
		<comments>http://thenoisychannel.com/2011/10/21/hcir-2011-we-have-arrived/#comments</comments>
		<pubDate>Fri, 21 Oct 2011 09:08:17 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3873</guid>
		<description><![CDATA[If you followed the #hcir2011 tweet stream, then you already know what I have to say: the Fifth Workshop on Human-Computer Interaction and Information Retrieval (HCIR 2011) was an extraordinary success. We had about 100 people attending, 14 paper presentations, 28 posters, and 4 challenge entries, all packed into one intense day at Google&#8217;s beautiful [...]]]></description>
			<content:encoded><![CDATA[<p>If you followed the <a href="http://twitter.com/#!/search/%23hcir2011">#hcir2011</a> tweet stream, then you already know what I have to say: the Fifth Workshop on Human-Computer Interaction and Information Retrieval (<a href="http://hcir.info/hcir-2011">HCIR 2011</a>) was an extraordinary success. We had about 100 people attending, 14 <a href="https://sites.google.com/site/hcirworkshop/hcir-2011/schedule/presentations">paper presentations</a>, 28 <a href="https://sites.google.com/site/hcirworkshop/hcir-2011/posters">posters</a>, and 4 <a href="https://sites.google.com/site/hcirworkshop/hcir-2011/challenge">challenge</a> entries, all packed into one intense day at Google&#8217;s beautiful Mountain View headquarters.</p>
<p>Wednesday evening before the workshop, we were treated to a welcome reception, the first of a few meals provided by Google&#8217;s excellent chefs. It was a great opportunity to reconnect with old friends and meet many first-time HCIR attendees.</p>
<p>Thursday started with a scrumptious breakfast that included chilaquiles, coconut fritters, and bacon. Last year&#8217;s <a href="https://sites.google.com/site/hcirworkshop/hcir-2010/keynote">keynote</a> and this year&#8217;s local host <a href="https://sites.google.com/site/dmrussell/">Dan Russell</a> pulled all the stops &#8212; apparently <a href="http://www.yelp.com/biz/bigtable-cafe-mountain-view">BigTable</a> is the only Google cafe that serves bacon for breakfast! We then proceeded to a poster boaster session in which each poster presenter had a minute to pitch his or her poster. This session set the tone for the rest of the workshop: concentrated ideas and intense audience engagement.</p>
<p>Then came this year&#8217;s keynote, <a href="http://ils.unc.edu/~march/">Gary Marchionini</a>. It was a particular treat to have Gary as a keynote, since his lecture on &#8220;<a href="http://www.asis.org/Bulletin/Jun-06/marchionini.html">Toward Human-Computer Information Retrieval</a>&#8221; inspired me to conceive the HCIR workshop back in 2007. And Gary delivered the goods. He started with a review of the history of HCIR, including some lesser known figures like <a href="http://www.linkedin.com/pub/donald-hawkins/10/a59/77">Don Hawkins</a> (who was in the audience) , <a href="http://www.ideals.illinois.edu/handle/2142/14100">Pauline Cochrane</a>, <a href="http://stuff.mit.edu/people/rmarcus/home.html">Richard Marcus</a>, and <a href="http://www3.fis.utoronto.ca/faculty/meadow/">Charles Meadow</a>.  He brought a few chuckles by citing <a href="http://comminfo.rutgers.edu/~belkin/belkin.html">Nick Belkin</a> (who was present) and <a href="http://research.microsoft.com/en-us/um/people/sdumais/">Sue Dumais</a> (who was not) as the father and mother of HCIR. Naturally he described some of his own work at the University of North Carolina, including the <a href="http://www.open-video.org/">Open Video</a>, <a href="http://ils.unc.edu/relationbrowser/">Relation Browser</a>, and <a href="http://ils.unc.edu/resultsspace/">ResultsSpace</a> projects.But the highlight of his talk was a graph he presented showing two paths to the same user end-state, one of the paths being a smooth progression and the other being a roller-coaster of ups and down. The question of which one was better drew a wide variety of responses, my favorite being <a href="http://www.fxpal.com/?p=gene">Gene Golovchinsky</a> observing that learning is the friction of the information-seeking process.</p>
<p>We broke for coffee and then came back to the first session of paper presentations. <a href="http://www.athenikos.com/">Sofia Athenikos</a> presented a semantic search engine that outperformed IMDB in a user study. <a href="http://comminfo.rutgers.edu/directory/changl/index.html">Chang Liu</a> explored the effect of task difficulty and domain knowledge on dwell times, finding counterintuitive results (at least for me) regarding the correlation of expertise to dwell time. <a href="https://sites.google.com/site/jliujingjing/">Jingjing Liu</a> presented research on knowledge examination in multi-session tasks. Then came the lightning talks: <a href="http://www.mansci.uwaterloo.ca/~msmucker/">Mark Smucker</a> on how users examine and process ranked document lists; <a href="http://www.cs.umass.edu/~jykim/">Jin Kim</a> on simulating associative browsing; <a href="http://faculty.cua.edu/kules/">Bill Kules</a> on visualizing the stages of exploratory search; and <a href="http://comminfo.rutgers.edu/directory/mjcole/index.html">Michael Cole</a> on user domain knowledge and eye movement patterns during search. Way too much goodness to summarize here &#8212; I suggest you read the full papers on the <a href="https://sites.google.com/site/hcirworkshop/hcir-2011/schedule/presentations">workshop site</a>.</p>
<p>Then came lunch &#8212; again in BigTable, but this time with outdoor seating &#8212; and the poster session. As always, this it the most interactive part of the day: two hours of non-stop discussion that start over food and end with prying people away from discussions about posters. I was especially proud of LinkedIn&#8217;s contributions to the <a href="https://sites.google.com/site/hcirworkshop/hcir-2011/posters">poster session</a>, which covered <a href="https://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6NmIxNzEzZjE3ZTVhZTAyYw&amp;pli=1">faceted search log analysis</a>, <a href="https://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6MjEzNDNjZTk5NGYyYWQwOA">social navigation</a>, and <a href="https://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6MWVlMGNhZWY5NTA3MzQ2ZA">whether it is time to abandon abandonment</a>.</p>
<p>Then back to the second session of  paper presentations. <a href="http://faculty.arts.ubc.ca/lfreund/">Luanne Freund</a> talked about document usefulness and genre, finding that genre, besides being hard for users to reliably identify, only matters for tasks that involve doing, deciding, learning; but not for those that involve fact finding or problem solving. <a href="http://www.fxpal.com/?p=gene">Gene Golovchinsky</a> presented work on designing for collaboration in information seeking, previewing the system he used for his challenge entry.  <a href="http://www.medelyan.com/">Alyona Medelyan</a> used the <a href="http://www.pingar.com/">Pingar</a> search engine to evaluate how search interface features affect performance on biosciences tasks. Then more lightning talks: <a href="http://www.ils.unc.edu/~rcapra/">Rob Capra</a> analyzing faceted search on mobile devices; <a href="http://www.linkedin.com/pub/keith-bagley/0/657/124">Keith Bagley</a> on conceptual mile markers for exploratory search; <a href="http://ils.unc.edu/~wildem/ASIST2008/Yuan-CV.pdf">Xiaojun Yuan</a> on how cognitive styles affect user performance; and <a href="http://mikezarro.com/">Mike Zarro</a> on using social tags and controlled vocabularies as search filters.</p>
<p>Last but not least came the <a href="https://sites.google.com/site/hcirworkshop/hcir-2011/challenge">HCIR Challenge</a>:</p>
<blockquote><p>The HCIR 2011 Challenge focuses on the case where recall is everything – namely, the problem of information availability. The information availability problem arises when the seeker faces uncertainty as to whether the information of interest is available at all. Instances of this problem include some of the highest-value information tasks, such as those facing national security and legal/patent professionals, who might spend hours or days searching to determine whether the desired information exists.</p>
<p>The corpus we will use for the HCIR 2011 Challenge is the CiteSeer digital library of scientific literature. The CiteSeer corpus contains over 750,000 documents and provides rich meta-data about documents, authors, and citations.</p></blockquote>
<p>There were four entries:</p>
<ul>
<li><a href="https://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6NzE1YmM2YzE4ODBhYzRjZA" target="_blank">FreeSearch – Literature Search in a Natural Way<br />
</a><em>Claudiu S. Firan, Wolfgang Nejdl, Mihai Georgescu (University of Hanover), and Xinyun Sun (DEKE Lab MOE, Renmin)<br />
</em></li>
<li><a href="https://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6MmZmM2Y5Yzg5OTM4NGI5NQ" target="_blank">Session-based search with Querium<br />
</a><em>Gene Golovchinsky (FX Palo Alto Lab) and Abdigani Diriye (University College London)<br />
</em></li>
<li><a href="https://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6NWI1NTc5NWNmNDlmZDUyZg" target="_blank">GisterPro<br />
</a><em>David L.Ostby and Edmond Brian (Visual Purple)<br />
</em></li>
<li><a href="https://docs.google.com/a/kent.edu/viewer?a=v&amp;pid=sites&amp;srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8Z3g6MjIwOWNlOWY4YTQzMDRmZA" target="_blank">Query Analytics Workbench<br />
</a><em>Antony Scerri, Matthew Corkum, Keith Gutfreund, Ron Daniel Jr., Michael Taylor (Elsevier Labs)</em></li>
</ul>
<p>The competition was fierce. Claudiu showed off the <a href="http://dblp.l3s.de/">Faceted DBLP</a> interface, which is well suited to the information availability task on CiteSeer data. Ed showed how GisterPro uses visualization to support the information seeking process. But it came down to a close call between the Query Analytics Workbench and Querium. Despite the Elsevier team&#8217;s impressive functionality and animated presentation, Gene&#8217;s simpler interface and application of <a href="http://www.fxpal.com/publications/FXPAL-PR-08-467.pdf">ranked fusion</a> won the day. Congratulations to Gene and Abdigani, this year&#8217;s HCIR Challenge winners!</p>
<p>We wrapped up the evening at the <a href="http://tiedhouse.com/">Tied House</a>, a local microbrewery. And of course the discussion turned to where, when, and how we will hold next year&#8217;s workshop. Watch this space. In the meantime, my heartfelt thanks to everyone who made this year&#8217;s workshop such a success &#8212; and especially to our sponsors. Thank you Endeca, Kent State, Microsoft, and Google!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/10/21/hcir-2011-we-have-arrived/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/10/21/hcir-2011-we-have-arrived/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Attention vs. Privacy</title>
		<link>http://thenoisychannel.com/2011/07/24/attention-vs-privacy/</link>
		<comments>http://thenoisychannel.com/2011/07/24/attention-vs-privacy/#comments</comments>
		<pubDate>Mon, 25 Jul 2011 06:22:22 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3740</guid>
		<description><![CDATA[A major feature of the recently released Google+ is Circles, which allows you to &#8220;share relevant content with the right people, and follow content posted by people you find interesting.&#8221; Most people seem to look at Circles as a privacy feature &#8212; and indeed Google&#8217;s official description gives the impression that Circles exist to manage [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone size-full wp-image-3741" title="Attention" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2011/07/attention.jpg" alt="" width="256" height="256" /></p>
<p>A major feature of the recently released <a href="https://plus.google.com/">Google+</a> is <a href="http://www.google.com/support/+/bin/static.py?hl=en&amp;page=guide.cs&amp;guide=1257347&amp;rd=1">Circles</a>, which allows you to &#8220;share relevant content with the right people, and follow content posted by people you find interesting.&#8221;</p>
<p>Most people seem to look at Circles as a privacy feature &#8212; and indeed Google&#8217;s official description gives the impression that Circles exist to manage privacy based on real-life social contexts. Of course, re-sharing can result in unintended consequences, and Google even offers a <a href="http://www.google.com/support/+/bin/static.py?hl=en&amp;page=guide.cs&amp;guide=1358057&amp;answer=1297219&amp;rd=1">warning</a> that:</p>
<blockquote><p>Unless you disable reshares, anything you share (either publicly or with your circles) can be reshared beyond the original people you shared the content with. This could happen either through reshares or through mentions in comments.</p></blockquote>
<p>Privacy is a big deal, <a href="http://ftc.gov/opa/2011/03/google.shtm">especially for Google</a> &#8212; and particularly in the context of rolling out a new social network. Still, I&#8217;m not persuaded that privacy is the only or even the primary concern motivating the concept of <a href="http://thenoisychannel.com/2010/07/08/paul-adamss-presentation-on-social-networking/">social circles</a>.</p>
<p>Sharing content with someone is not just about giving that person permission to see it. Sharing content with someone asserts a claim on that person&#8217;s <a href="http://thenoisychannel.com/2008/12/17/the-macroeconomics-of-information-and-attention-how-people-make-decisions/">attention</a>. While it may be a privilege for me to have access to your content, it may be even more of a privilege for you that I allocate my scarce attention to consume it.</p>
<p>What if we focus on routing content to the people who would find it most interesting? Such an approach works best if all of the shared content is <a href="http://thenoisychannel.com/2008/11/27/when-in-doubt-make-it-public/">public</a> with respect to permissions &#8212; that is, people post it without any expectation of privacy. Twitter demonstrates that many people are comfortable with such a sharing model. Imagine if they could learn to trust a system that optimizes (or at least attempts to optimize) the allocation of everyone&#8217;s attention. This is not an easy problem by any means, nor is it one that is likely to be solved by algorithms alone. It will take a strong dose of <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> to get it right. But, at least in my view, optimizing the allocation of human attention is the grand challenge that everyone working with information retrieval or social networks should be striving to address.</p>
<p>Privacy is important, and social networks should offer simple, robust privacy controls that users understand. We all have experienced the problem of <a href="http://thenoisychannel.com/2008/09/23/quick-bites-filter-failure/">filter failure</a>. But sharing isn&#8217;t just about privacy. Our attention is our most precious cognitive asset, both as individuals and as a society, Moreover, our attention faces ever-increasing demands as our social lives evolve in an online world relatively free of physical constraints. Social network developers would do well to pay attention&#8230;to attention.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/07/24/attention-vs-privacy/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/07/24/attention-vs-privacy/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Google±?</title>
		<link>http://thenoisychannel.com/2011/07/04/google%c2%b1/</link>
		<comments>http://thenoisychannel.com/2011/07/04/google%c2%b1/#comments</comments>
		<pubDate>Mon, 04 Jul 2011 22:55:09 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3703</guid>
		<description><![CDATA[When I left Google last December, it was an open secret that Google was developing a social networking product. Now that Google has released Google+, I am at liberty to share my personal impressions. Let&#8217;s start with the clear wins. Impressive launch. Google has certainly learned its lesson from the past launches of Wave and Buzz. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://plus.google.com/"><img class="alignnone size-full wp-image-3707" title="Google+" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2011/07/Google+.png" alt="" width="500" height="477" /></a></p>
<p>When I <a href="http://thenoisychannel.com/2010/12/03/follow-the-data/">left Google</a> last December, it was an <a href="http://techcrunch.com/2010/12/01/google-social-emerald-sea/">open secret</a> that Google was developing a social networking product. Now that Google has released <a href="http://plus.google.com/">Google+</a>, I am at liberty to share my personal impressions.</p>
<p>Let&#8217;s start with the clear wins.</p>
<ul>
<li><strong>Impressive launch.</strong> Google has certainly learned its lesson from the past launches of <a href="http://mashable.com/2010/08/04/rip-google-wave/">Wave</a> and <a href="http://www.quora.com/Why-did-Google-Buzz-fail">Buzz</a>. Google+ is unambiguously opt-in &#8212; no one is going to complain about being <a href="http://techcrunch.com/2011/03/30/reid-hoffman-data-ambush/">ambushed</a>. People have been begging for invites. But Google is wisely releasing invites quickly enough to build critical mass. I&#8217;d say that Google has at least picked up the <a href="http://www.quora.com/">Quora</a> crowd of early adopters in Silicon Valley.</li>
</ul>
<ul>
<li><strong>Clean design.</strong> Design lead <a href="http://techcrunch.com/2011/06/28/google-plus-design-andy-hertzfeld/">Andy Hertzfeld</a> (of Macintosh fame) has nailed it, leading bloggers to comment that this looks too well designed to be a Google product. Comparing Google+ to Facebook now, I&#8217;m reminded at least a little of comparisons between Facebook and Myspace. Great move for Google here.</li>
</ul>
<p>Now let&#8217;s talk about Google&#8217;s three big features here: Circles, Sparks, and Hangouts.</p>
<ul>
<li><strong>Circles.</strong> Straight out of Paul Adams&#8217;s <a href="http://thenoisychannel.com/2010/07/08/paul-adamss-presentation-on-social-networking/">presentation of social networking</a> (which he created before he <a href="http://techcrunch.com/2011/07/01/paul-adams-seeing-google-in-public-is-like-bumping-into-an-ex-girlfriend/">left Google for Facebook</a>), the idea is simple: a person doesn&#8217;t have a single group of friends, but rather several groups that tend are mostly disjoint. Through Circles, Google+ makes this soft partitioning of the social space a core design principle. You add people to one or more circles, follow the stream of activity from a circle, and share with circles. It&#8217;s great in theory. But in practice it creates friction, especially for people trained on Facebook. There&#8217;s a trade-off between simplicity and expressive power, and Google is placing a strong bet on how users will make this trade-off.  I&#8217;m inclined to agree with <a href="http://www.quora.com/Yishan-Wong/How-Google+-Shows-That-Google-Still-Doesnt-Understand-Social">Yishan Wong</a> that &#8220;the sorting of friends into buckets (friend lists) is something that only nerds do&#8221;. Given Google&#8217;s deep expertise in machine learning, I&#8217;m expecting Google to reduce this friction by give users intelligent suggestions. <em>Full disclosure: my colleagues at LinkedIn built <a href="http://blog.linkedin.com/2011/01/24/linkedin-inmaps/">InMaps</a>, which infers communities from your social network.</em></li>
</ul>
<ul>
<li><strong>Sparks.</strong> The tagline for Sparks is &#8220;For nerding out. Together.&#8221; It feels like a positioning designed by Googlers for Googlers&#8211; you can see promotional videos <a href="http://www.youtube.com/watch?v=MRkAdTflltcgoo">here</a> and <a href="http://www.youtube.com/watch?v=0DoAl4JXhQo">here</a>. I haven&#8217;t seen much talk about Sparks, and what little commentary I&#8217;ve seen is less than gushing. I&#8217;ve experimented with it a bit from a consumption side, and I confess I&#8217;m underwhelmed. Perhaps it&#8217;s a chicken-and-egg problem &#8212; Sparks will only be useful if users populate their profiles with interests, but right now users have no incentive to do so. If Sparks is Google&#8217;s attempt to make <a href="http://en.wikipedia.org/wiki/Google_Reader">Reader</a> more social, there&#8217;s still a ways to go. <em>Full disclosure: LinkedIn has its own approach to social news, <a href="http://blog.linkedin.com/2011/03/10/linkedin-today/">LinkedIn Today</a>, which seems to be <a href="http://techcrunch.com/2011/06/30/linkedin-traffic-twitter/">doing something right</a>. <img src='http://thenoisychannel.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  </em></li>
</ul>
<ul>
<li><strong>Hangouts.</strong> In plain English, Hangouts are group video chat embedded in a social network. Which sounds a lot like what Facebook is <a href="http://techcrunch.com/2011/07/01/facebook-will-launch-in-browser-video-chat-next-week-in-partnership-with-skype/">rumored</a> to be releasing this week through a partnership with Skype. Which in turn was just <a href="http://www.microsoft.com/presspass/press/2011/may11/05-10corpnewspr.mspx">acquired by Microsoft</a>. Will Apple join the party too by implementing group chat in <a href="http://www.apple.com/mac/facetime/">FaceTime</a>? Competitive dynamics aside, this is a very cool feature that hopefully won&#8217;t devolve into <a href="http://en.wikipedia.org/wiki/Chatroulette">Chatroulette</a>. Nothing to, um, disclose here.</li>
</ul>
<p>But the $64B question is whether all this will matter. Can Google+ sustainably co-exist with Facebook? Will people use both services &#8212; and, if so, how will they allocate their attention between them? Or is the success of Google+ predicated on displacing Facebook? Or Twitter? Either of those would certainly qualify as a <a href="http://en.wikipedia.org/wiki/Big_Hairy_Audacious_Goal">Big Hairy Audacious Goal</a>.</p>
<p>Like <a href="http://www.avc.com/a_vc/2011/07/why-im-rooting-for-google.html">Fred Wilson</a>, I&#8217;m rooting for Google+ to succeed &#8212; but even Fred <a href="http://www.avc.com/a_vc/2011/07/why-im-rooting-for-google.html#comment-240598057">notes</a> that he would not be able to get his family on Google+, as they are already happy with Facebook. It&#8217;s not clear to me what I can get *today* from Google+ that I can&#8217;t get from Facebook.</p>
<p>Granted, I&#8217;m not a heavy Facebook user, so I&#8217;m not the best person to ask this question. So readers, I ask you: why will or won&#8217;t you use Google+?</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/07/04/google%c2%b1/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/07/04/google%c2%b1/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Foo for Thought</title>
		<link>http://thenoisychannel.com/2011/06/18/foo-for-thought/</link>
		<comments>http://thenoisychannel.com/2011/06/18/foo-for-thought/#comments</comments>
		<pubDate>Sat, 18 Jun 2011 20:29:29 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3678</guid>
		<description><![CDATA[Last weekend I had the extraordinary privilege to attend Foo Camp, an annual gathering of about 250 Friends Of O&#8217;Reilly (aka Foo). Tim O&#8217;Reilly, Sara Winge, and their colleagues have amazing friends, as you can see if you scan this unofficial list of attendees working on big data, open government, computer security, and more generally [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone" title="Foo Camp (photo by Jeremy Zawodny)" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2011/06/foo-camp.jpg" alt="" width="500" height="366" /></p>
<p>Last weekend I had the extraordinary privilege to attend Foo Camp, an annual gathering of about 250 Friends Of O&#8217;Reilly (aka Foo). <a href="http://radar.oreilly.com/tim/">Tim O&#8217;Reilly</a>, <a href="http://radar.oreilly.com/sara/">Sara Winge</a>, and their colleagues have amazing friends, as you can see if you scan this <a href="http://twitter.com/#!/mrflip/foocamp/members">unofficial list of attendees</a> working on big data, open government, computer security, and more generally on the cutting edge of technology and culture (especially where the two overlap).</p>
<p>Foo Camp is an <a href="http://en.wikipedia.org/wiki/Unconference">unconference</a>, which merits some elaboration. No fees, no conference hotel (many attendees literally set up camp in the space O&#8217;Reilly provided), and no advance program aside from some preselected 5-minute <a href="http://ignite.oreilly.com/">Ignite</a> presentations. Attendees proposed and organized sessions, merging and re-arranging them to optimize for participation. It was a bit chaotic (especially the mad rush after dinner to secure session slots), but very effective.</p>
<p>The minimalist format brought out the best in participants.</p>
<p>For example, I am passionate about (i.e., against) software patents, so I organized a session about them. I did a double-take when I realized that one of the participants was <a href="http://people.ischool.berkeley.edu/~pam/">Pamela Samuelson</a>, perhaps the world&#8217;s top expers on intellectual property law. I braced myself to be schooled &#8212; as I was. But she did it gently and constructively. Specifically, she pointed me to work that her colleagues <a href="http://www.law.berkeley.edu/4457.htm">Jason Schultz</a> and <a href="http://www.law.berkeley.edu/9959.htm">Jennifer Urban</a> were doing on a defensive patent strategy for open-source software (including a <a href="http://events.stanford.edu/events/276/27687/">proposed license</a>), as well as reminding me of the <a href="http://radar.oreilly.com/2010/07/why-software-startups-decide-t.html">Berkeley Patent Survey</a> supporting the argument that software entrepreneurs only file for patents because of real or perceived pressure from their investors. I also heard war stories from lawyers who have done pro bono work against patent trolls, reinforcing my own resolve and also reassuring me that the examples I&#8217;ve seen <a href="http://thenoisychannel.com/2009/10/03/software-patents-a-personal-story/">at close range</a> are not isolated.</p>
<p>Another session asked whether we are too data driven in our work. What was notable is that this session included participants from some of the largest internet companies debating some of the must fundamental ways in which we work, e.g., do we actually learn from data or do we engage in assault by data to defend preconceived positions (cf. <a href="http://thenoisychannel.com/2011/05/30/id-like-to-have-an-argument-please/">argumentative theory</a>). Like all of the conference, the discussion was under &#8220;frieNDA&#8221;. so I&#8217;m being intentionally vague on the specifics. But it was refreshing to see candid admission that all of us know and have experienced the dangers of manipulating an audience with data, and that there are no algorithms to enforce common sense and good faith.</p>
<p>I won&#8217;t even try to enumerate the sessions and side conversations that excited me &#8212; topics included privacy, the future of publishing, a critical analysis of geek culture, and irrational user behavior. I missed the session on data-driven parenting, though others have pointed out to me that you can only learn so much if you don&#8217;t have twins and perform <a href="http://en.wikipedia.org/wiki/A/B_testing">A/B tests</a>. The best summary is intellectual diversity and overstimulation. If you&#8217;d like to get a general sense of the discussion, check out the <a href="http://twitter.com/#!/search/%23foocamp">#foocamp</a> tweet stream. I also recommend Scott Berkun&#8217;s post on &#8220;<a href="http://www.scottberkun.com/blog/2011/what-i-learned-at-foo-camp-11/">What I learned at FOO Camp</a>&#8220;.</p>
<p>As someone who organizes the <a href="http://hcir.info/hcir-2011/">occasional</a> <a href="http://www.cikm2011.org/industryevent">event</a>, I&#8217;m intrigued by the unconference approach &#8212; especially now that I&#8217;ve experienced it first-hand. Moreover, I feel strongly that <a href="http://thenoisychannel.com/2009/08/02/are-academic-conferences-broken-can-we-fix-them/">the academic conference model needs an upgrade</a>. But I also know that open-ended, free-form discussion sessions are not a viable alternative &#8212; indeed, a big part of Foo Camp&#8217;s success was how it inspired participants to organize sessions &#8212; and to vote with their feet to attend the worthwhile ones. And of course part of that success came from inviting active, engaged participants rather than passive spectators.</p>
<p>Many of you also organize events, and I&#8217;m sure that all of you attend them. I&#8217;m curious to hear your thoughts about how to make them better, and happy to share more of what I learned at Foo Camp. After all, Foo is for (inspiring) thought.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/06/18/foo-for-thought/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/06/18/foo-for-thought/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>Winning the War for Software Engineering Talent</title>
		<link>http://thenoisychannel.com/2011/06/05/winning-the-war-for-software-engineering-talent/</link>
		<comments>http://thenoisychannel.com/2011/06/05/winning-the-war-for-software-engineering-talent/#comments</comments>
		<pubDate>Mon, 06 Jun 2011 01:07:20 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3647</guid>
		<description><![CDATA[The war for talent. It&#8217;s the latest metaphor for the challenge that tech companies face as excitement is building in Silicon Valley again. Well, not really &#8212; McKinsey coined the phrase in 1997 and used it as the title of a book published four years later. But anyone who has been trying to hire great [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone" style="margin-left: 10px; margin-right: 10px;" src="http://siliconvalley.sla.org/wp-content/uploads/2010/11/i_want_you_poster.jpg" alt="" width="206" height="230" /><img class="size-full wp-image-3648 aligncenter" title="Real Genius" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2011/06/realgenius.jpg" alt="" width="182" height="230" /></p>
<p>The war for talent. It&#8217;s the latest metaphor for the challenge that tech companies face as excitement is building in Silicon Valley again. Well, not really &#8212; McKinsey coined the phrase in 1997 and used it as the title of a <a href="http://www.amazon.com/War-Talent-Ed-Michaels/dp/1578514592">book</a> published four years later.</p>
<p>But anyone who has been trying to hire great software engineers in recent months knows how hard it is to do so. Particularly for folks like me who are trying to hire <a href="http://www.linkedin.com/jobs/jobs-Data-Scientist-1544636">data scientists</a> &#8212; apparently there&#8217;s a <a href="http://www.mckinsey.com/mgi/publications/big_data/index.asp">national shortage</a>. This is nothing new &#8212; as Joel Spolsky noted in a 2006 <a href="http://www.joelonsoftware.com/articles/FindingGreatDevelopers.html">post</a>, &#8220;the great software developers, indeed, the best people in every field, are quite simply never on the market.&#8221;</p>
<p>I&#8217;m not an expert (or <a href="http://blog.linkedin.com/2010/04/08/linkedin-ninja-job-title/">ninja</a>) on the subject of recruiting or employer branding in general, but I&#8217;ve seen enough of how companies go about hiring software engineers to know that we can do better. I&#8217;d like to share some of my thoughts and experiences, and I hope that you will reciprocate and share your thoughts in the comments. I&#8217;m especially interesting in hearing from folks who are at universities (aka hunting grounds) or who are involved in organizing academic conferences.</p>
<p>First, let&#8217;s talk about how we measure success. As <a href="http://en.wikipedia.org/wiki/William_Thomson,_1st_Baron_Kelvin">Lord Kelvin</a> famously said, &#8220;If you can&#8217;t measure it, you can&#8217;t improve it.&#8221; I&#8217;m not going to talk about how to handle active candidates &#8212; that&#8217;s a filtering problem which, in my opinion, is much more tractable. For example, see what Joel has to say about <a href="http://www.joelonsoftware.com/articles/GuerrillaInterviewing3.html">interviewing developers</a>. Rather, I&#8217;m concerned with the challenge of discovering qualified passive candidates and converting them into active ones. Hence, I propose we make our metric the number of qualified applicants.</p>
<p>The baseline strategy is sourcing, i.e. have sourcers or hiring managers scour the world for qualified candidates (there&#8217;s an <a href="http://www.linkedin.com/hiring">app</a> for that), entice them with your best recruiting pitch, and then go hog wild on the folks who respond. The success of this strategy depends mainly on the rate at which you, your sourcers, or your hiring managers find qualified candidates &#8212; which in turn may split into the two subtasks of finding candidates and filtering them &#8212; and the conversion rate for the qualified candidates you find. Since the best candidates are often happy in their current positions, sourcing passive candidates requires a lot of work and a thick skin for rejection.</p>
<p>What are other ways to attract qualified passive candidates? Here are a few, with examples from my experience at LinkedIn:</p>
<ul>
<li><strong>Hosting events.</strong> Last week at LinkedIn, we hosted CMU professor <a href="http://www.cs.cmu.edu/~christos/">Christos Faloutsos</a>, who delivered a fantastic talk on &#8220;<a href="http://events.linkedin.com/Mining-Billion-Node-Graphs-LinkedIn-Tech/pub/660176">Mining Billion Node Graphs</a>&#8221; &#8212; a topic we thought interesting enough to justify opening up the talk to the general public. We had a few hundred guests, many of whom are precisely the kinds of folks we are trying to hire. Even more people watched the live stream online or will watch the video when we post it to YouTube (coming soon &#8212; stay tuned!). While this was not a recruiting event (we did not even announce that we are hiring), it was a great opportunity to associate LinkedIn with the hard computer science problems we solve on a daily basis.</li>
<li><strong>Sponsoring events.</strong> Sponsorship is tricky &#8212; if you&#8217;re not careful, you spend a lot of money for a glorified display ad. Sometimes sponsorship offers speaking slots as part of the package, but audiences are rightfully skeptical of speakers who have paid for their slots &#8212; especially at conferences that charge hefty fees for attendance. But sometimes sponsorship works. For example, LinkedIn&#8217;s was a sponsor of the <a href="http://strataconf.com/strata2011">O&#8217;Reilly Strata Conference</a>, and the perks of sponsorship complemented our earned speaker slots, helping us bring enormous visibility to our data scientist team and its recent innovations like <a href="http://blog.linkedin.com/2011/01/24/linkedin-inmaps/">InMaps</a> (we has a booth there to print attendees&#8217; InMaps) and <a href="http://thenoisychannel.com/2011/02/04/got-skills/">Skills</a> (which launched during the conference). While Strata generated few direct leads, it left a lasting impression in the <a href="http://en.wikipedia.org/wiki/Big_data">big data</a> community, and I regularly hear candidates refer to it.</li>
<li><strong>Participating in events.</strong> As the Beatles tell us, money <a href="http://en.wikipedia.org/wiki/Can't_Buy_Me_Love">can&#8217;t buy you love</a>. If you want to make an (positive) impression at a conference, you have to contribute people and ideas. This is especially true at academic conferences, where attendees quickly throw out the the extra weight in their tote bags and focus on the conference&#8217;s content and professional networking opportunities. It&#8217;s great if you are Microsoft with a team of close to a thousand researchers and can <a href="http://research.microsoft.com/en-us/news/features/sigir2010-071910.aspx">dominate</a> a conference like <a href="http://sigir.org/">SIGIR</a>. But smaller companies can still make a strong impression on researchers &#8212; and especially on students who may be looking for internships or full-time positions &#8212; by taking an active role at conferences. The traditional approach is to submit papers to the main conference track &#8212; but other avenues include <a href="http://www.kdd.org/kdd2011/tutorials.shtml">tutorials</a>, <a href="http://hcir.info/hcir-2011/">workshops</a>, and <a href="http://www.cikm2011.org/industryevent">industry events</a>. Such participation is often invited, but such invitations are in turn earned by cultivating relationships with researchers &#8212; especially the ones who find themselves on organizing committees.</li>
<li><strong>Contribute to open source projects.</strong> The Search, Network, and Analytics (SNA) team at LinkedIn contributes frequently to open-source projects and publicizes some of its work at <a href="http://sna-projects.com/">http://sna-projects.com/</a>. Open source projects are a great way to earn the respect of engineers who value source over PowerPoint. Especially when your employees include <a href="http://www.linkedin.com/in/allenwittenauer">committers</a> to key technologies like Hadoop. Moreover, open-source projects are social communities, so contributing to them offers opportunities for employees to interact with potential hires.</li>
<li><strong>Social media.</strong> By now, I&#8217;d like to think that marketers understand social media to simply be another set of marketing channels. But I think the territory is still pretty new for employers. Here is a simple suggestion: encourage (but do not try to force) employees to express themselves professionally online. Enforce the standard non-disclosure rules, of course, but don&#8217;t try to manage their voices. Authenticity speaks for itself &#8212; for example, look at what <a href="http://www.linkedin.com/in/adamnash">Adam Nash</a> says about LinkedIn on his <a href="http://blog.adamnash.com/?s=linkedin">personal blog</a>. Or my own posts <a href="http://thenoisychannel.com/?s=linkedin">here</a>. Engineers don&#8217;t read press releases or  corporate blogs, but they do pay attention to their peers. And there&#8217;s nothing unique about blogs &#8212; the same principle applies to platforms like Twitter, Facebook, Quora, and of course LinkedIn. Not all employees enjoy being online extroverts, but those that do not only act as brand ambassadors, but also are likely to eventually strike up conversations with passive candidates about employment opportunities.</li>
</ul>
<p>Finally, don&#8217;t forget measure the results of these efforts! Some activities generate leads directly, in which case you can make an apples-to-apples comparison of their results and costs with the baseline strategy of sourcing. It&#8217;s harder to measure the longer-term effect of efforts to raise visibility, but you can at least ask candidates if they are aware of those efforts &#8212; after all, efforts to raise visibility should be visible to candidates! You can also ask candidates if those efforts were a factor in their decision to apply. These measures aren&#8217;t perfect, but they are a lot better than nothing, especially when you&#8217;re trying to decide how best to invest limited resources.</p>
<p>Of course, even an optimal strategy can&#8217;t substitute for offering a combination of interesting work, competitive compensation, and a work hard / play hard <a href="http://www.youtube.com/watch?v=PUwEEOhcK3s">culture</a>. As with all marketing efforts, you need to start with a great product. But great products don&#8217;t sell themselves: you need to invest in a combination of outbound and inbound marketing to have a fighting chance in the war for talent. Good luck! And, in case you didn&#8217;t notice, <a href="http://www.linkedin.com/jobs/jobs-Data-Scientist-1544636">we&#8217;re hiring</a>!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/06/05/winning-the-war-for-software-engineering-talent/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/06/05/winning-the-war-for-software-engineering-talent/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Identifying Influencers on Twitter</title>
		<link>http://thenoisychannel.com/2011/04/16/identifying-influencers-on-twitter/</link>
		<comments>http://thenoisychannel.com/2011/04/16/identifying-influencers-on-twitter/#comments</comments>
		<pubDate>Sun, 17 Apr 2011 02:52:43 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3567</guid>
		<description><![CDATA[One of the perks of working at LinkedIn is being surrounded by intellectually curious colleagues. I recently joined a reading group and signed up to lead our discussion of a WSDM 2011 paper on &#8220;Identifying &#8216;Influencers&#8217; on Twitter&#8221; by Eytan Bakshy, Jake Hofman, Winter Mason, and Duncan Watts. It&#8217;s great to see the folks at [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://darmano.typepad.com/logic_emotion/2006/08/levels_of_influ.html"><img class="alignnone size-full wp-image-3569" title="Levels of Influence (David Armamo)" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2011/04/levels-of-influence.gif" alt="" width="418" height="418" /></a><br />
One of the perks of <a href="http://www.linkedin.com/jobs?viewJob=&amp;jobId=1544636">working at LinkedIn</a> is being surrounded by intellectually curious colleagues. I recently joined a reading group and signed up to lead our discussion of a <a href="http://www.wsdm2011.org/">WSDM 2011</a> paper on &#8220;<a href="http://research.yahoo.com/files/bakshy_wsdm.pdf">Identifying &#8216;Influencers&#8217; on Twitter</a>&#8221; by <a href="http://www-personal.umich.edu/~ebakshy">Eytan Bakshy</a>, <a href="http://research.yahoo.com/Jake_Hofman">Jake Hofman</a>, <a href="http://research.yahoo.com/Winter_Mason">Winter Mason</a>, and <a href="http://research.yahoo.com/Duncan_Watts">Duncan Watts</a>. It&#8217;s great to see the folks at Yahoo! Research doing cutting-edge work in this space.</p>
<p>I thought I&#8217;d prepare for the discussion by sharing my thoughts here. Perhaps some of you will even be kind enough to add your own ideas, which I promise to share with the reading group.</p>
<p>I encourage you to read the paper, but here&#8217;s a summary of its results:</p>
<ul>
<li>A user&#8217;s influence on Twitter is the extent to which that user can cause diffusion a posted URL, as measured by reposts propagated through follower edges in Twitter&#8217;s directed social graph.</li>
<li>The best predictors of future total influence are follower count and past local influence, where local influence refers to the average number of reposts by that user’s immediate followers, and total influence refers to average total cascade size.</li>
<li>The content features of individual posts do not have identifiable predictive value.</li>
<li>Barring a high per-influencer acquisition cost, the most cost-effective strategy for buying influence is to target users of average influence.</li>
</ul>
<p>Let&#8217;s dive in a bit deeper.</p>
<p>The definitions of influence and influencers are, by the authors&#8217; own admission, narrow and arbitrary. There are many ways one could define influence, even within the context of Twitter use. But I agree with the authors that these definitions have enough <a href="http://en.wiktionary.org/wiki/verisimilitude">verisimilitude</a> to be useful, and their simplicity facilitates quantitative analysis.</p>
<p>It&#8217;s hardly surprising that past influence is a strong predictor of future influence. But it might seem counterintuitive that, for predicting future total influence,  past local influence is more informative than past total influence. The authors suggest the explanation that most non-trivial cascades are of depth 1 &#8212; i.e., total influence is mostly local influence. But at most that would make the two features equally informative, and total influence should still be a mildly better predictor.</p>
<p>I suspect that another factor is in play &#8212; namely, that the difference between local influence and total influence reflects the unpredictable and rare virality of the content (e.g., <a href="http://networkeffect.allthingsd.com/20110415/random-facebook-users-question-gets-four-million-votes/">a random Facebook Question generated 4M votes</a>). If this hypothesis is correct, then past local influence factors out this unpredictable factor and is thus a better predictor of both future local influence and future total influence.</p>
<p>I&#8217;m a bit surprised that follower count supplies additional informative value beyond the past local influence; after all, local influence should already reflect the extent to which the followers are being influenced. It&#8217;s possible that past influence lags the follower count, since it does not sufficiently weigh the potential contributions of more recent followers. But another possibility is one analogous to the predictive value of past local vs. global influence: past local influence may include an unpredictable content factor which follower count factors out.</p>
<p>Of course, I can&#8217;t help suggesting that <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">TunkRank</a> might be a more useful indicator than follower count. Unfortunately the authors don&#8217;t seem to be aware of the TunkRank work &#8212; or perhaps they preferred to restrict their attention to basic features.</p>
<p>I&#8217;m not surprised by the inability to exploit content features to predict influence. If it were easy to generate viral content, <a href="http://en.wikipedia.org/wiki/Get-rich-quick_scheme">everyone would do it</a>. Granted, a deeper analysis might squeeze out a few features (like those suggested in the <a href="http://www.buddymedia.com/newsroom/?p=9335">Buddy Media report</a>), but I don&#8217;t think there are any silver bullets here.</p>
<p>Finally, the authors consider the question of designing a cost-effective strategy to buy influence. The authors assume that the cost of buying influence can be modeled in terms of two parameters: a per-influencer acquisition cost (which is the same for each influencer) and a per-follower cost for each influencer. They conclude that, until the acquisition cost is extremely high (i.e., over 10,000 times the per-follower cost), the most cost-efficient influencers are those of average influence. In other words, there&#8217;s no reason to target the <a href="http://www.amazon.com/Influentials-American-Tells-Other-Where/dp/0743227298">small number of highly influential users</a>.</p>
<p>The authors may be arriving at the right conclusion (Watts&#8217;s <a href="http://research.yahoo.com/files/w_d_JCR.pdf">earlier work</a> with <a href="http://www.uvm.edu/~pdodds/">Peter Dodds</a>, which the paper cites, questions the &#8220;influentials&#8221; hypothesis), but I&#8217;m not convinced by their economic model of an influence market. It may be the case that professional influencers are trying to peddle their followers&#8217; attention on a per-follower basis &#8212; there are <a href="http://www.buytwitterfollowers.org/">sites</a> <a href="http://twitter1k.com/">that</a> <a href="http://www.socialkik.com/twitter_promo.html">offer</a> <a href="http://www.twitterfollowersshop.com/">this</a> <a href="http://usocial.net/twitter_marketing/">model</a>.</p>
<p>But why should anyone believe that an influencer&#8217;s value is proportional to his or her number of followers? The authors&#8217; own work suggests that past local influence is a more valuable predictor than follower count, and again they might want to look at TunkRank.</p>
<p>Regardless, I&#8217;m not surprised that a fixed per-follower cost makes users with high follower counts less cost-effective, as I subscribe to its corollary: as a user&#8217;s follower count goes up, the per-follower value diminishes. I haven&#8217;t done the analysis, but I believe that the ratio of a user&#8217;s TunkRank to the user&#8217;s follower count tends to go down as a user&#8217;s follower count goes up. A more interesting research (and practical) question would be to establish a correctly calibrated model of influencer value and then explore portfolio strategies.</p>
<p>In any case, it&#8217;s an interesting paper, and I look forward to discussing it with my colleagues next week. Of course, I&#8217;m happy to discuss it here in the meantime. If you&#8217;re in my reading group, feel free to chime in. And you&#8217;re not in you&#8217;re not in my reading group, consider joining. We do have <a href="http://www.linkedin.com/jobs?viewJob=&amp;jobId=1544636">openings</a>. <img src='http://thenoisychannel.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/04/16/identifying-influencers-on-twitter/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/04/16/identifying-influencers-on-twitter/feed/</wfw:commentRss>
		<slash:comments>25</slash:comments>
		</item>
		<item>
		<title>Social Utility, +/- 25%</title>
		<link>http://thenoisychannel.com/2011/04/14/social-utility-25/</link>
		<comments>http://thenoisychannel.com/2011/04/14/social-utility-25/#comments</comments>
		<pubDate>Fri, 15 Apr 2011 04:19:28 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3550</guid>
		<description><![CDATA[I like Google&#8230; I&#8217;ve been a regular Google user since the day I first discovered its existence in 1999. Indeed, I&#8217;ve consistently found Google to be the most useful service on the web. That&#8217;s not love, but it&#8217;s a very strong +1. Moreover, I&#8217;d say that my preference for Google is an informed one. I&#8217;ve [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.businessinsider.com/heres-the-memo-telling-all-google-employees-their-2011-pay-depends-on-google-sucking-less-at-social-2011-4"><img class="alignnone" title="FAQ for Google employees about the &quot;social&quot; bonus (via Business Insider)" src="http://static3.businessinsider.com/image/4d9e557eccd1d599390f0000-915-581/multiplier-faq.jpg" alt="" width="514" height="326" /></a></p>
<h3>I like Google&#8230;</h3>
<p>I&#8217;ve been a regular Google user since the day I first discovered its existence in 1999. Indeed, I&#8217;ve consistently found Google to be the most useful service on the web. That&#8217;s not love, but it&#8217;s a very strong <a href="http://www.google.com/+1/button/">+1</a>.</p>
<p>Moreover, I&#8217;d say that my preference for Google is an informed one. I&#8217;ve given all of the major search engines a <a href="http://thenoisychannel.com/2009/06/01/banging-on-bing-a-bummer/">fair chance</a>, and even tried a fair number of <a href="http://thenoisychannel.com/2008/10/16/duck-duck-go/">obscure</a> <a href="http://thenoisychannel.com/2009/03/15/kosmix-im-impressed/">ones</a>. They all have their strengths, but none have delivered enough utility to me to justify the cognitive load of using more than one search engine for the open web.</p>
<h3>&#8230;but I don&#8217;t need Google.</h3>
<p>Nonetheless, I know that, if Google disappeared tomorrow or became <a href="http://www.mobilecrunch.com/2010/09/09/verizon-to-bing-i-choose-you/">inconvenient to access</a>, I&#8217;d be content with one of its competitors. I have no particular investment in Google beyond brand loyalty.</p>
<p>Actually, that&#8217;s not entirely true. I could easily walk away from Google search, but I&#8217;d be apoplectic if I suddenly lost access to my Gmail account &#8212; much as if I lost access to my LinkedIn or Twitter accounts. Indeed, Gmail is the only way in which Google has me locked in, but I don&#8217;t see my Gmail account as entangled with my access to Google&#8217;s other services.</p>
<p>Perhaps that not a bug but a feature: after all, Google trumpets the virtues of <a href="http://googleblog.blogspot.com/2009/12/meaning-of-open.html">&#8220;open&#8221;</a> and the portability of user data (including Gmail) through the <a href="http://www.dataliberation.org/">Data Liberation Front</a>. Nonetheless, it&#8217;s no secret that Google has a major case of <a href="http://abclocal.go.com/kabc/story?section=news/consumer&amp;id=8072533">Facebook envy</a>. And if <a href="http://www.businessinsider.com/heres-the-memo-telling-all-google-employees-their-2011-pay-depends-on-google-sucking-less-at-social-2011-4">rumors</a> hold, Google is now making the success of its social strategy a major component in all employee compensation.</p>
<h3>Social is Give to Get.</h3>
<p>Google critics often assert that <a href="http://www.google.com/search?q=%22google+doesn't+get+social%22">Google doesn&#8217;t get social</a>. But I think the problem isn&#8217;t so much with what Google gets as what it gives. When it comes to social, you have to give to get. That is, to get data and engagement, you have to provide social utility.</p>
<p>To start off, Google would love to know <strong>who you are</strong>. That&#8217;s why it developed <a href="http://www.google.com/support/accounts/bin/answer.py?answer=97703">Google Profiles</a> in 2007. People are more than willing to provide data about who they are, as proven by the hundreds of millions of people who create profiles on Facebook and LinkedIn. Perhaps Google was a little bit late to the game. More likely, people didn&#8217;t see enough utility in creating Google profiles. Facebook, on the other hand, helps people be found by their friends and family in a context designed for social interaction. LinkedIn offers people the opportunity to be found by people who can help you professionally: colleagues, classmates, potential employers, etc. Google didn&#8217;t give people much reason to invest effort &#8212; in fact it seems to treat Profiles as a dumping ground populated by Google&#8217;s other products, rather than valuable piece of online real estate embedded in a living social context. Not surprisingly, users invest their efforts elsewhere.</p>
<p>Google would also love to know <strong>where you are</strong> and <strong>where you&#8217;ve been</strong> &#8212; that&#8217;s why Google created <a href="http://techcrunch.com/2009/02/04/broadcast-your-location-to-friends-with-google-latitude/">Latitude</a> in 2009. Moreover, Google developed this pioneering location-based service as a complement to Google Maps, perhaps the best product Google has produced outside of search. Given it&#8217;s dominance in mapping services, directions, and local search, Google should be the leader of all things local. And yet, while Latitude has flopped, Foursquare &#8212; which launched in the same year as a tiny startup after Google acquired and shut down its <a href="http://en.wikipedia.org/wiki/Dodgeball_(service)">previous incarnation</a>&#8211; succeeded in defining location-based services as a category. Before Foursquare, the idea of a service tracking your location was one that most of us associated with <a href="http://www.lojack.com/">Lo-Jack</a> and <a href="http://en.wikipedia.org/wiki/Nineteen_Eighty-Four">Big Brother</a> &#8212; if not with modern totalitarian regimes. Yet, by making a game out of &#8220;checking in&#8221; to venues, Foursquare inspired its users to willingly &#8212; and eagerly! &#8212; share and publish their whereabouts. It&#8217;s unclear whether this model will create sustained interest (cf. Mark Watkins&#8217;s analysis at <a href="http://www.readwriteweb.com/archives/2011_the_year_the_check-in_died.php">ReadWriteWeb</a>), but Foursquare&#8217;s success thus far is predicate on its offers social utility in exchange for data and attention.</p>
<p>Of course, Google also wants to know <strong>what you like</strong>. That&#8217;s why Google developed <a href="http://thenoisychannel.com/2008/11/21/google-searchwiki-an-interesting-take-on-pim/">SearchWiki</a> (RIP), <a href="http://google-latlong.blogspot.com/2010/11/discover-yours-local-recommendations.html">Hotpot</a> (now <a href="http://googleblog.blogspot.com/2011/04/hotpot-is-going-places.html">merged into Places</a>), and most recently <a href="http://www.google.com/+1/button/">+1</a>. As Amazon, Facebook, Netflix, and Yelp have demonstrated, people aren&#8217;t shy about sharing their opinions publicly, given the right social context and utility. Unfortunately, Google seems to struggle with that last part. Google embedded SearchWiki in the non-social context of search &#8212; and has launched +1 the same way. It&#8217;s not at all clear what users would gain by going out of their flow to annotate search results. Hotpot may simply be a case of too little, too late &#8212; people are already trained to go to Yelp and Facebook Fan pages for subjective information about service businesses. Overall, Google has not given users a reason to believe there is significant return on their investment in sharing opinions.</p>
<h3>Collecting Data Doesn&#8217;t Count.</h3>
<p>Of course Google is able to collect a significant amount of data about users&#8217; identities through their search history, cookies, browser toolbars, and purchase history (if they use Google Checkout). Indeed, it is Google inference of user intent in search queries that has allowed Google to become the poster child of online advertising.</p>
<p>But collecting data is not the same as having the user volunteer it. Most users have a transactional relationship with Google, tolerating data collection and advertising in exchange for a free service. Google wants more &#8212; it wants users to invest in identities associated with their Google accounts. But Google doesn&#8217;t seem to undertand that users don&#8217;t make these investments unless their receive some social or professional utility in return.</p>
<p>If it&#8217;s true that Larry Page is making &#8220;social&#8221; Google&#8217;s top <a href="http://dondodge.typepad.com/the_next_big_thing/2010/01/how-google-sets-goals-and-measures-success.html">OKR</a>, then I hope for the sake of my former colleagues that Google has learned from its past experiments.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/04/14/social-utility-25/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/04/14/social-utility-25/feed/</wfw:commentRss>
		<slash:comments>39</slash:comments>
		</item>
		<item>
		<title>Guest Blog: Data 2.0 Conference Report</title>
		<link>http://thenoisychannel.com/2011/04/07/guest-blog-data-2-0-conference-report/</link>
		<comments>http://thenoisychannel.com/2011/04/07/guest-blog-data-2-0-conference-report/#comments</comments>
		<pubDate>Thu, 07 Apr 2011 15:26:41 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3546</guid>
		<description><![CDATA[Note: This post was written by Scott Nicholson, a Senior Data Scientist at LinkedIn. Scott is data and modeling geek with a passion for startups, product and user experience. His work at LinkedIn focuses on analyzing and improving user engagement and monetization. I’m happy to report back on my experience at the Data 2.0 conference, an [...]]]></description>
			<content:encoded><![CDATA[<p><object width="400" height="300"><param name="flashvars" value="offsite=true&amp;lang=en-us&amp;page_show_url=%2Fgroups%2Fdata2con%2Fpool%2Fshow%2F&amp;page_show_back_url=%2Fgroups%2Fdata2con%2Fpool%2F&amp;group_id=1614380@N25&amp;jump_to=&amp;start_index=" /><param name="movie" value="http://www.flickr.com/apps/slideshow/show.swf?v=71649" /><param name="allowFullScreen" value="true" /><embed type="application/x-shockwave-flash" width="400" height="300" src="http://www.flickr.com/apps/slideshow/show.swf?v=71649" flashvars="offsite=true&amp;lang=en-us&amp;page_show_url=%2Fgroups%2Fdata2con%2Fpool%2Fshow%2F&amp;page_show_back_url=%2Fgroups%2Fdata2con%2Fpool%2F&amp;group_id=1614380@N25&amp;jump_to=&amp;start_index=" allowfullscreen="true"></embed></object></p>
<p><em>Note: This post was written by <a href="http://www.linkedin.com/in/scottnicholsonphd">Scott Nicholson</a>, a Senior Data Scientist at LinkedIn. Scott is data and modeling geek with a passion for startups, product and user experience. His work at LinkedIn focuses on analyzing and improving user engagement and monetization.</em></p>
<p>I’m happy to report back on my experience at the <a href="http://data2con.com/">Data 2.0 conference</a>, an event organized by <a href="http://midventures.com/">midVentures</a> and targeted at entrepreneurs building products to leverage the dramatic increase in publicly and privately collected data. The conference has four main themes: what data is available, how to obtain data, how to store and access data, and how to create value from data products. For data nerds or hackers, the conference offered a delightful stream of  “you know what would be cool&#8230;” ideas.</p>
<p>The morning started off on a strong foot with a talk by <a href="http://wadhwa.com/">Vivek Wadhwa</a> on how data is going to define the next generation of successful startups in a new information age. He observed the increasing online access to data that has previously been restricted to offline access (or no access at all). He also emphasized the importance of  new sources of data, such as medical records and genome data. We need to think of social use of data beyond Twitter, Facebook and LinkedIn: for example, genome data will allow us to connect to each other in ways that helps us better understand our similarities and differences. Meanwhile, some existing data sources will become increasingly open and available to all. Wadhwa stressed the importance of leveraging the open sources of federal, state and local government data to come up with solutions to the existing closed and clunky legacy systems that governments used to generate data reports (<em>a pity that <a href="http://data.gov/">data.gov</a> and related programs may be <a href="http://www.guardian.co.uk/news/datablog/2011/apr/05/data-gov-crisis-obama">defunded</a> &#8212; DT</em>).</p>
<p>The morning keynote segued nicely into the <a href="http://data2con.com/schedule/topics-2/#WhyOpenData">panel</a> on open data sources. <a href="http://www.jaynath.com/">Jay Nath</a>, Director of CRM for the city of San Francisco, noted that, while many applications are using government data and APIs, they mostly address consumer convenience (e.g., public transit apps) rather than government efficiency.  Panelists agreed that government employees have few incentives to take risks by using new technology: legacy systems might be expensive, inflexible and inefficient, but they do perform their limited function. Alluding to Eric Ries&#8217;s idea of a &#8220;<a href="http://theleanstartup.com/">lean startup</a>&#8220;, Nath suggested the concept of a &#8220;lean government&#8221; that lowered costs, sped up its operations, and avoided procurement processes by using open source technology &#8212; all in the context of providing services to its citizens.</p>
<p>The inspiring mid-day keynote by former Amazon Chief Scientist <a href="http://www.weigend.com/">Andreas Weigend</a> took a different perspective from the morning sessions: he focused on the how data sharing can provide tangible value to end-users, even resulting in significant behavior change. He cited products like<a href="http://www.withings.com/en/bodyscale"> tweeting weight scales</a>,<a href="http://www.fitbit.com/"> FitBit</a>, and<a href="http://www.apple.com/ipod/nike/"> Nike +</a> that allow people to share data about their fitness efforts, thus leading to social reinforcement for positive behaviors. I personally see this area as a great example of where data scientists and engineers can create enormous economic value and increase people’s welfare</p>
<p>The day also featured a various product launches and presentations. Here are a few that caught my attention:</p>
<ul>
<li><a href="http://micello.net/">Micello</a>: Google maps for indoors. They won the startup competition that was held in conjunction with the conference.</li>
<li><a href="https://www.tropo.com/home.jsp">Tropo</a>: API for voice calls and SMS</li>
<li><a href="http://www.datastax.com/products/brisk">DataStax Brisk</a>: Technology unifying<a href="http://hadoop.apache.org/"> Hadoop</a>,<a href="http://wiki.apache.org/hadoop/Hive"> Hive</a> &amp;<a href="http://cassandra.apache.org/"> Cassandra</a>. A new Hadoop distribution powered by Cassandra.</li>
<li><a href="http://www.neerlife.com/">Neer</a>: always-on location awareness app from Qualcomm. Privately share location with groups and families.</li>
<li><a href="http://www.heritagehealthprize.com/c/hhp">Heritage Health Prize</a>: $3MM prize for predictive modeling around who will require hospitalization (a follow-up on their announcing the prize at<a href="http://strataconf.com/strata2011"> Strata</a>)</li>
</ul>
<p>Overall, it was great to see hundreds of people exploring innovations and opportunities to use data to improve business, technology and society.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/04/07/guest-blog-data-2-0-conference-report/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/04/07/guest-blog-data-2-0-conference-report/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Got Skills?</title>
		<link>http://thenoisychannel.com/2011/02/04/got-skills/</link>
		<comments>http://thenoisychannel.com/2011/02/04/got-skills/#comments</comments>
		<pubDate>Fri, 04 Feb 2011 05:53:38 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3463</guid>
		<description><![CDATA[Last October, a certain blogger said: LinkedIn needs to implement some kind of concept extraction to provide a useful topic facet (something I’d also love to see for their regular people search). This is a challenging information extraction problem, especially for the open web, but I also know from experience that it is tractable within a domain. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.linkedin.com/skills/skill/Information_Retrieval"><img class="alignnone size-full wp-image-3466" title="LinkedIn Skills: Information Retrieval" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2011/02/Screen-shot-2011-02-03-at-9.10.05-PM1.png" alt="" width="507" height="589" /></a></p>
<p>Last October, <a href="http://thenoisychannel.com/2010/10/02/linkedin-signal-exploratory-search-for-twitter/">a certain blogger said</a>:</p>
<blockquote><p>LinkedIn needs to implement some kind of concept extraction to provide a useful topic facet (something I’d also love to see for their regular people search). This is a challenging information extraction problem, especially for the open web, but I also know from <a href="http://www.endeca.com/">experience</a> that it is tractable within a domain. Given LinkedIn’s professional focus, I believe this is a problem they can and should tackle.</p></blockquote>
<p>Shortly after writing that post, I interviewed at LinkedIn and met <a href="http://www.linkedin.com/in/peterskomoroch">Pete Skomoroch</a>, who showed me an early preview of the work his team was doing to make skills a <a href="http://en.wikipedia.org/wiki/Faceted_search">facet</a> for exploring the space of LinkedIn member profiles. That demo made a strong impression on me, giving me a taste of the great products LinkedIn&#8217;s data scientists were working on in the lab.</p>
<p>And now I&#8217;m delighted that everyone can try out the beta launch of <a href="http://www.linkedin.com/skills/">LinkedIn Skills</a> which was announced today at O&#8217;Reilly&#8217;s <a href="http://strataconf.com/strata2011">Strata 2011</a> conference on Big Data.</p>
<p>As Pete says in his <a href="http://blog.linkedin.com/2011/02/03/linkedin-skills/">blog post</a>:</p>
<p><!-- p.p1 {margin: 0.0px 0.0px 13.0px 0.0px; line-height: 17.0px; font: 13.0px Arial} --></p>
<blockquote><p>If you search for a particular skill, we’ll surface key people within that community, show you the top locations, related companies, relevant jobs, and groups where you can interact with like minded professionals.  You’ll also be able to explore similar skills and compare their growth relative to each other.</p></blockquote>
<p>I encourage you to check it out &#8212; whether you&#8217;re looking for experts on <a href="http://www.linkedin.com/skills/skill/Hadoop">Hadoop</a>, <a href="http://www.linkedin.com/skills/skill/Cheese">cheese</a>, or anything else! It&#8217;s a beta, so I&#8217;m sure you&#8217;ll find rough edges; but I hope it gives you a sense of how LinkedIn&#8217;s data can enable a incredibly powerful and useful <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploratory search</a> experience.</p>
<p><a href="http://thenoisychannel.com/2011/01/29/be-vewy-vewy-quiet/">No forward-looking statements</a>, except to say that it only gets better from here!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/02/04/got-skills/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/02/04/got-skills/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Quo Vadis, Quora?</title>
		<link>http://thenoisychannel.com/2011/01/09/quo-vadis-quora/</link>
		<comments>http://thenoisychannel.com/2011/01/09/quo-vadis-quora/#comments</comments>
		<pubDate>Sun, 09 Jan 2011 22:14:56 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3433</guid>
		<description><![CDATA[I know, everyone is sick about hearing about Quora, the community question answering site that is the darling of the blogosphere, and perhaps you fled here from TechCrunch hoping for something different. If so, I apologize. And if you want to read something else, I encourage you to use either the random post widget I [...]]]></description>
			<content:encoded><![CDATA[<p>I know, everyone is sick about hearing about <a href="http://www.quora.com/">Quora</a>, the community question answering site that is the darling of the blogosphere, and perhaps you fled here from <a href="http://techcrunch.com/tag/quora/">TechCrunch</a> hoping for something different. If so, I apologize. And if you want to read something else, I encourage you to use either the random post widget I recently added to the right-hand sidebar  or the <a href="http://thenoisychannel.com/2011/01/07/enabling-exploratory-search-with-dhiti/">exploration widget</a> at the bottom of this post.</p>
<p>But I have personal reasons to be interested in Quora. One of their lead engineers, <a href="http://xng.cc/">Albert Sheu</a>, was a <a href="http://www.linkedin.com/profile/recommendations?id=9903676">star intern</a> of mine at <a href="http://www.endeca.com/">Endeca</a>. And Quora raises lots of interesting questions about search, user experience, knowledge management, and <a href="http://thenoisychannel.com/2010/05/02/thoughts-about-online-reputation/">online reputation</a>. How could I resist?</p>
<p>I see three potential reasons to use Quora:</p>
<ol>
<li>Objective question answering.</li>
<li>Subjective question answering.</li>
<li>Community participation.</li>
</ol>
<p>Let&#8217;s consider how Quora fares today on each of these, and where it might go.</p>
<p><strong>1. Objective question answering.</strong></p>
<p>When I <a href="http://thenoisychannel.com/2010/04/19/qui-quae-quora/">blogged about Quora</a> early last year, I said that &#8220;I don’t see Quora as a knowledge base of first resort–except possibly to learn more about software startups.&#8221; Despite Quora&#8217;s recently <a href="http://www.quora.com/Quora-Growth-Surge-Dec-2010-Jan-2011">growth surge</a>, I am not ready to change my answer significantly &#8212; I find that Quora&#8217;s topics are pretty sparse when I stray from its Silicon Valley focus.</p>
<p>Within that focus, Quora is nailing it. For example, I was curious to learn whether someone who signed a non-compete agreement outside of California was still subject to it if he or she moved to California, where such contracts are legally unenforceable. Not surprisingly, <a href="http://www.quora.com/Non-Compete-Agreements">non-compete agreements</a> are a topic on Quora, and I quickly found a <a href="http://www.quora.com/If-I-have-a-non-compete-agreement-with-a-company-in-NY-and-move-to-CA-is-the-non-compete-agreement-unenforcable">useful answer</a> from a lawyer.</p>
<p>But for most objective questions, I&#8217;m still turning to Google and Wikipedia &#8212; or to <a href="http://twitter.com/#!/dtunkelang">Twitter</a> if both of those fail and I am willing to ask a favor of my followers (who <a href="http://thenoisychannel.com/2009/03/14/challenge-blog-twitter-vs-aardvark/">kick ass</a>!). Sometimes Google will take me to Quora, but I can&#8217;t imagine Quora will succeed through this flow in the long term.</p>
<p><strong>2. Subjective question answering.</strong></p>
<p>I see subjective question answering as Quora&#8217;s strongest suit. A good subjective question on Quora &#8212; often a &#8220;why&#8221; question &#8212; generates a diverse collection of interesting and informed perspectives. A couple of good example are &#8220;<a href="http://www.quora.com/Why-did-Google-Wave-fail-to-get-significant-user-adoption">Why did Google Wave fail to get significant user adoption?</a>&#8221; and &#8220;<a href="http://www.quora.com/Social-Networks/What-is-lacking-in-social-networking-now">What is lacking in social networking now?</a>&#8220;.</p>
<p>Again, these questions are well within the Silicon Valley focus, but I could see Quora extending this value proposition to other verticals if it can grow the communities successfully. And I certainly don&#8217;t see myself going to Google or even Twitter to get useful answers to subjective questions. The closest is <a href="http://thenoisychannel.com/2009/05/27/topsy-tippling-the-stream-of-conversations/">Topsy</a>, and Quora has the advantage of being explicitly organized around questions and topics.</p>
<p><strong>3. Community participation.</strong></p>
<p>Is Quora a question answering site or a social network? Quora users and employees have tried to answer that question (<a href="http://www.quora.com/Is-Quora-a-social-network">on Quora</a>, natch), but I&#8217;m not sure Quora&#8217;s converged enough for anyone to know. What is clear is that Quora emphasizes conversation, making it more like a blog or wiki than an answers site.</p>
<p>Conversation certainly engages its participants. But it also raises the cost of participation. One of the things I love about Google is that it gives me information without unnecessary overhead. When I want conversation, I go to social venues like Twitter.</p>
<p>Perhaps Quora can be both a question answering site and a social network. But I suspect it will need to choose. Most people don&#8217;t have the time or patience to participate in additional communities, so question answering is the easier sell to a mass audience. But the participation is what makes Quora especially distinctive today. Perhaps it&#8217;s a question of quality vs. quantity.</p>
<p>So, <em><a href="http://en.wikipedia.org/wiki/Quo_vadis">quo vadis</a></em>, Quora? I suppose I&#8217;ll have to <a href="http://www.quora.com/Quora-Quality/Quora-is-a-curated-community-of-early-adopters-now-its-nice-but-how-can-it-scale">check Quora</a> (or <a href="http://www.cwora.com/">Cwora</a>) to find the answers.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/01/09/quo-vadis-quora/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/01/09/quo-vadis-quora/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>No More Quora Invites</title>
		<link>http://thenoisychannel.com/2011/01/07/no-more-quora-invites/</link>
		<comments>http://thenoisychannel.com/2011/01/07/no-more-quora-invites/#comments</comments>
		<pubDate>Fri, 07 Jan 2011 14:54:55 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3430</guid>
		<description><![CDATA[Over the past days, I have been inundated with requests for Quora invites. I realize that I brought this upon myself my making my blog the top hit on Google for [quora invite] &#8212; though it seems I&#8217;m at least down to the #2 slot now. In any case, I have sent out over a [...]]]></description>
			<content:encoded><![CDATA[<p>Over the past days, I have been inundated with requests for <a href="http://www.quora.com/">Quora</a> invites. I realize that I brought this upon myself my making my blog the top hit on Google for [<a href="http://www.google.com/search?q=quora+invite">quora invite</a>] &#8212; though it seems I&#8217;m at least down to the #2 slot now. In any case, I have sent out over a hundred invites and need to stop fulfilling requests so that I can focus on my day job!</p>
<p>I hope everyone I&#8217;ve invited is enjoying Quora. But I also hope you take it upon yourselves to circulate more invitations to those who want them. Any Quora user can send out invites &#8212; that&#8217;s how these viral sites work. If you&#8217;re still looking for an invite, I urge you to use Twitter or some other broadcast mechanism to request it. As of today, I will stop responding to Quora invite requests through my blog or email, and I will also delete comments requesting them. I am sorry if this is a bit harsh, but I hope folks understand.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/01/07/no-more-quora-invites/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/01/07/no-more-quora-invites/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>So You Like Big Data&#8230;</title>
		<link>http://thenoisychannel.com/2011/01/04/so-you-like-big-data/</link>
		<comments>http://thenoisychannel.com/2011/01/04/so-you-like-big-data/#comments</comments>
		<pubDate>Tue, 04 Jan 2011 04:43:35 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3408</guid>
		<description><![CDATA[The increasing volume of data that we generate as a species is a story so overplayed as to have become trite. Indeed, a vast amount of this data is in the public domain, including data from the full text and common ngrams of books, genome research, the  United States census, and much more. There is [...]]]></description>
			<content:encoded><![CDATA[<p>The increasing volume of data that we generate as a species is a story so overplayed as to have become trite. Indeed, a vast amount of this data is in the public domain, including data from the <a href="http://www.gutenberg.org/">full text</a> and common <a href="http://ngrams.googlelabs.com/datasets">ngrams</a> of books, <a href="http://www.ncbi.nlm.nih.gov/guide/data-software/">genome research</a>, the  <a href="http://www.census.gov/">United States census</a>, and much more. There is also open-source software not only to <a href="http://nutch.apache.org/">crawl</a> the web, but also to <a href="http://lucene.apache.org/">search</a> the data your crawl. So, if you&#8217;re an aspiring data scientist and just want to get your hands on data, there&#8217;s no excuse&#8211;go out and get it!</p>
<p>But perhaps you&#8217;d like to make a career out your jones for big data. Luckily for you, some of the hottest companies around are hiring data scientists!</p>
<p>Of course, those jobs aren&#8217;t for everyone. To get an idea of the necessary qualifications, I suggest you read the answers on Quora for &#8220;<a href="http://www.quora.com/How-do-I-become-a-data-scientist">How do I become a data scientist?</a>&#8221; to get an idea of the requisite math and computer science skills. I&#8217;m also a fan of <a href="http://www.hilarymason.com/">Hilary Mason</a>&#8216;s definition which was cited in Ryan Kim&#8217;s &#8220;<a href="http://gigaom.com/2010/12/16/wanted-data-scientists-to-turn-information-into-gold/">Wanted: Data Scientists to Turn Information Into Gold</a>&#8220;: a data scientist is someone who can obtain, scrub, explore, model and interpret data, blending hacking, statistics and machine learning. You can see Hilary&#8217;s full explanation in a blog post she co-authored with <a href="http://www.columbia.edu/~chw2/">Chris Wiggins</a>, entitled &#8220;<a href="http://www.dataists.com/2010/09/a-taxonomy-of-data-science/">A Taxonomy of Data Science</a>&#8220;.</p>
<p>If the qualifications haven&#8217;t scared you off, then it&#8217;s just a question of where you can best apply your data scientist skills. The good news is that there are a lot of different ways to make a career out of working with big data. Here are some suggestions for what to work on. I apologize in advance for taking a US-centric perspective &#8212; if you&#8217;re outside the US, I can only hope that the examples have local analogs.</p>
<p><strong>1) Web search.</strong></p>
<p>Google, Yahoo, and Bing all collect an enormous amount of data from people&#8217;s web search activity. Google is, of course, the 800-pound gorilla, but don&#8217;t dismiss the others &#8212; even a single-digit market share is enough to derive extremely valuable insights from user activity. And, since every major search engine makes the bulk of its revenue from advertising, they all present the big-data challenges associated with <a href="http://thenoisychannel.com/2009/07/31/sigir-2009-day-3-industry-track-vanja-josifovski/">computational advertising</a>. Search is, in my view, the web&#8217;s killer app, so you can&#8217;t go wrong working on it. But temper your expectations &#8212; despite heroic efforts from various parties, it seems difficult to deliver revolutionary improvements to this field.</p>
<p><strong>2) Social networking.</strong></p>
<p>Here the biggest players are Facebook and Twitter, but you can find a more comprehensive <a href="http://en.wikipedia.org/wiki/List_of_social_networking_websites">list</a> on Wikipedia. Many consider LinkedIn to be a social network, but I&#8217;ll take the liberty to discuss it in its own section. Social networks attract an outsized share of users&#8217; attention: Facebook alone accounts for a <a href="http://weblogs.hitwise.com/heather-dougherty/2010/11/facebookcom_generates_nearly_1_1.html">quarter of US page views</a> on the web! All of this user activity means a lot of data to crunch, so it&#8217;s not surprising that LinkedIn, Facebook, and Twitter are recognized as having the <a href="http://www.quora.com/Which-companies-have-the-best-data-science-teams">best data science teams</a>. How much you&#8217;ll enjoy working at these companies will in part reflect the value (and values) you perceive in their offerings, but they are all playgrounds for data scientists.</p>
<p><strong>3) Electronic commerce.</strong></p>
<p>While ad-supported web search may be the killer app of the web, what opens up people&#8217;s wallets is e-commerce. Led by Amazon and eBay, e-commerce sites deserve much of the credit for turning the web from an esoteric research project into a mainstream staple. And, <a href="http://www.prenhall.com/divisions/bp/app/alter/student/useful/ch1walmart.html">like their offline counterparts</a>, e-commerce sites generate vast amounts of data from how users view and purchase products. This data drives user recommendations, merchandising campaigns, pricing strategy, and much more. If you&#8217;d like to pursue data-driven capitalism, then e-commerce may be for you. A word of caution: if you are one of a crowd of merchants selling the same products as everyone else (as opposed to a site like <a href="http://www.etsy.com/">Etsy</a> selling unique products), make sure you have a sustainable competitive advantage. Data science is necessary for success in e-commerce, but it may not be sufficient.</p>
<p><strong>4) Digital content.</strong></p>
<p>Whether its books, music, video, or apps, the long-prophesied digital convergence has arrived: almost every newly created piece of digital content is now distributed in electronic form. Here the biggest players are Amazon, Apple, and Google (particularly its YouTube subsidiary), but there is still a lot of flux as new hardware, software, and business models compete for dominance. Digital content poses two daunting challenges: the volume of published content far exceeds people&#8217;s available attention, and digital media products are <a href="http://en.wikipedia.org/wiki/Experience_good">experience goods</a> than people can only evaluate after consuming them. For both of these reasons, the digital content industry depends on data scientists to help people find and discover what they like. The catch: from its advent, the digital content industry has struggled with unauthorized distribution (aka piracy), and the results of this struggle will determine which business models are viable.</p>
<p><strong>5) Finance.</strong></p>
<p><a href="http://www.youtube.com/watch?v=ETxmCCsMoD0">Money, money, money.</a> Working in finance has always been a data-intensive business, but advances in technology have only increased the industry&#8217;s reliance on data scientists. <a href="http://en.wikipedia.org/wiki/Algorithmic_trading">Algorithmic trading</a> &#8212; and <a href="http://en.wikipedia.org/wiki/High-frequency_trading">high-frequency trading</a> in particular &#8212; mean that those who can most effectively and efficiently mine financial data can derive enormous financial benefits. Finance isn&#8217;t for everyone &#8212; the hours are long, the stress is high, and the compensation is highly variable. That said, the financial upside can be quite compelling, and some even enjoy the lifestyle.</p>
<p><strong>6) Public sector.</strong></p>
<p>Given the libertarian leanings of the software industry, the public sector might not seem like an obvious career choice. But some of the largest repositories of data reside there&#8211;from public repositories like <a href="http://www.census.gov/">census</a> data to highly classified repositories restricted to the <a href="http://www.urbandictionary.com/define.php?term=Three-letter+Agencies">TLAs</a>. Better understanding of this data can improve public policy, national security, and much more. Not everyone has the temperament to deal with government bureaucracy, but those who do have the opportunity to turn big data into big public good.</p>
<p><strong>7) LinkedIn.</strong></p>
<p>OK, I&#8217;m being self-serving, but after all this is my blog! LinkedIn is widely recognized as being one of the top data science teams on the planet. But LinkedIn has more than just talent &#8212; it has what Pete Warden of ReadWriteWeb described in &#8220;<a href="http://www.readwriteweb.com/hack/2010/11/secrets-of-the-linkedin-data-scientists.php">Secrets of the LinkedIn Data Scientists</a>&#8221; as &#8220;detailed information on millions of people who are motivated to keep their profiles up-to-date, collect a rich network of connections and have a strong desire from their users for more tools to help them in their professional lives.&#8221; Indeed, I don&#8217;t know of anyone who has a dataset that competes with the combined quantity, quality, and utility of LinkedIn&#8217;s data. Moreover, working as a data scientist at LinkedIn means helping make people more professionally successful by connecting the to opportunities, information, and of course other people. It&#8217;s a wonderful way to create value, and it doesn&#8217;t hurt to do so in the context of a <a href="http://www.businessinsider.com/linkedin-looks-to-almost-double-headcount-in-2010-2010-6">profitable, rapidly growing company</a>.</p>
<p>And LinkedIn recognizes the extraordinary value of data science. Don&#8217;t take my word for it &#8212; listen to LinkedIn CEO Jeff Weiner&#8217;s <a href="http://www.youtube.com/v/unnQOEuAG8o">interview</a> at the 2010 Web 2.0 Summit:</p>
<p>To wrap up, data science is more than just an opportunity to have fun and make the world a better place &#8212; it might even be how you make an honest living!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2011/01/04/so-you-like-big-data/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2011/01/04/so-you-like-big-data/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Reflecting on 2010: Searching for Answers</title>
		<link>http://thenoisychannel.com/2010/12/30/reflecting-on-2010-searching-for-answers/</link>
		<comments>http://thenoisychannel.com/2010/12/30/reflecting-on-2010-searching-for-answers/#comments</comments>
		<pubDate>Fri, 31 Dec 2010 01:29:29 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3397</guid>
		<description><![CDATA[Yes, it&#8217;s that time of year when we take a moment to reflect on the past year&#8217;s accomplishments and muse about what the next year will bring. Other than milder weather! I began this year as a Noogler and leave it as a Xoogler. I hope I left Google better than I found it &#8212; [...]]]></description>
			<content:encoded><![CDATA[<p>Yes, it&#8217;s that time of year when we take a moment to reflect on the past year&#8217;s accomplishments and muse about what the next year will bring. Other than <a href="http://www.wunderground.com/US/CA/Mountain_View.html">milder weather</a>!</p>
<p>I began this year as a <a href="http://www.urbandictionary.com/define.php?term=noogler">Noogler</a> and leave it as a <a href="http://www.linkedin.com/groups?home=&amp;gid=73619">Xoogler</a>. I hope I left Google better than I found it &#8212; I&#8217;m certainly proud of the improvements my team made to the quality of <a href="http://www.seobythesea.com/?p=245">local authority pages</a>. I also tried to infuse Google with some of the scrappy start-up culture I&#8217;d picked up at <a href="http://www.endeca.com/">Endeca</a>, particularly focusing on the hiring process. In information retrieval terms, I&#8217;d say that Google&#8217;s hiring process does extremely well when it comes to <a href="http://en.wikipedia.org/wiki/Precision_and_recall#Precision">precision</a>, but could use improvement in the areas of <a href="http://en.wikipedia.org/wiki/Precision_and_recall#Recall">recall</a> and efficiency. Still, I&#8217;m impressed at how well Google has maintained its quality standards as the company has grown. Finally, I couldn&#8217;t help being an extrovert: I developed warm relationships with the lead <a href="http://thenoisychannel.com/2009/12/05/blogs-i-read-living-la-vida-local/">bloggers covering local search</a>, including <a href="http://www.localseoguide.com/about-me/">Andrew Shotland</a>, <a href="http://www.davidmihm.com/">David Mihm</a>, <a href="http://twitter.com/#!/golander59">Gib Olander</a>, <a href="http://gesterling.wordpress.com/about/">Greg Sterling</a>, and <a href="http://www.blumenthals.com/index.php?MikeBlumenthal">Mike Blumenthal</a>. Indeed, when I announced my departure, Mike wrote a <a href="http://blumenthals.com/blog/2010/12/03/daniel-tunkelang-leaving-google-maps-to-join-linkedin/">really nice post</a> about the friendship we cultivated over the past year. I hope that he continues to have such relationships with my former co-workers.</p>
<p>Looking back at what was <a href="http://thenoisychannel.com/2010/01/03/search-questions-for-2010-whats-on-my-mind/">on my mind when this year began</a>, I had lots of questions around exploratory, mobile, real-time, social/collaborative search. I also wondered whether it was  possible to offer more transparency in relevance ranking without losing ground in the battle against spam and black-hat SEO.</p>
<p>I&#8217;m as bullish as ever on the value of <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploratory search</a>:  part of why I <a href="http://thenoisychannel.com/2010/12/03/follow-the-data/">joined LinkedIn</a> is that a significant fraction of the site&#8217;s value comes from supporting users&#8217; exploratory search needs. I also published a position paper at the <a href="http://www.mansci.uwaterloo.ca/~msmucker/publications/simint10proceedings.pdf">SIGIR 2010 Workshop on Simulation of Interaction</a> proposing the use of <a href="http://thenoisychannel.com/2010/05/23/estimating-the-query-difficulty-for-information-retrieval/">query performance prediction</a> to model the fidelity of communication between user and system, thus helping <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> researchers to simulate query refinement with standard test collections. And of course exploratory search was a major theme at the <a href="http://sites.google.com/site/hcirworkshop/hcir-2010">HCIR 2010</a> workshop, not only providing the basis for the first <a href="http://sites.google.com/site/hcirworkshop/hcir-2010/challenge">HCIR Challenge</a>, but even extending to new territory with Max Wilson and David Elsweiler&#8217;s work on <a href="http://www.slideshare.net/gingdotslideshare/hcir2010-casualleisure-search">casual leisure searching</a>.</p>
<p>As for mobile search, I&#8217;d say that 2010 has been the year of &#8220;<a href="http://www.avc.com/a_vc/2010/09/mobile-first-web-second.html">mobile first</a>&#8220;. Thanks to a <a href="http://googlemobile.blogspot.com/2009/12/android-dogfood-diet-for-holidays.html">generous gift</a> from my former employer, I&#8217;ve become a regular user of the mobile web&#8211;and of search in particular. To my surprise, the communication bottleneck has not been screen real estate, but rather the difficulty of entering text. And innovative approaches like <a href="http://www.youtube.com/watch?v=laOlkD8LmZw">voice search</a> and <a href="http://www.swypeinc.com/">Swype</a> go a long way to mitigate that difficulty.</p>
<p>On to real-time search. Not surprisingly, my favorite innovation in this space is <a href="http://thenoisychannel.com/2010/10/02/linkedin-signal-exploratory-search-for-twitter/">LinkedIn Signal</a>, which offers exploratory search for Twitter. I still struggle to find <a href="http://thenoisychannel.com/2010/01/18/real-time-search-is-personal/">use cases</a> that emphasize the &#8220;real-time&#8221; aspect of Twitter and other microblogging services, but I am convinced that the path to utility lies in tools that support organization, analysis, and exploration.</p>
<p>On the social/collaborative front, I&#8217;m happy to work for a company whose charter includes &#8220;supporting mediated search by linking people to people, rather than directly to information&#8221;.  While the biggest event in this space in 2010 was Facebook&#8217;s introduction of the <a href="http://developers.facebook.com/docs/reference/plugins/like">Like button</a>, I&#8217;m not convinced that &#8220;likes&#8221; have supplanted links. I&#8217;m still looking to niche players like <a href="http://thenoisychannel.com/2009/05/27/topsy-tippling-the-stream-of-conversations/">Topsy</a> and <a href="http://thenoisychannel.com/2010/08/06/taking-blekko-out-for-a-spin/">Blekko</a> to push innovation in this space.</p>
<p>Speaking of Blekko, they&#8217;ve made an impressive attempt to increase the transparency of relevance ranking. But, <a href="http://thenoisychannel.com/2010/03/07/google-and-transparency/">as I blogged earlier this year</a>,  I think that, at least for the time being, Google is making the right decision to keep some of its details secret. Now that web search is essentially a <a href="http://www.bing.com/community/site_blogs/b/search/archive/2009/07/29/exciting-times-for-bing-and-yahoo.aspx">duopoly</a> (at least in the US), I believe the real test of the value of transparency to users will be whether one of the two parties employs it as competitive differentiator.</p>
<p>What&#8217;s in store for 2011? LinkedIn CEO <a href="http://www.linkedin.com/in/jeffweiner08">Jeff Weiner</a> has a vision of using data science to provide a &#8220;<a href="http://blogs.wsj.com/venturecapital/2010/06/10/after-first-year-as-linkedins-ceo-jeff-weiner-talks-shop/">Pandora for people</a>&#8220;, and that&#8217;s a vision I&#8217;m eager to help realize. Not surprisingly, when I blogged in 2008 about <a href="http://thenoisychannel.com/2008/08/07/where-google-isnt-good-enough/">where Google wasn&#8217;t good enough</a>, two of the four areas I cited were finding jobs and find employees. Even then I recognized that LinkedIn was the best at both. But LinkedIn can be so much more, and I am looking forward to working with an incredible team and incredible data on a delightful set of information science challenges.</p>
<p>Happy New Year! I hope that 2011 brings you great answers &#8212; and great questions!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/12/30/reflecting-on-2010-searching-for-answers/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/12/30/reflecting-on-2010-searching-for-answers/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Giving Thanks as an Information Scientist</title>
		<link>http://thenoisychannel.com/2010/11/25/giving-thanks-as-an-information-scientist/</link>
		<comments>http://thenoisychannel.com/2010/11/25/giving-thanks-as-an-information-scientist/#comments</comments>
		<pubDate>Fri, 26 Nov 2010 02:20:33 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3367</guid>
		<description><![CDATA[As a first-generation American who is married to a card-carrying Native American, I celebrate Thanksgiving the traditional way: a day of gluttony followed by yummy leftovers. But, trite as it may be, I do like to take the time to reflect on the countless things for which I am thankful. A wonderful family, of course, [...]]]></description>
			<content:encoded><![CDATA[<p>As a first-generation American who is married to a <a href="http://shop.cafepress.com/yurok">card-carrying Native American</a>, I celebrate Thanksgiving the traditional way: a day of <a href="http://www.youtube.com/watch?v=Rp4yWTLIPaE">gluttony</a> followed by yummy leftovers. But, trite as it may be, I do like to take the time to reflect on the countless things for which I am thankful. A wonderful <a href="http://www.flickr.com/photos/24264445@N05/">family</a>, of course, but also the great fortune to live in an age where some of the subjects that I find most intellectually stimulating have become highly relevant to our practical daily lives.</p>
<p>Consider <a href="http://en.wikipedia.org/wiki/Information_retrieval">information retrieval</a>. Perhaps I&#8217;m dating myself, but an undergraduate computer science major, I hardly imagined that information retrieval would have much significance outside of academia. Sure, there were commercial IR systems being built in the 1980s, but it wasn&#8217;t until the late 1990s that web search brought IR to the mainstream. Today, it&#8217;s hard to imagine studying computer science without learning about IR. Sure, my <a href="http://www.linkedin.com/in/dtunkelang">career</a> makes me a tad biased, but it is undeniable that information retrieval is one of the defining problems of our generation.</p>
<p>And then there are <a href="http://en.wikipedia.org/wiki/Social_network">social networks</a>. When I studied <a href="http://en.wikipedia.org/wiki/Graph_drawing">graph drawing</a> in the 1990s, the canonical example of a social network was &#8220;<a href="http://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon">Six Degrees of Kevin Bacon</a>&#8220;.  Sure, many of my peers would talk about their <a href="http://en.wikipedia.org/wiki/Erd%C5%91s_number">Erdős numbers</a> (they were more discreet about their placement in the <a href="http://shand.pagesperso-orange.fr/memoires/tarjan.html">Tarjan graph</a>), but the study of social networks was surely an academic pursuit. Who would imagine that, barely a decade later, a movie entitled <em><a href="http://www.imdb.com/title/tt1285016/">The Social Network</a></em> would be a blockbuster movie grossing <a href="http://boxofficemojo.com/movies/?id=socialnetwork.htm">$175M</a>? Leaving aside Hollywood, social networks have become a significant part of our daily lives. Not only do Facebook, Twitter, and LinkedIn account for a <a href="http://blog.nielsen.com/nielsenwire/online_mobile/what-americans-do-online-social-media-and-games-dominate-activity/">large fraction of our time online</a>, but they also affect our offline personal and professional lives.</p>
<p>From childhood, I&#8217;ve been interested in mathematics, computer science, and psychology. Living in an age of information retrieval and social networks means that I can apply these interests in my daily work. Today I give thanks for being born at the right place and right time, blessed with a lifetime of interesting and practical problems to solve. Happy Thanksgiving to all, and enjoy the leftovers!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/11/25/giving-thanks-as-an-information-scientist/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/11/25/giving-thanks-as-an-information-scientist/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Element of Surprise</title>
		<link>http://thenoisychannel.com/2010/11/07/the-element-of-surprise/</link>
		<comments>http://thenoisychannel.com/2010/11/07/the-element-of-surprise/#comments</comments>
		<pubDate>Sun, 07 Nov 2010 23:09:19 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3353</guid>
		<description><![CDATA[Surprise is not a word that user interface designers typically like to hear. Indeed, the principle of least surprise (also called the principle of least astonishment) is that systems should always strive to act in a way that least surprises the user. Like many interface design principles, the principle of least surprise reflects the premise [...]]]></description>
			<content:encoded><![CDATA[<p>Surprise is not a word that user interface designers typically like to hear. Indeed, the principle of least surprise (also called the <a href="http://en.wikipedia.org/wiki/Principle_of_least_astonishment">principle of least astonishment</a>) is that systems should always strive to act in a way that least surprises the user.</p>
<p>Like many interface design principles, the principle of least surprise reflects the premise that software applications exist to be useful. In utility-oriented applications, surprise means distraction and delay &#8212; negatives that good designers work to avoid.</p>
<p>But we increasingly see applications whose main value to the user is not utility, but entertainment. Indeed, a recent <a href="http://blog.nielsen.com/nielsenwire/online_mobile/what-americans-do-online-social-media-and-games-dominate-activity/">Nielsen report</a> claims that the top two online activities for Americans are social networks / blogs and games. I take the report with a grain of salt, but it seems safe to argue that people have come to expect the internet to be at least as fun as it is useful.</p>
<p>Even search, which would seem to be the poster child for the utility of online services, is being pressed into the service of entertainment. <a href="http://www.cs.swan.ac.uk/~csmax/index.php">Max Wilson</a> and <a href="http://twitter.com/#!/delsweil">David Elsweiler</a> argued as much in their <a href="http://sites.google.com/site/hcirworkshop/hcir-2010">HCIR 2010</a> presentation about &#8220;<a href="http://www.slideshare.net/gingdotslideshare/hcir2010-casualleisure-search">casual leisure searching</a>&#8220;. They mined Twitter to analyze a variety of scenarios where search isn&#8217;t about the use finding something, but rather about enjoying the experience. Indeed, their controversial definition of search is broad enough to include the possibility that the user does not have an information need.</p>
<p>Like the businessman in Antoine de St. Exupery&#8217;s <em><a href="http://gutenberg.net.au/ebooks03/0300771h.html">Le Petit Prince</a></em>, I&#8217;ve long felt that, as &#8220;un homme sérieux&#8221;, my job is delivering utility to users. Users already have lots of ways to waste time; I focus on making their productivity-oriented time more effective and efficient. I&#8217;m glad there are folks who devote their lives to making the rest of us have more fun (especially all the computer scientists who left academia for <a href="http://www.pixar.com/">Pixar</a>), but entertainment simply isn&#8217;t a vocation for me.</p>
<p>However, I&#8217;ve been coming around to the realization that fun and utility are not mutually exclusive. For example, news serves the utilitarian ideal of informing the citizenry, but many (most?) of us read news as a pleasant way to pass the time. Social networks are another example serving a similar function&#8211;perhaps with a balance that is more toward the entertainment of the spectrum but still providing genuine social utility.</p>
<p>A common feature of both of these examples is that users regularly return to the same site expecting the unexpected. The transient nature of news and social news feeds promises an endless supply of fresh content, produced more quickly than users can consume it. This situation is in stark contrast to those of <a href="http://en.wikipedia.org/wiki/Web_search_query">typical web search queries</a>, for which the results are expected to be largely static. Indeed, we may set up alerts to inform us of novel search results, but we are unlikely to regularly visit a bookmarked search results page the way we regularly visit a news or social network site.</p>
<p>Is novelty the only source of surprise? Novelty certainly helps, but it is not a necessity. An alternative source is randomness. I&#8217;m known people to use Wikipedia&#8217;s &#8220;<a href="http://en.wikipedia.org/wiki/Special:Random">random article</a>&#8221; feature. But a more plausible place to introduce randomness is in recommendations &#8212; whether for products or content. Since recommendations are good guesses at best, a bit of randomness can help ensure that the guesses are interesting. Indeed, a SIGIR 2010 paper by <a href="http://www.cs.ucl.ac.uk/staff/n.lathia/">Neal Lathia</a>, <a href="http://www.cs.ucl.ac.uk/staff/s.hailes/">Stephen Hailes</a>, <a href="http://www.cs.ucl.ac.uk/staff/l.capra/">Licia Capra</a>, and <a href="http://xavier.amatriain.net/">Xavier Amatriain</a> on &#8220;<a href="http://mobblog.cs.ucl.ac.uk/2010/05/20/temporal-diversity-in-recommender-systems/">Temporal Diversity in Recommender Systems</a>&#8221; explored the use or randomness to induce diversity in recommendations and arrived at the conclusion that people don’t like being recommended the same things over and over again.</p>
<p>Can we generalize from these examples? I think so. For utility-oriented information needs, it is important to provide users with accurate, predictable, and efficient tools. But we can&#8217;t dismiss everything else as frivolous. Sometimes we just need to offer our users a little bit of surprise to keep it interesting.</p>
<p>Or, as <a href="http://www.imdb.com/title/tt0058331/quotes">Mary Poppins</a> tells us: &#8220;In every job that must be done, there is an element of fun. You find the fun, and &#8211; SNAP &#8211; the job&#8217;s a game!&#8221;</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/11/07/the-element-of-surprise/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/11/07/the-element-of-surprise/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>LinkedIn Signal = Exploratory Search for Twitter</title>
		<link>http://thenoisychannel.com/2010/10/02/linkedin-signal-exploratory-search-for-twitter/</link>
		<comments>http://thenoisychannel.com/2010/10/02/linkedin-signal-exploratory-search-for-twitter/#comments</comments>
		<pubDate>Sat, 02 Oct 2010 23:54:58 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3322</guid>
		<description><![CDATA[I like Twitter. Yes, I know that a lot of its content is noise. But I&#8217;ve found Twitter to be a useful professional tool for both publishing and consuming information. Publishing to Twitter is the easy part: I publish links to my blog posts and occasionally engage in public conversations. Consuming information from Twitter is more [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://learn.linkedin.com/twitter/"><img class="alignnone" title="You put your LinkedIn in my Twitter!" src="http://learn.linkedin.com/wp-content/uploads/2009/11/pbandc.jpg" alt="" width="101" height="122" /></a></p>
<p>I like Twitter. Yes, I know that a lot of its content is <a href="http://www.youtube.com/watch?v=PN2HAroA12w">noise</a>. But I&#8217;ve found Twitter to be a useful professional tool for both publishing and consuming information. Publishing to Twitter is the easy part: I publish <a href="http://twitter.com/#!/search/%23thenoisychannel">links</a> to my blog posts and occasionally <a href="http://twitter.com/#!/dtunkelang">engage</a> in public conversations.</p>
<p>Consuming information from Twitter is more of a challenge. I follow <a href="http://twitter.com/#!/dtunkelang/following">100 people</a>, which is about the limit of my <a href="http://thenoisychannel.com/2009/02/27/dunbar-lives/">attention budget</a>. I use saved searches to track long-term interests (much as I use web and news alerts), and I perform ad hoc searches when I am interested in finding out what people are saying about a particular topic.</p>
<p>But Twitter search is not a great fit for analysis or <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploration</a>&#8211;unless you count trending topics as analysis. <a href="http://thenoisychannel.com/2009/03/05/twitter-is-not-a-search-engine/">Originally</a>, the search results were simply the tweets that contained the  matching tweets in order of recency. The current system sometimes promotes a few &#8220;<a href="http://twitter.com/#!/TopTweets">top tweets</a>&#8221; to the top of the results. Still, if you&#8217;d like to get a summary view, slice and dice the results, or perform any other sort of <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> task, you&#8217;re out of luck.</p>
<p>Until now.</p>
<p><strong> </strong>The LinkedIn <a href="http://sna-projects.com/sna/">Search, Network, and Analytics</a> team&#8211;the same folks that built LinkedIn&#8217;s <a href="http://thenoisychannel.com/2009/12/15/linkedin-faceted-search-now-out-of-beta/ ">faceted search</a> system and developed open-source search tools <a href="http://sna-projects.com/zoie/">Zoie</a> and <a href="http://sna-projects.com/bobo/">Bobo</a>&#8211;just introduced a service called <a href="http://blog.linkedin.com/2010/09/29/linkedin-signal/  ">Signal</a> that is squarely aimed at folks like me who use Twitter as a professional tool. It is still in its infancy (in private beta, in fact), but I think it has the potential to dramatically change how people like me use Twitter. You can learn more about its architecture and implementation details <a href="http://sna-projects.com/blog/2010/10/linkedin-signal-a-look-under-the-hood/">here</a>.</p>
<p>Signal joins the often cacophonous Twitter stream to the high-quality structured data that LinkedIn knows about its own users. For example, when I post a tweet, LinkedIn knows that I am in the software industry, work at Google, and live in New York. LinkedIn can only make this connection for people who include Twitter ids in their LinkedIn profiles, but that&#8217;s a substantial and growing population.</p>
<p>Signal then lets you use this structured information to satisfy analytic and exploratory information needs. For example, I can see which companies&#8217; employees are tweeting about software patents (top two are Google and Red Hat).</p>
<p><a href="http://www.linkedin.com/signal/home#software patents?"><img class="alignnone size-full wp-image-3323" title="LinkedIn Signal: software patents" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2010/10/software-patents.png" alt="" width="596" height="425" /></a></p>
<p>Or compare what Microsoft employees are saying about Android&#8230;</p>
<p><a href="http://www.linkedin.com/signal/home#android?company=00000000000000001035"><img class="alignnone size-full wp-image-3324" title="LinkedIn Signal: android, Company = Microsoft" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2010/10/android-microsoft.png" alt="" width="594" height="346" /></a></p>
<p>&#8230;to what Google employees are saying about Android.</p>
<p><a href="http://www.linkedin.com/signal/home#android?company=00000000000000001441"><img class="alignnone size-full wp-image-3325" title="LinkedIn Signal: android, Company = Google" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2010/10/android-google.png" alt="" width="593" height="344" /></a></p>
<p>As you can see on the right-hand side, Signal also mines shared links to identify popular ones relative to given search&#8211;and allows you to see who has shared a particular link. This functionality is similar to <a href="http://thenoisychannel.com/2009/05/27/topsy-tippling-the-stream-of-conversations/">Topsy</a>, but with the advantage of allowing structured searches. Like Topsy, it wrangles the mass of retweeted links into a useful and user-friendly summary.</p>
<p>Signal is still very much in beta. An amusing bug that I encountered earlier today was that, due to some legacy issues in how Linkedin standardized institution names, the system decided that I was an alumnus of the <a href="http://www.longy.edu/">Longy School of Music</a> rather than of <a href="http://www.mit.edu/">MIT</a>. Fortunately, that&#8217;s fixed now (thanks, John!)&#8211;I love karaoke, but I&#8217;m not ready to quit my day job!</p>
<p>Also, Signal only exposes a handful of LinkedIn&#8217;s facets, which limits the breadth of analysis and exploration. I&#8217;d love to see it add a past company facet, making it possible to drill down into what a company&#8217;s ex-employees are saying about a particular topic (e.g., their ex-employer).</p>
<p>Finally, while Signal offers Twitter hashtags as a facet, these are hardly a substitute for a topic facet. In order to provide such a facet, LinkedIn needs to implement some kind of concept extraction to provide a useful topic facet (something I&#8217;d also love to see for their regular people search). This is a challenging information extraction problem, especially for the open web, but I also know from <a href="http://www.endeca.com/">experience</a> that it is tractable within a domain. Given LinkedIn&#8217;s professional focus, I believe this is a problem they can and should tackle.</p>
<p>Of course, Linkedin also needs to convince more of its users to join their LinkedIn accounts to their Twitter accounts&#8211;since that is their input source. But I suspect it&#8217;s mostly a matter of time and education&#8211;and hopefully the buzz around Signal will help raise awareness.</p>
<p>All in all, I see LinkedIn Signal as a great innovation and a big step forward for exploratory search and for Twitter. Congratulations to <a href="http://www.linkedin.com/in/javasoze">John Wang</a>, <a href="http://www.linkedin.com/in/igorperisic">Igor Perisic</a>, and the rest of the LinkedIn search team on the launch!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/10/02/linkedin-signal-exploratory-search-for-twitter/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/10/02/linkedin-signal-exploratory-search-for-twitter/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Taking Blekko out for a Spin</title>
		<link>http://thenoisychannel.com/2010/08/06/taking-blekko-out-for-a-spin/</link>
		<comments>http://thenoisychannel.com/2010/08/06/taking-blekko-out-for-a-spin/#comments</comments>
		<pubDate>Sat, 07 Aug 2010 02:57:05 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3254</guid>
		<description><![CDATA[If you&#8217;re a search engine junkie like me, you&#8217;ve probably heard about Blekko, a search engine that has been percolating for over two years and recently launched a private beta. If not, I encourage you to watch the TechCrunch video I&#8217;ve embedded above. You can join the beta by following them on Twitter. I did [...]]]></description>
			<content:encoded><![CDATA[<p><script src="http://player.ooyala.com/player.js?embedCode=90cmtrMTom9vae2YoUwJrngW3UCgI2Zu&amp;deepLinkEmbedCode=90cmtrMTom9vae2YoUwJrngW3UCgI2Zu"></script></p>
<p>If you&#8217;re a search engine junkie like me, you&#8217;ve probably heard about <a href="http://blekko.com/">Blekko</a>, a search engine that has been percolating for <a href="http://blekko.com/">over two years</a> and recently <a href="http://searchengineland.com/blekko-a-new-search-engine-that-lets-you-spin-the-web-47215">launched</a> a private beta. If not, I encourage you to watch the TechCrunch video I&#8217;ve embedded above. You can join the beta by following them <a href="http://www.twitter.com/blekko">on Twitter</a>. I did that earlier this week, and my invitation arrived via a direct message the next day.</p>
<p>Blekko&#8217;s main differentiating feature is that it supports &#8220;slashtags&#8221;. These aren&#8217;t the same as the <a href="http://en.wikipedia.org/wiki/Slashtag">Twitter microsyntax</a> proposed by <a href="http://factoryjoe.com/blog/2009/11/08/slashtags/">Chris Messina</a> and named by <a href="http://unthinkingly.com/2009/11/09/slashtags-for-citizen-editors/">Chris Blow</a>. Rather, they are a way for users to &#8220;spin&#8221; their search results using a variety of filters. For example, [climate /liberal] and [climate /conservative] return very different results, because they are restricted to different sets of sites.</p>
<p>In addition to providing a set of curated slashtags, Blekko allows users to define their own slashtags by specifying the sets of sites to be included. There&#8217;s a social aspect here too: you can use (and follow) other users&#8217; slashtags. Blekko also has some special slashtags that don&#8217;t act as site filters, e.g., /date shows recent results and /seo offers indexing information about web sites.</p>
<p>Blekko emphasizes two characteristics that I find very appealing: transparency and user control. While they do not disclose their relevance ranking algorithm, they do expose some of the information they use to compute it. More significantly, their emphasis on slashtags de-emphasizes default ranking, but rather encourages users to take more responsibility in the information seeking process. Very <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a>!</p>
<p>I like the concept. But I&#8217;m not sure how I feel about the execution. I have three main concerns.</p>
<p>First, the set of slashtags is somewhat haphazard&#8211;to be expected in a beta, but I&#8217;m not sure how it will evolve. I&#8217;d love to see a vocabulary collectively (and transparently) curated like Wikipedia, but I fear it will look more like social tagging site <a href="http://delicious.com/">Delicious</a>, which is a case study in the &#8220;<a href="http://furnas.people.si.umich.edu/Papers/vocab.paper.pdf">vocabulary problem</a>&#8220;. As any information scientist can tell you, managing vocabularies is hard!</p>
<p>Second, I&#8217;m not sure if site filters are the right model. What happens to sites with heterogeneous content? Or to sites that have one-hit wonders and therefore are unlikely to show up in any slashtags? I&#8217;d prefer to see the sites used as seeds to train classifiers that could then be applied to the entire index. Something a bit more like what <a href="http://people.lis.illinois.edu/~mefron/">Miles Efron</a> implemented in <a href="http://people.lis.illinois.edu/~mefron/papers/efron-libmedia.pdf">this research</a>&#8211;only on a much larger scale and applied at a page rather than site level.</p>
<p>Third, I think there&#8217;s a third ingredient that is essential to complement transparency and user control: guidance. As a user, I need to know what slashtags would lead me to interesting results, and ideally I&#8217;d want some kind of preview to make exploration as low-cost as possible.</p>
<p>I know I&#8217;m asking for a lot&#8211;especially from an ambitious startup that has just launched its private beta. But I think the stakes are high in this space, and going easy on a newcomer is no favor. I offer the tough love of a critic who would really like to see this kind of vision succeed.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/08/06/taking-blekko-out-for-a-spin/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/08/06/taking-blekko-out-for-a-spin/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>HCIR 2010 Accepted Papers</title>
		<link>http://thenoisychannel.com/2010/08/03/hcir-2010-accepted-papers/</link>
		<comments>http://thenoisychannel.com/2010/08/03/hcir-2010-accepted-papers/#comments</comments>
		<pubDate>Wed, 04 Aug 2010 01:55:35 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3246</guid>
		<description><![CDATA[The 4th Workshop on Human-Computer Interaction and Information Retrieval (HCIR 2010) is coming up on August 22 in New Brunswick, NJ, taking place immediately after the Information Interaction in Context conference (IIiX 2010). That&#8217;s just a few weeks away! If you are are interested in attending and haven&#8217;t already registered, please let me know as [...]]]></description>
			<content:encoded><![CDATA[<p>The 4th Workshop on Human-Computer Interaction and Information Retrieval <a href="http://www.hcir2010.org/">(HCIR 2010</a>) is coming up on August 22 in New Brunswick, NJ, taking place immediately after the Information Interaction in Context conference (<a href="http://www.iiix2010.org/">IIiX 2010</a>). That&#8217;s just a few weeks away!</p>
<p>If you are are interested in attending and haven&#8217;t already registered, please let me know as soon as possible via <a href="mail:dtunkelang@gmail.com">email</a> or <a href="http://twitter.com/dtunkelang">Twitter</a> (speaking of which, follow the <a href="http://twitter.com/#search?q=%23hcir10">#hcir2010</a> hash tag). We&#8217;re making the remaining slots available to the community on a first-come, first-serve basis.</p>
<p>Google user experience researcher <a href="http://sites.google.com/site/dmrussell/">Dan Russell</a> will be delivering this year&#8217;s keynote on &#8220;<a href="http://research.microsoft.com/en-us/um/people/ryenw/hcir2010/keynote.html">Why is search sometimes easy and sometimes hard? Understanding serendipity and expertise in the mind of the searcher</a>&#8220;.</p>
<p>Here is the list of <a href="http://research.microsoft.com/en-us/um/people/ryenw/hcir2010/presentations.html">accepted papers</a>:</p>
<p>Oral Presentations</p>
<ul>
<li>VISTO: for Web Information Gathering and Organization<br />
<em>Anwar Alhenshiri, Carolyn Watters, and Michael Shepherd (Dalhousie University)</em></li>
<li><em> </em>Time-based Exploration of News Archives<br />
<em>Omar Alonso (Microsoft Corporation), </em><em>Klaus Berberich (Max-Planck Institute for Informatics), </em><em>Srikanta Bedathur (Max-Planck Institute for Informatics), and </em><em>Gerhard Weikum (Max-Planck Institute for Informatics)</em></li>
<li><em></em>Combining Computational Analyses and Interactive Visualization to Enhance Information Retrieval<br />
<em>Carsten Goerg, Jaeyeon Kihm, Jaegul Choo, Zhicheng Liu, Sivasailam Muthiah, Haesun Park, and John Stasko (Georgia Institute of Technology)</em></li>
<li><em></em>Impact of Retrieval Precision on Perceived Difficulty and Other User Measures<br />
<em>Mark Smucker and Chandra Prakash Jethani (University of Waterloo)</em></li>
<li><em></em>Exploratory Searching As Conceptual Exploration<br />
<em>Pertti Vakkari (University of Tampere)</em></li>
<li><em></em>Casual-leisure Searching: The Exploratory Search Scenarios that Break our Current Models<br />
<em>Max L. Wilson (Swansea University) and David Elsweiler (University of Erlangen)</em></li>
</ul>
<p><a href="http://research.microsoft.com/en-us/um/people/ryenw/hcir2010/challenge.html">HCIR Challenge</a> Reports</p>
<ul>
<li>Search for Journalists: New York Times Challenge Report<br />
<em>Corrado Boscarino, Arjen P. de Vries, and Wouter Alink </em><em>(Centrum Wiskunde and Informatica)</em></li>
<li>Exploring the New York Times Corpus with NewsClub<br />
<em>Christian Kohlschütter (Leibniz Universität Hannover)</em></li>
<li><em></em>Searching Through Time in the New York Times<br />
<em>Michael Matthews, Pancho Tolchinsky, Roi Blanco, Jordi Atserias, Peter Mika, and Hugo Zaragoza (Yahoo! Labs)</em></li>
<li><em></em>News Sync: Three Reasons to Visualize News Better<br />
<em>V.G. Vinod Vydiswaran (University of Illinois), </em><em>Jeroen van den Eijkhof (University of Washington), </em><em>Raman Chandrasekar (Microsoft Research), Ann Paradiso (Microsoft Research), and Jim St. George (Microsoft Research)</em></li>
<li><em></em>Custom Dimensions for Text Corpus Navigation<br />
<em>Vladimir Zelevinsky (Endeca Technologies)</em></li>
<li><em></em>A Retrieval System Based on Sentiment Analysis<br />
<em>Wei Zheng and Hui Fang (University of Delaware)</em></li>
</ul>
<p>Research Posters</p>
<ul>
<li>Improving Web Search for Information Gathering: Visualization in Effect<br />
<em>Anwar Alhenshiri, Carolyn Watters, and Michael Shepherd (Dalhousie University)</em></li>
<li><em></em>User-oriented and Eye-Tracking-based Evaluation of an Interactive Search System<br />
<em>Thomas Beckers and Norbert Fuhr (University of Duisberg-Essen)</em></li>
<li><em></em>Exploring Combinations of Sources for Interaction Features for Document Re-ranking<br />
<em>Emanuele Di Buccio (University of Padua), Massimo Melucci (University of Padua), and Dawei Song (The Robert Gordon University)</em></li>
<li>Extracting Expertise to Facilitate Exploratory Search and Information Discovery: Combining Information Retrieval Techniques with a Computational Cognitive Model<br />
<em>Wai-Tat Fu and Wei Dong (University of Illinois at Urbana-Champaign)</em></li>
<li><em></em>An Architecture for Real-time Textual Query Term Extraction from Images<br />
<em>Cathal Hoare and Humphrey Sorensen (University College Cork)</em></li>
<li><em></em>Transaction Log Analysis of User Actions in a Faceted Library Catalog Interface<br />
<em>Bill Kules (The Catholic University of America), </em><em>Robert Capra (University of North Carolina at Chapel Hill), and </em><em>Joseph Ryan (North Carolina State University Libraries)</em></li>
<li>Context in Health Information Retrieval: What and Where<br />
<em>Carla Lopes and Cristina Ribeiro (University of Porto)</em></li>
<li><em></em>Tactics for Information Search in a Public and an Academic Library Catalog with Faceted Interfaces<br />
<em>Xi Niu and Bradley M. Hemminger (University of North Carolina at Chapel Hill)</em></li>
</ul>
<p>Position Papers</p>
<ul>
<li>Understanding Information Seeking in the Patent Domain and its Impact on the Interface Design of IR Systems<br />
<em>Daniela Becks, Matthias Görtz, and </em><em>Christa Womser-Hacker (University of Hildesheim)</em></li>
<li>Better Search Applications Through Domain Specific Context Descriptions<br />
<em>Corrado Boscarino, Arjen P. de Vries, and Jacco van Ossenbruggen </em><em>(Centrum Wiskunde and Informatica)</em></li>
<li><em></em>Layered, Adaptive Results: Interaction Concepts for Large, Heterogeneous Data Sets<br />
<em>Duane Degler (Design for Context)</em></li>
<li><em></em>Revisiting Exploratory Search from the HCI Perspective<br />
<em>Abdigani Diriye (University College London), Max L. Wilson (Swansea University), </em><em>Ann Blandford (University College London), and </em><em>Anastasios Tombros (Queen Mary University London)</em></li>
<li>Supporting Task with Information Appliances: Taxonomy of Needs<br />
<em>Sarah Gilbert, Lori McCay-Peet, and Elaine Toms (Dalhousie University)</em></li>
<li><em></em>A Proposal for Measuring and Implementing Group’s Affective Relevance in Collaborative Information Seeking<br />
<em>Roberto González-Ibáñez and Chirag Shah (Rutgers University)</em></li>
<li><em></em>Evaluation of Music Information Retrieval: Towards a User-Centered Approach<br />
<em>Xiao Hu (University of Illinois at Urbana Champaign) and </em><em>Jingjing Liu (Rutgers University)</em></li>
<li><em></em>Information Derivatives: A New Way to Examine Information Propagation<br />
<em>Chirag Shah (Rutgers University)</em></li>
<li><em></em>Implicit Factors in Networked Information Feeds<br />
<em>Fred Stutzman (University of North Carolina at Chapel Hill)</em></li>
<li><em></em>Improving the Online News Experience<br />
<em>V. G. Vinod Vydiswaran (University of Illinois) and </em><em>Raman Chandrasekar (Microsoft Research)</em></li>
<li><em></em>Breaking Down the Assumptions of Faceted Search<br />
<em>Vladimir Zelevinsky (Endeca Technologies)</em></li>
<li><em></em>A Survey of User Interfaces in Content-based Image Search Engines on the Web<br />
<em>Danyang Zhang (The City University of New York) </em></li>
</ul>
<p>You can also download the full proceedings <a href="http://www.hcir2010.org/docs/HCIR2010Proceedings.pdf">here</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/08/03/hcir-2010-accepted-papers/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/08/03/hcir-2010-accepted-papers/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Overcoming Spammers in Twitter</title>
		<link>http://thenoisychannel.com/2010/08/02/overcoming-spammers-in-twitter/</link>
		<comments>http://thenoisychannel.com/2010/08/02/overcoming-spammers-in-twitter/#comments</comments>
		<pubDate>Tue, 03 Aug 2010 03:43:20 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3243</guid>
		<description><![CDATA[As I blogged a few months ago, University of Oviedo professor Daniel Gayo-Avello published a research paper entitled “Nepotistic Relationships in Twitter and their Impact on Rank Prestige Algorithms“, in which he concluded that TunkRank was the best of the measures he studied for ranking Twitter users. I recently discovered that he and David Brenes posted slides from [...]]]></description>
			<content:encoded><![CDATA[<p><object id="__sse4504913" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=ceri2010-gayobrenes-imagenes-100615061415-phpapp02&amp;stripped_title=overcoming-spammers-in-twitter-a-tale-of-five-algorithms" /><param name="name" value="__sse4504913" /><param name="allowfullscreen" value="true" /><embed id="__sse4504913" type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=ceri2010-gayobrenes-imagenes-100615061415-phpapp02&amp;stripped_title=overcoming-spammers-in-twitter-a-tale-of-five-algorithms" name="__sse4504913" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<div id="__ss_4504913" style="width: 425px;">
<p>As I blogged <a href="http://thenoisychannel.com/2010/04/07/go-tunkrank/">a few months ago</a>, University of Oviedo professor <a href="http://www.di.uniovi.es/~dani/">Daniel Gayo-Avello</a> published a research paper entitled “<a href="http://arxiv.org/abs/1004.0816">Nepotistic Relationships in Twitter and their Impact on Rank Prestige Algorithms</a>“, in which he concluded that <a href="http://tunkrank.com/">TunkRank</a> was the best of the measures he studied for ranking Twitter users. I recently discovered that he and <a href="http://es.linkedin.com/in/brenes">David Brenes</a> posted slides from their presentation at <a href="http://ir.ii.uam.es/ceri2010/">CERI 2010</a> on &#8220;Overcoming Spammers in Twitter&#8221;. Enjoy!</p>
</div>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/08/02/overcoming-spammers-in-twitter/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/08/02/overcoming-spammers-in-twitter/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Questions. But Why?</title>
		<link>http://thenoisychannel.com/2010/08/01/questions-but-why/</link>
		<comments>http://thenoisychannel.com/2010/08/01/questions-but-why/#comments</comments>
		<pubDate>Sun, 01 Aug 2010 18:41:51 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3231</guid>
		<description><![CDATA[Yahoo! Answers and Answers.com have been around since 2005. But community question answering (as distinct from question answering using natural language processing) has witnessed a resurgence of popularity&#8211;at least in the blogosphere and among investors. Quora and Hunch are two of hottest startups on the web, and Aardvark was acquired by Google earlier this year. Most recently, Ask.com [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://answers.yahoo.com/">Yahoo! Answers</a> and <a href="http://www.answers.com/">Answers.com</a> have been around since 2005. But community question answering (as distinct from <a href="http://en.wikipedia.org/wiki/Question_answering">question answering using natural language processing</a>) has witnessed a resurgence of popularity&#8211;at least in the blogosphere and among investors. <a href="http://www.quora.com/">Quora</a> and <a href="http://hunch.com/">Hunch</a> are two of hottest startups on the web, and <a href="http://vark.com/">Aardvark</a> was acquired by Google earlier this year. Most recently, <a href="http://www.ask.com/">Ask.com</a> relaunched with a return to its question-answering roots and Facebook began rolling out <a href="http://blog.facebook.com/blog.php?post=411795942130">Facebook Questions</a>.</p>
<p>So there&#8217;s no question that community question answering is hot. The question is why? In particular, is community question answering a step forward or backward relative to today&#8217;s search engines, or is it something different?</p>
<p>Regarding Facebook Questions, Jason Kincaid writes in <a href="http://techcrunch.com/2010/07/28/facebook-qa-service-questions-begins-rolling-out-could-be-massive/">TechCrunch</a>:</p>
<blockquote><p>Given its size, it won’t take long for Facebook to build up a massive amount of data — if that data is consistently reliable, Questions could turn into a viable alternative to Google for many queries.</p></blockquote>
<p>That&#8217;s a big if.  But I think the bigger caveat is the vague quantifier &#8220;many&#8221;. The success of community question answering services will depend on how these services position themselves relative to users&#8217; information needs. Anyone arguing that these services can or should replace today&#8217;s web search engines might want to consider the following examples of information needs that are typical of current search engine use:</p>
<ul>
<li><a href="http://www.google.com/search?q=how+do+i+get+an+iphone+case">How do I get an iPhone case?</a></li>
<li><a href="http://www.google.com/search?q=who+sings+the+choco+latte+song">Who sings the &#8220;choco latte&#8221; song?</a></li>
<li><a href="http://www.google.com/search?q=movies+near+11201">What movies are playing in my neighborhood?</a></li>
<li><a href="http://www.google.com/search?q=how+do+i+get+to+boston+from+new+york">How do I get to Boston from New York?</a></li>
<li><a href="http://www.google.com/search?q=best+selling+netbook">What is the best selling netbook?</a></li>
<li><a href="http://www.google.com/search?q=best+cell+phone+reception+in+new+york">Who offers the best cell phone reception in New York?</a></li>
<li><a href="http://www.google.com/search?q=what+was+the+score+in+the+north+korea+portugal+game">What was the score in the North Korea &#8211; Portugal game?</a></li>
</ul>
<p>I hope I don&#8217;t have to keep going to convince you that web search engines have earned their popularity by serving a broad class of information needs (i.e., answer lots of questions)&#8211;and that&#8217;s without even using the wide variety of personalized and social features that web search engines are rapidly developing.</p>
<p>The common thread in the above questions is that they focus on objective information. In general, such questions are effectively and efficiently answered by search engines based on indexed, published content (including &#8220;<a href="http://en.wikipedia.org/wiki/Deep_Web">deep web</a>&#8221; content made available to search engines via APIs). There&#8217;s a lot of work we can do to improve search engines, particularly in the area of <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">supporting query formulation</a>. But it seems silly and wasteful to route such questions to other people&#8211;human beings should not be reduced to performing tasks at which machines excel.</p>
<p>That said, I agree with Kincaid that there are many information needs that are well addressed by  community question answering. In particular:</p>
<ul>
<li><strong>Questions for which point of view is a feature, not a bug.</strong> Review sites succeed when they provide sincere, informed personal reactions to products and services. Similarly, routing questions to people makes sense either when we care about the answerer&#8217;s a point of view. For some questions, I want the opinion of someone who shares my taste (which is what Hunch is pursuing with its &#8220;<a href="http://www.businessinsider.com/heres-what-comes-after-the-social-graph-2010-7">taste graph</a>&#8220;). For others, I want a diversity of expert opinions&#8211;for which I might turn to Aardvark (which tries to route questions to topic experts), Quora (where people follow particular topics), or <a href="http://www.linkedin.com/answers/">LinkedIn Answers</a>. Over time, the answers to many such questions can be published and indexed&#8211;and indeed some answers sites receive a <a href="http://twitter.com/Hitwise_US/status/19919086878">large share of their traffic</a> from search engines.</li>
<li><strong>Niche topics.</strong> As much as web search as improved <a href="http://thenoisychannel.com/2008/04/22/accessibility-in-information-retrieval/">information accessibility</a> for the &#8220;long tail&#8221; of published information, the effectiveness of web search can be highly variable for the most obscure information needs. Moreover, this effectiveness depends significantly on the user: some people are better at searching than others, especially in their areas of domain expertise. Social search can help level the playing field. Much as Wikipedia has surfaced much of the expertise at the head of the information distribution, community question answering can help out in the tail.</li>
<li><strong>Community for its own sake.</strong> Even in cases where search engines are more effective and efficient than community question answering services, some people prefer to participate in a social exchange rather than to conduct a transaction with an impersonal algorithm. Indeed, <a href="http://vark.com/aardvarkFinalWWW2010.pdf">researchers at Aardvark</a> found that many of the questions posed through their service (pre-acquisition) could be answered successfully using Google. I&#8217;ll go out on a limb and assume that Aardvark&#8217;s users were early technology adopters who are quite conversant with search engines&#8211;but in some case chose to use a social alternative simply because they wanted to be social.</li>
</ul>
<p>Conclusions? Community question answering may be overhyped right now, but it isn&#8217;t a fad. There are broad classes of subjective information needs that require a point of view, if not a diversity of views. And even if much of the use of community question answering sites is mediated by search engines indexing their archives, there will always be a need for fresh content. I also believe that social search will continue to be valuable for niche topics, since neither search engines nor searchers will ever be perfect.</p>
<p>But I think the biggest open question is whether people will favor community question answering simply to be social. I conjecture that, by very publicly integrating community question answering into is social networking platform, Facebook is testing the hypothesis that it can turn information seeking from a utilitarian individual task into an entertaining social destination. Given Facebook&#8217;s <a href="http://mashable.com/2009/09/17/facebook-google-time-spent/">highly engaged</a> user population, we won&#8217;t have to wait long to find out.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/08/01/questions-but-why/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/08/01/questions-but-why/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>SIGIR 2010: Day 2 Keynote</title>
		<link>http://thenoisychannel.com/2010/07/22/sigir-2010-day-2-keynote/</link>
		<comments>http://thenoisychannel.com/2010/07/22/sigir-2010-day-2-keynote/#comments</comments>
		<pubDate>Thu, 22 Jul 2010 12:32:36 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3212</guid>
		<description><![CDATA[The second day of the SIGIR 2010 conference kicked off with a keynote by TREC pioneer Donna Harman entitled &#8220;Is the Cranfield Paradigm Outdated?&#8221;. If you are at all familiar with Donna&#8217;s work on TREC, you&#8217;ll hardly be surprised that her answer was a resounding &#8220;NO!&#8221;. But of course she did a lot more than [...]]]></description>
			<content:encoded><![CDATA[<p>The second day of the <a href="http://www.sigir2010.org/">SIGIR 2010</a> conference kicked off with a keynote by <a href="http://trec.nist.gov/">TREC</a> pioneer Donna Harman entitled &#8220;Is the Cranfield Paradigm Outdated?&#8221;. If you are at all familiar with Donna&#8217;s work on TREC, you&#8217;ll hardly be surprised that her answer was a resounding &#8220;NO!&#8221;.</p>
<p>But of course she did a lot more than defend <a href="http://www.iva.dk/bh/core%20concepts%20in%20lis/articles%20a-z/cranfield_experiments.htm">Cranfield</a>. She offered a comprehensive and fascinating history of the Cranfield paradigm, starting with the Cranfield 1 experiments in the late 1950s which evaluated manual indexing systems.</p>
<p>Most importantly, she defined the Cranfield paradigm as defining a metric that reflects real user model and building the collection before the experiments to prevent human bias and enable reusability. As she noted, this model does not say anything about only returning a ranked list of ten blue links&#8211;which is what most people (myself included) associate with the Cranfield model. Indeed, she urged us to think outside this mindset.</p>
<p>I loved the presentation and found the history enlightening (though <a href="http://research.microsoft.com/en-us/people/robertson/">Stephen Robertson</a> corrected a few minor details). Still, I wondered if she was defining the Cranfield paradigm so broadly as to co-opt all of its critics.  But I think the clear dividing line between Cranfield and non-Cranfield is whether user effects are something to avoid or embrace. I perceive the success of Cranfield as coming in large part from its reduction of user effects. But I think that much of the <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> community sees user effects as precisely what we need to be evaluating for information seeking support systems.</p>
<p>In any case, it was a great keynote, and Donna promises me she will make the slides available. Of course I&#8217;ll post them here. In the mean time, check out <a href="http://www.searchenginecaffe.com/2010/07/sigir-2010-keynote-donna-harmon-on.html">Jeff Dalton&#8217;s notes</a> on his great blog and the tweets at <a href="http://search.twitter.com/search?q=%23sigir2010">#sigir2010</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/07/22/sigir-2010-day-2-keynote/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/07/22/sigir-2010-day-2-keynote/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>SIGIR 2010: Day 1 Posters</title>
		<link>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-posters/</link>
		<comments>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-posters/#comments</comments>
		<pubDate>Wed, 21 Jul 2010 07:19:23 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3208</guid>
		<description><![CDATA[The first day of SIGIR 2010 ended with a monster poster session&#8211;over 100 posters to see in 2 hours in a hall without air conditioning! I managed to see a handful: &#8220;Query Quality: User Ratings and System Predictions&#8221; by Claudia Hauff, Franciska de Jong, Diane Kelly, and Leif Azzopardi offered the startling (to me at [...]]]></description>
			<content:encoded><![CDATA[<p>The first day of <a href="http://www.sigir2010.org/">SIGIR 2010</a> ended with a monster <a href="http://www.sigir2010.org/doku.php?id=program:posters">poster session</a>&#8211;over 100 posters to see in 2 hours in a hall without air conditioning! I managed to see a handful:</p>
<ul>
<li>&#8220;Query Quality: User Ratings and System Predictions&#8221; by Claudia Hauff, Franciska de Jong, Diane Kelly, and Leif Azzopardi offered the startling (to me at least) result that human prediction of query difficulty did not correlate (or at best correlated weakly) to post-retrieval <a href="http://thenoisychannel.com/2010/05/23/estimating-the-query-difficulty-for-information-retrieval/">query performance prediction</a> (QPP) measures like query clarity. I talked with Diane about it, and I wonder how strongly the human prediction, which was pre-retrieval, would correlate to human assessments of the results. I also don&#8217;t know how well the QPP measures she used apply to web search contexts.</li>
<li>Which leads me to the next poster I saw, &#8220;Predicting Query Performance on the Web&#8221; by Niranjan Balasubramanian, Giridhar Kumaran, and Vitor Carvalho. They offered what I saw as a much more encouraging result&#8211;namely that QPP is highly reliable when it returns low scores. In other words, a search engine may wrongly believe that it did well on a query, but it is almost certainly right when it thinks it failed. This certainty on the negative side is exactly the opening that <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> advocates need to offer richer interaction for queries a conventional ranking approach recognizes its own failure. While some of the specifics of the authors&#8217; approach are proprietary (they perform regression on features used by Bing), the approach seems broadly applicable.</li>
<li>Next I saw &#8220;Hashtag Retrieval in a Microblogging Environment&#8221; by Miles Efron. He provided evidence that hashtags could be an effective foundation for query expansion of Twitter search queries, using a <a href="http://en.wikipedia.org/wiki/Language_model">language model</a> approach. The approach may generalize beyond hashtags, but hashtags do have the advantage of being highly topical and relatively unambiguous by convention.</li>
<li>&#8220;The Power of Naive Query Segmentation&#8221; by Matthias Hagen, Martin Potthast, Benno Stein, and Christof Brautigam suggested a simple approach for segmenting long queries into quoted phrases: consider all segmentations and, for a given segmentation, compute a weighted sum of the <a href="http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html">Google ngram counts</a> for each quoted phases, the weight of a phrase of length <em>s</em> being s^s. I don&#8217;t find the weighting particularly intuitive, but the accuracy numbers they present look quite nice relative to more sophisticated approaches.</li>
<li>&#8220;Investigating the Suboptimality and Instability of Pseudo-Relevance Feedback&#8221; by Raghavendra Udupa and Abhijit Bhole showed that an oracle with knowledge of a few high-scoring non-relevant documents could vastly improve the performance of <a href="http://en.wikipedia.org/wiki/Relevance_feedback#Blind_feedback">pseudo-relevance feedback</a>. While this information does not lead directly to any applications, it does suggest that obtaining a very small amount of feedback from the user might go a long way. I&#8217;m curious how much is possible from even a single negative-feedback input.</li>
<li>&#8220;Short Text Classification in Twitter to Improve Information Filtering&#8221; by Bharath Sriram, David Fuhry, Engin Demir, Hakan Ferhatosmanoglu, and Murat Demirbas challenged the conventional wisdom that tweets are too short for traditional classification methods. They achieved nice results, but on the relatively simple problem of classifying tweets as news, events, opinions, deals, and private messages. I was offered promises of future work, but I think the more general classification problem is much harder.</li>
<li>&#8220;Metrics for Assessing Sets of Subtopics&#8221; by Filip Radlinski, Martin Szummer, and Nick Craswell proposed an evaluation framework for result diversity based on coherence, distinctness, plausibility, and completeness. I suggested that this framework would apply nicely to <a href="http://en.wikipedia.org/wiki/Faceted_search">faceted search</a> interfaces, and that I&#8217;d love to see it demonstrated on production systems&#8211;especially since I think that might be easier to achieve than convincing the SIGIR community to embrace it.</li>
<li>Which leads me nicely to the last poster I saw, &#8220;Machine Learned Ranking of Entity Facets&#8221; by Roelof van Zwol, Lluis Garcia Pueyo, Mridul Muralidharan, and Borkur Sigurbjornsson. They found that they could accurately predict click-through rates on named entity facets (people, places) by learning from click logs. It&#8217;s worth noting that their entity facets are extremely clean, since they are derived from sources like Wikipedia, IMDB, GeoPlanet, and Freebase. It&#8217;s not clear to me how well their approach would work for noisier facets extracted from open-domain data.</li>
</ul>
<p>As I said, there were over a hundred posters, and I&#8217;d meant to see far more of them. Hopefully other people will blog about some of them! Or perhaps tweet about them at <a href="search.twitter.com/search?q=%23sigir2010">#sigir2010</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-posters/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-posters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SIGIR 2010: Day 1 Technical Sessions</title>
		<link>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-technical-sessions/</link>
		<comments>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-technical-sessions/#comments</comments>
		<pubDate>Wed, 21 Jul 2010 07:08:27 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3204</guid>
		<description><![CDATA[I&#8217;ve always felt that parallel conference sessions are designed to optimize for anticipated regret, and SIGIR 2010 is no exception. I decided that I&#8217;d try to attend whole sessions rather than shuttle between them. I started by attending the descriptively titled &#8220;Applications I&#8221; session. Jinyoung Kim of UMass presented joint work with Bruce Croft on [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve always felt that parallel conference sessions are designed to optimize for <a href="http://researchstories.asu.edu/2007/10/anticipated_regret_takes_out_t.html">anticipated regret</a>, and <a href="http://www.sigir2010.org/">SIGIR 2010</a> is no exception. I decided that I&#8217;d try to attend whole sessions rather than shuttle between them. I started by attending the descriptively titled &#8220;Applications I&#8221; session.</p>
<p>Jinyoung Kim of UMass presented joint work with Bruce Croft on &#8220;Ranking using Multiple Document Types in Desktop Search&#8221; in which they showed that type prediction can significantly improve known-item search performance in simulated desktop settings. I like the approach and result, but I&#8217;d be very interested to see how well it applied to more recall-oriented tasks.</p>
<p>Then came work by Googlers Enrique Alfonseca, Marius Pasca, and Enrique Robledo-Arnuncio on &#8220;Acquisition of Instance Attributes via Labeled and Related Instances&#8221; that overcomes the data sparseness of open-domain attribute extraction by computing relationships among instances and injecting this relatedness data into the instance-attribute graph so that attributes can be propagated to more instances. This is a nice enhancement to <a href="http://research.google.com/pubs/author107.html">earlier work by Pasca and others</a> on obtaining these  instance-attribute graphs.</p>
<p>The session ended with an intriguing paper on &#8220;Relevance and Ranking in Online Dating Systems&#8221; by Yahoo researchers Fernando Diaz, Donald Metzler, and Sihem Amer-Yahia that formulated a two-way relevance model for matchmaking systems but unfortunately found that it did no better than query-independent ranking in the context of a production personals system. I would be very interested to see how the model applied to other matchmaking scenarios, such as matching job seekers to employers.</p>
<p>After a wonderful lunch hosted by <a href="http://www.morganclaypool.com/">Morgan &amp; Claypool</a> for <a href="http://www.amazon.com/Synthesis-Lectures-Information-Concepts-Retrieval/dp/1598299999">authors</a>, I attended a session on Filtering and Recommendation.</p>
<p>It started with a paper on &#8220;Social Media Recommendation Based on People and Tags&#8221; by IBM researchers Ido Guy, Naama Zwerdling, Inbal Ronen, David Carmel, and Erel Uziel. They analyzed item recommendation in an enterprise setting and found that a hybrid approach combining algorithmic tag-based recommendations with people-based recommendations achieves better performance at delivering interesting recommendations than either approach alone. I&#8217;m curious how well these results generalize outside of enterprise settings&#8211;or even how well they apply across the large variation in enterprises.</p>
<p>Then came work by Nikolaos Nanas, Manolis Vavalis, and Anne De Roeck on &#8220;A Network-Based Model for High-Dimensional Information Filtering&#8221;. The authors propose to overcome the &#8220;curse of dimensionality&#8221; of vector space representations of profiles by instead modeling keyword dependencies in a directed graph and applying a non-iterative activation model to it. The presentation was excellent, but I&#8217;m not entirely convinced by the baseline they used for their comparisons.</p>
<p>After that was a paper by Neal Lathia, Stephen Halles, Licia Capra, and Xavier Amatriain on &#8220;Temporal Diversity in Recommender Systems&#8221;. They focused on the problem that users get bored and frustrated by recommender systems that keep recommending the same items over time. They provided evidence that users prefer temporal diversity of recommendations and suggested some methods to promote it. I like the research, but I still think that <a href="http://thenoisychannel.com/2008/11/21/the-napoleon-dynamite-problem/">recommendation engines cry out for transparency</a>, and that transparency can also help address the diversity problem&#8211;e.g., pick a random movie the user watched and propose recommendations explicitly based on that movie.</p>
<p>Unfortunately I missed the last paper of the session, in which Noriaki Kawamae talked about &#8220;Serendipitous Recommendations via Innovators&#8221;.</p>
<p>Reminder: also check out the tweet stream with hash tag <a href="search.twitter.com/search?q=%23sigir2010">#sigir2010</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-technical-sessions/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-technical-sessions/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>SIGIR 2010: Day 1 Keynote</title>
		<link>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-keynote/</link>
		<comments>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-keynote/#comments</comments>
		<pubDate>Wed, 21 Jul 2010 06:44:32 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3201</guid>
		<description><![CDATA[As promised, here are some highlights of the SIGIR 2010 conference thus far. Also check out the tweet stream with hash tag #sigir2010. I arrived here on Monday, too jet-lagged to even imagine attending the tutorials, but fortunately I recovered enough to go to the welcome reception in the Parc de Bastions that evening. Then [...]]]></description>
			<content:encoded><![CDATA[<p>As promised, here are some highlights of the <a href="http://www.sigir2010.org/">SIGIR 2010</a> conference thus far. Also check out the tweet stream with hash tag <a href="http://search.twitter.com/search?q=%23sigir2010">#sigir2010</a>.</p>
<p>I arrived here on Monday, too jet-lagged to even imagine attending the <a href="http://www.sigir2010.org/doku.php?id=program:tutorials">tutorials</a>, but fortunately I recovered enough to go to the welcome reception in the <a href="http://www.bastions.ch/">Parc de Bastions</a> that evening. Then a night of sleep and on to the main event.</p>
<p>Tuesday morning kicked off with a keynote by Microsoft Live Labs director <a href="http://flakenstein.net/">Gary Flake</a> entitled &#8220;Zoomable UIs, Information Retrieval, and the Uncanny Valley&#8221;. Flake&#8217;s premise is that information retrieval is stuck in the &#8220;<a href="http://en.wikipedia.org/wiki/Uncanny_valley">uncanny valley</a>&#8220;, a metaphor he borrows from the robotics community. According to Wikipedia:</p>
<blockquote><p>The theory holds that when robots and other facsimiles of humans look and act almost like actual humans, it causes a response of revulsion among human observers. The &#8220;valley&#8221; in question is a dip in a proposed graph of the positivity of human reaction as a function of a robot&#8217;s lifelikeness.</p></blockquote>
<p>Flake offered <a href="http://en.wikipedia.org/wiki/Grokker">Grokker</a> (R.I.P.) as an example of a search interface that emphasized visual clustering and got stuck in the uncanny valley. He called it &#8220;the sexiest search experience that no one was going to use&#8221;.  Flake then went on to propose that moving beyond the uncanny valley would require replacing our current discrete interactions with search engines into a mode of continuous, fluid interaction where whole of data greater than sum or parts. He offered some demos, emphasizing the recently released <a href="http://www.getpivot.com/">Pivot</a> client, that he felt provided a vision to overcome the uncanny valley.</p>
<p>As became clear in the question and answer period, many people (myself included) felt that this rich visual approach might work well for browsing images but not as clear a fit for text-oriented information needs&#8211;despite Flake offering a demo based on the collection of Wikipedia documents. In fairness, it may be too early to assess a proof of concept.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-keynote/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/07/21/sigir-2010-day-1-keynote/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>The War on Attention Poverty: Measuring Twitter Authority</title>
		<link>http://thenoisychannel.com/2010/07/13/the-war-on-attention-poverty-measuring-twitter-authority/</link>
		<comments>http://thenoisychannel.com/2010/07/13/the-war-on-attention-poverty-measuring-twitter-authority/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 03:13:45 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3195</guid>
		<description><![CDATA[I gave this presentation today at AT&#38;T Labs, hosted by Stephen North of Graphviz fame. The talk was recorded, but I don&#8217;t know when the video will be available. In the mean time, here are the slides. The audience was very engaged and questioned just about all of the TunkRank model&#8217;s assumptions. I&#8217;m hopeful that as [...]]]></description>
			<content:encoded><![CDATA[<div id="__ss_4749609" style="width: 425px;"><object id="__sse4749609" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=waronattentionpoverty-100713213804-phpapp01&amp;stripped_title=the-war-on-attention-poverty-measuring-twitter-authority" /><param name="name" value="__sse4749609" /><param name="allowfullscreen" value="true" /><embed id="__sse4749609" type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=waronattentionpoverty-100713213804-phpapp01&amp;stripped_title=the-war-on-attention-poverty-measuring-twitter-authority" name="__sse4749609" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<div style="padding: 5px 0 12px;">
<p>I gave this presentation today at <a href="http://www.research.att.com/editions/201005_home.html">AT&amp;T Labs</a>, hosted by <a href="http://www.research.att.com/people/North_Stephen_C">Stephen North</a> of <a href="http://www.graphviz.org/">Graphviz</a> fame. The talk was recorded, but I don&#8217;t know when the video will be available. In the mean time, here are the slides.</p>
<p>The audience was very engaged and questioned just about all of the TunkRank model&#8217;s assumptions. I&#8217;m hopeful that as <a href="http://mendicantbug.com/about/">Jason Adams</a> and <a href="http://www.linkedin.com/in/israelkloss">Israel Kloss</a> work on making a business out of <a href="http://tunkrank.com/">TunkRank</a>, they&#8217;ll bridge some of the gap between simplicity and realism.</p>
</div>
</div>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/07/13/the-war-on-attention-poverty-measuring-twitter-authority/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/07/13/the-war-on-attention-poverty-measuring-twitter-authority/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Are Links A Distraction?</title>
		<link>http://thenoisychannel.com/2010/05/31/are-links-a-distraction/</link>
		<comments>http://thenoisychannel.com/2010/05/31/are-links-a-distraction/#comments</comments>
		<pubDate>Tue, 01 Jun 2010 01:15:24 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3166</guid>
		<description><![CDATA[Eric Andersen called my attention to a post by Nick Carr entitled &#8220;Experiments in delinkification&#8220;, in which Carr argues that links embedded in text are distracting, and that we&#8217;re better off treating them like the footnotes they evolved from and putting them in a block at the end of the text. It&#8217;s an interesting piece, [...]]]></description>
			<content:encoded><![CDATA[<p>Eric Andersen <a href="http://twitter.com/eric_andersen/status/15140732425">called my attention</a> to a post by Nick Carr entitled &#8220;<a href="http://www.roughtype.com/archives/2010/05/experiments_in.php">Experiments in delinkification</a>&#8220;, in which Carr argues that links embedded in text are distracting, and that we&#8217;re better off treating them like the footnotes they evolved from and putting them in a block at the end of the text. It&#8217;s an interesting piece, and I see the merits of his argument. Indeed, I remember trying to read a heavily annotated edition of Nabokov&#8217;s <em><a href="http://books.google.com/books?id=UJznorXbTuYC">Lolita</a></em>, and it was extremely hard to maintain the flow of reading the novel while turning every few seconds to read about every last <a href="http://en.wikipedia.org/wiki/Vladimir_Nabokov#Entomology">entomology</a> reference in the text.</p>
<p>Nonetheless, I feel that links supply context, and I&#8217;m a fan of keeping context nearby. Indeed, I find that clicking on a link incurs a much lower cognitive cost than flipping to the back of the book, searching for an endnote. I&#8217;ve had readers specifically thank me for including links to Wikipedia entries for technical terms. I assume those readers are fully capable of finding those Wikipedia entries themselves, but that they appreciate the convenience of the links.</p>
<p>Some of the commenters on Carr&#8217;s post suggest that we use technology to address this tension between preserving the reader&#8217;s focus and supplying nearby context. Specifically, we can use <a href="http://en.wikipedia.org/wiki/Cascading_Style_Sheets">CSS</a> and have a <a href="http://en.wikipedia.org/wiki/JavaScript">JavaScript</a> button that toggles the link style between visible and invisible. I like the idea of handing readers control of the presentation style, though I still think it&#8217;s important to pick a sensible default. At the very least, a document should be self-contained so that a reader can choose if and when to look at the material it cites. The document should also give credit where it&#8217;s due, linking to the material it cites in a way that is visible to people and search engines. Beyond that, I think it&#8217;s really a matter of author style.</p>
<p>Still, I&#8217;m curious what folks here&#8211;especially long-time readers&#8211;think. Do I link so heavily that it&#8217;s distracting? Would it be easier to read my posts if the links were in a block at the end? I write for you, so please let me know how I can make this blog better. I don&#8217;t have the resources to conduct <a href="http://en.wikipedia.org/wiki/Cognitive_load">cognitive load</a> experiments, but I&#8217;m very receptive to comments.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/05/31/are-links-a-distraction/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/05/31/are-links-a-distraction/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
		</item>
		<item>
		<title>Peter Morville&#8217;s Keynote at Enterprise Search Summit</title>
		<link>http://thenoisychannel.com/2010/05/12/peter-morvilles-keynote-at-enterprise-search-summit/</link>
		<comments>http://thenoisychannel.com/2010/05/12/peter-morvilles-keynote-at-enterprise-search-summit/#comments</comments>
		<pubDate>Wed, 12 May 2010 16:47:41 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3132</guid>
		<description><![CDATA[This morning&#8217;s Enterprise Search Summit keynote was by Peter Morville, who has written a number of best-selling books about information architecture. I&#8217;ve known Peter for a while and had the pleasure of serving as a reviewer for his latest book, Search Patterns, but had never seen him present this material live. As you can see [...]]]></description>
			<content:encoded><![CDATA[<p>This morning&#8217;s <a href="http://www.enterprisesearchsummit.com/2010/">Enterprise Search Summit</a> keynote was by <a href="http://semanticstudios.com/about/">Peter Morville</a>, who has written a number of best-selling books about <a href="http://en.wikipedia.org/wiki/Information_architecture">information architecture</a>. I&#8217;ve known Peter for a while and had the pleasure of serving as a reviewer for his latest book, <em><a href="http://searchpatterns.org/">Search Patterns</a></em>, but had never seen him present this material live. As you can see from his <a href="http://www.slideshare.net/morville/search-discovery-patterns">slides</a>, Peter&#8217;s presentation style is incredibly visual&#8211;almost all of his slides are screenshots or illustrations explaining his concepts. It makes for a great presentation, but a difficult text summary!</p>
<p>The focus of his talk, naturally, was patterns. Specifically, he advocated that we take the behavior patterns of information seekers that library and information scientists have been studying for years, and use them to inform design patterns for search user interfaces.</p>
<p>One point he raised that deserves a deeper dive:  number of media (mobile,  kiosk, TV) environments push people to browse, partly because of limitations of the medium but also taking advantage of the novelty and relative lack of user habits. Unfortunately, browsing doesn&#8217;t always scale in those environments, so search is usually available as a contingency.</p>
<p>Interestingly, while Peter promotes rich interfaces in many of his patterns, he noted that great results ranking plus speedy response (he uses Google &#8220;classic&#8221; as his example) does allow users to rapidly reformulate their queries while staying in the flow of the information seeking experience. He returned to Google later in his talk, noting that the new interface goes beyond ranking to support a richer user interaction.</p>
<p>And, like me and <a href="http://people.ischool.berkeley.edu/~hearst/">Marti Hearst</a> (yesterday&#8217;s <a href="http://thenoisychannel.com/2010/05/11/marti-hearsts-keynote-at-enterprise-search-summit/">keynote</a>), Peter advocates <a href="http://en.wikipedia.org/wiki/Faceted_search">faceted navigation</a> (I won&#8217;t quibble on whether to call it navigation or search) as his favorite search design pattern. He uses the <a href="http://www.lib.ncsu.edu/">NCSU library</a> as an example not only of a great implementation but also of an organization that continues to experiment with incremental design changes. He also showed faceted search examples from other domains, including <a href="http://amazon.com/">Amazon</a> and <a href="http://buzzillions.com/">Buzzilions</a>.</p>
<p>Other patterns he discusses included <a href="http://en.wikipedia.org/wiki/Question_answering">question answering</a> (his example being <a href="http://thenoisychannel.com/2009/03/31/wolfram-alpha-first-hand-impressions/">Wolfram Alpha</a>) and decision making (his example being <a href="http://hunch.com/">Hunch</a>). He didn&#8217;t go deep on these, but rather invited the audience to consider a broad palette of strategies for supporting information seeking. Indeed, when I asked him about question answering, he conceded that he was a skeptic and preferred a conversational (i.e., <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a>) approach akin to a librarian&#8217;s <a href="http://en.wikipedia.org/wiki/Reference_interview">reference interview</a>.</p>
<p>His closing note was about bridging the gap between physical and digital information, where he offered a potpourri of examples (from <a href="http://www.redbox.com/">Redbox</a> to a <a href="http://www.botanicalls.com/kits/">tweeting plant</a>). I <a href="http://thenoisychannel.com/2010/05/09/celebrating-six-months-at-google-new-york/">work in local search</a>, so in my case he&#8217;s preaching to the converted. But I think he&#8217;s right that everything is only recently coming together&#8211;specifically, the ubiquity of digital data on the internet and of mobile devices in the physical world that can both consume and produce that data. Many of us take these developments for granted, but it&#8217;s important that we adapt our approach to search to address what is a very recent phenomenon.</p>
<p>Fun stuff! I didn&#8217;t get to attend the rest of the summit, but I encourage you to check out the tweet stream at <a href="http://search.twitter.com/search?q=%23ESS10">#ESS10</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/05/12/peter-morvilles-keynote-at-enterprise-search-summit/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/05/12/peter-morvilles-keynote-at-enterprise-search-summit/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Marti Hearst&#8217;s Keynote at Enterprise Search Summit</title>
		<link>http://thenoisychannel.com/2010/05/11/marti-hearsts-keynote-at-enterprise-search-summit/</link>
		<comments>http://thenoisychannel.com/2010/05/11/marti-hearsts-keynote-at-enterprise-search-summit/#comments</comments>
		<pubDate>Tue, 11 May 2010 20:01:47 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3127</guid>
		<description><![CDATA[The Enterprise Search Summit is taking place in New York this week, and I was lucky to be able to attend Marti Hearst&#8217;s opening keynote this morning about designing search for humans. If you&#8217;ve read her book or heard her present its material, then you&#8217;re probably familiar with the pitch she made. Still, it&#8217;s great [...]]]></description>
			<content:encoded><![CDATA[<p>The Enterprise Search Summit is taking place in New York this week, and I was lucky to be able to attend Marti Hearst&#8217;s opening keynote this morning about designing search for humans. If you&#8217;ve read her <a href="http://searchuserinterfaces.com/book/">book</a> or heard her present its material, then you&#8217;re probably familiar with the pitch she made. Still, it&#8217;s great to hear her present it live to a very non-academic audience.</p>
<p>Her major take-aways:</p>
<ul>
<li>The user&#8217;s emotional response is a key aspect of the <a href="http://en.wikipedia.org/wiki/Information_seeking">information seeking</a> experience.</li>
<li>There is a double vocabulary problem: different ways to express same concept (cf. <a href="http://furnas.people.si.umich.edu/Papers/vocab.paper.pdf">Furnas et al.</a>), and users stubbornly anchor on initial query terms (cf. <a href="http://en.wikipedia.org/wiki/Anchoring">Kahneman, Tversky, et al.</a>)</li>
<li><a href="http://en.wikipedia.org/wiki/Recognition_memory">Recognition</a> is easier than <a href="http://en.wikipedia.org/wiki/Recall_(memory)">recall</a>, so interfaces need to support the recognition process.</li>
<li>Don&#8217;t <a href="http://en.wikipedia.org/wiki/Personalization">personalize</a> search, <a href="http://en.wikipedia.org/wiki/Social_search">socialize</a> it!</li>
</ul>
<p>She peppered her talk with concrete examples and scholarly references. Given that her <a href="http://searchuserinterfaces.com/book/">book</a> is available online for free, I won&#8217;t try to replicate them all here! Still, I&#8217;ll single out two <a href="http://thenoisychannel.com/the-noisy-community/">Noisy Community</a> members: FXPAL researchers <a href="http://fxpal.com/?p=jeremy">Jeremy Pickens</a> and <a href="http://fxpal.com/?p=gene">Gene Golovchinsky</a> (for their SIGIR 2008 work on <a href="http://fxpal.com/publications/FXPAL-PR-08-460.pdf">collaborative exploratory search</a>) and user experience designer <a href="http://www.designcaffeine.com/about/">Greg Nudelman</a> for his proposal of <a href="http://www.boxesandarrows.com/view/faceted-finding-with">faceted breadcrumbs</a> as a search user interface.</p>
<p>If you missed her live, you check find a video of a <a href="http://thenoisychannel.com/2009/11/25/marti-hearst-tech-talk-on-search-user-interfaces/">tech talk</a> she gave at Google a few months ago. You can also check out the conference tweet-stream at <a href="http://search.twitter.com/search?q=%23ESS10">#ESS10</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/05/11/marti-hearsts-keynote-at-enterprise-search-summit/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/05/11/marti-hearsts-keynote-at-enterprise-search-summit/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>TunkRank scores added to FluidDB</title>
		<link>http://thenoisychannel.com/2010/05/05/tunkrank-scores-added-to-fluiddb/</link>
		<comments>http://thenoisychannel.com/2010/05/05/tunkrank-scores-added-to-fluiddb/#comments</comments>
		<pubDate>Thu, 06 May 2010 01:39:42 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=3116</guid>
		<description><![CDATA[For those keeping track of TunkRank, I encourage you to check out FluidDB, which just added TunkRank scores to its feature set. That lets you do cool things like find out which users I follow have a TunkRank score over 40. You can also read what Jason Adams has to say about it here. Speaking [...]]]></description>
			<content:encoded><![CDATA[<p>For those keeping track of <a href="http://tunkrank.com/">TunkRank</a>, I encourage you to check out <a href="http://fluidinfo.com/fluiddb">FluidDB</a>, which just <a href="http://blogs.fluidinfo.com/fluidDB/2010/05/06/tunkrank-scores-added-to-fluiddb/">added TunkRank scores</a> to its feature set. That lets you do cool things like find out <a href="http://tickery.net/?query=has%20twitter.com/friends/dtunkelang%20and%20tunkrank.com/score%20%3E%2040&amp;sort=screen_name&amp;icon=medium&amp;tab=advanced">which users I follow have a TunkRank score over 40</a>. You can also read what <a href="http://mendicantbug.com/">Jason Adams</a> has to say about it <a href="http://mendicantbug.com/2010/05/05/tunkrank-meet-tickery/">here</a>.</p>
<p>Speaking of Jason, check out the latest improvements he&#8217;s made to the <a href="http://tunkrank.com/">TunkRank</a> interface. Pretty slick! To learn more about the TunkRank measure of Twitter influence / authority, check out <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">this post</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/05/05/tunkrank-scores-added-to-fluiddb/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/05/05/tunkrank-scores-added-to-fluiddb/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Google Follow Finder</title>
		<link>http://thenoisychannel.com/2010/04/14/google-follow-finder/</link>
		<comments>http://thenoisychannel.com/2010/04/14/google-follow-finder/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 00:15:34 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=3089</guid>
		<description><![CDATA[I know there&#8217;s lots of interesting stuff coming out at the Chirp Twitter developer conference this week, and I&#8217;m still catching up on it all. But I am happy to point folks to a Google Labs application that was announced this morning: Follow Finder. It&#8217;s not the first application to suggest Twitter followers based on analysis [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.followfinder.googlelabs.com/"><img class="alignnone" title="Google Follow Finder" src="http://2.bp.blogspot.com/_7ZYqYi4xigk/S8YmZNs5CXI/AAAAAAAAF10/eq616058q34/s1600/screenfinal.png" alt="" width="496" height="201" /></a></p>
<p>I know there&#8217;s lots of interesting stuff coming out at the <a href="http://chirp.twitter.com/">Chirp</a> Twitter developer conference this week, and I&#8217;m still catching up on it all. But I am happy to point folks to a Google Labs application that was <a href="http://googleblog.blogspot.com/2010/04/google-follow-finder-find-some-sweet.html">announced</a> this morning: <a href="http://www.followfinder.googlelabs.com/">Follow Finder</a>.</p>
<p>It&#8217;s not the first application to suggest Twitter followers based on analysis of the social graph, but I&#8217;ve actually found its suggestions to be quite plausible. For example, it suggests @<a href="http://twitter.com/fredwilson">fredwilson</a>, @<a href="http://twitter.com/cshirky">cshirky</a>, @<a href="http://twitter.com/mattcutts">mattcutts</a>, @<a href="http://twitter.com/peteskomoroch">peteskomoroch</a>, and @<a href="http://twitter.com/msftresearch">msftresearch</a> as &#8220;tweeps&#8221; I should follow, and suggests that the following users have similar followers to mine: @<a href="http://twitter.com/endeca">endeca</a>, @<a href="http://twitter.com/lemire">lemire</a>, @<a href="http://twitter.com/yahooresearch">yahooresearch</a>, @<a href="http://twitter.com/googleresearch">googleresearch</a>, and @<a href="http://twitter.com/mattcutts">mattcutts</a>.</p>
<p>There&#8217;s a bit of an &#8220;<a href="http://thenoisychannel.com/2009/02/24/how-recommendation-engines-quash-diversity/">everything sounds like Coldplay</a>&#8221; effect (e.g., @<a href="http://twitter.com/fredwilson">fredwilson</a> shows up in a lot of the searches I tried), but overall I&#8217;m impressed with the quality, especially compared to the other suggestion tools I&#8217;ve tried.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/04/14/google-follow-finder/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/04/14/google-follow-finder/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Go TunkRank!</title>
		<link>http://thenoisychannel.com/2010/04/07/go-tunkrank/</link>
		<comments>http://thenoisychannel.com/2010/04/07/go-tunkrank/#comments</comments>
		<pubDate>Thu, 08 Apr 2010 00:48:42 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3079</guid>
		<description><![CDATA[I haven&#8217;t talked much about TunkRank in the past months, largely because Jason Adams, who stepped up to the TunkRank Implementation Challenge last year, has been leading the charge. Indeed, all I did, beyond lending my first syllable to its name, was to propose the measure and get it implemented &#8220;Tom Sawyer&#8221; style. Since then: [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tunkrank.com/"><img class="alignnone" title="TunkRank" src="http://tunkrank.com/images/TunkRank.png" alt="" width="328" height="93" /></a></p>
<p>I haven&#8217;t talked much about <a href="http://tunkrank.com/">TunkRank</a> in the past months, largely because <a href="http://mendicantbug.com/">Jason Adams</a>, who stepped up to the <a href="http://thenoisychannel.com/2009/01/16/the-tunkrank-implementation-challenge/">TunkRank Implementation Challenge</a> last year, has been leading the charge. Indeed, all I did, beyond lending my first syllable to its name, was to <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">propose the measure</a> and get it implemented &#8220;<a href="http://www.pbs.org/marktwain/learnmore/writings_tom.html">Tom Sawyer</a>&#8221; style.</p>
<p>Since then:</p>
<ul>
<li>TunkRank was cited in a <a href="http://www.wsdm-conference.org/2010/">WSDM 2010</a> paper entitled &#8220;<a href="http://www.mysmu.edu/staff/jsweng/papers/TwitterRank_WSDM.pdf">TwitterRank: finding topic-sensitive influential twitterers</a>&#8220;</li>
<li><a href="http://www.gamocracy.com/profile/incolas/">Nicolas Cerrato</a> implemented an influence measure for gamer site <a href="http://www.gamocracy.com/">Gamocracy</a> based on TunkRank.</li>
</ul>
<p>And, most recently:</p>
<ul>
<li>University of Oviedo professor <a href="http://www.di.uniovi.es/~dani/">Daniel Gayo-Avello</a> published a research paper entitled &#8220;<a href="http://arxiv.org/abs/1004.0816">Nepotistic Relationships in Twitter and their Impact on Rank Prestige Algorithms</a>&#8220;, based on a follower graph of 1.8M Twitter users, in which he reports:<br />
<em><br />
Lastly, there are one method clearly outperforming PageRank with respect to penalization of abusive users while still inducing plausible rankings: TunkRank. It is certainly similar to PageRank but it makes a much better job when confronted with &#8220;cheating&#8221;: aggressive marketers are almost indistinguishable from common users –which is, of course, desirable; and spammers just manage to grab a much smaller amount of the global available prestige and reach lower positions –although they still manage to be better positioned than average users. In addition to that, the ranking induced by TunkRank certainly agrees with that of PageRank, specially at the very top of the list, meaning that many users achieving good positions with PageRank should also get good positions with TunkRank. Thus, TunkRank is a highly recommendable ranking method to apply to social networks: it is simple, it induces plausible rankings, and severely penalizes spammers when compared to PageRank.<br />
</em><br />
You can read a summary version in his blog post, descriptively titled &#8220;<a href="http://www.di.uniovi.es/~dani/PFCblog/index.php?entry=entry100407-093639">Research on a 1.8M Twitter user graph. Conclusion: TunkRank is your best option.</a>&#8220;</li>
</ul>
<p>I&#8217;ve excited that an idea I came up with on a whim (or perhaps out of <a href="http://www.texttechnologies.com/2009/01/02/daniel-tunkelang-idealizes-twitter/">excessive idealism</a>) has taken such a life of its own. And hey, I do work for a company that is into <a href="http://googleblog.blogspot.com/2009/12/relevance-meets-real-time-web.html">real-time search</a> and that knows a thing or two about <a href="http://en.wikipedia.org/wiki/Adversarial_information_retrieval">adversarial information retrieval</a>. Hopefully I&#8217;ll find way to apply TunkRank&#8211;or at least its intuition&#8211;in my own work. In the mean time, I offer those who have already done so my congratulations and gratitude.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/04/07/go-tunkrank/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/04/07/go-tunkrank/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>New Toys from Hunch</title>
		<link>http://thenoisychannel.com/2010/03/27/new-toys-from-hunch/</link>
		<comments>http://thenoisychannel.com/2010/03/27/new-toys-from-hunch/#comments</comments>
		<pubDate>Sat, 27 Mar 2010 20:02:33 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3016</guid>
		<description><![CDATA[I&#8217;ve been following Hunch for a while, and my impression has evolved from the initial skepticism with which I greeted it a year ago (to the day!). Given the track records of co-founders Caterina Fake and Chris Dixon, perhaps I should have expected their success at obtaining traffic and funding.  But what interests me more is that [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><img class="size-full wp-image-3018 aligncenter" style="border: 1px solid black;" title="Hunch" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2010/03/Picture-1.png" alt="" width="343" height="115" /></p>
<p><a href="http://thenoisychannel.com/wordpress/wp-content/uploads/2010/03/Picture-1.png"></a>I&#8217;ve been following <a href="http://hunch.com/">Hunch</a> for a while, and my impression has evolved from the <a href="http://thenoisychannel.com/2009/03/27/i-gotta-hunch-youll-wanna-check-this-out/">initial skepticism</a> with which I greeted it a year ago (to the day!). Given the track records of co-founders <a href="http://www.caterina.net/about.html">Caterina Fake</a> and <a href="http://cdixon.org/about.html">Chris Dixon</a>, perhaps I should have expected their success at obtaining <a href="http://siteanalytics.compete.com/hunch.com/">traffic</a> and <a href="http://techcrunch.com/2010/03/14/hunch-takes-12-million-from-khosla-ventures-adds-former-facebook-cfo-to-board-of-directors/">funding</a>.  But what interests me more is that they are doing interesting things with data mining and putting a new twist on social media analytics.</p>
<p>For those unfamiliar with Hunch, it is a decision engine (cf. [<a href="http://www.google.com/search?q=real+decision+engine">real decision engine</a>] vs. [<a href="http://www.google.com/search?q=decision+engine">decision engine</a>]). For example, it can help you decide <a href="http://hunch.com/should-i-buy-the-apple-ipad/">whether to buy an iPad</a> or <a href="http://hunch.com/baby-girl-names/">how to name your baby</a>. While it&#8217;s not clear to me how much people are using Hunch for utility vs. entertainment, Hunch is certainly accumulating users&#8211;as well as the <a href="http://hunch.com/teach-hunch-about-you/">data</a> that those users volunteer.</p>
<p>Hunch recently released two applications that mash up that data with the Twitter follower graph. The first is a &#8220;<a href="http://hunch.com/games/twitter-predictor/">Twitter Predictor Game</a>&#8221; that attempts to calculate your taste profile from your Twitter id and then predict how you&#8217;ll answer Hunch&#8217;s taste questions. Just to keep the game honest, you can look at the Hunch&#8217;s guess either before or after you provide your answer. The second is called &#8220;<a href="http://hunch.com/twitter-followers/">Twitter Follower Stats</a>&#8220;: given a Twitter user, it reports the salient information it has inferred about that user&#8217;s followers (e.g., <a href="http://hunch.com/twitter-followers/maddow/">@maddow</a> vs. <a href="http://hunch.com/twitter-followers/karlrove/">@karlrove</a>).</p>
<p>I think this stuff is neat, and a great testament to the &#8220;<a href="http://thenoisychannel.com/2009/03/31/the-unreasonable-effectiveness-of-data/">unreasonable effectiveness of data</a>&#8220;. The question-answer data still feels a bit sparse for my taste, and I suspect there&#8217;s still room for more <a href="http://en.wikipedia.org/wiki/Dimension_reduction">dimensionality reduction</a>. I&#8217;m sure Hunch CTO <a href="http://mattgattis.com/about/about.html">Matt Gattis</a> and colleagues are working on it! Also, it would be neat to direct the follower analytics rather than simply see the ones that Hunch deems most salient.</p>
<p>In summary, Hunch is keeping it interesting. Definitely a startup to watch and learn from.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/03/27/new-toys-from-hunch/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/03/27/new-toys-from-hunch/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Can We Build a Distributed Trust Network?</title>
		<link>http://thenoisychannel.com/2010/03/20/can-we-build-a-distributed-trust-network/</link>
		<comments>http://thenoisychannel.com/2010/03/20/can-we-build-a-distributed-trust-network/#comments</comments>
		<pubDate>Sun, 21 Mar 2010 02:38:50 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3005</guid>
		<description><![CDATA[Mathew Ingram posted an interview with Craig Newmark (the Craig of craigslist fame) in which the latter argued that what the web needs is a “distributed trust network” to manage our online reputations. As it happens, this is an idea that has occupied me for several years. So I figured it was about time that I shared my thoughts [...]]]></description>
			<content:encoded><![CDATA[<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="390" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://blip.tv/play/AYHOqGwC" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="390" src="http://blip.tv/play/AYHOqGwC" allowfullscreen="true"></embed></object></p>
<p>Mathew Ingram posted an <a href="http://gigaom.com/2010/03/18/craig-newmark-on-the-webs-next-big-problem/">interview</a> with Craig Newmark (the <a href="http://www.craigslist.org/about/craig_newmark">Craig</a> of <a href="http://www.craigslist.org/">craigslist</a> fame) in which the latter argued that what the web needs is a “distributed trust network” to manage our online reputations. As it happens, this is an idea that has occupied me for several years. So I figured it was about time that I shared my thoughts on the subject.</p>
<p>When we think of how trust works online, two of the most prominent examples are Google&#8217;s <a href="http://en.wikipedia.org/wiki/PageRank">PageRank</a> measure and eBay&#8217;s <a href="http://pages.ebay.com/help/feedback/scores-reputation.html">feedback scores</a>. But neither of these measures addresses what I think Craig has in mind. PageRank is a great way of using citation analysis to determine the most authoritative citations, but the trust in a page should consider its out-links (i.e., can we trust the page not to point us to untrustworthy ones?) and not just its in-links. eBay&#8217;s feedback scores have a different problem: they count positive and negative ratings without considering the social network of buyers and sellers&#8211;and approach that is <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.138&amp;rep=rep1&amp;type=pdf">vulnerable to fraud through shill ratings</a>. Incidentally,<a href="http://linkedin.custhelp.com/cgi-bin/linkedin.cfg/php/enduser/std_adp.php?p_faqid=99">LinkedIn recommendations</a> have a similar weakness if viewed in strictly quantitative terms, but the potential for abuse is mitigated by the endorsements being signed&#8211;and by their being more than just binary or numerical ratings. Incidentally, here&#8217;s a <a href="http://endorser.org/">site</a> you can use if you&#8217;re too lazy to actually write the recommendations yourself.</p>
<p>But I digress. Propagation of trust does seem like the perfect application to build on top of social networks. Consider any problem that involves getting advice to inform a decision. If we regularly solicit advice from our first-degree connections, then we should be able to learn over time whose advice we can trust. We can then vouch for these connections, which offers the connections who trust us a basis for trusting their second-degree connections through us. And so forth through our social network. Of course, trust is not irrevocable: loss of trust should propagate similarly.</p>
<p>I&#8217;ve talked about this problem with two of the leading experts on social networks, <a href="http://www.cs.cornell.edu/home/kleinber/">Jon Kleinberg</a> and <a href="http://research.yahoo.com/Prabhakar_Raghavan">Prabhakar Raghavan</a>, and as far as I know no one has built a system along these principles. In economic terms, I envision a system where a person&#8217;s reputation truly is his or her coin. One person might think of bribing one another to exploit the latter&#8217;s established reputation, but a rational person with a strong reputation would demand an exorbitant bribe to put that reputation at risk.</p>
<p>Of course, a lot of information would have to propagate throughout the social network&#8211;and be stored&#8211;for this system to work. Regardless of how the information is abstracted, such a reputation index would raise thorny privacy issues. Nonetheless, I don&#8217;t know if we can build a reputation system that is entirely privacy-preserving&#8211;since reputation is an inherently public mechanism. In addition, any such system would have to consider the implications of <a href="http://en.wikipedia.org/wiki/Defamation">defamation</a> laws. These are some major hurdles!</p>
<p>Nonetheless, I agree wholeheartedly with Craig that a distributed trust network could be “the killingest of killer apps&#8221;. I just hope we can find a way to build and use it!</p>
<p><em>Note: <a href="http://twitter.com/communicating">Chris Rines</a> suggested I look at <a href="http://www.advogato.org/trust-metric.html">Advogato&#8217;s Trust Metric</a>, and a quick investigation led me to the Wikipedia entry for <a href="http://en.wikipedia.org/wiki/Trust_metric">trust metric</a>. Looks like I have some homework to do!</em></p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/03/20/can-we-build-a-distributed-trust-network/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/03/20/can-we-build-a-distributed-trust-network/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>Are Ashton Kutcher and Puff Daddy the Most Influential Twitter Users?</title>
		<link>http://thenoisychannel.com/2010/03/20/are-ashton-kutcher-and-puff-daddy-the-most-influential-twitter-users/</link>
		<comments>http://thenoisychannel.com/2010/03/20/are-ashton-kutcher-and-puff-daddy-the-most-influential-twitter-users/#comments</comments>
		<pubDate>Sat, 20 Mar 2010 23:32:24 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=3001</guid>
		<description><![CDATA[In a post on ReadWriteWeb, Sarah Perez summarizes &#8220;Measuring User Inﬂuence in Twitter: The Million Follower Fallacy&#8220;, a recent research paper by Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, and Krishna Gummadi. The punch line should hardly be surprising to regular readers here given my variety of rants on the subject: follower count isn&#8217;t great measure [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://an.kaist.ac.kr/~mycha/docs/icwsm2010_cha.pdf"><img class="alignnone" title="Cha et al, &quot;Measuring User Inﬂuence in Twitter: The Million Follower Fallacy&quot;" src="http://www.readwriteweb.com/images/top_100_influentials_on_twitter_chart.png" alt="" width="439" height="274" /></a></p>
<p>In a post on <a href="http://www.readwriteweb.com/archives/the_million_follower_fallacy_audience_size_doesnt_prove_influence_on_twitter.php">ReadWriteWeb</a>, Sarah Perez summarizes &#8220;<a href="http://an.kaist.ac.kr/~mycha/docs/icwsm2010_cha.pdf">Measuring User Inﬂuence in Twitter: The Million Follower Fallacy</a>&#8220;, a recent research paper by Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, and Krishna Gummadi. The punch line should hardly be surprising to regular readers here given my variety of <a href="http://thenoisychannel.com/?s=twitter+number+of+followers">rants</a> on the subject: follower count isn&#8217;t great measure of influence.</p>
<p>The authors focus on measuring three quantities: followers (which they call indegree), retweets, and mentions. Their main results is that, while the number of followers is strongly correlated to the numbers of retweets and mentions for the general user population, the correlation is much weaker for the users with high follower counts, e.g., in the top 10%. Indeed, the authors believe that the correlation for the general population is &#8220;an artifact of the tied ranks among the least inﬂuential users, e.g., many of the least connected users also received zero retweet and mention.&#8221;</p>
<p>The authors further note that:</p>
<blockquote><p>Across all three measures, the top inﬂuentials were generally recognizable public ﬁgures and websites. Interestingly, we saw marginal overlap in these three top lists. These top-20 lists only had 2 users in common: Ashton Kutcher and Puff Daddy. The top-100 lists also showed marginal overlap, as shown in Figure 1, indicating that the three measures capture different types of inﬂuence.</p></blockquote>
<p>The authors ultimately conclude that:</p>
<ul>
<li>Follower count represents a user’s popularity, but is not related to notions of inﬂuence such as engaging audience, i.e., retweets and mentions.</li>
<li>Retweets are driven by the content value of a tweet, favoring mainstream news organizations.</li>
<li>Mentions are driven by the name value of the user, favoring celebrities.</li>
</ul>
<p>I can&#8217;t argue with any of the above, but I do wonder if any of them are ideal measures of influence. All three measures are easy to game&#8211;and none of them model the scarcity of user attention, which is the motivating principle of <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">TunkRank</a>. Nor do they ground &#8220;influence&#8221; in any outcome external to Twitter.</p>
<p>Still, it&#8217;s an interesting negative result. If nothing else, it helps reinforce the argument that follower count isn&#8217;t a useful measure&#8211;at least once you get beyond the very low end of the range.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/03/20/are-ashton-kutcher-and-puff-daddy-the-most-influential-twitter-users/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/03/20/are-ashton-kutcher-and-puff-daddy-the-most-influential-twitter-users/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>You Can&#8217;t Hurry Relevance</title>
		<link>http://thenoisychannel.com/2010/02/28/you-cant-hurry-relevance/</link>
		<comments>http://thenoisychannel.com/2010/02/28/you-cant-hurry-relevance/#comments</comments>
		<pubDate>Sun, 28 Feb 2010 20:07:55 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2975</guid>
		<description><![CDATA[Lately, I&#8217;ve been musing about the Herb Simon quote that launched&#8211;or at least popularized&#8211;the concepts of information overload and attention economics: in an information-rich world, the wealth of information means a dearth of something else: a scarcity of whatever it is that information consumes. What information consumes is rather obvious: it consumes the attention of [...]]]></description>
			<content:encoded><![CDATA[<div id="_mcePaste">Lately, I&#8217;ve been musing about the <a href="http://en.wikipedia.org/wiki/Herbert_Simon">Herb Simon</a> quote that launched&#8211;or at least popularized&#8211;the concepts of information overload and <a href="http://en.wikipedia.org/wiki/Attention_economy">attention economics</a>:</div>
<blockquote>
<div id="_mcePaste">in an information-rich world, the wealth of information means a dearth of something else: a scarcity of whatever it is that information consumes. What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it (Simon, 1971)</div>
</blockquote>
<p>I hope everyone agrees that attention is a scarce good. But I&#8217;m curious how people measure it. After all, if we&#8217;re going to talk about an economic good being scarce, we ought to quantify it!</p>
<p>One approach is to measure attention at a specific moment in time, measuring how much of our instantaneous <a href="http://en.wikipedia.org/wiki/Cognitive_load">cognitive capacity</a> we devote to a task. This approach is useful for evaluating a user interface&#8211;in particular, for determining how users allocate their attention among the various interface elements. Another approach is to measure attention in units of time, e.g., how many of our waking hours do we devote to a particular activity. This latter strikes me as more of what Herb Simon had in mind.</p>
<p>We can interpret the two definitions as equivalent&#8211;after all, cumulative attention devoted to a task is simply the sum (or integral) of instantaneous attention over time. But thinking this way so misses a key consideration: we pay a significant price for <a href="http://www.joelonsoftware.com/articles/fog0000000022.html">context switching</a>.</p>
<p>A familiar example is email. The total time we spend reading email is a productivity concern, but the larger concern for many of us is the frequency with which email causes us to interrupt our workflow. Knowing this, I made a <a href="http://thenoisychannel.com/2008/12/06/overwhelmed-by-email/ ">brief attempt</a> in 2008 to check email only once a day. Unfortunately, this approach would have violated too many of my peers&#8217; expectations. I returned to status quo, reading my email (or at least scanning headers) as it arrives. Other messaging tools, such as instant messaging and Twitter, only add to the challenge of managing our personal communication flow.</p>
<p>Of course, what I really want is for my messaging tools to distinguish urgent messages from non-urgent ones, and to only interrupt my workflow for the former. I know that no system, whether based on manual filtering or algorithmic analysis, can make this subjective classification with 100% accuracy, but I&#8217;d certainly accept a handful of false positives in exchange for far fewer interruptions. I suspect I&#8217;m not alone.</p>
<p>Moreover, this approach extends beyond personal communications to more public ones, such as social media platforms and even web search. On one hand, the passing of time offers an opportunity to accumulate reliable content analysis; on the other hand, we don&#8217;t want to miss time-sensitive content just because the system waited too long to determine the content&#8217;s relevance to our information needs. Still, the low <a href="http://en.wikipedia.org/wiki/Signal-to-noise_ratio">signal-to-noise ratio</a> on social media platforms suggests to me that many information consumers would be amenable to a different tradeoff than the one we experience today.</p>
<p>What I&#8217;d really like to see is systems take advantage of the differences in users&#8217; personal senses of urgency. Some examples:</p>
<ul>
<li>A widely broadcast email isn&#8217;t delivered all at once, but first goes to users with higher urgency settings. Because those users mark it as spam, the email is already marked as spam for users with lower urgency settings. Conversely, if enough high-urgency users mark it as important, then it may be sent to lower-urgency users sooner.</li>
<li>High-urgency users frequently check news sites and blogs. If an article attract a threshold level of engagement from high-urgency users, then low-urgency users are notified. This approach could apply to general news or to news in a specific topic that the user follows.</li>
<li>Same as above, but applied to activity feeds and based on engagement within your social network. But again, high-urgency users lead the way, seeing updates sooner but at the price of experiencing a noisier stream.</li>
</ul>
<p>To some extent, our existing systems already approximate this approach. Mechanisms like favoriting and re-tweeting propagate signal from information scouts to their followers, as do algorithms that rank real-time information based on engagement. Still, as an information consumer, I&#8217;d appreciate an interface that explicitly and transparently adapts to my priorities, and that manages interruption of my workflow accordingly.</p>
<p>What do folks here think? Is information delayed tantamount to information denied? Or is time on our side, potentially offering us a better tradeoff than the one we experience today?</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/02/28/you-cant-hurry-relevance/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/02/28/you-cant-hurry-relevance/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>WSDM 2010: Day 3</title>
		<link>http://thenoisychannel.com/2010/02/06/wsdm-2010-day-3/</link>
		<comments>http://thenoisychannel.com/2010/02/06/wsdm-2010-day-3/#comments</comments>
		<pubDate>Sat, 06 Feb 2010 22:24:36 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2952</guid>
		<description><![CDATA[Note: this post is cross-posted at BLOG@CACM. Today is the last day of WSDM 2010, and I unfortunately spent it at home drinking chicken soup. But I&#8217;ve been following the conference via the proceedings and tweets. The day started with a short session on temporal interaction. Topics included clustering social media documents (e.g., Flickr photos) based on their association [...]]]></description>
			<content:encoded><![CDATA[<p><em>Note: this post is cross-posted at <a href="http://cacm.acm.org/blogs/blog-cacm/72149-wsdm-2010-day-3/fulltext">BLOG@CACM</a>.<br />
</em></p>
<p>Today is the last day of <a href="http://www.wsdm-conference.org/2010/">WSDM 2010</a>, and I unfortunately spent it at home drinking chicken soup. But I&#8217;ve been following the conference via the <a href="http://www.wsdm-conference.org/2010/proceedings/ ">proceedings</a> and <a href="http://search.twitter.com/search?q=%23wsdm2010 ">tweets</a>.</p>
<p>The day started with a short session on temporal interaction. Topics included clustering social media documents (e.g., <a href="http://www.flickr.com/">Flickr</a> photos) based on their association with events, statistical tests for early identification of popular social media content, and analysis of answers sites (like <a href="http://answers.yahoo.com/">Yahoo! Answers</a>) as evolving two-sided economic markets.</p>
<p>The next session focused on advertising. Two papers focused on click prediction: one proposing an <a href="http://www.scholarpedia.org/article/Bayesian_statistics">Bayesian</a> inference model to better predict click-throughs in the tail of the ad distribution; the other presenting a framework for personalized click models. Another paper addressed the closely related problem of predicting ad relevance. The remaining papers discussed other aspects of search advertising: one on estimating the value per click for channels like <a href="http://www.google.com/services/adsense_tour/index.html">Google AdSense</a>, where ad inventory is supplied by a third party; the other proposing an algorithmic approach to automate online ad campaigns based on<a href="http://en.wikipedia.org/wiki/Landing_page">landing page</a> content.</p>
<p>The following session was on systems and efficiency, a popular topic given the immense data and traffic associated with web search. Two papers proposed approaches to help short-circuit ranking computations: one by optimizing the organizations of <a href="http://en.wikipedia.org/wiki/Inverted_index">inverted index</a> entries to consider both the static ranks of documents and the upper bounds of term scores for all terms contained in each document; the other using early-exit strategies to optimize <a href="http://en.wikipedia.org/wiki/Ensemble_learning">ensemble-based machine learning</a> algorithms. Another used machine learning to mine rules for de-duplicating web pages based on URL string patterns. Another focused on compression, showing that web content is at least an order of magnitude more compressible that what can be achieved by <a href="http://en.wikipedia.org/wiki/Gzip">gzip</a>. The last paper proposed a method to perform efficient distance queries on graph (i.e., web graphs or social graphs) by pre-computing a collection of node-centered subgraphs.</p>
<p>The last session of the conference discussed various topics in web mining. One presented a system for identifying distributed search bot attacks. Another proposed an image search method using a combination of entity information and visual similarity. The final paper showed that shallow text features can be used for low-cost detection of boilerplate text in web documents.</p>
<p>All in all, WSDM 2010 was an excellent conference, and I&#8217;m sad to not to have been able to attend more of it in person. I&#8217;m delighted to see an even mix of academic and industry representatives sharing ideas and working to make the web a better place for information access.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/02/06/wsdm-2010-day-3/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/02/06/wsdm-2010-day-3/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>WSDM 2010: Day 2</title>
		<link>http://thenoisychannel.com/2010/02/06/wsdm-2010-day-2/</link>
		<comments>http://thenoisychannel.com/2010/02/06/wsdm-2010-day-2/#comments</comments>
		<pubDate>Sat, 06 Feb 2010 04:00:53 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2949</guid>
		<description><![CDATA[Note: this post is cross-posted at BLOG@CACM. Unfortunately, I woke up this morning rather under the weather, so I&#8217;m having to resort to remotely reporting on the second day of WSDM 2010 conference, based on the published proceedings and the tweet stream. The day started with a keynote from Harvard economist Susan Athey. Her research focuses on the [...]]]></description>
			<content:encoded><![CDATA[<p><i>Note: this post is cross-posted at <a href="http://cacm.acm.org/blogs/blog-cacm/71927-wsdm-2010-day-2/fulltext">BLOG@CACM</a>.</i></p>
<p>Unfortunately, I woke up this morning rather under the weather, so I&#8217;m having to resort to remotely reporting on the second day of <a href="http://www.wsdm-conference.org/2010/">WSDM 2010</a> conference, based on the published proceedings and the <a href="http://twitter.com/#search?q=%23wsdm2010">tweet stream</a>.</span></em></p>
<p>The day started with a keynote from Harvard economist <a href="http://kuznets.fas.harvard.edu/~athey/">Susan Athey</a>. Her research focuses on the design of auction-based markets, a topic core to the business of search which largely relies on auction-based advertising models (cf. <a href="http://en.wikipedia.org/wiki/AdWords">Google AdWords</a>). Then came a session focused on learning and optimization. One paper proposed a method to learn ranking functions and query categorization simultaneously, reflecting that different categories of queries leads users to have different expectations about ranking. Another combined traditional list-based ranking with pair-wise comparisons between results to separate the results into tiers reflecting grades of relevance. An intriguing approach to query recommendation treated it as an optimization problem, perturbing users’ query-reformulation path to maximize the expected value of a utility function over the search session. Another paper looked not at ranking per se, but rather at improving the quality of training data for using machine learning for ranking. The final paper of the session, which earned a best-paper nomination, modeled document relevance based not on click-through behavior, but rather on post-click user behavior.</p>
<p>The next session was about users and measurement. It opened with another best-paper nominee: a analysis of over a hundred million users to understand how they re-find web content. Another offered a rigorous analysis of the often sloppily presented &#8220;<a href="http://en.wikipedia.org/wiki/Long_Tail">long-tail</a>&#8221; hypothesis: it found that light users disproportionately prefer content at the head of distribution while heavy users disproportionately prefer the tail. Another log-analysis paper analyzed search logs using a partially observable Markov model, a variant of the<a href="http://en.wikipedia.org/wiki/Hidden_Markov_model">hidden Markov model</a> in which not all of the hidden state transitions emit observable events&#8211;and compared the latent variables with eye-tracking studies. An intriguing study demonstrated that user behavior models are more predictive of goal success than models based on document relevance. The final paper of the session proposed methods for quantifying the reusability of the test collections that lie at the heart of information retrieval evaluation.</p>
<p>The last session of the day focused on social aspects of search. Two of the papers were concerned with modeling authority and influence in social networks, a problem in which I take a deep <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">personal interest</a>. Another inferred attributes of social network users based on those of other users in their communities (cg. MIT&#8217;s <a href="http://www.boston.com/bostonglobe/ideas/articles/2009/09/20/project_gaydar_an_mit_experiment_raises_new_questions_about_online_privacy/">Project Gaydar</a>). Another analyzed <a href="http://www.flickr.com/">Flickr</a> and <a href="http://www.last.fm/">Last.fm</a> user logs to show that users&#8217; semantic similarity based on their tagging behavior is predictive of social links. The final paper tackled the sparsity of social media tags by inferring latent topics from shared tags and spatial information.</p>
<p>Not surprisingly, a disproportionate number of contributors to the conference work at major web search companies, who have both the motivation to improve results and the access to data that is needed for such research. One of the ongoing research challenges for the field is to find ways to make this data available to others while respecting the business concerns of search engine companies and the privacy concerns of their users.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/02/06/wsdm-2010-day-2/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/02/06/wsdm-2010-day-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Report on the Third Workshop on Search and Social Media (SSM 2010)</title>
		<link>http://thenoisychannel.com/2010/02/04/report-on-the-third-workshop-on-search-and-social-media-ssm-2010/</link>
		<comments>http://thenoisychannel.com/2010/02/04/report-on-the-third-workshop-on-search-and-social-media-ssm-2010/#comments</comments>
		<pubDate>Thu, 04 Feb 2010 08:25:01 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2936</guid>
		<description><![CDATA[Note: this post is cross-posted at BLOG@CACM. It is my pleasure to report on the 3rd Annual Workshop on Search in Social Media (SSM 2010), a gathering of information retrieval and social media researchers and practitioners in an area that has captured the interest of computer scientists, social scientists, and even the broader public. The one-day [...]]]></description>
			<content:encoded><![CDATA[<p><em>Note: this post is cross-posted at <a href="http://cacm.acm.org/blogs/blog-cacm/71444-third-workshop-on-search-and-social-media-ssm-2010/fulltext">BLOG@CACM</a>.</em></p>
<p>It is my pleasure to report on the 3rd Annual Workshop on Search in Social Media (<a href="http://ir.mathcs.emory.edu/SSM2010/" target="_blank">SSM 2010</a>), a gathering of information retrieval and social media researchers and practitioners in an area that has captured the interest of computer scientists, social scientists, and even the broader public. The one-day workshop took place at the Polytechnic Institute of NYU in Brooklyn, NY, co-located with the ACM Conference on Web Search and Data Mining (<a href="http://www.wsdm-conference.org/2010/" target="_blank">WSDM 2010</a>). The quality of the presenters, the overbooked registration, and the hundreds of live tweets with the <a href="http://search.twitter.com/search?q=%23ssm2010" target="_blank">#ssm2010</a> hashtag all attest to the success of this event.</p>
<p>The workshop opened with a warm welcome from <a href="http://www.csee.umbc.edu/~ian/" target="_blank">Ian Soboroff</a> (NIST), immediately followed by a keynote from <a href="http://www.jopedersen.com/jopedersen/Home.html" target="_blank">Jan Pedersen</a>, Chief Scientist of Bing Search. Jan established a clear business case for search in social media: the opportunity to deliver content that is fresh, local, and under-served by general web search. He drilled into particular types of content where social media search is most useful: expert opinions, breaking news, and tail content. The benefits of social media search include trust and personal interaction (as compared to web content that is often soulless and of uncertain provenance), low latency (though perhaps at the cost of accuracy), and access to niche or ephemeral information that web search rarely surfaces. But delivering social media results to searchers creates its own variety of challenges, such as weighing freshness against accuracy and relevance, coping with loss of social content&#8217;s conversational context, managing low update latency when search engines have not been optimized for it, and fighting new kinds of spam. Despite these challenges, it is clear that the major web search engines have embraced the brave new world of real-time social content.</p>
<p><a href="http://www.mathcs.emory.edu/~eugene/" target="_blank">Eugene Agitchein</a> (Emory University) then moderated a panel representing the world&#8217;s leading search engines: <a href="http://www.google.com/profiles/jhylton" target="_blank">Jeremy Hylton</a> (Google), <a href="http://datamining.typepad.com/" target="_blank">Matthew Hurst</a> (Microsoft), <a href="http://research.yahoo.com/user/78" target="_blank">Sihem Amer-Yahia</a> (Yahoo!), and <a href="http://ir.baidu.com/phoenix.zhtml?c=188488&amp;p=irol-govBio&amp;ID=161381" target="_blank">William Chang</a> (Baidu). Jeremy justified the universal interface approach, pointing out that users don&#8217;t want to have to figure out what kind of search site to use for their queries, and that they expect a familiar interface. He also noted that Google has made great strides on update latency: it can index the Twitter firehose in the same amount of time as serving a query. Matthew offered various analyses of the social search problem, based on whether the information signal resides in content (e.g., web) or attention (e.g., Twitter), or whether the information need is expressed in an explicit search query or inferred from the user&#8217;s context. Sihem offered a counter-point to Jeremy, arguing that social media search queries often represent broad or vague information needs, and thus call for a more browsing-oriented interface than web search, which is optimized for highly specific needs. William noted that the biggest competitive threat he sees for web search engines comes from social media players&#8211;and he credits much of Baidu&#8217;s success to its surfacing of social media content.</p>
<p>Then came a flurry of questions, perhaps the most interesting of which was how to address identity management. William argued that people prefer interacting with real-named (or pseudonymous) people to whom they are directly connected. Sihem offered the counter-example of obtaining recommendations through community aggregation. Matthew noted the incongruity of there being no economic relationship between social network companies that maintain proprietary social graphs and people whose identities and relationships those graph represent. Jeremy pointed out that users benefit if the data is as open as possible.</p>
<p>Given the almost even split between academic and industry participation in the workshop, the panelists were also asked to present research challenges to academia. Jeremy posed the problem of determining when social media results are actually true. Matthew wants to see more interdisciplinary work between computer scientists and social scientists. Sihem offered two challenge problems:  scalable community discovery and evaluation of collaborative recommendation systems. William wants to see a rigorous axiomatization of social media search behavior.</p>
<p>After lunch, <a href="http://www.fxpal.com/?p=jeremy" target="_blank">Jeremy Pickens</a> (FXPAL) moderated a panel representing social media / networking companies: <a href="http://www.hilarymason.com/" target="_blank">Hilary Mason</a> (bit.ly), <a href="http://www.linkedin.com/in/igorperisic" target="_blank">Igor Perisic</a> (LinkedIn), and <a href="http://www.myspace.com/myspacedave" target="_blank">David Hendi</a> (MySpace). Hilary noted that, while bit.ly does not have access to an explicit social graph, it captures implicit connections from user behavior that may not be represented in the graph. Jeremy asked the panelists how much a person&#8217;s extended network matters; David and Igor pointed out research indicating correlations of mood and even medical conditions between people and their third-degree connections. Again, the audience was full of questions, especially for Igor. As a fan of <a href="http://en.wikipedia.org/wiki/Faceted_search" target="_blank">faceted search</a>, I was glad to see him touting LinkedIn&#8217;s success in making faceted search the primary means of performing people search on the site. For an in-depth view, I recommend &#8220;<a href="http://thenoisychannel.com/2010/01/31/linkedin-search-a-look-beneath-the-hood/" target="_blank">LinkedIn Search: A Look Beneath the Hood</a>&#8220;.</p>
<p>The afternoon continued with a poster / demo session emphasizing work in progress: tools, interfaces, research studies, and position papers. I particularly enjoyed listening to the stream of interaction between academic researchers and industry practitioners.</p>
<p>The final panel session assembled academic researchers to discuss their views of the challenges in social media. <a href="http://www.fxpal.com/?p=gene" target="_blank">Gene Golovchinsky</a> (FXPAL) moderated a panel comprised of <a href="http://knoesis.wright.edu/researchers/meena/homepage/" target="_blank">Meena Nagarajan</a> (Wright State University), <a href="http://www.lehigh.edu/~lih307/" target="_blank">Liangjie Hong</a> (Lehigh University),<a href="http://www.dcs.gla.ac.uk/~richardm/" target="_blank">Richard McCreadie</a> (University of Glasgow), <a href="http://www.cs.cmu.edu/~jelsas/" target="_blank">Jonathan Elsas</a> (CMU), and <a href="http://comminfo.rutgers.edu/~mor/" target="_blank">Mor Naaman</a> (Rutgers University). Meena highlighted the need to build up meta-data to describe the context around social utteracnces. Liangjie took a position similar to William Cheng&#8217;s, calling for a framework to model the tasks and behavior of users who interact with social media. Richard focused on the intersection of social media and news search, and noted that some of the most useful information is private and proprietary (e.g., search and chat logs). Jonathan offered a variety of challenges: determining the right retrieval granularity, managing multiple axes of organization, aggregating author behavior, and multidimensional indexing of social media content. Finally, Mor noted that we&#8217;re moving from a world of email to a &#8220;social awareness stream&#8221;, in which the content we directed content at a group and have lower expectations of readership than email. As with all of the panels, there were countless questions from the moderator and audience, particularly about determining the truthfulness of social media content and delivering social content in an effective user interface.</p>
<p>The final conference session was a conference was a full-group discussion that dived into the various topics addressed throughout the day. But Gene Golovchinsky provided the &#8220;one more thing&#8221; at the end, showing us a glimpse of a faceted search interface to explore a Twitter stream. It was an elegant finish to a day filled with informative and engaging discussion, and I look forward to seeing many of the participants in the WSDM conference over the next few days.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/02/04/report-on-the-third-workshop-on-search-and-social-media-ssm-2010/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/02/04/report-on-the-third-workshop-on-search-and-social-media-ssm-2010/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Blogging SSM 2010 and WSDM 2010</title>
		<link>http://thenoisychannel.com/2010/02/03/blogging-ssm-2010-and-wsdm-2010/</link>
		<comments>http://thenoisychannel.com/2010/02/03/blogging-ssm-2010-and-wsdm-2010/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 05:07:20 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2932</guid>
		<description><![CDATA[I&#8217;m delighted to report that I&#8217;ll be blogging about the Search and Social Media Workshop (SSM 2010) and the Web Search and Data Mining Conference (WSDM 2010) for Communications of the ACM. Of course, I&#8217;ll cross-post here. I also encourage folks to follow the live tweet streams at #ssm2010 and #wsdm2010, as well as Gene and [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m delighted to report that I&#8217;ll be blogging about the Search and Social Media Workshop (<a href="http://ir.mathcs.emory.edu/SSM2010/">SSM 2010</a>) and the Web Search and Data Mining Conference (<a href="http://www.wsdm-conference.org/2010/">WSDM 2010</a>) for <a href="http://cacm.acm.org/blogs/blog-cacm/">Communications of the ACM</a>.</p>
<p>Of course, I&#8217;ll cross-post here. I also encourage folks to follow the live tweet streams at <a href="http://search.twitter.com/search?q=%23ssm2010">#ssm2010</a> and <a href="http://search.twitter.com/search?q=%23wsdm2010">#wsdm2010</a>, as well as Gene and Jeremy&#8217;s posts at the <a href="http://palblog.fxpal.com/?tag=ssm2010">FXPAL blog</a>.</p>
<p>To those attending: see you all tomorrow through Saturday! To everyone else: I will try my best to communicate the substance and spirit of the conference.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/02/03/blogging-ssm-2010-and-wsdm-2010/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/02/03/blogging-ssm-2010-and-wsdm-2010/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Real Time Search Is Personal</title>
		<link>http://thenoisychannel.com/2010/01/18/real-time-search-is-personal/</link>
		<comments>http://thenoisychannel.com/2010/01/18/real-time-search-is-personal/#comments</comments>
		<pubDate>Mon, 18 Jan 2010 19:42:13 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2911</guid>
		<description><![CDATA[The other day, I promised in a comment thread that I&#8217;d write about what I see as real use cases for real-time search. As it happens, I&#8217;m experiencing one right now. As my wife, daughter, and I were walking home from a playground, we noticed a large number of fire trucks congregating a block away [...]]]></description>
			<content:encoded><![CDATA[<p>The other day, I promised in a <a href="http://thenoisychannel.com/2010/01/03/search-questions-for-2010-whats-on-my-mind/#comments">comment thread</a> that I&#8217;d write about what I see as real use cases for real-time search. As it happens, I&#8217;m experiencing one right now.</p>
<p>As my wife, daughter, and I were walking home from a playground, we noticed a large number of fire trucks congregating a block away from our house. A quick search on Twitter <a href="http://search.twitter.com/search?q=+near%3A11201+within%3A15mi+explosion">explained</a> what was going on, particularly by pointing us to this <a href="http://gothamist.com/2010/01/18/buildings_and_subway_stations_in_do.php">post</a> on Gothamist&#8211;which as of this writing seems to be the only reporting about this incident.</p>
<p>I think this example tells us a lot about the utility of real-time search. Most of us don&#8217;t need real-time search to tell us about the <a href="http://http://news.google.com/news/search?q=haiti">news in Haiti</a>, since a critical mass of major news providers is covering the story around the clock. Where real-time search matters most is at the personal level&#8211;specifically, when our personal urgency to obtain information is higher than that of the general population. In such situations, we&#8217;re willing to accept less polished&#8211;and even risk less accurate&#8211;information, particularly if the alternative is to wait until if and when news providers cover the story. At least to some extent, urgency trumps authority.</p>
<p>Yes, there are other use cases for conversational media like Facebook and Twitter, such as sharing the experience of watching a live event, or simply chatting with friends and strangers about arbitrary topics. But I wouldn&#8217;t consider such use of these media to be search. Real-time search, in my view, is about helping users obtain the latest information available&#8211;in accordance with their personal needs. Twitter and <a href="http://www.google.com/search?&amp;output=search&amp;q=brooklyn%20heights%20explosion&amp;tbs=rltm:1">Google</a> served me well today, and I&#8217;m grateful that real-time search gave me real-time peace of mind.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/01/18/real-time-search-is-personal/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/01/18/real-time-search-is-personal/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>When Is Faceted Search Appropriate?</title>
		<link>http://thenoisychannel.com/2010/01/15/when-is-faceted-search-appropriate/</link>
		<comments>http://thenoisychannel.com/2010/01/15/when-is-faceted-search-appropriate/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 06:31:27 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2900</guid>
		<description><![CDATA[Earlier this week, Peter Morville and Mark Burrell presented a UIE virtual seminar on &#8220;Leveraging Search &#38; Discovery Patterns For Great Online Experiences&#8220;. It sold out! And I thought Pete Bell and I had done well with our seminar on faceted search! But I&#8217;m hardly surprised. Although I wasn&#8217;t able to attend it myself, I [...]]]></description>
			<content:encoded><![CDATA[<p><img style="visibility: hidden; width: 0px; height: 0px;" src="http://counters.gigya.com/wildfire/IMP/CXNID=2000002.0NXC/bT*xJmx*PTEyNjM1MzYyMTA5MTUmcHQ9MTI2MzUzNjIxNTQ3MSZwPTEwMTkxJmQ9c3NfZW1iZWQmZz*yJm89YjczYWQ5YzUwMGVmNGRiOGFhZGY*MDRmMDI*NzNiOWQmb2Y9MA==.gif" border="0" alt="" width="0" height="0" /></p>
<div id="__ss_2692450" style="width: 425px; text-align: left;"><object style="margin: 0px;" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=uiedesignpatternstrailermerged-091210133302-phpapp01&amp;stripped_title=search-discovery-patterns-a-uie-virtual-seminar" /><param name="allowfullscreen" value="true" /><embed style="margin: 0px;" type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=uiedesignpatternstrailermerged-091210133302-phpapp01&amp;stripped_title=search-discovery-patterns-a-uie-virtual-seminar" allowscriptaccess="always" allowfullscreen="true"></embed></object></div>
<p>Earlier this week, <a href="http://www.findability.org/">Peter Morville</a> and Mark Burrell presented a <a href="http://uie.com/">UIE</a> virtual seminar on &#8220;<a href="http://www.uie.com/events/virtual_seminars/search_patterns/">Leveraging Search &amp; Discovery Patterns For Great Online Experiences</a>&#8220;. It <a href="http://facets.endeca.com/2010/01/how-to-sell-out-a-virtual-seminar/">sold out</a>! And I thought Pete Bell and I had done well with our <a href="http://www.uie.com/events/virtual_seminars/facets/">seminar</a> on <a href="http://en.wikipedia.org/wiki/Faceted_search">faceted search</a>!</p>
<p>But I&#8217;m hardly surprised. Although I wasn&#8217;t able to attend it myself, I gather from <a href="http://search.twitter.com/search?q=%23uievs">Twitter</a> and the <a href="http://strottrot.com/2010/01/14/looking-forward-to-interaction10/">blogosphere</a> that it was a great presentation. I enjoyed serving as a reviewer for Peter&#8217;s new book on <a href="http://searchpatterns.org/">Search Patterns</a>, and I contributed a bit to Endeca&#8217;s <a href="http://www.endeca.com/resource-center-ui-pattern-library.htm">UI Design Pattern Library</a> while I was there and Mark&#8217;s team was developing it.</p>
<p>In reading reactions to the seminar, I was particularly intrigued by a post entitled &#8220;<a href="http://livlab.com/thinkia/2010/01/search-and-browse/">Search and Browse</a>&#8221; by Livia Labate on her fantastically named blog, &#8220;<a href="http://livlab.com/thinkia/">I think, therefore IA</a>&#8220;. She raised a question that I think needs to be asked more often: when is (or isn&#8217;t) faceted search appropriate?</p>
<p>Her conversation with readers in a comment thread offered some possible answers:</p>
<ul>
<li>Faceted search helps users who think in terms of attribute specifications as filtering criteria.</li>
<li>Faceted search supports search by exclusion, as opposed to by discovery.</li>
<li>Faceted search requires a set of useful facets that is neither too small nor too large.</li>
</ul>
<p>I&#8217;d like to propose my own answers. Here are the conditions for which I see faceted search being most useful:</p>
<ul>
<li>Faceted search supports <a href="http://thenoisychannel.com/2008/06/24/what-is-not-exploratory-search/">exploratory</a> use cases, in contrast to <a href="http://www.db.dk/bh/core%20concepts%20in%20lis/articles%20a-z/known_item_search.htm">known-item search</a>. For known-item search, users are better served by a search box to specify an item by name, or a non-faceted hierarchy to locate it. In contrast, faceted search optimizes for cases where users are either unsure of what they want or of how to specify it.</li>
<li>Faceted search helps users who need or want to learn about the search space as they execute the search process. Facets educate users about different ways to characterize items in a collection. If users do not need or want this education, they may be frustrated by an interface that makes them do more work.</li>
<li>The search space is classified using accurate, understandable facets that relate to the users&#8217; information needs. As I&#8217;ve discussed before, <a href="http://thenoisychannel.com/2009/12/03/search-user-interfaces-and-data-quality/">data quality is often the bottleneck in designing search interfaces</a>. Offering users facets that are either unreliable or unrelated to their needs is worse than providing no facets at all.</li>
</ul>
<p>Given the above criteria, it&#8217;s not surprising that faceted search has been a huge success in online retail: shopping is often an exploratory learning experience, and retailers tend to have good data.</p>
<p>But the success of faceted search in retail overshadows other domains where faceted search may be even more valuable. My favorite example is faceted people search, most recently demonstrated by <a href="http://thenoisychannel.com/2009/12/15/linkedin-faceted-search-now-out-of-beta/">LinkedIn</a>. I would love to see other entities (locations, businesses, etc.) receive similar treatment, at least in contexts where exploration is a common use case.</p>
<p>I think Livia is right to be skeptical about any interface that introduces complexity&#8211;and facets do introduce complexity. I hope that my guidelines help answer her question as to when that complexity is worthwhile and perhaps even necessary to help users satisfy their information needs.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2010/01/15/when-is-faceted-search-appropriate/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2010/01/15/when-is-faceted-search-appropriate/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>Forget Real-Time, Give Us Over Time!</title>
		<link>http://thenoisychannel.com/2009/12/30/forget-real-time-give-us-over-time/</link>
		<comments>http://thenoisychannel.com/2009/12/30/forget-real-time-give-us-over-time/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 14:56:20 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2883</guid>
		<description><![CDATA[In a recent announcement, Twitter Platform / API Product Manager Ryan Sarver tells us that Twitter is: committed to providing a framework for any company big or small, rich or poor to do a deal with us to get access to the Firehose in the same way we did deals with Google and Microsoft. We want [...]]]></description>
			<content:encoded><![CDATA[<p>In a recent <a href="https://groups.google.com/group/twitter-development-talk/browse_thread/thread/a1076d83d70d0450?pli=1">announcement</a>, Twitter Platform / API Product Manager <a href="http://sarver.org/about/">Ryan Sarver</a> tells us that Twitter is:</p>
<blockquote><p>committed to providing a framework for any company big or small, rich or poor to do a deal with us to get access to the Firehose in the same way we did deals with Google and Microsoft. We want everyone to have the opportunity &#8212; terms will vary based on a number of variables but we want a two-person startup in a  garage to have the same opportunity to build great things with the full feed that someone with a billion dollar market cap does. There are still a lot of details to be fleshed out and communicated, but this a top priority for us and we look forward to what types of companies and products get built on top of this unique and rich stream.</p></blockquote>
<p>That and some other details, like raising the API rate limit from 150 requests per hour to 1500,  may well bring on what Marshall Kirkpatrick of ReadWriteWeb calls &#8220;<a href="http://www.readwriteweb.com/archives/twitter_20_api_rate_change_could_lead_to_a_world_o.php">Twitter 2.0</a>&#8220;. But it was something else in Kirkpatrick&#8217;s write up that caught my attention&#8211;this quote from <a href="http://wow.ly/">Wow.ly</a> co-founder Kevin Marshall:</p>
<blockquote><p>The more I do with and around social data, the less interested I seem to become in &#8216;realtime&#8217; and the more interested I become in &#8216;over time.&#8217; When I first started hacking on Twitter (and Facebook) apps, I was in love with the idea of parsing and analyzing data in real-time and I was very link/content focused. But the more I build and use these tools, the more I see the value in the history and the trails of the data set.</p></blockquote>
<p>I couldn&#8217;t have said it better! Not that I haven&#8217;t tried: you look back at my post about <a href="http://thenoisychannel.com/2009/05/27/topsy-tippling-the-stream-of-conversations/">Topsy</a>, you&#8217;ll see where real-time and over time meet. Recency matters, but the signal is far too sparse without some way to aggregate and analyze over time.</p>
<p>I&#8217;m thrilled that Twitter plans to open up its platform in a way that could enable analysis over semantic, social, and temporal dimensions. Now I&#8217;m curious to see what that access will look like, and what everyone has been clamoring for that access will do with it.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/12/30/forget-real-time-give-us-over-time/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/12/30/forget-real-time-give-us-over-time/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Recovering From Being Hacked</title>
		<link>http://thenoisychannel.com/2009/12/24/recovering-from-being-hacked/</link>
		<comments>http://thenoisychannel.com/2009/12/24/recovering-from-being-hacked/#comments</comments>
		<pubDate>Thu, 24 Dec 2009 22:48:59 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2866</guid>
		<description><![CDATA[I discovered today that I&#8217;d been hacked earlier this week by a spam link injection attack. I&#8217;m still not sure how it happened, but I believe I&#8217;ve cleaned out all of the offending PHP from my WordPress installation. I&#8217;ve also removed most of my plug-ins in the process, and I may have broken some things [...]]]></description>
			<content:encoded><![CDATA[<p>I discovered today that I&#8217;d been hacked earlier this week by a spam link injection attack. I&#8217;m still not sure how it happened, but I believe I&#8217;ve cleaned out all of the offending PHP from my WordPress installation. I&#8217;ve also removed most of my plug-ins in the process, and I may have broken some things in my zeal to clean up the site. My apologies for any inconveniences, and my thanks to <a href="http://twitter.com/awaisathar">@awaisathar</a> and <a href="http://twitter.com/gsingers">@gsingers</a> for helping me resolve this quickly.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/12/24/recovering-from-being-hacked/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/12/24/recovering-from-being-hacked/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Blogs I Read: Living La Vida Local</title>
		<link>http://thenoisychannel.com/2009/12/05/blogs-i-read-living-la-vida-local/</link>
		<comments>http://thenoisychannel.com/2009/12/05/blogs-i-read-living-la-vida-local/#comments</comments>
		<pubDate>Sat, 05 Dec 2009 21:44:30 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2836</guid>
		<description><![CDATA[My new role at Google (yes, it still feels new after not quite a month!) has given me a professional interest in local search. I&#8217;ve adjusted my reading materials accordingly, and I&#8217;ve started reading blogs that focus on local. Here are a handful that I&#8217;ve discovered so far: BIA / Kelsey Blog By The Kelsey [...]]]></description>
			<content:encoded><![CDATA[<p>My new role at Google (yes, it still feels new after not quite a month!) has given me a professional interest in <a href="http://en.wikipedia.org/wiki/Local_search_%28Internet%29">local search</a>. I&#8217;ve adjusted my reading materials accordingly, and I&#8217;ve started reading blogs that focus on local. Here are a handful that I&#8217;ve discovered so far:</p>
<ul>
<li><a href="http://blog.kelseygroup.com/">BIA / Kelsey Blog</a>
<ul>
<li>By <a href="http://kelseygroup.com/">The Kelsey Group</a>, a division of <a href="http://www.bia.com/" target="_blank">BIA Advisory Services</a> that provides data and analysis on directories and local media.</li>
</ul>
</li>
<li><a href="http://blog.telemapics.com/">Exploring Local</a>
<ul>
<li> By <a href="http://www.glgroup.com/Council-Member/Michael-Dobson-178033.html">Mike Dobson</a>, President of <a href="http://telemapics.com/">TeleMapics</a>, a company that provides consulting services focused on local search.</li>
</ul>
</li>
<li><a href="http://www.localseoguide.com/">Local SEO Guide</a>
<ul>
<li><a href="http://www.localseoguide.com/about-me?PHPSESSID=7509db638c808d8ac60e49cc596c99fc">Andrew Shotland</a>&#8216;s blog on local search optimization, small business marketing &amp; search engine optimization strategy.</li>
</ul>
</li>
<li><a href="http://www.localsearchdatabase.com/">Localsearchdatabase</a>
<ul>
<li>By <a href="http://twitter.com/golander59">Gib Olander</a>, Director of Business Development for <a href="http://www.localeze.com/">Localeze</a>, an online content management company serving businesses, local search engines and consumers.</li>
</ul>
</li>
<li><a href="http://www.davidmihm.com/blog/">Mihmorandum</a>
<ul>
<li><a href="http://www.davidmihm.com/">David Mihm</a>&#8216;s blog on local search engine optimization and marketing.</li>
</ul>
</li>
<li><a href="http://gesterling.wordpress.com/">Screenwerk</a>
<ul>
<li><a href="http://gesterling.wordpress.com/about/">Greg Sterling</a>&#8216;s thoughts on online and offline media. Sterling used to run The Kelsey Group’s Interactive Local Media program.</li>
</ul>
</li>
<li><a href="http://www.solaswebdesign.net/wordpress/">SEO Igloo Blog</a>
<ul>
<li>By <a href="http://www.solaswebdesign.net/">Solas Web Design</a>, which specializes in web design and search engine optimization for small businesses.</li>
</ul>
</li>
<li><a href="http://blumenthals.com/blog/">Understanding Google Maps &amp; Local Search</a>
<ul>
<li>By <a href="http://www.blumenthals.com/index.php?MikeBlumenthal">Mike Blumenthal</a>, whose company offers consulting services and market research advice relating to maps and local search.</li>
</ul>
</li>
</ul>
<p>Not surprisingly, these blogs offers me a critical perspective on how Google and other search engines serve the local space.  Granted, everyone has their own motives&#8211;and it&#8217;s hard to avoid some tension in a space with the competitive dynamics of local search. But now that I&#8217;m no longer an outsider myself, I appreciate having others to help keep me honest as I work to make local search better for users and businesses.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/12/05/blogs-i-read-living-la-vida-local/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/12/05/blogs-i-read-living-la-vida-local/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Can We Learn From Anti-Social Users?</title>
		<link>http://thenoisychannel.com/2009/11/21/can-we-learn-from-anti-social-users/</link>
		<comments>http://thenoisychannel.com/2009/11/21/can-we-learn-from-anti-social-users/#comments</comments>
		<pubDate>Sat, 21 Nov 2009 21:54:17 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2798</guid>
		<description><![CDATA[One of the interesting challenges we face as both both developers and consumers of search technology is that social signals are a double-edged sword. On one hand, social signals have proven essential in distinguishing signal from noise&#8211;be they links, re-tweets, or any number other ways that online consumers (or more correctly &#8220;prosumers&#8221;) actively and passively [...]]]></description>
			<content:encoded><![CDATA[<p>One of the interesting challenges we face as both both developers and consumers of search technology is that social signals are a double-edged sword. On one hand, social signals have proven essential in distinguishing signal from noise&#8211;be they links, re-tweets, or any number other ways that online consumers (or more correctly &#8220;prosumers&#8221;) actively and passively communicate value judgments about information. On the other hand, our reliance on these social signals makes us vulnerable to positive feedback and spammers.</p>
<p>Consider <a href="http://www.princeton.edu/~mjs3/musiclab.shtml">MusicLab</a>, an &#8220;<a href="http://www.princeton.edu/%7Emjs3/salganik_watts08.pdf" target="_blank">experimental study of self-fulfilling prophecies in an artificial cultural market</a>&#8220;. In this study, sociologists <a href="http://www.princeton.edu/~mjs3/index.shtml">Matt Salganik</a>, <a href="http://www.uvm.edu/~pdodds/home.html">Peter Dodds</a>, and <a href="http://en.wikipedia.org/wiki/Duncan_J._Watts">Duncan Watts</a> manipulated the social information available to consumers (specifically teens) regarding their peers&#8217; musical tastes. The experimenters&#8217; goal was to empirically validate a quantitative model of social contagion.</p>
<p>But we can look at this study another way: by isolating the social factors that influence musical taste, the experimenters were also isolating the non-social signal&#8211;in theory, how popular a song would be in the absence of social signaling. Indeed, they found that, if they measured a song&#8217;s quality by isolating out the social factor, &#8220;the best songs never do very badly, and the worst songs never do extremely well, but almost any other result is possible&#8221;.</p>
<p>It&#8217;s interesting&#8211;interesting to me, at least!&#8211;to ask if search engines can do the same for search. One of the frequent objections to link-based authority measures like <a href="http://en.wikipedia.org/wiki/PageRank">PageRank</a> is that they make the rich get richer. &#8220;Real-time&#8221; variants like re-tweet frequency (and even <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">TunkRank</a>) suffer from the same weakness. Unchecked, these measures can cause authority / influence market has to resemble a <a href="http://ingrimayne.com/econ/resouceProblems/WinnerTakeIt.html">winner-take-all</a> market.</p>
<p>It strikes me as interesting to learn from cases where searchers swim upstream against the social signals to find information. Of course, you may already see the contradiction&#8211;this is just another kind of social signaling! Still, it seems like it might be a way to hedge our bets and against the weaknesses of positive feedback and spammers. In a similar vein, we might look at how users find information that suffers from poor <a href="http://thenoisychannel.com/2008/04/22/accessibility-in-information-retrieval/">accessibility</a> or <a href="http://thenoisychannel.com/2009/09/26/information-retrievability/">retrievability</a>.</p>
<p>I don&#8217;t have answers about how to pursue such an approach, or whether it would even be feasible to do so. But I hope you agree with me that it&#8217;s an interesting question.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/11/21/can-we-learn-from-anti-social-users/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/11/21/can-we-learn-from-anti-social-users/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Twitter Lists as an Influence Measure?</title>
		<link>http://thenoisychannel.com/2009/11/01/twitter-lists-as-an-influence-measure/</link>
		<comments>http://thenoisychannel.com/2009/11/01/twitter-lists-as-an-influence-measure/#comments</comments>
		<pubDate>Sun, 01 Nov 2009 05:40:30 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2757</guid>
		<description><![CDATA[In &#8220;Using Twitter Lists To Judge Influence&#8220;, Todd Zeigler of the Bivings Report writes: I think Twitter Lists will end up helping separate the men from the boys when it comes to influence.  In addition to seeing a Twitter users follower count, we can now see the number of other Twitter users who have added [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.amazon.com/Influence-Mary-Kate-Olsen/dp/159514210X"><img class="alignnone size-full wp-image-2758" title="Influence" src="http://thenoisychannel.com/wordpress/wp-content/uploads/2009/11/influence.jpg" alt="Influence" width="179" height="220" /></a></p>
<p>In &#8220;<a href="http://www.bivingsreport.com/2009/using-twitter-lists-to-judge-influence/">Using Twitter Lists To Judge Influence</a>&#8220;, Todd Zeigler of the <a href="http://www.bivingsreport.com/">Bivings Report</a> writes:</p>
<blockquote><p>I think Twitter Lists will end up helping separate the men from the boys when it comes to influence.  In addition to seeing a Twitter users follower count, we can now see the number of other Twitter users who have added them to lists (example to the right).  I would argue that getting added to a list is a bigger deal than simply getting someone to follow you.</p></blockquote>
<p>I&#8217;m certainly intrigued by <a href="http://blog.twitter.com/2009/10/theres-list-for-that.html">Twitter Lists</a>, but I&#8217;m skeptical that counting how many lists someone is on will prove that much more useful than follower count. For example, <a href="http://twitter.com/dtunkelang">I</a> currently have <a href="http://twitter.com/dtunkelang/followers">1159 followers</a>, am on <a href="http://twitter.com/dtunkelang/lists/memberships">33 lists</a>, and have a <a href="http://twitter.com/dtunkelang/followers">TunkRank of 24.1</a>. For grins, here&#8217;s a handful of people who have similar stats:</p>
<ul>
<li><a href="http://twitter.com/kansandhaus">Evan Sandhaus</a>: 796 followers, 21 lists, TunkRank = 17.2</li>
<li><a href="http://twitter.com/jny2">Josh Young</a>: 801 followers, 25 lists, TunkRank = 14.3</li>
<li><span><a href="http://twitter.com/cjahearn">Chris Ahearn</a>: 1108 followers, 14 lists, TunkRank = </span>30.1</li>
<li><a href="http://twitter.com/brynn">Brynn Evans</a>: 1303 followers, 33 lists, TunkRank = 18.9</li>
<li><a href="http://twitter.com/eric_andersen">Eric Andersen</a>: 1543 followers, 37 lists, TunkRank = 3.1</li>
</ul>
<p>While I can&#8217;t generalize from a few arbitrarily selected data points (though Gladwell seems to have no trouble doing so in <a href="http://en.wikipedia.org/wiki/Outliers_%28book%29"><em>Outliers</em></a>), my suspicion is that list count will be highly correlated to follower count&#8211;and may actually be a noisier signal because the numbers are so much smaller.</p>
<p>Of course, there&#8217;s no reason we should use raw list counts&#8211;any more than we should use raw follower counts. Just as <a href="http://tunkrank.com/">TunkRank</a> aspires to <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">model attention scarcity</a> and recognizes that not all followers are created equal, an effective measure of how lists contribute to influence must recognize that not all list memberships are created equal either.</p>
<p>I&#8217;ve been chatting with <a href="http://twitter.com/chl">Chris Langreiter</a>, who is working on <a href="http://etherpad.com/HoPv2hJ4GB">enhancements to TunkRank</a> to address some of the oversimplifications of its model, as well as with <a href="http://twitter.com/jonathanglick">Jonathan Glick</a> and <a href="http://twitter.com/kenreisman">Ken Reisman</a> at <a href="http://www.tlists.com/">TLists</a>. I&#8217;d like to see online influence&#8211;on Twitter and in general&#8211;measured more effectively. It will be great if lists can help, but we can&#8217;t make the same naive mistakes as those who were quick to embrace <a href="http://thenoisychannel.com/2008/12/27/loic-le-meur-misses-the-point-of-twitter/">follower count as a measure of authority</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/11/01/twitter-lists-as-an-influence-measure/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/11/01/twitter-lists-as-an-influence-measure/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Google Experimenting with Social Search</title>
		<link>http://thenoisychannel.com/2009/10/26/google-experimenting-with-social-search/</link>
		<comments>http://thenoisychannel.com/2009/10/26/google-experimenting-with-social-search/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 20:31:02 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2734</guid>
		<description><![CDATA[Google may be an also-ran in the social networking market with its Brazil-centric Orkut service, but that hasn&#8217;t stopped the search giant from adding social features to its products. A post at the (unofficial) Google Operating System blog recounts the history of Google Reader&#8217;s social evolution, up to but not including its latest update last [...]]]></description>
			<content:encoded><![CDATA[<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="560" height="272" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/ZqWJxgp-_mU&amp;hl=en&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="560" height="272" src="http://www.youtube.com/v/ZqWJxgp-_mU&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Google may be an also-ran in the <a href="http://en.wikipedia.org/wiki/List_of_social_networking_websites">social networking market</a> with its Brazil-centric <a href="http://en.wikipedia.org/wiki/Orkut">Orkut</a> service, but that hasn&#8217;t stopped the search giant from adding social features to its products. A post at the (unofficial) Google Operating System blog recounts the history of <a href="http://googlesystem.blogspot.com/2009/07/google-readers-social-evolution.html">Google Reader&#8217;s social evolution</a>, up to but not including its <a href="http://googleblog.blogspot.com/2009/10/reading-gets-personal-with-popular.html">latest update</a> last week. <a href="http://googleblog.blogspot.com/2008/11/searchwiki-make-search-your-own.html">SearchWiki</a>, though not a social search feature per se, allows users to share personal annotations of their search results, as does the more recently introduced <a href="http://www.google.com/sidewiki/intl/en/index.html">Sidewiki</a>. And, <a href="http://blog.twitter.com/2009/10/bing-goes-dynamite.html">like Bing</a>, Google has established a <a href="http://blog.twitter.com/2009/10/google-nice.html">partnership with Twitter</a> in order to surface &#8220;social&#8221; results.</p>
<p>But the feature announced today, which Google is actually calling &#8220;<a href="http://googleblog.blogspot.com/2009/10/introducing-google-social-search-i.html">Social Search</a>&#8220;, is a much bigger step, even if it is tucked away as an <a href="http://www.google.com/experimental/">experiment on Google Labs</a>. From the official blog post:</p>
<blockquote><p>With Social Search, Google finds relevant public content from your friends and contacts and highlights it for you at the bottom of your search results. When I do a simple query for [new york], Google Social Search includes my friend&#8217;s blog on the results page under the heading &#8220;Results from people in your social circle for New York.&#8221; I can also filter my results to see only content from my social circle by clicking &#8220;Show options&#8221; on the results page and clicking &#8220;Social.&#8221;</p></blockquote>
<p>I gave it a whirl, search for <a href="http://www.google.com/search?q=&quot;noisy+channel&quot;">&#8220;noisy channel&#8221;</a> and then restricting the search to content from what Google considers my social circle. The results are as promised, and could further refine to results by author name, selecting from a familiar list of Neal Richter, Jason Adams, Daniel Lemire. Ken Ellis, and Joshua Young (<span style="text-decoration: line-through;">though for some reason Josh&#8217;s link didn&#8217;t work</span>). Cool! Except that there are a lot of names missing (check out the bloggers in <a href="http://thenoisychannel.com/the-noisy-community/">The Noisy Community</a>) and, more importantly, I can&#8217;t further refine or even sort the search results. Indeed, the ordering of search results seems quite arbitrary&#8211;a phenomenon I&#8217;ve noticed more generally for search engine ranking of social media content.</p>
<p>In short, Google Social Search is a welcome initiative, but there&#8217;s a lot more work to do before I would find a productive use for it. Given the mismatch between social search and black-box relevance ranking, a little bit of <a href="http://en.wikipedia.org/wiki/Human–computer_information_retrieval">HCIR</a> would go a long way towards making this feature practically useful.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/10/26/google-experimenting-with-social-search/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/10/26/google-experimenting-with-social-search/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>HCIR 2009: Human-Human Interaction</title>
		<link>http://thenoisychannel.com/2009/10/26/hcir-2009-human-human-interaction/</link>
		<comments>http://thenoisychannel.com/2009/10/26/hcir-2009-human-human-interaction/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 14:24:19 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2727</guid>
		<description><![CDATA[On Friday, I had the privilege of seeing just how much the annual Workshop on Human-Computer Information Retrieval has grown up since I conceived it in the summer of 2007. Back then, my co-conspirators and I worried about attracting a critical mass of participants&#8211;indeed, Endeca employees easily accounted for a quarter of the attendees (and [...]]]></description>
			<content:encoded><![CDATA[<p>On Friday, I had the privilege of seeing just how much the annual <a href="http://cuaslis.org/hcir2009/">Workshop on Human-Computer Information Retrieval</a> has grown up since I conceived it in the summer of 2007. Back then, my co-conspirators and I worried about attracting a critical mass of participants&#8211;indeed, <a href="http://endeca.com/">Endeca</a> employees easily accounted for a quarter of the attendees (and submissions) at the <a href="http://projects.csail.mit.edu/hcir/">first HCIR workshop</a>. And even <a href="http://research.microsoft.com/en-us/um/people/ryenw/hcir2008/">last year</a> host and co-sponsor <a href="http://research.microsoft.com/">Microsoft Research</a> supplied a disproportionate share of the attendees.</p>
<p>But this year was different. We were overloaded with strong submissions from all corners, and we had to turn people away for lack of capacity! While we didn&#8217;t relish saying no to prospective participants, these are great problems to have! And, thanks to Nick Belkin and Diane Kelly, we&#8217;ve arranged to greatly increase that capacity at <a href="http://iiix2010.org/hcir.php">HCIR 2010</a>&#8211;more on that in a moment.</p>
<p><a href="http://www.cs.swan.ac.uk/~csmax/">Max Wilson</a> has already written up an <a href="http://www.cs.swan.ac.uk/~csmax/blog/2009/10/hcir09-redux/">excellent summary</a> of the workshop, which I encourage you to read. You can also see the live tweet stream at <a href="http://search.twitter.com/search?q=%23hcir09">#hcir09</a>. Rather than duplicate these efforts, let me add my personal reflections as an organizer and participant.</p>
<p><a href="http://www.cs.umd.edu/~ben/">Ben Shneiderman</a>&#8216;s keynote address was sweeping and inspiring. I expected him to talk about <a href="http://en.wikipedia.org/wiki/Information_visualization">information visualization</a>, the area where he is most known for his contributions. He did present some examples of his group&#8217;s work on <a href="http://www.cs.umd.edu/hcil/lifelines2/">visualization-centric interfaces to support medical research</a>, but his overall presentation took the much more ambitious approach of discussing the past, present, and possible future of <a href="http://en.wikipedia.org/wiki/Human–computer_information_retrieval">HCIR</a>. Specifically, he urged us to link our work to societal goals, such as the <a href="http://www.un.org/millenniumgoals/">United Nations Millennium Development Goals</a>. His challenge may seem impossibly idealistic, but I agree with his assertion that it is a practical one: we will do our best research by grounding ourselves firmly in the real and pressing problems of our age. <a href="http://research.microsoft.com/en-us/um/people/sdumais/">Last year&#8217;s keynote speaker</a> went on to win the <a href="http://www.sigir.org/awards/awards.html">Gerard Salton Award</a>; I can only hope that Ben receives comparable accolades for his past accomplishments and future contributions to HCIR.</p>
<p>A new feature for this year&#8217;s workshop was having a &#8220;poster boaster&#8221; session, in which each of the presenters in the poster session had one minute to pitch his or her work.  For those of you unfamiliar with this format, I highly recommend it. The compressed format forces presenters to distill the essence of their contributions&#8211;a useful exercise in general. And the audience doesn&#8217;t get bored: if you decide halfway into a presentation that you aren&#8217;t interested, then you only have to wait 30 seconds until the next one! Not the we had that problem: the posters were consistently interesting, as the submissions were unusually strong this year. You can download the full workshop proceedings <a href="http://cuaslis.org/hcir2009/HCIR2009.pdf">here</a>.</p>
<p>Even the full presentations weren&#8217;t that long. The five speakers were each allotted ten minutes, with a healthy amount of time reserved for a panel-style Q&amp;A sessions. The papers in this session were, by design, some of the more controversial ones. In particular, <a href="http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/v/Voorhees:Ellen_M=.html">Ellen Voorhees</a> delivered a full-throated defense of <a href="http://en.wikipedia.org/wiki/Cranfield_Experiments">Cranfield</a> / <a href="http://en.wikipedia.org/wiki/Text_Retrieval_Conference">TREC</a>-style evaluation: &#8220;I Come Not to Bury Cranfield, but to Praise It&#8221; (similar to her <a href="http://www.dcs.gla.ac.uk/workshops/air/slides/EllenVoorhees-TestCollectionsforAIR.pdf">presentation</a> at the <a href="http://www.dcs.gla.ac.uk/workshops/air/">2006 Workshop on Adaptive Information Retrieval</a> that I <a href="http://thenoisychannel.com/2008/04/17/ellen-voorhees-defends-cranfield/">discussed</a> on this blog last year). Her reminder of HCIR&#8217;s challenges on the evaluation front surely ruffled some feathers, but all of us HCIR avocates need to address these challenges if we want researchers (and practitioners) outside our community to drink our kool-aid.</p>
<p>The above format was already quite interactive (as befits a workshop about interaction), but the second half of the day was explicitly designed to facilitate discussion. We had lunch on site, followed by a one-hour poster session.  We then had two one-hour guided discussion sessions to address the theoretical and practical concerns of HCIR. As organizers, we seeded both sessions with questions, but we also incorporated concerns that had come up during earlier discussions.</p>
<p>Finally, I am grateful to our sponsors. <a href="http://slis.cua.edu/">Catholic University</a> was a gracious host and sponsor, providing the workshop with a great space and very helpful student volunteers. Between that and the financial contributions of <a href="http://endeca.com/">Endeca</a> and <a href="http://research.microsoft.com/">Microsoft Research</a>, we were able to continue our tradition of not charging attendees for the workshop. I can&#8217;t promise that will continue indefinitely, but I am glad that our insistence on emphasizing substance over frivolous amenities has helped us deliver what I believe to be some of the best bang-for-buck in the scholarly community.</p>
<p>I&#8217;m already excited about <a href="http://iiix2010.org/hcir.php">HCIR 2010</a>. Unlike the past three workshops, which have been held as independent events, next year&#8217;s workshop will be co-located with the <a href="http://iiix2010.org/">Information Interaction in Context Symposium (IIiX’10)</a> in New Brunswick, New Jersey. The workshop will take place on August 22nd, breaking our unintended tradition of holding the workshop on October 23rd. <a href="http://comminfo.rutgers.edu/~belkin/belkin.html">Nick Belkin</a> assures us that there will be lots of space, so hopefully we&#8217;ll be able to accommodate everyone who is interested. We&#8217;ll also be soliciting sponsors for both the workshop and the broader symposium.</p>
<p>But there&#8217;s more to HCIR than enjoying each other&#8217;s company at workshops. We must spend the remaining 364 days of the year fleshing out our vision, and relating that vision not only to the disciplines HCIR explicitly integrates, but to pressing social concerns. It is up to us all to make our work relevant.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/10/26/hcir-2009-human-human-interaction/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/10/26/hcir-2009-human-human-interaction/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Innovation at Huffington Post: Data-Driven Headlines</title>
		<link>http://thenoisychannel.com/2009/10/15/innovation-at-huffington-post-data-driven-headlines/</link>
		<comments>http://thenoisychannel.com/2009/10/15/innovation-at-huffington-post-data-driven-headlines/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 18:17:37 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2705</guid>
		<description><![CDATA[The other day, I was suggesting to one of my colleagues that Endeca&#8216;s software could help authors write better (translate, more SEO-friendly) headlines. The details of that discussion are proprietary, but I&#8217;m sure you can imagine the gist. But we all wondered whether authors would be willing to stomach such a left-brain infringement on their [...]]]></description>
			<content:encoded><![CDATA[<p>The other day, I was suggesting to one of my colleagues that <a href="http://endeca.com/">Endeca</a>&#8216;s software could help authors write better (translate, more <a href="http://en.wikipedia.org/wiki/Search_engine_optimization">SEO</a>-friendly) headlines. The details of that discussion are proprietary, but I&#8217;m sure you can imagine the gist. But we all wondered whether authors would be willing to stomach such a left-brain infringement on their right-brain creativity.</p>
<p>But apparently the <a href="http://www.huffingtonpost.com/">Huffington Post</a> is blazing new trails in this area. The <a href="http://www.niemanlab.org/2009/10/how-the-huffington-post-uses-real-time-testing-to-write-better-headlines/">Nieman Journalism Lab</a> reports that:</p>
<blockquote><p><strong>The Huffington Post applies A/B testing to some of its headlines.</strong> Readers are randomly shown one of two headlines for the same story. After five minutes, which is enough time for such a high-traffic site, the version with the most clicks becomes the <a href="http://www.google.com/search?q=site%3Aobserver.com+%22wood+war%22">wood</a> that everyone sees.</p></blockquote>
<p>NJL also reports that Huffington Post social media editor&#8211;and long-time Noisy Channel reader&#8211;<a href="http://networkednews.wordpress.com/">Josh Young</a> uses Twitter to help crowd-source  better headlines.</p>
<p>I&#8217;m sure this approach must rattle some old-school journalists. And there is a real danger of optimizing for the wrong outcome. For example, including the word &#8220;sex&#8221; in this message might improve its traffic (the popularity of <a href="http://thenoisychannel.com/2008/12/12/the-noisy-channel-now-better-than-sex/">this post</a> attests to that), but to what end?</p>
<p>Still, I don&#8217;t see this use of technology as cramping anyone&#8217;s style. Most of us write to be read&#8211;especially those in the media industry who are trying to monetize their audiences. Measurable success matters, and there&#8217;s no harm in trying to maximize it.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/10/15/innovation-at-huffington-post-data-driven-headlines/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/10/15/innovation-at-huffington-post-data-driven-headlines/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Are Duplicate Tweets Spam?</title>
		<link>http://thenoisychannel.com/2009/10/15/are-duplicate-tweets-spam/</link>
		<comments>http://thenoisychannel.com/2009/10/15/are-duplicate-tweets-spam/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 16:43:09 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2701</guid>
		<description><![CDATA[The Twitterverse is all a-twitter with a new controversy: Twitter has rolled out a new feature that blocks duplicate tweets. They reported to the SocialOomph blog that: Recurring Tweets are a violation no matter how they are done, including whether or not someone pays you to have a special privilege. We don’t want to see [...]]]></description>
			<content:encoded><![CDATA[<p>The Twitterverse is all a-twitter with a new controversy: Twitter has rolled out a new feature that <a href="http://www.techcrunch.com/2009/10/14/cleaning-up-the-stream-twitter-kills-duplicate-tweets/">blocks duplicate tweets</a>. They reported to the <a href="http://www.socialoomphblog.com/recurring-tweets/">SocialOomph</a> blog that:</p>
<blockquote><p>Recurring Tweets are a violation no matter how they are done, including whether or not someone pays you to have a special privilege. We don’t want to see any duplicate tweets whatsoever- They pollute Twitter, and tools shouldn’t be given to enable people to break the rules. Spinnable text seems to just be a way to bypass the rules against duplicate updates and essentially provides the same problems.</p>
<p>Hence, from Thursday, October 15th, 2009, 00:00 AM CST we will prevent the entry of recurring tweets on Twitter accounts within the SocialOomph system. Existing recurring tweets on Twitter accounts will all be placed in paused state at that time, so that the content of the tweet text is still accessible to you, but no publishing to Twitter of those tweets will take place.</p></blockquote>
<p>Not everyone is thrilled with this new feature. My friend (and Noisy Channel reader) <a href="http://twitter.com/eric_andersen">Eric Andersen</a> notes: &#8220;<span title="processed"><span>this doesn&#8217;t make a lot of sense to me &#8211; many highly regarded Twitter users (e.g. @<a href="http://twitter.com/GuyKawasaki">GuyKawasaki</a>) regularly re-post tweets&#8230;</span></span><span title="processed"><span>primarily because of the &#8220;dip&#8221; model: re-posting the same tweet means more people will see, especially with an int&#8217;l audience.&#8221;</span></span></p>
<p><span title="processed"><span>On one hand, I loathe inefficient communication, and I see repeated tweets as exposing the inefficiency of the dip model. We won&#8217;t get into my <a href="http://thenoisychannel.com/2009/04/06/guy-kawasaki-ill-say-it/">differences of opinion</a> with Guy Kawasaki. If Twitter offered better search and control to users, then I think it would make sense for them to consider </span></span><span title="processed"><span>duplicate tweets as a spam issue.</span></span></p>
<p><span title="processed"><span>On the other hand, Twitter search is <a href="http://thenoisychannel.com/2009/05/09/the-twouble-with-twitter-search/">crude</a>. And the dip model, much as it may raise my <a href="http://thenoisychannel.com/2009/01/02/an-attention-ponzi-scheme/">personal hackles</a>, is, in fact, what many users embrace. Twitter takes pride in letting users drive innovation, and I think they should be cautious about being too autocratic. Surely many of the people who post duplicate tweets do so with unspammy intentions.</span></span></p>
<p><span title="processed"><span>Let&#8217;s face it: Twitter is going through growing pains, even if it just inherited the <a href="http://pacific.bizjournals.com/pacific/stories/2009/10/05/daily60.html">mother of all trust funds</a>. They really do have to address <a href="http://thenoisychannel.com/2009/06/27/are-spammers-taking-over-twitter/">spam</a>. But they might consider doing so in a less heavy-handed way. I suspect that duplicate tweets are mainly a problem because they affect the statistics for Trending Topics&#8211;a problem they could easily address without prohibiting the tweets themselves. Better search would make it users to take charge of the user experience&#8211;a small dose of <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> would go a long way.</span></span></p>
<p><span title="processed"><span>I think Twitter has the best of intentions, and that it is confronting a real problem. I hope they work harder to find the right solution.<br />
</span></span></p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/10/15/are-duplicate-tweets-spam/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/10/15/are-duplicate-tweets-spam/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Is Twitter Planning To Monetize The Firehose?</title>
		<link>http://thenoisychannel.com/2009/10/08/is-twitter-planning-to-monetize-the-firehose/</link>
		<comments>http://thenoisychannel.com/2009/10/08/is-twitter-planning-to-monetize-the-firehose/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 13:05:05 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2671</guid>
		<description><![CDATA[A few months ago, I wrote in &#8220;The Twouble with Twitter Search&#8220;: But the trickle that Twitter returns is hardly enough. I believe this limitation is by design–that Twitter knows the value of such access and isn’t about to give it away. I just hope Twitter will figure out a way to provide this access [...]]]></description>
			<content:encoded><![CDATA[<p>A few months ago, I wrote in &#8220;<a href="http://thenoisychannel.com/2009/05/09/the-twouble-with-twitter-search/">The Twouble with Twitter Search</a>&#8220;:</p>
<blockquote><p>But the trickle that Twitter returns is hardly enough.</p>
<p>I believe this limitation is by design–that Twitter knows the value of such access and isn’t about to give it away. I just hope Twitter will figure out a way to provide this access for a price, and that an ecology of information access providers develops around it. Of course, if Google or Microsoft buys Twitter first, that probably won’t happen.</p></blockquote>
<p>Now that Twitter has raised $100M at a valuation of $1B, I doubt any acquisition will happen anytime soon. But, according to <a href="http://kara.allthingsd.com/20091008/twitter-talking-separately-to-microsoft-and-also-google-about-big-data-mining-deals/">Kara Swisher&#8217;s unnamed sources</a>:</p>
<blockquote><p>Twitter is in advanced talks with Microsoft and Google separately about striking data-mining deals, in which the companies would license a full feed from the microblogging service that could then be integrated into the results of their competing search engines.</p></blockquote>
<p>If so, then it&#8217;s about time! How much either Microsoft or Google would pay for this feed is an interesting question. It&#8217;s probably not a coincidence that Twitter raised its last round of funding before pursuing this path&#8211;the revenue they obtain this way could be significant, but is unlikely to justify a $1B valuation.</p>
<p>In any case, I&#8217;m excited as a consumer that Twitter may finally allow Google and Microsoft to better expose the value of its content. But I&#8217;m also curious what my friends on the Twitter Search team think of the potential competition from the web search titans. Until now, no one has been able compete effectively with Twitter&#8217;s native search because of  lacking access to the firehose. Having such access would give Google and Microsoft more than a fighting chance. Given the centrality of search to Twitter&#8217;s user experience, it&#8217;s an interesting corporate strategy.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/10/08/is-twitter-planning-to-monetize-the-firehose/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/10/08/is-twitter-planning-to-monetize-the-firehose/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Privacy, Pseudonymity, and Copyright</title>
		<link>http://thenoisychannel.com/2009/09/29/privacy-pseudonymity-and-copyright/</link>
		<comments>http://thenoisychannel.com/2009/09/29/privacy-pseudonymity-and-copyright/#comments</comments>
		<pubDate>Tue, 29 Sep 2009 20:49:16 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2627</guid>
		<description><![CDATA[A lunch conversation during the Transparent Text symposium about transparency in social media (also a hot topic in the Ethics of Blogging panel) led me to watch the following presentation from Lawrence Lessig on &#8220;Privacy 2.0&#8220;: Another topic in that conversation was pseudonymity. Someone pointed to a 2000 USENIX paper entitled &#8220;Can Pseudonymity Really Guarantee [...]]]></description>
			<content:encoded><![CDATA[<p>A lunch conversation during the <a href="http://thenoisychannel.com/2009/09/21/transparent-text-symposium-day-1/">Transparent</a> <a href="http://thenoisychannel.com/2009/09/23/transparent-text-symposium-day-2/">Text</a> symposium about transparency in social media (also a hot topic in the <a href="http://thenoisychannel.com/2009/09/10/the-ethics-of-blogging/">Ethics of Blogging</a> panel) led me to watch the following presentation from <a href="http://www.lessig.org/">Lawrence Lessig</a> on &#8220;<a href="http://lessig.blip.tv/file/2016591/">Privacy 2.0</a>&#8220;:</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="390" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://blip.tv/play/lG372wMC" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="390" src="http://blip.tv/play/lG372wMC" allowfullscreen="true"></embed></object></p>
<p>Another topic in that conversation was <a href="http://en.wikipedia.org/wiki/Pseudonymity">pseudonymity</a>. Someone pointed to a 2000 <a href="http://www.usenix.org/">USENIX</a> paper entitled &#8220;<a href="http://www.usenix.org/events/sec2000/full_papers/rao/rao_html/index.html">Can Pseudonymity Really Guarantee Privacy?</a>&#8221; The challenges of implementing pseudonymity have, of course, received lots of attention in the past few years. The most notorious example is the <a href="http://en.wikipedia.org/wiki/AOL_search_data_scandal">AOL search data scandal</a>, which made the <a href="http://www.nytimes.com/2006/08/09/technology/09aol.html">front page of the New York Times</a>. But there&#8217;s also the work co-authored by my friend <a href="http://www.cs.utexas.edu/users/shmat/">Vitaly Shmatikov</a> on <a href="http://www.cs.utexas.edu/users/shmat/shmat_oak08netflix.pdf">de-anonymizing Netflix data</a>. Indeed, some have expressed concern that the new Netflix competition is a <a href="http://www.freedom-to-tinker.com/blog/paul/netflixs-impending-still-avoidable-multi-million-dollar-privacy-blunder">privacy lawsuit waiting to happen</a>.</p>
<p>Finally, <a href="http://www.danah.org/">danah boyd</a>&#8216;s master&#8217;s thesis on &#8220;<a href="http://smg.media.mit.edu/papers/danah/danahThesis.pdf">faceted id/entity: managing representation in a digital world</a>&#8221; also came up&#8211;and I recently discovered by way of <a href="http://scobleizer.com/2009/09/26/youre-not-on-twitters-suggested-user-list-but-you-are-in-good-company/">Robert Scoble</a> that she&#8217;ll be <a href="http://sxsw.com/node/3432">keynoting at SXSW</a> next year. Now I feel even more proud that I convinced her to speak at the <a href="http://thenoisychannel.com/2009/07/29/sigir-2009-day-3-industry-track-danah-boyd/">SIGIR Industry Track</a> this year. But I digress.</p>
<p>What does any of this have to do with copyright? Watch Lessig&#8217;s presentation&#8211;it&#8217;s long, but I promise you it&#8217;s worthwhile and entertaining to boot. Besides, I&#8217;ve made it easy by embedding it for you! He makes an analogy&#8211;rather, he makes fair use of <a href="http://cyber.law.harvard.edu/people/jzittrain">Jonathan Zittrain</a>&#8216;s analogy&#8211;between privacy rights and copyright.</p>
<p>The executive (and overgeneralized) summary is that both privacy-holders (&#8220;consumers&#8221;) and copyright-holders (&#8220;industry&#8221;) have complained that technology has undermined their rights, and both have sought out legal remedies. Consumers push back on industry, frustrated with legal strategies to enforce copyright at the expense of consumer freedom, preferring instead to let technology dictate policy; industry pushes back on consumers, frustrated with their legal strategies to enforce privacy rights at the expense of industry freedom, in this case preferring instead to let technology dictate policy. The analogy may not be perfect, but it is close enough to be compelling.</p>
<p>But I&#8217;d like to stretch the analogy further than Lessig and Zittrain to consider pseudonymity and <a href="http://en.wikipedia.org/wiki/Derivative_work">derivative works</a>. The pseudonymity challenge (e.g., the recent reports about <a href="http://thenoisychannel.com/2009/09/20/project-gaydar-a-reminder-that-privacy-isnt-binary/">Project Gaydar</a>) remind us that privacy isn&#8217;t binary, and that we have to accept at least some loss of privacy if we are going to live in a social world. Similarly, provisions like <a href="http://en.wikipedia.org/wiki/Fair_use">fair use</a> exist because copyright is an inherent trade-off between protecting creators&#8217; rights and embracing the value of creation in a social context.</p>
<p>As I said, I find the Zittrain&#8217;s analogy and Lessig&#8217;s presentation compelling. While it may not answer any of society&#8217;s urgent questions about privacy and copyright, it may at least further the conversation. At the very least, I hope the topic is intellectually stimulating.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/29/privacy-pseudonymity-and-copyright/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/29/privacy-pseudonymity-and-copyright/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ethics of Blogging: Webcast Now Available</title>
		<link>http://thenoisychannel.com/2009/09/28/ethics-of-blogging-webcast-now-available/</link>
		<comments>http://thenoisychannel.com/2009/09/28/ethics-of-blogging-webcast-now-available/#comments</comments>
		<pubDate>Tue, 29 Sep 2009 03:27:26 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2623</guid>
		<description><![CDATA[Thanks to Robin Fray Carey for posting the webcast of the Ethics of Blogging panel on the Social Media Today site. You can also catch the tweet stream at #SMTWebcast while it&#8217;s still indexed.]]></description>
			<content:encoded><![CDATA[<p>Thanks to Robin Fray Carey for posting the <a href="http://www.socialmediatoday.com/SMC/127920">webcast</a> of the <a href="http://thenoisychannel.com/2009/09/10/the-ethics-of-blogging/">Ethics of Blogging</a> panel on the Social Media Today site. You can also catch the tweet stream at <a href="http://search.twitter.com/search?q=%23SMTWebcast">#SMTWebcast</a> while it&#8217;s still indexed.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/28/ethics-of-blogging-webcast-now-available/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/28/ethics-of-blogging-webcast-now-available/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ethics of Blogging Panel Today</title>
		<link>http://thenoisychannel.com/2009/09/24/ethics-of-blogging-panel-today/</link>
		<comments>http://thenoisychannel.com/2009/09/24/ethics-of-blogging-panel-today/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 11:56:35 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2604</guid>
		<description><![CDATA[Just a reminder that I&#8217;m participating in an online panel today (at 1pm EST) to discuss the Ethics of Blogging. Maggie Fox, founder and CEO of Social Media Group, will moderate a panel composed of Augie Ray, who blogs at Experience: The Blog) and is Managing Director of Experiential Marketing at interactive and social media [...]]]></description>
			<content:encoded><![CDATA[<p>Just a reminder that I&#8217;m participating in an online panel today (at 1pm EST) to discuss the <a href="http://www.socialmediatoday.com/submitform/smtwebinar092409/?reference=smt_dtunkelang">Ethics of Blogging</a>.</p>
<p><a href="http://socialmediagroup.com/about/">Maggie Fox</a>, founder and CEO of <a href="http://socialmediagroup.com/">Social Media Group</a>, will moderate a panel composed of<a href="https://twitter.com/augieray"> Augie Ray</a>, who blogs at <a href="http://www.experiencetheblog.com/">Experience: The Blog</a>) and is Managing Director of Experiential Marketing at interactive and social media agency <a href="http://www.fullhouseinteractive.com/">Fullhouse</a>; <a href="http://johnjantsch.com/">John Jantsch</a>, who blogs at <a href="http://www.ducttapemarketing.com/blog/">Duct Tape Marketing</a> and is a marketing and digital technology coach; and yours truly. It’s free to attend; just register <a href="http://www.socialmediatoday.com/submitform/smtwebinar092409/?reference=smt_dtunkelang">here</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/24/ethics-of-blogging-panel-today/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/24/ethics-of-blogging-panel-today/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>HCIR 2009 Accepted Submissions</title>
		<link>http://thenoisychannel.com/2009/09/23/hcir-2009-accepted-submissions/</link>
		<comments>http://thenoisychannel.com/2009/09/23/hcir-2009-accepted-submissions/#comments</comments>
		<pubDate>Wed, 23 Sep 2009 16:56:06 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2600</guid>
		<description><![CDATA[The agenda for HCIR 2009 is now online! As previously announced, Ben Shneiderman from the University of Maryland will be the keynote speaker. The accepted submissions are as follows: Panel Presentations Usefulness as the Criterion for Evaluation of Interactive Information Retrieval Michael Cole, Jingjing Liu, Nicholas Belkin, Ralf Bierig, Jacek Gwizdka, Chang Liu, Jun Zhang [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://cuaslis.org/hcir2009/agenda.html">agenda</a> for <a href="http://cuaslis.org/hcir2009/">HCIR 2009</a> is now online! As previously announced, <a href="http://www.cs.umd.edu/~ben/">Ben Shneiderman</a> from the University of Maryland will be the keynote speaker. The accepted submissions are as follows:</p>
<p><strong>Panel Presentations</strong></p>
<ul>
<li>Usefulness as the Criterion for Evaluation of Interactive Information Retrieval<br />
<em> Michael Cole, Jingjing Liu, Nicholas Belkin, Ralf Bierig, Jacek Gwizdka, Chang Liu, Jun Zhang and Xiangmin Zhang (Rutgers University)</em></li>
<li>Modeling Searcher Frustration<br />
<em> Henry Feild and James Allan (University of Massachusetts Amherst)</em></li>
<li>Query Suggestions as Idea Tactics for Information Search<br />
<em> Diane Kelly (University of North Carolina at Chapel Hill)</em></li>
<li>I Come Not to Bury Cranfield, but to Praise It<br />
<em> Ellen Voorhees (National Institute of Standards and Technology)</em></li>
<li>Search Tasks and Their Role in Studies of Search Behaviors<br />
<em> Barbara Wildemuth (University of North Carolina at Chapel Hill) and Luanne Freund (University of British Columbia)</em></li>
</ul>
<p><strong>Posters and Demonstrations</strong></p>
<ul>
<li>Visual Interaction for Personalized Information Retrieval<br />
<em> Jae-wook Ahn and Peter Brusilovsky (University of Pittsburgh)</em></li>
<li>PuppyIR: Designing an Open Source Framework for Interactive Information Services for Children<br />
<em> Leif Azzopardi (University of Glasgow), Richard Glassey (University of Glasgow), Mounia Lalmas (University of Glasgow), Tamara Polajnar (University of Glasgow) and Ian Ruthven (University of Strathclyde)</em></li>
<li>Designing an Interactive Automatic Document Classification System<br />
<em> Kirk Baker (Collexis)</em></li>
<li>The HCI Browser Tool for Studying Web Search Behavior<br />
<em> Robert Capra (University of North Carolina at Chapel Hill)</em></li>
<li>A Graphic User Interface for Content and Structure Queries in XML Retrieval<br />
<em> Juan M. Fernández-Luna, Luis M. de Campos, Juan F. Huete and Carlos J. Martin-Dancausa (University of Granada)</em></li>
<li>Improving Search-Driven Development with Collaborative Information Retrieval Techniques<br />
<em> Juan M. Fernández-Luna (University of Granada), Juan F. Huete (University of Granada), Ramiro Pérez-Vázquez (Universidad Central de Las Villas) and Julio C. Rodríguez-Cano (Universidad de Holguín)</em></li>
<li>A visualization interface for interactive search refinement<br />
<em> Fernando Figueira Filho (State University of Campinas), João Porto de Albuquerque (University of Sao Paulo), André Resende (State University of Campinas), Paulo Lício de Geus (State University of Campinas) and Gary Olson (University of California, Irvine)</em></li>
<li>Cognitive Dimensions Analysis of Interfaces for Information Seeking<br />
<em> Gene Golovchinsky (FX Palo Alto Laboratory, Inc.)</em></li>
<li>Cognitive Load and Web Search Tasks<br />
<em> Jacek Gwizdka (Rutgers University)</em></li>
<li>Visualising Digital Video Libraries for TV Broadcasting Industry: A User-Centred Approach<br />
<em> Mieke Haesen, Jan Meskens and Karin Coninx (Hasselt University)</em></li>
<li>Log Based Analysis of How Faceted and Text Based Searching Interact in a Library Catalog Interface<br />
<em> Bradley Hemminger (University of North Carolina), Xi Niu (University of North Carolina) and Cory Lown (NC State Libraries)</em></li>
<li>Freebase Cubed: Text-based Collection Queries for Large, Richly Interconnected Data Sets<br />
<em> David Huynh (Metaweb Technologies, Inc.)</em></li>
<li>System Controlled Assistance for Improving Search Performance<br />
<em> Bernard Jansen (Pennsylvania State University)</em></li>
<li>Designing for Enterprise Search in a Global Organization<br />
<em> Maria Johansson and Lina Westerling (Findwise AB)</em></li>
<li>Cultural Differences in Information Behavior<br />
<em> Anita Komlodi (University of Maryland Baltimore County) and Karoly Hercegfi (Budapest University of Technology and Economics)</em></li>
<li>Adapting an Information Visualization Tool for Mobile Information Retrieval<br />
<em> Sherry Koshman and Jae-wook Ahn (University of Pittsburgh)</em></li>
<li>A Theoretical Framework for Subjective Relevance<br />
<em> Katrina Muller and Diane Kelly (University of North Carolina)</em></li>
<li>Query Reuse in Exploratory Search Tasks<br />
<em> Chirag Shah and Gary Marchionini (University of North Carolina at Chapel Hill)</em></li>
<li>Augmenting Cranfield-Style Evaluation with GOMS to Obtain Timed Predictions of User Performance<br />
<em> Mark Smucker (Waterloo University)</em></li>
<li>Text-To-Query: Suggesting Structured Analytics to Illustrate Textual Content<br />
<em> Raphael Thollot (SAP Business Objects) and Marie-Aude Aufaure (Ecole Centrale Paris)</em></li>
<li>The Information Availability Problem<br />
<em> Daniel Tunkelang (Endeca)</em></li>
<li>Exploratory Search Over Temporal Event Sequences: Novel Requirements, Operations, and a Process Model<br />
<em> Taowei Wang, Krist Wongsuphasawat, Catherine Plaisant and Ben Shneiderman (University of Maryland)</em></li>
<li>Keyword Search: Quite Exploratory Actually<br />
<em> Max Wilson (Swansea University)</em></li>
<li>Using Twitter to Assess Information Needs: Early Results<br />
<em> Max Wilson (Swansea University)</em></li>
<li>Integrating User-generated Content Description to Search Interface Design<br />
<em> Kyunghye Yoon (SUNY Oswego)</em></li>
<li>Ambiguity and Context-Aware Query Reformulation<br />
<em> Hui Zhang (Indiana University)</em></li>
</ul>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/23/hcir-2009-accepted-submissions/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/23/hcir-2009-accepted-submissions/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Goby Goes Deep</title>
		<link>http://thenoisychannel.com/2009/09/23/goby-goes-deep/</link>
		<comments>http://thenoisychannel.com/2009/09/23/goby-goes-deep/#comments</comments>
		<pubDate>Wed, 23 Sep 2009 11:17:51 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2597</guid>
		<description><![CDATA[At  the first HCIR workshop in 2007, Michael Stonebraker stood up in the middle of an open discussion session and told all assembled that we needed to be thinking about the deep web. I don&#8217;t know how much the audience took heed of his call, but he certainly followed his own advice. He and Endeca [...]]]></description>
			<content:encoded><![CDATA[<p>At  the first <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> workshop in <a href="http://projects.csail.mit.edu/hcir/web/">2007</a>, <a href="http://en.wikipedia.org/wiki/Michael_Stonebraker">Michael Stonebraker</a> stood up in the middle of an open discussion session and told all assembled that we needed to be thinking about the <a href="http://en.wikipedia.org/wiki/Deep_Web">deep web</a>.</p>
<p>I don&#8217;t know how much the audience took heed of his call, but he certainly followed his own advice. He and Endeca alum <a href="http://twitter.com/viking2917">Mark Watkins</a> just launched <a href="http://www.goby.com/">Goby</a>, a vertical search engine that exhorts you to &#8220;create your own adventure&#8221;.  It&#8217;s fun&#8211;a sort of <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploratory search</a> for explorers. And it uses a deep web crawl to populate its index with semi-structured data.</p>
<p>Anyway, try it out! I&#8217;ve been in the private beta, but haven&#8217;t had the chance to see what they&#8217;ve been up to in the final stretch leading to the launch. You can also read more on <a href="http://searchengineland.com/what-where-when-travel-local-search-combine-goby-com-26395">Search Engine Land</a> or <a href="http://news.cnet.com/8301-27076_3-10359329-248.html">CNET</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/23/goby-goes-deep/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/23/goby-goes-deep/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Transparent Text Symposium: Day 2</title>
		<link>http://thenoisychannel.com/2009/09/23/transparent-text-symposium-day-2/</link>
		<comments>http://thenoisychannel.com/2009/09/23/transparent-text-symposium-day-2/#comments</comments>
		<pubDate>Wed, 23 Sep 2009 04:32:29 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2590</guid>
		<description><![CDATA[Given how intense yesterday was at the Transparent Text symposium, I couldn&#8217;t imagine that today would match it. But it did! The morning kicked off with a series of 18 lighting talks in 90 minutes&#8211;that was 5 minutes apiece, with a ruthless gong for anyone who went overtime. The presentations were consistently intense, and I [...]]]></description>
			<content:encoded><![CDATA[<p>Given how intense <a href="http://thenoisychannel.com/2009/09/21/transparent-text-symposium-day-1/">yesterday</a> was at the <a href="http://www.research.ibm.com/social/transparent_text/">Transparent Text</a> symposium, I couldn&#8217;t imagine that today would match it. But it did!</p>
<p>The morning kicked off with a series of 18 lighting talks in 90 minutes&#8211;that was 5 minutes apiece, with a ruthless <a href="http://twitpic.com/ip345">gong</a> for anyone who went overtime. The presentations were consistently intense, and I had the misfortune to follow one of the best talks&#8211;a very passionate presentation about crowd-sourced translation by IBM&#8217;s Uyi Stewart. Other notable presenters included design ninja <a href="http://www.alexislloyd.com/">Alexis Lloyd</a> from the New York Times R&amp;D Lab, Karrie  Karahalios from the University of Illinois talking about the experimental <a href="http://wemeddle.com/">WeMeddle</a> Twitter client,  <span id="msgtxt4172392610">MIT Media Lab professor and Berkman Fellow <a href="http://smg.media.mit.edu/people/Judith/">Judith Donath</a> showing a stunning gallery of &#8220;data portraits&#8221;, and Dragon Systems co-founder <a href="http://en.wikipedia.org/wiki/Dragon_NaturallySpeaking#History">Janet Baker</a> explaining how the brain recognizes speech&#8211;with an skull as a prop! The session was incredible, and I hope other conferences adopt this model.</span></p>
<p><span>After the coffee break, there was a session on Text Analysis in the Large, featuring </span><span><a href="http://www.almaden.ibm.com/cs/people/dgruhl/">Dan Gruhl</a> (IBM), </span><span><a href="http://gking.harvard.edu/">Gary King</a> (Harvard), and <a href="http://money.cnn.com/2009/06/26/technology/ibm_jeopardy_watson_computer/?postversion=2009062616">David Ferrucci</a> (IBM). Dan Gruhl talked about web-scale text analysis&#8211;a topic up his alley, considering his role in architecting the IBM <a href="http://en.wikipedia.org/wiki/IBM_WebFountain">WebFountain</a> project. Gary King gave a fascinating talk about using</span><span id="msgtxt4174004598"> ensemble methods to improve on existing clustering methods&#8211;the idea is to synthesize a collection of derived clusterings and place them in an explorable metric space. You can read the full paper <a href="http://gking.harvard.edu/files/discov.pdf">here</a>. But the winner for this session was definitely David Ferrucci, who described the work IBM Research is doing to develop a <a href="http://thenoisychannel.com/2009/04/27/who-wants-to-play-jeopardy/">machine Jeopardy player</a>. He spent much of the talk building a case for the difficulty of the problem&#8211;and then delivered the </span><span id="msgtxt4175579219">punchline: In less then three years of research, they&#8217;ve developed a machine player whose performance is comparable to that or jeopardy winners. Hopefully they&#8217;ll be competing on live television by next year!<br />
</span></p>
<p>After lunch, there was a session on Investigation, featuring <a href="http://maplight.org/">MAPLight</a> Research Director <a name="Emily_Calhoun" href="http://maplight.org/staff">Emily Calhoun</a>, UC Berkeley law professor <a name="Kevin_Quinn" href="http://www.law.berkeley.edu/kevinmquinn.htm">Kevin Quinn</a>, and <span id="msgtxt4177751966">Guardian news editor <a href="http://www.guardian.co.uk/profile/simonrogers">Simon Rogers</a>. </span>Emily Calhoun showed how MAPLight illuminates the connections between money and politics&#8211;it was great seeing <span id="msgtxt4295316262">data to correlate who supports and opposes bills with the associated campaign </span><span id="msgtxt4295316262">contributions from</span><span id="msgtxt4295316262"> interest groups. Kevin Quinn&#8217;s presentation was a bit more technical, but his <a href="http://www.law.berkeley.edu/5957.htm">work</a> reminds me a lot of Miles Efron&#8217;s work on <a href="http://people.lis.illinois.edu/~mefron/papers/efron-libmedia.pdf">estimating political orientation in web documents</a>&#8211;but Quinn&#8217;s work is more general and goes beyond co-citation analysis to analyze the actual language of the documents. Great application of topic modeling! But my favorite presentation in this session was the one from Simon Rogers: he told the story of how the Guardian successfully crowd-sourced a project to <a href="http://mps-expenses.guardian.co.uk/">investigate the expenses of UK Parliament members</a>.</span></p>
<p><span>The final session was a panel discussion about how visualization might elevate or advance the debate over health care policy. The panelists were </span><span id="msgtxt4297562492"><a href="http://benfry.com/">Ben Fry</a>, <a href="http://people.ischool.berkeley.edu/~hearst/">Marti Hearst</a>, <a href="http://gking.harvard.edu/">Gary King</a>, and </span><span id="msgtxt4177751966"><a href="http://www.guardian.co.uk/profile/simonrogers">Simon Rogers</a></span><span id="msgtxt4297562492">; <a href="http://fernandaviegas.com/">Fernanda Vi</a></span><a href="http://fernandaviegas.com/">é</a><span id="msgtxt4297562492"><a href="http://fernandaviegas.com/">gas</a> and <a href="http://www.bewitched.com/">Martin Wattenberg</a> moderated. Unfortunately, the overwhelming sentiment from the panel was pessimism that anything we could do might actually lead to improved outcomes. Nonetheless, it&#8217;s clear that a lot of people are going to try.</span></p>
<p><span>Again, I want to thank Fernanda, Martin, <a href="http://domino.watson.ibm.com/cambridge/research.nsf/pages/irene_greif.html">Irene Greif</a>, and everyone at IBM for organizing this fantastic event&#8211;and for inviting me to attend! I am impressed that anyone could manage to assemble such an impressive set of speakers in one place, and I appreciate the effort that everyone put into making the past two days so worthwhile. I look forward to seeing the videos available online, and I hope those who weren&#8217;t able to attend take the opportunity to watch some of them. I also encourage you to check out the live Twitter stream at <a href="http://search.twitter.com/search?q=%23tt09">#tt09</a> while it&#8217;s still available.<br />
</span></p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/23/transparent-text-symposium-day-2/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/23/transparent-text-symposium-day-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Transparent Text Symposium: Day 1</title>
		<link>http://thenoisychannel.com/2009/09/21/transparent-text-symposium-day-1/</link>
		<comments>http://thenoisychannel.com/2009/09/21/transparent-text-symposium-day-1/#comments</comments>
		<pubDate>Tue, 22 Sep 2009 03:37:25 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2584</guid>
		<description><![CDATA[Wow, what an intense day at the Transparent Text symposium! I won&#8217;t try to give detailed summaries of the talks&#8211;videos will be posted after the conference, and you can get a pretty good picture from the live tweet stream at #tt09. Instead, I&#8217;ll try to capture my personal highlights and reactions. I&#8217;ll start with Deputy [...]]]></description>
			<content:encoded><![CDATA[<p>Wow, what an intense day at the <a href="http://www.research.ibm.com/social/transparent_text/">Transparent Text</a> symposium! I won&#8217;t try to give detailed summaries of the talks&#8211;videos will be posted after the conference, and you can get a pretty good picture from the live tweet stream at <a href="http://search.twitter.com/search?q=%23tt09">#tt09</a>. Instead, I&#8217;ll try to capture my personal highlights and reactions.</p>
<p>I&#8217;ll start with Deputy U.S. CTO <a href="http://www.nyls.edu/faculty/faculty_profiles/beth_simone_noveck">Beth Noveck</a>&#8216;s keynote about the <a href="http://www.whitehouse.gov/open/">Open Government Initiative</a>. First, the very existence of such an initiative is incredible, given the culture of secrecy traditionally associated with Washington. Second, I like the top priority of releasing raw data so that other people can work on analyzing it, visualizing it, and generally making it more accessible either to the general public or to particular interest groups. This is very much what I had in mind in January when I posted &#8220;<a href="http://thenoisychannel.com/2009/01/20/information-sharing-we-can-believe-in/">Information Sharing We Can Believe In</a>&#8221; and I&#8217;m glad to see tangible progress. I was never a big fan of faith-based initiatives. <img src='http://thenoisychannel.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>The next session was a group of talks about watchdogs and accountability&#8211;people looking at how to ensure government transparency from the outside. New York Times editor <a href="http://topics.nytimes.com/topics/reference/timestopics/people/p/aron_pilhofer/index.html">Aron Pilhofer</a> and software developer <a href="http://ashkenas.com/">Jeremy Ashkenas</a> talked about <a href="http://www.documentcloud.org/">DocumentCloud</a>, an ambitious project to enable <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploratory search</a> for news documents on the open web. <a href="http://www.sunlightfoundation.com/">Sunlight Foundation</a> co-founder and executive director <a href="http://www.sunlightfoundation.com/people/emiller/">Ellen Miller</a> offered a particularly compelling example of the power of visualization: a graph correlating the campaign contributions and earmark associated with a congressman under investigation. But my favorite presenter in this section was <a href="http://www.propublica.org/">ProPublica</a>&#8216;s <a href="http://www.propublica.org/site/author/amanda_michel">Amanda Michel</a>, whose thoughts about a &#8220;human test of transparency&#8221; are worth a talk in themselves. For now, I recommend you look at the two projects she discussed:<a href="http://projects.propublica.org/spotcheck/"> Stimulus Spot Check</a> and <a href="http://www.huffingtonpost.com/off-the-bus-reporter/a-new-era-begins_b_141197.html">Off the Bus</a>.</p>
<p>After lunch, we shifted gears from government transparency to more of a focus on text. The first of the two afternoon sessions was entitled &#8220;Analyzing the Written Record&#8221; and featured <a href="http://matthew.gray.org/">Matthew Gray</a> from <a href="http://books.google.com/">Google Books</a>, <a href="http://www.opencalais.com/users/tom">Tom Tague</a> from <a href="http://www.opencalais.com/">Open Calais</a> (a free text annotation service that almost all of the previous speakers raved about), and <a href="http://ethanzuckerman.com/">Ethan Zuckerman</a> from Harvard&#8217;s <a href="http://cyber.law.harvard.edu/">Berkman Center</a>. All of the talks were solid, but Ethan&#8217;s was outstanding. I <a href="http://thenoisychannel.com/2009/03/11/media-cloud-watch-analyze-learn/">blogged</a> about his <a href="http://www.mediacloud.org/">Media Cloud</a> project back in March, but it&#8217;s come a long was in the past six months and is doing something I&#8217;ve been waiting years to see someone do: comparing how different news organizations select and cover news.</p>
<p>The final session was about visualization.  <a href="http://davidsmall.com/">David Small</a> offered a presentation about literally transparent text that was, in the words of <a href="http://twitter.com/nrchtct/status/4154937460"><span>Marian Dörk</span></a>, &#8220;<span id="msgtxt4154937460">refreshingly non-utilitarian and visually stimulating&#8221;. <a href="http://benfry.com/">Ben Fry</a> showed the power of visualizing changes in a document over time&#8211;specifically, a project called &#8220;<a href="http://www.benfry.com/traces/">the preservation of favoured traces</a>&#8221; that illustrates  the evolution of Darwin&#8217;s <a href="http://en.wikipedia.org/wiki/On_the_Origin_of_Species"><em>On the Origin of Species</em></a>. But, as expected, IBM&#8217;s <a href="http://manyeyes.alphaworks.ibm.com/manyeyes/">Many Eyes</a> researchers </span><a href="http://fernandaviegas.com/">Fernanda Viégas</a> and <a href="http://www.bewitched.com/">Martin Wattenberg</a> stole the show with an incredibly informative and entertaining presentation about the visualization of repetition in text. No summary can do it justice, so I urge you to watch the video when it is available.</p>
<p>After all that, we enjoyed a nice reception at the <a href="http://www.research.ibm.com/social/"> IBM Center for Social Software</a>. I&#8217;m incredibly grateful to IBM for organizing and sponsoring this event, and to Martin Wattenberg for being so kind as to invite me. I&#8217;ll try to earn my keep in my 5 minutes at the &#8220;Ignite-style&#8221; session tomorrow morning.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/21/transparent-text-symposium-day-1/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/21/transparent-text-symposium-day-1/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Live Tweeting from Transparent Text Symposium</title>
		<link>http://thenoisychannel.com/2009/09/21/live-tweeting-from-transparent-text-symposium/</link>
		<comments>http://thenoisychannel.com/2009/09/21/live-tweeting-from-transparent-text-symposium/#comments</comments>
		<pubDate>Mon, 21 Sep 2009 15:43:21 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2580</guid>
		<description><![CDATA[As promised, I&#8217;ll blog about the two-day Transparent Text symposium when it&#8217;s over and I have a chance to collect and express my thoughts. But for now you can follow the live Twitter stream at #tt09.]]></description>
			<content:encoded><![CDATA[<p>As promised, I&#8217;ll blog about the two-day <a href="http://www.research.ibm.com/social/transparent_text/">Transparent Text</a> symposium when it&#8217;s over and I have a chance to collect and express my thoughts. But for now you can follow the live Twitter stream at <a href="http://search.twitter.com/search?q=%23tt09">#tt09</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/21/live-tweeting-from-transparent-text-symposium/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/21/live-tweeting-from-transparent-text-symposium/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Transparent Text Symposium</title>
		<link>http://thenoisychannel.com/2009/09/19/transparent-text-symposium/</link>
		<comments>http://thenoisychannel.com/2009/09/19/transparent-text-symposium/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 12:22:20 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2568</guid>
		<description><![CDATA[One of the unexpected benefits of accepting an invitation to speak at SIGMOD 2009 was an invitation from fellow participant Martin Wattenberg to attend the upcoming Transparent Text symposium at the IBM Center for Social Software: The Transparent Text symposium is a free event that will focus on ways to make large collections of documents [...]]]></description>
			<content:encoded><![CDATA[<p>One of the unexpected benefits of accepting an invitation to <a href="http://thenoisychannel.com/2009/07/02/the-wild-world-of-sigmod/">speak at SIGMOD 2009</a> was an invitation from fellow participant <a href="http://www.bewitched.com/">Martin Wattenberg</a> to attend the upcoming <a href="http://www.research.ibm.com/social/transparent_text/">Transparent Text</a> symposium at the <a href="http://www.research.ibm.com/social/">IBM Center for Social Software</a>:</p>
<blockquote><p>The Transparent Text symposium is  			a free event that will focus on ways to make large collections of  			documents understandable to laypeople and experts alike. We are  			interested in approaches that shed light on unstructured text,  			ranging from novel statistical techniques to web-based crowdsourcing.</p></blockquote>
<p>The<a href="http://www.research.ibm.com/social/transparent_text/participants.html"> speaker list</a> is impressive, ranging from familiar (at least to me) interface experts  <a href="http://benfry.com/">Ben Fry</a> and <a href="http://people.ischool.berkeley.edu/%7Ehearst/">Marti Hearst</a> to social scientist <a href="http://gking.harvard.edu/">Gary King</a> and <a href="http://www.sunlightfoundation.com/">Sunlight Foundation</a> Executive Director <a href="http://www.sunlightfoundation.com/people/emiller/">Ellen Miller</a>. IBM also contributed some of its own researchers to the program, including <a href="http://money.cnn.com/2009/06/26/technology/ibm_jeopardy_watson_computer/?postversion=2009062616">David Ferrucci</a>, who has been leading the <a href="http://thenoisychannel.com/2009/04/27/who-wants-to-play-jeopardy/">Jeopardy</a> project. There&#8217;s even an &#8220;<span>Ignite-style&#8221; session where all attendees will have the opportunity to give five-minute presentations.</span></p>
<p><span>I&#8217;m looking forward to the eclectic mix of speakers and attendees. As <a href="http://www.cdixon.org/?p=989">Chris Dixon</a> recently reminded us, it&#8217;s important to introduce some randomization into our intellectual diets so that we don&#8217;t get stuck in a rut of local optimization. While an event with a theme of transparency and interacting with textual information is hardly a detour for me, I am excited about the opportunity to hear a diversity of new perspectives on this topic. There will be videos of the speakers posted after the event, as well as a  Twitter stream at</span><span id="msgtxt4105774644"> <a href="http://search.twitter.com/search?q=%23tt09">#tt09</a>.<br />
</span></p>
<p><span>Of course, I&#8217;ll blog about what I learn and recycle it in the discussion activities at the <a href="http://cuaslis.org/hcir2009/">HCIR workshop</a> next month.<br />
</span></p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/19/transparent-text-symposium/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/19/transparent-text-symposium/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Blogs I Read: The Haystack Blog</title>
		<link>http://thenoisychannel.com/2009/09/17/blogs-i-read-the-haystack-blog/</link>
		<comments>http://thenoisychannel.com/2009/09/17/blogs-i-read-the-haystack-blog/#comments</comments>
		<pubDate>Thu, 17 Sep 2009 13:19:28 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2564</guid>
		<description><![CDATA[It&#8217;s been quite the week in tech business news, with Adobe acquiring Omniture, Google acquiring reCAPTCHA and being rumored (falsely) to acquire Brightcove, Facebook announcing that is has over 300M users and is cash-flow positive, and Twitter closing a new round of funding at a $1B valuation. Recession? What recession? But sometimes I like to [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been quite the week in tech business news, with <a href="http://www.adobe.com/aboutadobe/invrelations/adobeandomniture.html">Adobe acquiring Omniture</a>, <a href="http://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html">Google acquiring reCAPTCHA</a> and being <a href="http://www.businessinsider.com/google-to-buy-brightcove-2009-9">rumored</a> (falsely) to acquire Brightcove, Facebook announcing that is has <a href="http://blog.facebook.com/blog.php?post=136782277130">over 300M users and is cash-flow positive</a>, and Twitter closing a new round of funding at a <a href="http://www.techcrunch.com/2009/09/16/twitter-closing-new-venture-round-with-1-billion-valuation/">$1B valuation</a>. Recession? What recession?</p>
<p>But sometimes I like to get away from all that and turn back to my roots inside the ivory tower. And that leads me to one of my favorite university blogs: the <a href="http://groups.csail.mit.edu/haystack/blog/">Haystack Blog</a>.</p>
<p>The Haystack Blog is published by faculty and grad students in the <a href="http://www.csail.mit.edu/">MIT Computer Science and AI Lab (CSAIL)</a>&#8211;specifically those in the <a href="http://groups.csail.mit.edu/haystack/">Haystack</a> group. Principal Investigator (and occasional <a href="http://stellar.mit.edu/S/pe/2009q5/0304.1/index.html">dance instructor</a>) <a href="http://people.csail.mit.edu/karger/">David Karger</a> is its most prolific blogger&#8211;you might have read some of his <a href="http://groups.csail.mit.edu/haystack/blog/?s=sigir09">SIGIR 2009 posts</a> or his debate with <a href="http://www.betaversion.org/~stefano/linotype/">Stefano Mazzocchi</a> about <a href="http://groups.csail.mit.edu/haystack/blog/?s=Stefano+RDF">how to properly use RDF</a>. But other people&#8217;s posts are just as interesting&#8211;check out the most recent post by <a href="http://www.mit.edu/~ebakke/">Eirik Bakke</a> about <a href="http://groups.csail.mit.edu/haystack/blog/2009/09/16/spreadsheets-vs-relational-databases-bridging-the-gap/">bridging the gap between spreadsheets and relational databases</a>.</p>
<p>I wish that more universities and departments would encourage their faculty and students to blog. As <a href="http://www.daniel-lemire.com/blog/">Daniel Lemire</a> has pointed out, it&#8217;s a great way for academic researchers to get their ideas out and build up their reputations and networks. He should know&#8211;he leads by example. Likewise, Haystack is setting a great example for university blogs, and is a credit to MIT and CSAIL.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/17/blogs-i-read-the-haystack-blog/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/17/blogs-i-read-the-haystack-blog/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>The Ethics of Blogging</title>
		<link>http://thenoisychannel.com/2009/09/10/the-ethics-of-blogging/</link>
		<comments>http://thenoisychannel.com/2009/09/10/the-ethics-of-blogging/#comments</comments>
		<pubDate>Thu, 10 Sep 2009 15:53:42 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2527</guid>
		<description><![CDATA[A few people have commented that the events I advertise here tend to be expensive&#8211;or, worse, require a lot of work to get into! So I&#8217;m glad to announce a freebie that I hope will be as much fun for me as for attendees. I&#8217;ve been invited to participate in a webinar on the ethics [...]]]></description>
			<content:encoded><![CDATA[<p>A few people have commented that the events I advertise here tend to be expensive&#8211;or, worse, require a lot of work to get into! So I&#8217;m glad to announce a freebie that I hope will be as much fun for me as for attendees.</p>
<p>I&#8217;ve been invited to participate in a <a href="http://www.socialmediatoday.com/submitform/smtwebinar092409/?reference=smt_dtunkelang">webinar on the ethics of blogging</a> that will take place Thursday, September 24th at 1 PM EST. It&#8217;s free to attend; just register online <a href="http://www.socialmediatoday.com/submitform/smtwebinar092409/?reference=smt_dtunkelang">here</a>.</p>
<p><a href="http://socialmediagroup.com/about/">Maggie Fox</a>, founder and CEO of <a href="http://socialmediagroup.com/">Social Media Group</a>, will moderate. My two co-panelists are<a href="https://twitter.com/augieray"> Augie Ray</a>, who blogs at <a href="http://www.experiencetheblog.com/">Experience: The Blog</a>) and is Managing Director of Experiential Marketing at interactive and social media agency <a href="http://www.fullhouseinteractive.com/">Fullhouse</a>, and <a href="http://johnjantsch.com/">John Jantsch</a>, who blogs at <a href="http://www.ducttapemarketing.com/blog/">Duct Tape Marketing</a> and is a marketing and digital technology coach.</p>
<p>Among the topics to be discussed:</p>
<ul>
<li>Transparency: How and when should a blogger reveal revenue sources?</li>
<li>Pay for play: Blog posts, tweets, and more as marketing tools</li>
<li>Online privacy</li>
<li>Astroturfing: Organizations creating artificial &#8220;grassroots&#8221; campaigns</li>
<li>Compliance and Legal: What should a corporate blog policy look like? What are a blogger&#8217;s legal obligations?</li>
</ul>
<p>I hope some of you will be able to attend! Regardless, please use the comment thread make suggestions here about topics you&#8217;d like me to cover or concerns you&#8217;d like to see me address. I know that a lot of you have thought hard about these issues, and I&#8217;d like to ethically exploit your collective wisdom.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/10/the-ethics-of-blogging/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/10/the-ethics-of-blogging/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Another Project to Measure Twitter Influence</title>
		<link>http://thenoisychannel.com/2009/09/03/another-project-to-measure-twitter-influence/</link>
		<comments>http://thenoisychannel.com/2009/09/03/another-project-to-measure-twitter-influence/#comments</comments>
		<pubDate>Thu, 03 Sep 2009 21:23:04 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2510</guid>
		<description><![CDATA[Just noticed that the Web Ecology Project has published &#8220;The Influentials: New Approaches for Analyzing Influence on Twitter&#8220;. The blog post includes a link to their full report. Their approach strikes me as a generalization of measuring retweets, but perhaps I&#8217;m giving it too cursory a read. I did compare their results to TunkRank: we [...]]]></description>
			<content:encoded><![CDATA[<p>Just noticed that the Web Ecology Project has published &#8220;<a href="http://www.webecologyproject.org/2009/09/analyzing-influence-on-twitter/">The Influentials: New Approaches for Analyzing Influence on Twitter</a>&#8220;. The blog post includes a link to their <a href="http://www.webecologyproject.org/wp-content/uploads/2009/09/influence-report-final.pdf">full report</a>.</p>
<p>Their approach strikes me as a generalization of measuring retweets, but perhaps I&#8217;m giving it too cursory a read. I did compare their results to <a href="http://tunkrank.com/">TunkRank</a>: we at least agree that <a href="http://tunkrank.com/score/mashable">mashable</a> is more influential than <a href="http://tunkrank.com/score/cnn">CNN</a>&#8211;though even as simple a measure as follower count would confirm that judgment.</p>
<p>Anyway, I am delighted to see serious researchers looking at this problem. I&#8217;m still hoping to investigate hypotheses regarding TunkRank and friend:follower ratios.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/09/03/another-project-to-measure-twitter-influence/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/09/03/another-project-to-measure-twitter-influence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Finding, Locating, Discovering</title>
		<link>http://thenoisychannel.com/2009/08/31/finding-locating-discovering/</link>
		<comments>http://thenoisychannel.com/2009/08/31/finding-locating-discovering/#comments</comments>
		<pubDate>Mon, 31 Aug 2009 16:10:55 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2503</guid>
		<description><![CDATA[Thanks to Tony Hollingsworth for alerting me to a post by Alex Campbell entitled &#8220;Stark realisation: I no longer depend on Google to find stuff&#8220;. The title is provocative link bait, but the take-away is very down to earth: Google is primarily useful for locating information than for discovering it. Library scientists make a distinction [...]]]></description>
			<content:encoded><![CDATA[<p>Thanks to <a href="http://twitter.com/hollingsworth/statuses/3661598935">Tony Hollingsworth</a> for alerting me to a post by Alex Campbell entitled &#8220;<a href="http://www.alexjcampbell.com/post/175271559/stark-realisation-i-no-longer-depend-on-google-to-find">Stark realisation: I no longer depend on Google to find stuff</a>&#8220;. The title is provocative link bait, but the take-away is very down to earth: Google is primarily useful for locating information than for discovering it.</p>
<p>Library scientists make a distinction between <a href="http://www.db.dk/bh/core%20concepts%20in%20lis/articles%20a-z/known_item_search.htm">known-item</a> and <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploratory</a> search. The former is about locating information: as an information seeker, you know the information exists, and you can even characterize it unambiguously; but the challenge is to convert that description into a location that allows you to retrieve the information. The latter is about discovery: you don&#8217;t know that the information you seek exists, and you may be sure of how to characterize what you are looking for&#8211;or even know what exactly you want until you&#8217;ve learned something about what is available.</p>
<p>These are extreme points on the information seeking spectrum, and most real-world tasks are in the middle, or combine subtasks of both types. For example, in physical libraries (yes, I&#8217;m that old!), I remember finding a book in the stacks and then browsing the nearby books in the hopes of serendipitous discovery. These days, I&#8217;d be more likely to scan its bibliography&#8211;or to look at the books and articles citing it. A known item can be an excellent entry point for exploration. Conversely, exploration can lead you to discover the existence of information that you then simply need to retrieve.</p>
<p>In common use, words like searching and finding cover this entire spectrum of information seeking activity. This breadth of meaning causes a lot of confusion. I&#8217;ve blogged about this before: &#8220;<a href="http://thenoisychannel.com/2008/12/02/what-is-not-search/">What is (Not) Search?</a>&#8220;:</p>
<blockquote><p>At the very least, I propose that we distinguish “search” as a problem from “search” as a solution. By the former, I mean the problem of <a href="http://en.wikipedia.org/wiki/Information_seeking">information seeking</a>, which is traditionally the domain of <a href="http://en.wikipedia.org/wiki/Library_science">library</a> and <a href="http://en.wikipedia.org/wiki/Information_science">information</a> scientists. By the latter, I mean the approach most commonly associated with <a href="http://en.wikipedia.org/wiki/Information_retrieval">information retrieval</a>, in which a user enters a query into the system (typically as free text) and the system returns a set of objects that match the query, perhaps with different degrees of relevancy.</p></blockquote>
<p>Back to Campbell&#8217;s article. His main points:</p>
<ul>
<li>Social networks have dramatically expanded our network of contacts.</li>
<li>Search engine optimization (SEO) experts have killed their own game.</li>
<li>The flow of information has changed: information now comes to us, rather than us having to go out and find it.</li>
</ul>
<p>I like the spirit of the post, but I think he overstates his case. <a href="http://en.wikipedia.org/wiki/Search_engine_optimization">SEO</a> isn&#8217;t all bad&#8211;in fact, it&#8217;s probably a key factor in Google&#8217;s effectiveness. And, while social networks enable social search in theory, and information does come to us; we are experiencing <a href="http://web2expo.blip.tv/file/1277460/">filter failure</a> (Clay Shirky&#8217;s term) in a big way.</p>
<p>My conclusion: I agree with him about Google&#8217;s limitations&#8211;Google is primarily a locating tool, not a discovery tool. Unfortunately, I&#8217;m not persuaded that social networks and our theoretical ability to construct an ideal in-flow of information have actually delivered on the promise of more efficient information access. But I&#8217;m optimistic that we&#8217;ll eventually get there.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/31/finding-locating-discovering/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/31/finding-locating-discovering/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Free as in Freebase</title>
		<link>http://thenoisychannel.com/2009/08/29/free-as-in-freebase/</link>
		<comments>http://thenoisychannel.com/2009/08/29/free-as-in-freebase/#comments</comments>
		<pubDate>Sat, 29 Aug 2009 18:25:49 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2488</guid>
		<description><![CDATA[It&#8217;s been a while since I&#8217;ve blogged about Freebase, the semantic web database maintained by Metaweb. But I recently had the chance to meet Freebasers Robert Cook and Jamie Taylor and hear them present to the New York Semantic Web Meetup on &#8220;Content, Identifiers and Freebase&#8221; (slides embedded above). It was a fun and informative [...]]]></description>
			<content:encoded><![CDATA[<div id="__ss_1921800" style="width: 425px; text-align: left;"><object style="margin:0px" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=nycmeetup-jt-aug09-090828181330-phpapp02&amp;stripped_title=nyc-semantic-web-meetup-aug-2009" /><param name="allowfullscreen" value="true" /><embed style="margin:0px" type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=nycmeetup-jt-aug09-090828181330-phpapp02&amp;stripped_title=nyc-semantic-web-meetup-aug-2009" allowscriptaccess="always" allowfullscreen="true"></embed></object></div>
<div style="width: 425px; text-align: left;"></div>
<div id="__ss_1921800" style="width: 425px; text-align: left;">
<p>
It&#8217;s been a while since I&#8217;ve blogged about <a href="http://www.freebase.com/">Freebase</a>, the <a href="http://en.wikipedia.org/wiki/Semantic_Web">semantic web</a> database maintained by <a href="http://www.metaweb.com/">Metaweb</a>. But I recently had the chance to meet Freebasers <a href="http://www.freebase.com/view/en/robert_cook">Robert Cook</a> and <a href="http://www.freebase.com/view/en/jamie_taylor">Jamie Taylor</a> and hear them present to the <a href="http://semweb.meetup.com/25/">New York Semantic Web Meetup</a> on &#8220;<a href="http://semweb.meetup.com/25/calendar/10966857/">Content, Identifiers and Freebase</a>&#8221; (slides embedded above).</p>
<p>It was a fun and informative presentation. Perhaps the most surprising revelation about Freebase was that all of their data fits in RAM on a 32G box (yes, some of you caught me <a href="http://twitter.com/dtunkelang/status/3590696944">live-tweeting</a> that during the presentation). Their biggest challenge is collecting good data that lends itself to the <a href="http://blog.freebase.com/2008/05/13/new-api-service-reconciliation/">reconciliation</a> needed to make Freebase useful as a data repository. Despite the lack of a near-term revenue model, the Freebasers are bullish about their approach: strong identifiers, strong semantics, open data. On the last point, almost all of Freebase is available under the  <a href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution License (CC-BY)</a>&#8211;which, as far as I can tell, make anyone free to develop a mirror of Freebase. Indeed, many people are using this data, including <a href="http://newstimeline.googlelabs.com/">Google</a> and <a href="http://blog.freebase.com/2009/07/13/bing-structured-search-results-powered-by-freebase/">Bing</a>.</p>
<p>You might wonder whether Freebase is a business or a non-profit foundation&#8211;and the question did come up. The answer is that Freebase eventually expects to make money by providing services, e.g., helping advertisers. They see their <a href="http://en.wikipedia.org/wiki/Triplestore">graph store</a> as a competitive advantage&#8211;but they freely admit that this advantage will erode over time. Indeed, the surprisingly small size of their graph makes me wonder how much speed and scalability matter, compared to the challenge of data scarcity.</p>
<p>I&#8217;d like to see Freebase succeed. I&#8217;m particularly a fan of the work <a href="http://davidhuynh.net/">David Huynh</a> has done there on interfaces for semantic web browsing. Clearly their investors are true believers&#8211;Metaweb has raised a <a href="http://www.crunchbase.com/company/metawebtechnologies">total of $57M in funding</a>. I don&#8217;t quite get it, but I&#8217;m happy we can all benefit from the results.</div>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/29/free-as-in-freebase/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/29/free-as-in-freebase/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Social Networking: Theory and Practice</title>
		<link>http://thenoisychannel.com/2009/08/25/social-networking-theory-and-practice/</link>
		<comments>http://thenoisychannel.com/2009/08/25/social-networking-theory-and-practice/#comments</comments>
		<pubDate>Tue, 25 Aug 2009 13:45:49 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2483</guid>
		<description><![CDATA[I&#8217;ve been a student of social network theory for years, enjoying the work of Duncan Watts, Albert-László Barabási, Jon Kleinberg, and a number of other researchers investigating this field. It should be no surprise that a topic that is so core to our humanity has attracted attention from some of our best and brightest. And [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been a student of social network theory for years, enjoying the work of <a href="http://en.wikipedia.org/wiki/Duncan_J._Watts">Duncan Watts</a>, <a href="http://www.nd.edu/~alb/">Albert-László Barabási</a>, <a href="http://www.cs.cornell.edu/home/kleinber/">Jon Kleinberg</a>, and a number of other researchers investigating this field. It should be no surprise that a topic that is so core to our humanity has attracted attention from some of our best and brightest.</p>
<p>And I&#8217;ve dabbled a bit on the theoretical side myself. The <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">TunkRank</a> measure (I&#8217;m indebted to <a href="http://twitter.com/ealdent">Jason Adams</a> for his <a href="http://tunkrank.com/">implementing it on a live site</a>!) attempts to take the most basic assumption about our social behavior&#8211;the constraint that we have a finite attention budget&#8211;and explore its implications for influence over social networks. I have a few unexplored hypotheses queued up for when I can find the spare time to try validate them empirically!</p>
<p>But why settle for theory? We live in an age where social networks compete with web search (and perhaps complement search) as the hottest online technologies. If we&#8217;re not reading about Google vs. Bing, we&#8217;re reading about Facebook vs. Twitter, with LinkedIn offering a third way that seems to co-exist with its more storied peers. In this post, I&#8217;d like to focus on LinkedIn.</p>
<p>LinkedIn, despite its feature creep, is still fairly old-school: its raison d&#8217;être is for users to build, maintain, and exploit their professional networks. In theory, connections on LinkedIn represent present or past working relationships that become the basis for referrals&#8211;whether the goal is employment, sales, or partnership. LinkedIn is not the only professionally oriented social network, but at this point it&#8217;s certainly the dominant one.</p>
<p>But I&#8217;ve found at least two additional ways to use LinkedIn that I&#8217;d like to share:</p>
<p><strong>Intelligence gathering</strong>. For reasons I don&#8217;t yet claim to understand, people share far more information about themselves&#8211;and in a much cleaner, structured form&#8211;on LinkedIn than in perhaps any other online medium. Most people&#8217;s resumes are not available online, but their LinkedIn profiles are tantamount to resumes. Moreover, their structured format makes it possible for LinkedIn to assemble aggregate profiles of companies, revealing composite pictures that must drive some of those companies&#8217; legal and HR departments batty! At a higher level, LinkedIn also works well as a discovery tool&#8211;much more so now they&#8217;ve enabled faceted search. It&#8217;s still a bit tricky to explore people and companies by topic, but far more effective using LinkedIn than using any other tool I&#8217;m aware of.</p>
<p><strong>Meeting new people</strong>. Cold-calling, spamming&#8211;pick your poison. In short, LinkedIn doesn&#8217;t have to only be about connecting with people you already know. But there&#8217;s an art to sending unsolicited messages: you have to pass the moral equivalent of a <a href="http://en.wikipedia.org/wiki/CAPTCHA">CAPTCHA</a> by proving that your communication strategy isn&#8217;t indiscriminate. Let me use a personal example (that Maisha Walker was nice enough to write up in her <a href="http://blog.inc.com/e-commerce/2009/08/linkedin_small_business_success.html">Inc. magazine column</a>). I decided that I wanted to find everyone on LinkedIn who might be interested in <a href="http://cuaslis.org/hcir2009/">HCIR &#8217;09</a>. So I searched for everyone whose profiles indicated interests in both <a href="http://en.wikipedia.org/wiki/Information_retrieval">IR</a> and <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_interaction">HCI</a> and sent out a targeted message (in fact, a invite with personalized message&#8211;a feature I recently <a href="http://thenoisychannel.com/2009/08/18/linkedin-no-longer-allowing-invite-messages/">feared they&#8217;d killed</a>). The results were overwhelmingly positive. I&#8217;m not sure how many of the people I contacted will attend, but I raised awareness without inflicting annoyance. Better yet, one of the people I contacted then discovered I <a href="http://thenoisychannel.com/2009/04/17/booking-it-to-the-finish-line/">was looking for volunteers</a> to review the draft of my <a href="http://thenoisychannel.com/faceted-search-the-book/">book</a>&#8211;and I thus obtained hours of help of someone who, just a day before, had never heard of me!</p>
<p>What intrigues me about LinkedIn (and other social networks) is the extent to which I am exploiting attention market inefficiencies (as LinkedIn may be doing as well). For example, LinkedIn makes it easy to send unsolicited invitations to anyone. Granted, you can lose this privilege by even having a couple of people respond to invitations with &#8220;I don&#8217;t know this person&#8221;. There&#8217;s also the question of why people&#8217;s social norms around disclosure are so different on LinkedIn than anywhere else&#8211;people not only post the content of their resumes, but go through the effort of providing it to LinkedIn in a structured form! Meanwhile, LinkedIn keeps tightfisted control over the information it aggregates&#8211;understandably, they recognize that this content is their most valuable asset.</p>
<p>People are still getting used to the idea of social networks. It will be interesting to see how their use evolves, particularly in term of information and attention market efficiency.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/25/social-networking-theory-and-practice/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/25/social-networking-theory-and-practice/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>UIE Virtual Seminar on Faceted Search: A Great Experience!</title>
		<link>http://thenoisychannel.com/2009/08/20/uie-virtual-seminar-on-faceted-search-a-great-experience/</link>
		<comments>http://thenoisychannel.com/2009/08/20/uie-virtual-seminar-on-faceted-search-a-great-experience/#comments</comments>
		<pubDate>Fri, 21 Aug 2009 00:21:33 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2475</guid>
		<description><![CDATA[Pete Bell and I delivered the seminar today, and it was a blast! We had over 150 registered listeners&#8211;and I found out that at least one of those registrations corresponded to a roomful of 20 people at an online retailer that is a thought leader in web usability and design! Since we didn&#8217;t manage to [...]]]></description>
			<content:encoded><![CDATA[<p>Pete Bell and I delivered the <a href="http://www.uie.com/events/virtual_seminars/facets/">seminar</a> today, and it was a blast! We had over 150 registered listeners&#8211;and I found out that at least one of those registrations corresponded to a roomful of 20 people at an online retailer that is a thought leader in web usability and design!</p>
<p>Since we didn&#8217;t manage to get to all of the questions (over 40&#8211;possibly over 50 counting the activity on <a href="http://search.twitter.com/search?q=%23uievs">Twitter</a>!), we&#8217;re going to do a follow-up podcast that will be available even to people who didn&#8217;t attend the seminar. And, since even that might not be enough, I&#8217;m saving all of the questions as blog fodder.</p>
<p>To all who attended&#8211;and to Jared, Adam, and all the folks of UIE&#8211;thanks from me and Pete for giving us this great opportunity to connect with folks interested in faceted search and user experience.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/20/uie-virtual-seminar-on-faceted-search-a-great-experience/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/20/uie-virtual-seminar-on-faceted-search-a-great-experience/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>LinkedIn No Longer Allowing Invite Messages?</title>
		<link>http://thenoisychannel.com/2009/08/18/linkedin-no-longer-allowing-invite-messages/</link>
		<comments>http://thenoisychannel.com/2009/08/18/linkedin-no-longer-allowing-invite-messages/#comments</comments>
		<pubDate>Tue, 18 Aug 2009 19:30:13 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2461</guid>
		<description><![CDATA[I noticed recently that, when I sent out an invitation to connect to someone on LinkedIn, there wasn&#8217;t the usual slot for including a free-text note with the invitation. I thought it might be a glitch&#8211;and I even considered the possibility that this was only happening to my account because I&#8217;m a bit of a [...]]]></description>
			<content:encoded><![CDATA[<p>I noticed recently that, when I sent out an invitation to connect to someone on LinkedIn, there wasn&#8217;t the usual slot for including a free-text note with the invitation. I thought it might be a glitch&#8211;and I even considered the possibility that this was only happening to my account because I&#8217;m a bit of a networking junkie.</p>
<p>But I noticed <a href="http://twitter.com/Mr_Linkedin/statuses/3360784769">on Twitter today</a> that Mark Williams (aka <a href="http://twitter.com/Mr_Linkedin">@Mr_LinkedIn</a>) had noticed the same change and followed up on it with LinkedIn&#8217;s customer service department. I never assume any site behavior on a freely provided service is permanent, but it is starting to look like this is a deliberate decision and not a transient bug.</p>
<p>If so, it&#8217;s an annoying change, though I can see the merits. I&#8217;ve made heavy use of the connection message, especially when inviting someone I don&#8217;t know all that well&#8211;or don&#8217;t know at all. A personal message can be what distinguishes a welcome cold call from spam. But I&#8217;m guessing that others have abused that capability, filling it with spam or worse. Still, I feel like LinkedIn may be throwing the baby out with the bathwater. Will follow up if / when I hear more.</p>
<p><strong>UPDATE: Just saw this message on the <a href="http://linkedin.custhelp.com/cgi-bin/linkedin.cfg/php/enduser/std_adp.php?p_faqid=2162">LinkedIn</a> site via <a href="http://twitter.com/LinkedIn/status/3368903249">Twitter</a>:</strong></p>
<blockquote><p><strong>Unable to Personalize Invitation Message</strong></p>
<div id="questiontext">
<div id="desc"><!-- This div is for console answer preview, control of access levels. -->Why can&#8217;t I personalize the message in my Invitation?</div>
</div>
<p>We are aware of an issue preventing some members from customizing their Invitation messages. There is no need to contact Customer Service as our team is reviewing the issue to determine the best overall solution.</p>
<p>As a temporary workaround, the following message (with your name in the signature) is being sent when you click on the &#8216;Send Invitation&#8217; button: &#8216;I&#8217;d like to add you to my professional network on LinkedIn.&#8217;</p>
<p>As long as you approve of this message, you may continue to take advantage of this feature. If you prefer a more customized message to be sent, you may delay sending your Invitations until the functionality has been restored.</p></blockquote>
<p><strong>UPDATE #2: Looks like the problem is resolved.</strong></p>
<p><!-- This div is for console answer preview, control of access levels. --></p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/18/linkedin-no-longer-allowing-invite-messages/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/18/linkedin-no-longer-allowing-invite-messages/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Why Does Google Hold Back On Faceted Search?</title>
		<link>http://thenoisychannel.com/2009/08/14/why-does-google-hold-back-on-faceted-search/</link>
		<comments>http://thenoisychannel.com/2009/08/14/why-does-google-hold-back-on-faceted-search/#comments</comments>
		<pubDate>Fri, 14 Aug 2009 21:24:00 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2446</guid>
		<description><![CDATA[Sometimes the response to a comment is worthy of an entire post, and this is one of those times. In response to my recent post about Able Grape, a wine search engine developed by Doug Cook (now Director of Twitter Search), Lee asked: Let&#8217;s say I know almost nothing about wines/digital cameras/cars and a search [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes the response to a comment is worthy of an entire post, and this is one of those times. In response to my recent <a href="http://thenoisychannel.com/2009/08/13/an-able-grape-at-the-helm-of-twitter-search/">post about Able Grape</a>, a wine search engine developed by Doug Cook (now Director of Twitter Search), <a href="http://thenoisychannel.com/2009/08/13/an-able-grape-at-the-helm-of-twitter-search/#comment-4174">Lee asked</a>:</p>
<blockquote><p>Let&#8217;s say I know almost nothing about wines/digital cameras/cars and a search site offers me &#8220;options&#8221; to drill down. However, I can&#8217;t use those effectively and eventually it comes down to availability and price for me. My questions are what are your thoughts on these kinds of situations and is there a scientific explanation/theory on this case?</p>
<p>This may be why Google does not endorse faceted search except for experimental projects.</p></blockquote>
<p>It&#8217;s a great question. There&#8217;s been a lot of research on how people make decisions when they have to manage trade-offs among multiple attributes, and the increasing interest in <a href="http://en.wikipedia.org/wiki/Behavioral_economics">behavioral economics</a> since <a href="http://en.wikipedia.org/wiki/Daniel_Kahneman">Daniel Kahneman</a> won the Nobel Prize in 2002 has helped some of that research has even percolated into the mainstream thanks to bestsellers like <a href="http://en.wikipedia.org/wiki/Freakonomics"><em>Freakonomics</em></a> and Dan Ariely&#8217;s <a href="http://www.predictablyirrational.com/"><em>Predictable Irrationality</em></a>.</p>
<p>The short answer is that there&#8217;s no point in offering users options that they can&#8217;t (or won&#8217;t) use effectively. <a href="http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less">Choice overload</a> is certainly a problem, and our reaction to it is to <a href="http://en.wikipedia.org/wiki/Satisficing">satisfice</a>, typically resorting to &#8220;<a href="http://fastandfrugal.com/">fast and frugal</a>&#8221; heuristics that throw out most of the potential decision criteria and instead focus on one or two attributes, e.g., price and availability.</p>
<p>But that&#8217;s no reason to dumb down the data we make available to decision makers. We make hard choices all the time, and fast and frugal can be horrendously suboptimal. We don&#8217;t hire employees based solely on their price and availability&#8211;or at least good employers don&#8217;t! For that matter, I don&#8217;t think most people pick wines that way, given that even Trader Joe has to diversify beyond &#8220;<a href="http://en.wikipedia.org/wiki/Charles_Shaw_wine">Two Buck Chuck</a>&#8220;. And, while there&#8217;s probably more of a market for cheap cameras and cars, I&#8217;m pretty sure you&#8217;re an extreme outlier if you completely ignore other criteria.</p>
<p>That said, there are some caveats about exposing options to users. <a href="http://en.wikipedia.org/wiki/Faceted_search">Faceted search</a> is hard, especially on the open web. Take it <a href="http://thenoisychannel.com/2008/11/18/faceted-search-for-the-web-a-grand-challenge/">from the folks at Microsoft Research</a>&#8211;but I&#8217;m sure Googlers would be the first to agree, especially given their experience with projects like <a href="http://thenoisychannel.com/2009/06/04/google-squared-a-great-first-step/">Google Squared</a> that, while promising, are nowhere near ready for prime time.</p>
<p>I appreciate that Google is conservative about embracing faceted search&#8211;and <a href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">HCIR</a> in general. I&#8217;m actually impressed by the steadily improving quality of their related terms for search queries&#8211;even if they do hide them behind two clicks (show options -&gt; related searches). Perhaps they&#8217;re <a href="http://thenoisychannel.com/2009/06/17/google-markets-itself/">feeling some pressure</a> from Bing. But I think they&#8217;re largely following the dictum of &#8220;if it ain&#8217;t broke, don&#8217;t fix it&#8221;. Google is an extremely successful company. And, as <a href="http://en.wikipedia.org/wiki/Clayton_M._Christensen">Clayton Christensen</a> argues, successful companies are great at incremental innovation and bad at <a href="http://en.wikipedia.org/wiki/Disruptive_technology">disruptive innovation</a>. As far as I can tell, faceted search is very disruptive to their model.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/14/why-does-google-hold-back-on-faceted-search/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/14/why-does-google-hold-back-on-faceted-search/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>An Able Grape at the Helm of Twitter Search</title>
		<link>http://thenoisychannel.com/2009/08/13/an-able-grape-at-the-helm-of-twitter-search/</link>
		<comments>http://thenoisychannel.com/2009/08/13/an-able-grape-at-the-helm-of-twitter-search/#comments</comments>
		<pubDate>Thu, 13 Aug 2009 14:54:59 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2432</guid>
		<description><![CDATA[While I am an avid Twitter user (and apparently a tradeable commodity in a &#8220;Fantasy Twitter&#8221; game that some friends are playing), regular readers know that I&#8217;ve offered mixed reviews of Twitter Search. I&#8217;ve link-baited Summize founder and Twitter Chief Scientist Abdur Chowdhury here once or twice, but I understand that he&#8217;s no longer running [...]]]></description>
			<content:encoded><![CDATA[<p>While I am an avid Twitter user (and apparently a <a href="http://twitter.com/mmallin/statuses/3255811854">tradeable commodity</a> in a &#8220;Fantasy Twitter&#8221; game that some friends are playing), regular readers know that I&#8217;ve offered <a href="http://thenoisychannel.com/2009/05/09/the-twouble-with-twitter-search/">mixed reviews of Twitter Search</a>.</p>
<p>I&#8217;ve link-baited <a href="http://www.crunchbase.com/company/summize">Summize</a> founder and Twitter Chief Scientist <a href="http://twitter.com/abdur">Abdur Chowdhury</a> here <a href="http://thenoisychannel.com/2009/03/04/twitters-real-time-search-aint-that-hard/#comment-2215">once</a> or <a href="http://thenoisychannel.com/2009/03/15/a-scaling-challenge-for-twitter-search/#comment-2366">twice</a>, but I understand that he&#8217;s no longer running Twitter Search. They&#8217;ve got a new guy, <a href="http://www.linkedin.com/pub/doug-cook/3/a63/18b">Doug Cook</a>, as Director of Search.</p>
<p>This is great news, because Doug is someone who&#8217;s thought a lot about search and user experience. He was one of the early web search guys at <a href="http://en.wikipedia.org/wiki/Inktomi_Corporation">Inktomi</a> and also spent some time at Yahoo!, but what impresses me most is a project he&#8217;s pursued as a labor of love: <a href="http://ablegrape.com/">Able Grape</a>.</p>
<p>From their about page:</p>
<blockquote><p>We&#8217;re a wine search engine — not for comparison shopping, but for learning and research. We aim to be the world&#8217;s most comprehensive, up-to-date, and authoritative source for online wine information.</p></blockquote>
<p>Great, another vertical search engine, just what the world needs (unfortunately <a href="http://wordpress.org/development/2009/08/2-8-4-security-release/">WordPress 2.8.4</a> doesn&#8217;t support <a href="http://www.glennmcanally.com/sarcastic/">sarcastic font</a>). But seriously, Able Grape is worth a look, even if, like me, you are not a wine nerd. So wash your glasses and let&#8217;s have a quick tasting.</p>
<p>First off, Able Grape is not searching a proprietary document collection. Rather, it&#8217;s based on a focused crawl of &#8220;more than 38,000 sites and some 18 million pages.&#8221; In other words, Able Grape is in no position to ask anyone to add meta-data. Even at the site level, I doubt Doug had the time to customize the handling of content for each of 38,000 sites. In other words, there&#8217;s enough scale here to make the problem interesting.</p>
<p>Now let&#8217;s look at some <a href="http://ablegrape.com/en/help.html#faq7">examples</a> of the site in action. I&#8217;m a fan of Spanish wines, so I&#8217;ll start with one of their example queries, <a href="http://ablegrape.com/search.jsp?query=tempranillo">tempranillo</a>. The first page of results looks relevant to the topic, but so far that doesn&#8217;t distinguish them from <a href="http://www.google.com/search?q=tempranillo">Google</a>, <a href="http://search.yahoo.com/search?p=tempranillo">Yahoo</a>, or <a href="http://www.bing.com/search?q=tempranillo">Bing</a>. What surprises me is that the &#8220;Filter by Region&#8221; offers regions outside of Spain&#8211;like <a href="http://ablegrape.com/search.jsp?encodedParams=11bb8312540771234eb15afc143fa50ed5c86d4bb507bcd5d0182b283a8a869ed172982ed5018da581e02a4c2daf62d87e2ff090237708946ae124bfed4e0f00dd3cd394d5dd86fe7ba0de271790e1c28b97950a43175760c554f6f2e6fec245">California</a> and even <a href="http://ablegrape.com/search.jsp?encodedParams=1bd4e11cf19e40387da63b079e805b184a5e6b53211a8f0a5baefead3fee1848d172982ed5018da581e02a4c2daf62d87e2ff090237708946ae124bfed4e0f00dd3cd394d5dd86fe7ba0de271790e1c2d827ef4bfa34d75cf8061087be0ce4cb">New York</a>! Yes, I might have learned some of that from <a href="http://en.wikipedia.org/wiki/Tempranillo#Regions">Wikipedia</a>&#8211;though it would not even have occurred to me to ask about non-Spanish Tempranillo. That&#8217;s <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploratory search</a> and serendipitous discovery for you!</p>
<p>Let&#8217;s try a different example, this time not from their list. I like <a href="http://en.wikipedia.org/wiki/Malbec">Malbec</a> wines (which I associated with my maternal link to Argentina), but the only local wine region for me is the <a href="http://en.wikipedia.org/wiki/North_Fork_of_Long_Island_AVA">North Fork of Long Island</a>. So here&#8217;s a search for <a href="http://ablegrape.com/search.jsp?query=north+fork+malbec&amp;encodedParams=fec890854566bf87cca0dbb3dfec98a4257ada31b0790c5db6f420faa5cfff480978519eebc5d1279e87dea20af430cae6326aa4e72abe51bfded02e8ef5864b9ce543b455754affc3c94ccedc949651&amp;hreftarget=">north fork malbec</a>, filtered to <a href="http://ablegrape.com/search.jsp?encodedParams=1bd4e11cf19e40387da63b079e805b18e542a072be1b02058a08a910cbdf7317a97c585255ef0c68ce4cf96f9cdf70639cca6b31acf0bb0687a5ddd7ab47c4b0fbc61c7877df866ce347135fb09de21dc2a3032062caaaa6ab58eb3959ba369d3e6557b611cf02cd143424ffe41bfcfbcdc664e3f5b5fbbc49700065909078a080aaefa630a3eb32cfd3af6fe02feb3aa2ac8250704788e278ecabd086e12a2db53f9ea260ff99766d5d6eaf6db0b34a">Long Island</a>. It certainly gives me ideas of which wineries to check out on my next trip there. Though, to be fair to the competition, <a href="http://www.google.com/search?q=north+fork+malbec">G</a>/<a href="http://search.yahoo.com/search?p=north+fork+malbec">Y</a>/<a href="http://www.bing.com/search?q=north+fork+malbec">B</a> all handle this query pretty well&#8211;though none of them  offer refinement by region to disambiguate &#8220;north fork&#8221;.</p>
<p>Able Grape has lots of cool features, ranging from how they handle <a href="http://ablegrape.wordpress.com/2009/01/04/spiffy-new-language-features/">multilingual content</a> to clever use of constrained &#8220;wildcard&#8221; terms like <a href="http://ablegrape.wordpress.com/2008/09/10/fun-experimental-feature-variety-finder/">anyvariety</a> to match any wine variety (aka <a href="http://en.wikipedia.org/wiki/Varietal">varietal</a>). I suspect that there is much to learn from its design that applies to a broad variety (sorry!) of search applications.</p>
<p>I&#8217;m a wine dilletante, so it&#8217;s hard for me to spend too much time on this site without any deep-seated information needs to fulfill. But I&#8217;m a card-carrying member of searchaholics anonymous (well, maybe not so anonymous), and I&#8217;m impressed by what Doug&#8217;s done with this vertical.</p>
<p>Which brings us back to Twitter Search. Director of Search for Twitter is a high-profile, high-pressure job, even without Facebook <a href="http://www.google.com/hostednews/afp/article/ALeqM5g8dOOsKnnFqLqZHxoLc-g0dvZGNg">nipping at Twitter&#8217;s heels</a>. I&#8217;m sure Able Grape will ferment for a while as Doug devotes his creative energies to improving Twitter Search. I certainly hope he brings the same focus and sensitivity to his new endeavor and makes Twitter Search a <a href="http://en.wikipedia.org/wiki/Grand_cru">grand cru</a> of search engines.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/13/an-able-grape-at-the-helm-of-twitter-search/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/13/an-able-grape-at-the-helm-of-twitter-search/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Lots of Search News Today!</title>
		<link>http://thenoisychannel.com/2009/08/10/lots-of-search-news-today/</link>
		<comments>http://thenoisychannel.com/2009/08/10/lots-of-search-news-today/#comments</comments>
		<pubDate>Tue, 11 Aug 2009 01:40:17 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2423</guid>
		<description><![CDATA[I try not to write posts that are just cut-and-paste from Techmeme, but it&#8217;s hard to resist a trio like this: Facebook rolls out new version of search Facebook acquires FriendFeed Google testing new &#8220;caffeine&#8221; web search infrastructure OK, perhaps that last item isn&#8217;t strictly search news, but it may as well be, given that [...]]]></description>
			<content:encoded><![CDATA[<p>I try not to write posts that are just cut-and-paste from <a href="http://techmeme.com/">Techmeme</a>, but it&#8217;s hard to resist a trio like this:</p>
<ul>
<li><a href="http://blog.facebook.com/blog.php?post=115469877130">Facebook rolls out new version of search</a></li>
<li><a href="http://www.facebook.com/press/releases.php?p=116581">Facebook acquires FriendFeed</a></li>
<li><a href="http://googlewebmastercentral.blogspot.com/2009/08/help-test-some-next-generation.html">Google testing new &#8220;caffeine&#8221; web search infrastructure</a></li>
</ul>
<p>OK, perhaps that last item isn&#8217;t strictly search news, but it may as well be, given that the microblogging wars are in no small part about &#8220;real time&#8221; search.</p>
<p>I&#8217;m not a huge Facebook fan (as those of you who have looked at my spartan <a href="http://www.facebook.com/noisychannel">profile page</a> may have noticed), but I am curious about how they&#8217;re implementing search over their sprawling collection of content. I&#8217;m underwhelmed with my own search experience on the site, but that might be my own fault for not being an active Facebook participant. Perhaps folks here who are more active can share their own experiences.</p>
<p>As for the acquisition of FriendFeed, I&#8217;m surely in good company to assume this was Facebook&#8217;s second choice after the attempt to acquire Twitter <a href="http://kara.allthingsd.com/20081124/when-twitter-met-facebook-the-acquisition-deal-that-fail-whaled/">fell through</a>. If, as has been <a href="http://kara.allthingsd.com/20090810/facebook-acquires-not-twitter-oops-friendfeed-plus-the-full-press-release/">reported</a>. Facebook only paid $50M for FriendFeed, then the acquisition was pocket change compared to the $500M they offered Twitter  (granted, some or all of that being based on a controversial valuation of the Facebook). Anyway, it should keep life interesting in the status-sphere.</p>
<p>And then there&#8217;s Google&#8217;s preview site, which you can try <a href="http://www2.sandbox.google.com/">here</a>. The only difference I see between it and the non-preview Google search is that the estimated result counts tend to be slightly higher. The  top-ranked results seem almost identical, modulo tiny permutations for the queries I checked, as do related searches and any other features I tried. But apparently that&#8217;s the idea:</p>
<blockquote><p>The new infrastructure sits &#8220;under the hood&#8221; of Google&#8217;s search engine, which means that most users won&#8217;t notice a difference in search results. But web developers and power searchers might notice a few differences, so we&#8217;re opening up a web developer preview to collect feedback.</p></blockquote>
<p>Anyway, it&#8217;s more fun reading all of this stuff than hearing the CEO of the web&#8217;s great search brands proclaim that her company has <a href="http://bits.blogs.nytimes.com/2009/08/07/yahoo-ceo-we-have-never-been-a-search-company/">never been a search company</a>&#8211;or wondering where all the great search people I know there will land as Yahoo search is assimilated into Bing. Don&#8217;t get me wrong, I&#8217;m looking forward to the competition between Google and Microsoft&#8211;one that I think will finally be waged in earnest. But I&#8217;m still sad for Yahoo and its employees.</p>
<p>Which bring us to the last news item: <a href="http://en.wikipedia.org/wiki/Doug_Cutting">Doug Cutting</a> is <a href="http://blog.lucene.com/2009/08/10/joining-cloudera/">leaving Yahoo</a> for <a href="http://www.cloudera.com/">Cloudera</a>, where he&#8217;ll continue to work on <a href="http://hadoop.apache.org/">Hadoop</a>. According to his blog post about it, &#8220;This move will not fundamentally change my day-to-day activities.&#8221; It will certainly be interesting to see what comes next from someone who has been instrumental to so many major open-source packages associated with search.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/10/lots-of-search-news-today/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/10/lots-of-search-news-today/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Public Expression, Liability, and Anonymity</title>
		<link>http://thenoisychannel.com/2009/08/08/public-expression-liability-and-anonymity/</link>
		<comments>http://thenoisychannel.com/2009/08/08/public-expression-liability-and-anonymity/#comments</comments>
		<pubDate>Sat, 08 Aug 2009 23:10:11 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2412</guid>
		<description><![CDATA[A colleague just sent me a link to a story about a Twitter user being sued for a tweet. At least he&#8217;s not being sued in London. I&#8217;m strongly if not absolutely in favor of freedom of expression, so it&#8217;s hard not to find such cases depressing. Nonetheless, I don&#8217;t think  the legal landscape hasn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>A colleague just sent me a link to a story about a <a href="http://insurance-expert.typepad.com/cliff/2009/07/twitter-user-being-sued-for-tweet-will-your-insurance-policy-respond.html">Twitter user being sued for a tweet</a>. At least he&#8217;s not being <a href="http://blogs.zdnet.com/Howlett/?p=1134">sued in London</a>.</p>
<p>I&#8217;m strongly if not absolutely in favor of freedom of expression, so it&#8217;s hard not to find such cases depressing. Nonetheless, I don&#8217;t think  the legal landscape hasn&#8217;t changed.</p>
<p>Rather, what has changed (or accelerated) is that:</p>
<ul>
<li>It is easier for people to express themselves publicly&#8211;and hence far more people are doing it.</li>
<li>The detached nature of online communication releases people&#8217;s inhibitions. Moreover, people not only don&#8217;t self-censor, but in some cases are deliberately provocative to attract attention.</li>
<li>The speed and efficiency of distribution (especially through search / alerts) means that the people most likely to be or feel damaged by an act of public expression are far more likely to discover that act.</li>
</ul>
<p>So it&#8217;s not surprising that users are being sued for what they say online&#8211;it&#8217;s an expected consequence of the democratization of publishing, especially in the litigious English-speaking countries on both sides of the pond.</p>
<p>I&#8217;d personally like to see it a higher bar for someone to initiate a defamation lawsuit&#8211;let alone win it&#8211;but I&#8217;m not holding my breath. Instead, I expect that we&#8217;ll see more anonymous expression by people who don&#8217;t feel the authenticity of disclosure justified the risk of retaliation. Oh well.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/08/public-expression-liability-and-anonymity/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/08/public-expression-liability-and-anonymity/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>SIGIR 2009: Day 3, Industry Track: Nick Craswell</title>
		<link>http://thenoisychannel.com/2009/08/02/sigir-2009-day-3-industry-track-nick-craswell/</link>
		<comments>http://thenoisychannel.com/2009/08/02/sigir-2009-day-3-industry-track-nick-craswell/#comments</comments>
		<pubDate>Mon, 03 Aug 2009 03:11:59 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2386</guid>
		<description><![CDATA[One of the things I didn&#8217;t consider when I signed on to organize the SIGIR 2009 Industry Track was that I&#8217;d have to replace speakers and panelists on less than two weeks&#8217; notice. But what I couldn&#8217;t even have imagined was replacing a speaker on less than 24 hours&#8217; notice! Tuesday morning, the second day [...]]]></description>
			<content:encoded><![CDATA[<p>One of the things I didn&#8217;t consider when I signed on to organize the <a href="http://sigir2009.org/">SIGIR 2009</a> <a href="http://sigir2009.org/Program/industry">Industry Track</a> was that I&#8217;d have to replace speakers and panelists on less than two weeks&#8217; notice. But what I couldn&#8217;t even have imagined was replacing a speaker on less than 24 hours&#8217; notice!</p>
<p>Tuesday morning, the second day of the conference and the day before the Industry Track, I woke up to an email from Tip House of the <a href="http://www.oclc.org/">OCLC</a>, whom I&#8217;d planned to have speak about his experiences developing <a href="http://www.worldcat.org/">Worldcat.org</a>, the world&#8217;s largest bibliographic database. Unfortuantely, he had fallen ill and would not be able to make it to the conference.</p>
<p>I was determined not to have a hole in the program. I immediate sent an email to the Director of Search at LinkedIn, whom I had just met at the poster session the previous evening, hoping he might have a presentation tucked away about LinkedIn&#8217;s recent launch of <a href="http://thenoisychannel.com/2009/07/15/linkedin-rolling-out-faceted-search/">faceted people search</a>. I <a href="http://twitter.com/dtunkelang/status/2757507120">turned to Twitter</a>&#8211;which actually earned me a plausible suggestion.</p>
<p>But it was during the morning coffee break that serendipity struck. As I walked by the <a href="http://bing.com/">Bing</a> exhibitor table, I saw <a href="http://www.jopedersen.com/">Jan Pedersen</a>, Chief Scientist of Core Search at Microsoft, chatting with <a href="http://www.linkedin.com/pub/peter-bailey/0/4aa/b7">Peter Bailey</a>, an applied researcher on the Bing team. I turned to them and, in my most charming voice, asked if they might be interested in having someone on their team taking about Bing the next day. They took a few minutes to think it over, and then replied in the affirmative, producing <a href="http://research.microsoft.com/en-us/people/nickcr/">Nick Craswell</a>, also an applied researcher. Problem solved, and I can proudly say that I Binged for it!</p>
<p>Nick talked about how query modeling, focusing issues like query ambiguty, session context, and temporal query dynamics (particularly seasonality). He talked a bit about a technique that involved <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=65235">random walks on click logs</a>&#8211;a technique I remember striking me when I first heard him talk about it at <a href="http://ecir2008.dcs.gla.ac.uk/industry.html">ECIR 2008</a>.</p>
<p>The talk was a bit raw&#8211;understanably so given the short notice. But it was great to see a major web search practitioner connecting information retrieval research to actual product. Yes, there were the standard caveats about not revealing secret sauce, but the talk was open and substantive. Indeed, I hope Nick will be able to share the slides!</p>
<p><strong>UPDATE: Nick emailed me the <a href="http://thenoisychannel.com/wordpress/wp-content/uploads/2009/08/Bing-SIGIR09-Industry-Day.pdf">slides</a> and gave me permission to post them here.</strong></p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/02/sigir-2009-day-3-industry-track-nick-craswell/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/02/sigir-2009-day-3-industry-track-nick-craswell/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Are Academic Conferences Broken? Can We Fix Them?</title>
		<link>http://thenoisychannel.com/2009/08/02/are-academic-conferences-broken-can-we-fix-them/</link>
		<comments>http://thenoisychannel.com/2009/08/02/are-academic-conferences-broken-can-we-fix-them/#comments</comments>
		<pubDate>Sun, 02 Aug 2009 20:20:06 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2375</guid>
		<description><![CDATA[I&#8217;d hoped to get through all of the SIGIR 2009 Industry Track before blogging about anything else (such as Yahoo! search going bada-Bing), but clearly I&#8217;m taking too long. So I&#8217;m following Daniel Lemire&#8217;s suggestion that I post a recent comment on Lance Fortnow&#8217;s blog (actually a response to his CACM column entitled &#8220;Time for [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d hoped to get through all of the <a href="http://sigir2009.org/">SIGIR 2009</a> <a href="http://sigir2009.org/Program/industry">Industry Track</a> before blogging about anything else (such as <a href="http://www.choicevalueinnovation.com/">Yahoo! search going bada-Bing</a>), but clearly I&#8217;m taking too long. So I&#8217;m following Daniel Lemire&#8217;s <a href="http://twitter.com/lemire/statuses/3090462230">suggestion</a> that I post a recent <a href="http://blog.computationalcomplexity.org/2009/07/why-go-to-conferences.html">comment on Lance Fortnow&#8217;s blog</a> (actually a response to his CACM column entitled &#8220;<a href="http://cacm.acm.org/magazines/2009/8/34492-time-for-computer-science-to-grow-up/">Time for Computer Science to Grow Up</a>&#8220;) here at The Noisy Channel.</p>
<p>It&#8217;s nice to see this piece joining a growing chorus questioning the way we conflate the distinct concerns of disseminating knowledge, establishing professional reputation, and building community. This problem is not unique to computer science, but we are certainly in a position to lead by example in addressing it.</p>
<p>In age where distribution is nearly free, I agree that we should move the filtering role from content publishers to content consumers. There&#8217;s no economic reason today why scholarship (or purported scholarship) shouldn&#8217;t be published online. Of course, the ability to publish digital content for free (or close to free) does not imply anyone will (or should) read what you write. The blogosphere offers an instructive example: the overwhelming majority of blogs attract few (if any) readers. I suspect that the same holds true for <a rel="nofollow" href="http://arxiv.org/">arXiv.org</a>. Of course, peer-reviewed content may not fare that much better, particularly given the proliferation of peer-reviewed venues. Regardless, it makes no sense for publishers to act as filters in an age of nearly-free digital distribution.</p>
<p>That brings us to the question of how researchers should establish their professional reputation&#8211;and, in the case of academics, obtain tenure and promotion. Today, they have to publish in peer-reviewed journals and conferences. Even if we accept the weaknesses of the current peer-review regime, we should be able to separate content assessment from distribution. The peer-review process (and review processes in general) should serve to endorse content&#8211;and ideally even to improve it&#8211;rather than to filter it.</p>
<p>Finally, conferences should primarily serve to build community. I find the main value of conferences and workshops to be face-to-face interaction, and I&#8217;ve heard many people express similar sentiments. Part of the problem is that so few presenters at conferences invest in (or have the skills for) delivering strong presentations. But more fundamentally it&#8217;s not even clear that the presentations are the point of a conference&#8211;after all, an author&#8217;s main motive for submitting an article to a conference seems to be getting it into the proceedings.</p>
<p>Here are some questions I&#8217;d like to suggest we consider as a community:</p>
<p>What if presentation at a conference were optional, and an author&#8217;s decision to present had no effect on inclusion in the proceedings? Would there be significantly fewer presentations? Would those fewer presentation be of higher quality?</p>
<p>What if the process of peer-reviewing conference submissions required the submission of presentation materials rather than (or in addition to) a paper? Would the accepted presentations be of higher quality? Would researchers invest more in presentation skills? What would happen to strong researchers without such skills?</p>
<p>Can we update the traditional conference format to foster more productive interaction among researchers? For example, should we have more poster sessions and fewer paper presentations?</p>
<p>I&#8217;d love to see the computer science community take the lead in evolving what increasingly feel like dated procedures for disseminating knowledge, establishing professional reputation, and building community. I&#8217;ve tried to do my small part, co-organizing <a rel="nofollow" href="http://cuaslis.org/hcir2009/">workshops on Human-Computer Interaction and Information Retrieval (HCIR)</a> that emphasize face-to-face interaction and organizing the <a rel="nofollow" href="http://www.sigir2009.org/Program/industry">SIGIR 2009 Industry Track</a> as a series of invited talks and panels from strong presenters. But I&#8217;m encouraged to see &#8220;establishment&#8221; types like <a href="http://cacm.acm.org/magazines/2009/5/24632-conferences-vs-journals-in-computing-research/fulltext">Moshe Vardi</a> and Lance Fortnow leading the charge to question the status quo.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/08/02/are-academic-conferences-broken-can-we-fix-them/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/08/02/are-academic-conferences-broken-can-we-fix-them/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
		<item>
		<title>SIGIR 2009: Day 3, Industry Track: Matt Cutts</title>
		<link>http://thenoisychannel.com/2009/07/29/sigir-2009-day-3-industry-track-matt-cutts/</link>
		<comments>http://thenoisychannel.com/2009/07/29/sigir-2009-day-3-industry-track-matt-cutts/#comments</comments>
		<pubDate>Wed, 29 Jul 2009 04:05:37 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2361</guid>
		<description><![CDATA[At last we arrive at the SIGIR 2009 Industry Track. Since I organized this track (which mainly involved coming up with a program and then actually producing the speakers), I&#8217;m not exactly an impartial observer. But hopefully the organizers of future industry tracks will benefit from my perspective as an organizer. Last December (New Year&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>At last we arrive at the <a href="http://sigir2009.org/">SIGIR 2009</a> <a href="http://sigir2009.org/Program/industry">Industry Track</a>. Since I <a href="http://sigir2009.org/about/organizers">organized</a> this track (which mainly involved coming up with a program and then actually producing the speakers), I&#8217;m not exactly an impartial observer. But hopefully the organizers of future industry tracks will benefit from my perspective as an organizer.</p>
<p>Last December (New Year&#8217;s Eve, to be precise), I started recruiting speakers. I started with a list of topics I wanted to see covered, and one of those topics was spam / <a href="http://en.wikipedia.org/wiki/Adversarial_information_retrieval">adversarial information retrieval</a>. My top two choices were <a href="http://www.mattcutts.com/blog/about-me/">Matt Cutts</a> and <a href="http://singhal.info/">Amit Singhal</a>, both members of the Search Quality group at Google. I&#8217;d heard Amit speak before: he delivered one of the <a href="http://ecir2008.dcs.gla.ac.uk/keynote_speakers.html">keynotes</a> at <a href="http://ecir2008.dcs.gla.ac.uk/">ECIR 2008</a> (and inspired <a href="http://thenoisychannel.com/2008/04/08/qa-with-amit-singhal-2/">one of my first blog posts</a>!). So I decided to aim for Matt Cutts, despite having no way to contact him (the head of Google&#8217;s Webspam team is understandably a bit protective of his personal email address). And, just two weeks later, I had Matt locked in to the program.</p>
<p>Matt was an incredible speaker, and he had the unenviable task of opening the Industry Track at 8:30 AM, the morning after the banquet. His title, &#8220;WebSpam and Adversarial IR: The Road Ahead&#8221;, gave him a fair amount of maneuvering room, and he used his 45 minutes to give the audience a peek into his world.</p>
<p>He opened the talk by inducing the audience to try to think like a spammer. He then game examples of <a href="http://en.wikipedia.org/wiki/Social_engineering_%28security%29">social engineering</a> attacks, to put us in a &#8220;<a href="http://en.wikipedia.org/wiki/Black_hat">black hat</a>&#8221; mindset. He also pointed out the danger of punishing sites with spammy inlinks: people and companies would use this knowledge against their competitors / enemies (the practice has been called &#8220;<a href="http://seoblackhat.com/category/googlebowling/">Google bowling</a>&#8220;).</p>
<p>He then moved on to examples of spam techniques. He showed examples of pages whose spaminess is only detectable by parsing JavaScript, something I wasn&#8217;t aware that Google could do (though apparently this has been <a href="http://www.webmasterworld.com/forum3/12393.htm">public knowledge for a while</a>). The theoretical computer scientist in me wonders about using <a href="http://en.wikipedia.org/wiki/Random_self-reducibility">random self-reducibility</a> as obfuscation on steroids, but hopefully spammers aren&#8217;t quite that sophisticated yet!</p>
<p>He offered a common-sense framework for fighting spam: reduce the return on investment. Unfortuately, he sees a trend in spam where spammers are aiming for faster, higher payoffs by hacking sites and installing <a href="http://en.wikipedia.org/wiki/Malware">malware</a>. Indeed, the democratizing effect of social media means that a lot more people have pages that can serve spam, including their Twitter and Facebook pages. He invited the information retrieval community to invest effort in learning how to automatically detect  that a page or server has been hacked.</p>
<p>My only quibble with the talk is that Matt did not discuss the inherent subjectivity of spam. Sure, there are many cases that are black and white, but ultimately spam (like relevance) is in the eye of the user. I&#8217;d love to see more use of techniques like <a href="http://www.itu.int/osg/spu/spam/contributions/Spam%20economics-faq.pdf">attention bond mechanisms</a> that accommodate a subjective definition of spam, e.g., &#8220;any email that you would rather have not received.&#8221;</p>
<p>But I quibble. Matt delivered an excellent talk to a packed audience, and it was a real privilege to have him kick off the Industry Track.</p>
<p>ps. You can also read <a href="http://www.searchenginecaffe.com/2009/07/matt-cutts-sigir-industry-day-web-spam.html">Jeff Dalton’s notes</a> on Matt&#8217;s presentation.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/07/29/sigir-2009-day-3-industry-track-matt-cutts/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/07/29/sigir-2009-day-3-industry-track-matt-cutts/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)</title>
		<link>http://thenoisychannel.com/2009/07/25/sigir-2009-day-2-morning-sessions-anchor-text-vertical-search/</link>
		<comments>http://thenoisychannel.com/2009/07/25/sigir-2009-day-2-morning-sessions-anchor-text-vertical-search/#comments</comments>
		<pubDate>Sat, 25 Jul 2009 16:28:04 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2347</guid>
		<description><![CDATA[Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at SIGIR and at the apartment where I was staying) and mostly restricted my online activity to occasional tweets during talks. I meant to catch up on my blogging yesterday, but instead spent the [...]]]></description>
			<content:encoded><![CDATA[<p>Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at <a href="http://sigir2009.org/">SIGIR</a> and at the apartment where I was staying) and mostly restricted my online activity to occasional <a href="http://search.twitter.com/search?q=dtunkelang+sigir09">tweets</a> during talks. I meant to catch up on my blogging yesterday, but instead spent the day <a href="http://www.northfork.org/wineries.html">wine tasting in Long Island</a>. But enough apologizing, I&#8217;m refreshed and ready to blog up a storm!</p>
<p>The second day of SIGIR (Tuesday) started straight off with research talks. I went to the web retrieval session, which consisted of two talks about <a href="http://en.wikipedia.org/wiki/Anchor_text">anchor text</a> and one about privacy-preserving link analysis.</p>
<p>&#8220;<a href="http://research.yahoo.com/files/ataggregation.pdf">Building Enriched Document Representations using Aggregated Anchor Text</a>&#8220;, by <a href="http://research.yahoo.com/Don_Metzler">Don Metzler</a> and colleagues at Yahoo Labs. They address the challenge of anchor text sparsity (the distribution of in-links for web pages follows a <a href="http://en.wikipedia.org/wiki/Power_law">power law</a>) by enriching document representation through aggregation of anchor text along the web graph. Their technique is intuitive, and the authors demonstrate statistically significant improvements in retrieval effectiveness. Unfortunately, their results are not repeatable, since used a proprietary test collection to obtain them.</p>
<p>The second talk of the session, &#8220;<a href="http://research.microsoft.com/pubs/80581/sigirfp185-dou.pdf">Using Anchor Texts with Their Hyperlink Structure for Web Search</a>&#8220;, was by a group of authors from Microsoft Research Asia. They address the opposite problem of the previous paper: how to handle too much, rather than too little, anchor text. Specifically, they model dependence among multiple anchor texts associated with the same target document. Like the Yahoo folks, they demonstrate statistically significant results on a proprietary test collection.</p>
<p>The third talk, &#8220;<a href="http://portal.acm.org/citation.cfm?doid=1571941.1571983">Link Analysis for Private Weighted Graphs</a>&#8221; (ACM DL subscribers only) by <a href="http://www.fe.dis.titech.ac.jp/member/jun/index.htm">Jun Sakuma</a> (University of Tsukuba) and Shigenobu Kobayashi (Tokyo Institute of Technology), was a bit of an outlier, if one can call a paper in a three-paper session an outlier. The authors offer privacy-preserving expansions of <a href="http://en.wikipedia.org/wiki/PageRank">PageRank</a> and <a href="http://en.wikipedia.org/wiki/HITS_algorithm">HITS</a>, the best-known link analysis methods associated with relevance and authority in web search. I&#8217;ve noticed an increasing number of papers like these that mix cryptography with information retrieval or database concerns. One of my frustrations in reading such papers is that I always suspect that people are re-inventing wheels because so few people are able to keep up with research in multiple disciplines.</p>
<p>Then I had the coffee break to solve my own research problem: how to fill the 11:30 slot in the Wednesday <a href="http://sigir2009.org/Program/industry">Industry Track</a>, since a speaker called in sick that morning. When I walked by the <a href="http://www.bing.com/">Bing</a> table, I saw <a href="http://www.jopedersen.com/">Jan Pedersen</a> (Chief Scientist for Core Search at Microsoft), and I begged him to help me out. I must have been a persuasive supplicant, because he procured me <a href="http://research.microsoft.com/en-us/people/nickcr/">Nick Craswell</a>, an applied researcher who works on Bing. Out of gratitude for this 11th-hour favor, I wore a Bing t-shirt all day yesterday as I went wine-tasting. Bing drinking, not binge drinking!</p>
<p>Anyway, that urgent problem resolved, I went back to enjoying the conference. For the second morning session, I went to the vertical search session.</p>
<p>As it turns out, that session kicked off the with SIGIR Best Paper winner: &#8220;<a href="http://www.cs.cmu.edu/~jaime/SIGIR09Arguello.pdf">Sources of Evidence for Vertical Selection</a>&#8221; by <a href="http://www.cs.cmu.edu/~jaime/">Jaime Arguello</a> (CMU), <a href="http://labs.yahoo.com/user/150">Fernando Diaz</a> (Yahoo), <a href="http://www.cs.cmu.edu/~callan/">Jamie Callan</a> (CMU), and <a href="http://labs.yahoo.com/user/164">Jean-François Crespo</a> (Yahoo). The authors do a lot of things I like: they apply <a href="http://comminfo.rutgers.edu/etc/mongrel/cronen-townsend-croft-hlt.pdf">query clarity</a> as a performance predictor, and they <a href="http://ciir.cs.umass.edu/pubfiles/ir-468.pdf">bootstrap on an external collection</a> (specifically Wikipedia). The test collection they use for evaluation is proprietary, but that seems to be the price (at least today) of doing this kind of work.</p>
<p>The second talk of the session was by a subset of the previous paper&#8217;s authors: &#8220;<a href="http://www.cs.cmu.edu/~jaime/SIGIR09Diaz.pdf">Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback</a>&#8221; by Fernando Diaz and Jaime Arguello. The authors creatively used simulation to evalaute their approach. They did a nice job, but I have to admit I&#8217;m skeptical of results about feedback that aren&#8217;t based on user studies.</p>
<p>Unfortunately, I missed the third talk of the session because I had to play organizer. But I must have earned some good karma, because I got to enjoy a delightful lunch with <a href="http://people.ischool.berkeley.edu/~hearst/">Marti Hearst</a> and <a href="http://www.ir.iit.edu/~dagr/">David Grossman</a>.</p>
<p>Stay tuned for more posts about the interactive search session, the keynote by <a href="http://www.physics.neu.edu/Department/Vtwo/faculty/barabasi.htm">Albert-László Barabási</a>, the banquet at the <a href="http://www.jfklibrary.org/">JFK Presidential Library and Museum</a>, and of course the <a href="http://sigir2009.org/Program/industry">Industry Track</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/07/25/sigir-2009-day-2-morning-sessions-anchor-text-vertical-search/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/07/25/sigir-2009-day-2-morning-sessions-anchor-text-vertical-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SIGIR 2009: Day 1</title>
		<link>http://thenoisychannel.com/2009/07/21/sigir-2009-day-1/</link>
		<comments>http://thenoisychannel.com/2009/07/21/sigir-2009-day-1/#comments</comments>
		<pubDate>Tue, 21 Jul 2009 05:09:24 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2339</guid>
		<description><![CDATA[SIGIR &#8217;09 is in full swing! I arrived on Sunday evening, and the reception was like Cheers (&#8220;where everyone knows your name&#8220;)&#8211;only that, at least in my case, I was meeting many people face-to-face for the first time in years, and in some cases for the first time, period! I reconnected with some of the [...]]]></description>
			<content:encoded><![CDATA[<p>SIGIR &#8217;09 is in full swing!</p>
<p>I arrived on Sunday evening, and the reception was like <em>Cheers </em>(&#8220;<a href="http://en.wikipedia.org/wiki/Theme_from_Cheers_%28Where_Everybody_Knows_Your_Name%29">where everyone knows your name</a>&#8220;)&#8211;only that, at least in my case, I was meeting many people face-to-face for the first time in years, and in some cases for the first time, period! I reconnected with some of the SIGIR regulars whom I&#8217;d missed last year (<a href="http://www.sigir2008.org/">Singapore</a> was a bit far for me), finally met my editor, <a href="http://www.linkedin.com/pub/diane-cerra/4/67a/93b">Diane Cerra</a> from <a href="http://www.morganclaypool.com/">Morgan &amp; Claypool</a>, and even ran into someone who is evaluating my company&#8217;s technology. And that was just Day 0.</p>
<p>Day 1 started bright and early with the 7:00 am newcomer&#8217;s breakfast, which brings together newcomers and &#8220;old hands&#8221;. I believe my role as organizer qualified my as an &#8220;old hand&#8221;, even though this is only my third SIGIR. Which might explain why <a href="http://www.cs.mu.oz.au/~jz/">Justin Zobel</a>, a real old hand (and one of this year&#8217;s <a href="http://www.sigir2009.org/about/organizers">program chairs)</a> joined my table. Of course, he hadn&#8217;t read my <a href="http://thenoisychannel.com/2009/07/17/in-defense-of-recall/">post</a> about  his recent <a href="http://www.sigir.org/forum/2009J/2009j-sigirforum-zobel.pdf">SIGIR Forum essay</a>, so we chatted a bit about recall. Not surprisingly, we mostly agreed, and I have to give credit to the essay for provoking that and other good discussions today.</p>
<p>Then the conference started in earnest, with <a href="http://ischool.syr.edu/FACSTAFF/member.aspx?id=37">Liz Liddy</a> bestowing  the <a href="http://www.sigir.org/awards/awards.html#salton">Salton Award</a> to <a href="http://research.microsoft.com/en-us/um/people/sdumais/">Susan Dumais</a>. In the tradition of the award, Sue delivered a keynote recounting her personal journey through the space of information retrieval. I was thrilled that her recognition called out her working to bring together information retrieval and human-computer interaction. Of course some of us were ahead of the curve by recruiting her as the keynote for <a href="http://research.microsoft.com/en-us/um/people/ryenw/hcir2008/">HCIR &#8217;08</a>. <img src='http://thenoisychannel.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Of course, I asked her a question about why transparency, which she called out as a reason that users in her <a href="http://research.microsoft.com/en-us/um/people/cutrell/siscore-sigir2003.pdf">Stuff I&#8217;ve Seen</a> work preferred to explicitly sort results by date rather than accept the systems best-first relevance ranking, was so absent in web search. Her answer was interesting: she feels that transparency is most useful for re-finding, and least useful for discovery. I&#8217;m not sure I agree with that explanation, but I&#8217;ll at least think about it a bit before I commit to disagreeing with it.</p>
<p>Some coffee, and then off to the first session of research papers. The presentation that stood out for me in this session was &#8220;<a href="http://research.microsoft.com/en-us/um/people/pauben/papers/sigir-2009-refined-experts-bennett-nguyen.pdf">Refined Experts</a>&#8220;, presented by <a href="http://research.microsoft.com/en-us/um/people/pauben/">Paul Bennett</a>. The paper offers a nice technique for improve on hierarchical classification (by addressing the problems of error propagation through the hierarchy and the inherent non-linearity of hierarchies), and Paul is an outstanding presenter.</p>
<p>Then Diane Cerra and <a href="http://ils.unc.edu/~march/">Gary Marchionini</a> took the Morgan &amp; Claypool authors (and a few authors-to-be) to lunch at <a href="http://www.brasseriejoboston.com/">Brasserie Jo</a>. Great food, and even better company. My only regret is that I missed one of the talks in the first session after lunch, &#8220;<a href="http://bradipo.net/mark/papers/carman_sigir2009.pdf">A Statistical Comparison of Tag and Query Logs</a>&#8220;. I did like <a href="http://domino.research.ibm.com/comm/research_people.nsf/pages/carmel.index.html">David Carmel</a>&#8216;s talk on &#8220;Enhancing Cluster Labeling Using Wikipedia&#8221; in that same session, though I&#8217;ll need to do some homework to figure out what distinguishes it from other work in this area, such as an <a href="http://www.icde2008.org/">ICDE 2008</a> paper by <a href="http://www.cs.columbia.edu/~wisam/">Wisam Dakka</a> and <a href="http://pages.stern.nyu.edu/~panos/">Panos Ipeirotis</a> on &#8220;<a href="http://pages.stern.nyu.edu/~panos/publications/icde2008.pdf">Automatic Extraction of Useful Facet Hierarchies from Text Databases</a>&#8220;.</p>
<p>In the following session, I attended a couple of the efficiency talks. The talks were well presented, but in both cases I wondered if they were addressing the right problems. I&#8217;ve felt this way before at SIGIR efficiency talks, so perhaps my tastes are just idiosyncratic.</p>
<p>Then came the <a href="http://sigir2009.org/Program/posters">poster</a> / <a href="http://sigir2009.org/Program/demonstrations">demo</a> reception. Even with three hours, there was far too much to take in&#8211;and of course that session is as much about networking as it is about the posters and demos. I enjoyed the three hours, but I&#8217;ll have to go back to the proceedings to learn more about what I saw&#8211;and what I missed.</p>
<p>Finally, I wrapped up by leading a crew to <a href="http://www.tapeo.com/">Tapeo</a> for dinner&#8211;apparently a popular choice for attendees, since another table of 6 arrived shortly afterward. It was a nice cap to a fantastic but exhausting day.</p>
<p>I can&#8217;t promise I&#8217;ll keep this up daily, but I will blog about the rest of the conference when I have the chance. Meanwhile, here are some other folks blogging about SIGIR &#8217;09:</p>
<ul>
<li><a href="http://groups.csail.mit.edu/haystack/blog/?s=SIGIR09">David Karger</a> is live blogging on the Haystack blog.</li>
<li><a href="http://windowoffice.tumblr.com/">Jon Elsas</a> is blogging and posting some pictures.</li>
<li><a href="http://www.searchenginecaffe.com/">Jeff Dalton</a> posted detailed notes about Sue&#8217;s keynote.</li>
<li><a href="http://blog.semantichacker.com/?tag=sigir09">Mary McKenna</a> is blogging at the SemanticHacker blog.</li>
<li><a href="http://battellemedia.com/archives/004960.php">John Battelle</a> and <a href="http://palblog.fxpal.com/?p=1423">Gene Golovchinsky</a> couldn&#8217;t attend, but have both blogged about the conference.</li>
</ul>
<p>And follow on Twitter. The preferred hashtag is <a href="http://search.twitter.com/search?q=%23sigir09">#sigir09</a>, but I follow <a href="http://search.twitter.com/search?q=sigir+OR+sigir09+OR+sigir2009">sigir OR sigir09</a> to be safe.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/07/21/sigir-2009-day-1/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/07/21/sigir-2009-day-1/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Heading to SIGIR</title>
		<link>http://thenoisychannel.com/2009/07/19/heading-to-sigir/</link>
		<comments>http://thenoisychannel.com/2009/07/19/heading-to-sigir/#comments</comments>
		<pubDate>Sun, 19 Jul 2009 15:53:06 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2336</guid>
		<description><![CDATA[Hope to see lots of you at SIGIR! Sounds like there are already great tutorials underway. I&#8217;ll get there tonight for the reception, where they will announce the triennial Gerard Salton Award winner (who will deliver tomorrow&#8217;s opening keynote). I&#8217;m looking forward to the paper, poster, and demo presentations, and of course to the Industry [...]]]></description>
			<content:encoded><![CDATA[<p>Hope to see lots of you at <a href="http://sigir2009.org/">SIGIR</a>! Sounds like there are already great <a href="http://sigir2009.org/Program/tutorials">tutorials</a> underway. I&#8217;ll get there tonight for the reception, where they will announce the triennial <a href="http://www.sigir.org/awards/awards.html#salton">Gerard Salton Award</a> winner (who will deliver tomorrow&#8217;s opening keynote). I&#8217;m looking forward to the <a href="http://sigir2009.org/Program/papers">paper</a>, <a href="http://sigir2009.org/Program/posters">poster</a>, and <a href="http://sigir2009.org/Program/demonstrations">demo</a> presentations, and of course to the <a href="http://sigir2009.org/Program/industry">Industry Track</a> on Wednesday. Unfortunately, I have to return to my day job on Thursday, so I won&#8217;t be able to attend any of the <a href="http://sigir2009.org/Program/workshops">workshops</a>.</p>
<p>If you&#8217;re attending, I hope you&#8217;ll find me and say hi&#8211;after over a year of blogging, there are far too many people I&#8217;ve gotten to know but never met face to face! If you&#8217;re not attending, then I encourage you to follow the coverage on Twitter. Since there seems to be some confusion about which hashtag to use, I suggest you follow <a href="http://search.twitter.com/search?q=sigir+OR+sigir09+OR+sigir2009">sigir OR sigir09 OR sigir2009</a> (yes, there is sometimes value to <a href="http://thenoisychannel.com/2009/07/17/in-defense-of-recall/">favoring recall</a>). I promise to blog about it when I get back, but I hope you&#8217;ll forgive me if The Noisy Channel is a bit quiet over the next few days.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/07/19/heading-to-sigir/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/07/19/heading-to-sigir/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LinkedIn Rolling Out Faceted Search!</title>
		<link>http://thenoisychannel.com/2009/07/15/linkedin-rolling-out-faceted-search/</link>
		<comments>http://thenoisychannel.com/2009/07/15/linkedin-rolling-out-faceted-search/#comments</comments>
		<pubDate>Thu, 16 Jul 2009 02:18:19 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2325</guid>
		<description><![CDATA[I&#8217;m glad I have a Twitter alert for &#8220;faceted search&#8221;, since it alerted me (via @getzsch) to a post in TechCrunch announcing that LinkedIn now has a People Search beta that offers faceted search. I can disclose now that I known about this project for a while&#8211;they&#8217;d reached out to me after I offered a [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m glad I have a Twitter alert for <a href="http://search.twitter.com/search?q=%22faceted+search%22">&#8220;faceted search&#8221;</a>, since it alerted me (via <a href="http://twitter.com/getzsch">@getzsch</a>) to a post in <a href="http://www.techcrunch.com/2009/07/15/linkedin-drills-down-into-people-search-with-new-beta/">TechCrunch</a> announcing that <a href="http://linkedin.com/">LinkedIn</a> now has a <a href="http://www.linkedin.com/search?optIn=">People Search beta</a> that offers <a href="http://en.wikipedia.org/wiki/Faceted_search">faceted search</a>. I can disclose now that I known about this project for a while&#8211;they&#8217;d reached out to me after I offered a <a href="http://thenoisychannel.com/2009/03/31/why-linkedin-frustrates-me/">lukewarm review</a> of their search&#8211;but I was asked to be discreet about that knowledge.</p>
<p>In any case, I wish I&#8217;d known about the beta launch earlier today, when I was looking for Boston-area colleagues to help me publicize the <a href="http://thenoisychannel.com/2009/07/15/sigir-meet-the-whos-who-of-search-and-information-retrieval/">SIGIR Industry Track</a>! The current interface is much more supportive of exploration.</p>
<p>It&#8217;s a nice implementation. The interface lets you refine the text search results by location, relationship (1st degree, 2nd degree, group, and other), industry, current / past company, and school. For a facet with a large number of values, like company, the interface only displays the top 10 values, and then lets you use type-ahead to refine by other companies. Unfortunately, the type-ahead was a bit buggy for me&#8211;but hey, it is a beta.</p>
<p>The application is fairly responsive, even for my search for &#8220;software&#8221;, which returns 2.4M results, 120K of which are 2nd-degree connections. Other than at <a href="http://endeca.com/">Endeca</a>, I haven&#8217;t seen anyone else mix faceted search with social networks, and LinkedIn has done a nice job of it.</p>
<p>So, if anyone from LinkedIn is reading this, congratulations and welcome to the wonderful world of faceted search. Count me a delighted customer. I hope my enthusiasm today makes up for my <a href="http://thenoisychannel.com/2008/11/25/linkedins-new-search-platform-a-review/">past criticism</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/07/15/linkedin-rolling-out-faceted-search/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/07/15/linkedin-rolling-out-faceted-search/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Catching Up On Last Week&#8217;s News</title>
		<link>http://thenoisychannel.com/2009/07/10/catching-up-on-last-weeks-news/</link>
		<comments>http://thenoisychannel.com/2009/07/10/catching-up-on-last-weeks-news/#comments</comments>
		<pubDate>Sat, 11 Jul 2009 01:52:47 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2309</guid>
		<description><![CDATA[I hope everyone had a great week! It looks like I missed some interesting / controversial stories in the tech news / blogosphere, the most notable being: No to SQL? Anti-database movement gains steam A Comparison of Open Source Search Engines Introducing the Google Chrome OS Quick reactions: Regarding the anti-SQL movement, I would have [...]]]></description>
			<content:encoded><![CDATA[<p>I hope everyone had a great week! It looks like I missed some interesting / controversial stories in the tech news / blogosphere, the most notable being:</p>
<ul>
<li> <a href="http://www.computerworld.com/s/article/9135086/No_to_SQL_Anti_database_movement_gains_steam_">No to SQL? Anti-database movement gains steam</a></li>
<li><a href="http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/">A Comparison of Open Source Search Engines</a></li>
<li><a href="http://googleblog.blogspot.com/2009/07/introducing-google-chrome-os.html">Introducing the Google Chrome OS</a></li>
</ul>
<p>Quick reactions:</p>
<p>Regarding the anti-SQL movement, I would have thought the main complaint would be that SQL is too arcane a language for ordinary users to ever use it directly. Instead, the article discusses developers&#8217; complaints about databases, and these are mostly about price, speed, and scale. Evidently even free, open-source databases like <a href="http://www.mysql.com/">MySQL</a> are losing favor relative to tools like <a href="http://hadoop.apache.org/">Hadoop</a> and <a href="http://www.hypertable.org/">Hypertable</a> that don&#8217;t offer support for SQL. Of course, this picture comes from a meetup of 150 people that might not be entirely representative of information technology workers.</p>
<p>I know first-hand from my experience at <a href="http://endeca.com/">Endeca</a> that, to quote <a href="http://databasecolumn.vertica.com/2007/09/one-size-fits-all.html">Michael Stonebraker</a>, the &#8220;one size fits all&#8221; approach to databases is an idea whose time has come and gone. At Endeca, we have built our own special-purpose database to address information needs ill-served by the available <a href="http://en.wikipedia.org/wiki/Online_transaction_processing">OLTP</a> and <a href="http://en.wikipedia.org/wiki/Online_analytical_processing">OLAP</a> technologies. Still, I think it&#8217;s premature to declare the death of SQL or of relational databases. But why let that stand in the way of a good story?</p>
<p>On to the open-source search engine comparison. I won&#8217;t rehash the critique of the study, which you can find in the <a href="http://http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/#comments">80+ comments</a> from folks like <a href="http://www.searchenginecaffe.com/">Jeff Dalton</a>, <a href="http://lingpipe-blog.com/">Bob Carpenter</a>, and <a href="http://www.sematext.com/">Otis Gospodnetic</a>. Perhaps the most salient point is that it&#8217;s not clear how much sense it makes to perform &#8220;out of the box&#8221; evaluations. In any case, my impression is that <a href="http://lucene.apache.org/">Lucene</a> is by far the dominant player in the open-source search space; the study, if it has any effect, will only be to reinforce that dominance.</p>
<p>And finally, the big news from the big G: a Google Operating System. Even my mom (who couldn&#8217;t name an existing operating system) was asking me about it, so clearly this one has made it into the mainstream media. And yet I don&#8217;t see why this is such a big deal. We have netbooks, and we even have Linux-based netbooks. As far as I&#8217;ve heard, the latter are popular with geeks and cheapskates, but that&#8217;s about it&#8211;most people are willing to pony up the few extra dollars for Windows XP. Will Google launching a netbook-oriented OS significantly affect this market? I suspect the only route to success is if they meet non-technical users&#8217; needs (browsing, email, media, light document editing) while minimizing their overhead (maintenance, security, compatibility). Will they be a better <a href="http://en.wikipedia.org/wiki/Ubuntu">Ubuntu</a>? Perhaps, much in the way that <a href="http://www.google.com/chrome">Chrome</a> is trying to be a better <a href="http://www.mozilla.com/en-US/firefox/">Firefox</a>. Why Google choose to build its own free, open-source products rather than contribute to mature open-source projects is a mystery to me, but it&#8217;s their money and time to spend.</p>
<p>I think that cover&#8217;s the week&#8217;s big stories&#8211;or at least those that matter most to Noisy Channel readers. Somehow I didn&#8217;t manage to come up with an IR / HCIR angle on the Michael Jackson story, or perhaps it&#8217;s just that <a href="http://searchengineland.com/google-thinks-michael-jackson-died-at-age-65-in-2007-21659">Danny Sullivan beat me to it</a>.</p>
<p>Anyway, I&#8217;m back in the saddle, and should soon be back to my normal posting volume. Thank you all for being patient.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/07/10/catching-up-on-last-weeks-news/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/07/10/catching-up-on-last-weeks-news/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Looking for a IR / Data Mining Job?</title>
		<link>http://thenoisychannel.com/2009/06/30/looking-for-a-ir-data-mining-job/</link>
		<comments>http://thenoisychannel.com/2009/06/30/looking-for-a-ir-data-mining-job/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 03:50:36 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2298</guid>
		<description><![CDATA[No, I&#8217;m not recruiting for my team&#8211;though I&#8217;m always open to research collaborations. But I wanted to call readers&#8217; attention to at least two places that are hiring folks with expertise in information retrieval. The first is Panjiva, a startup that I&#8217;m advising. You can read more about them here. They are looking for a [...]]]></description>
			<content:encoded><![CDATA[<p>No, I&#8217;m not recruiting for my team&#8211;though I&#8217;m always open to research collaborations. But I wanted to call readers&#8217; attention to at least two places that are hiring folks with expertise in information retrieval.</p>
<p>The first is <a href="http://panjiva.com/">Panjiva</a>, a startup that I&#8217;m advising. You can read more about them <a href="http://blog.panjiva.com/">here</a>. They are looking for a hands-on developer (yes, someone who can code) with background in information retrieval or data mining. The job is in Cambridge, MA, and they want someone local. Check out their <a href="http://panjiva.com/jobs">jobs page</a>.</p>
<p>The second is Twitter. Yes, you&#8217;ve heard of them. What you might not know if that they&#8217;re aggressively hiring in their search group. Apparently the company is growing&#8211;I&#8217;d thought they were at ~30 people, but I just did a reference call for someone and learned that they&#8217;ve doubled in the past few months. I have no stake in Twitter except as a user, but I&#8217;d love to see them improve their search capabilities. So, if you&#8217;re in or near San Francisco and looking for a search job on the bleeding edge, <a href="http://static.twitter.com/jobvite_frame.html?c=q8X9VfwT&amp;jvi=oyPbVfwd,Job">check it out</a>.</p>
<p>Other folks who are trying to hire people with search / information retrieval background: I encourage you to post opportunities in the comments!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/30/looking-for-a-ir-data-mining-job/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/30/looking-for-a-ir-data-mining-job/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Reports from HCOMP 2009</title>
		<link>http://thenoisychannel.com/2009/06/29/reports-from-hcomp-2009/</link>
		<comments>http://thenoisychannel.com/2009/06/29/reports-from-hcomp-2009/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 15:43:00 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2290</guid>
		<description><![CDATA[Check out Panos&#8217;s extensive live blogging from what, as far as I know, is the first Human Computation Workshop (HCOMP  2009). You can also see the associated #hcomp Twitter activity. Evidently Luis von Ahn used his keynote to unveil MonoLingo, a human-powered system for translation, but only using people that know one language (no idea [...]]]></description>
			<content:encoded><![CDATA[<p>Check out Panos&#8217;s extensive <a href="http://behind-the-enemy-lines.blogspot.com/2009/06/liveblogging-from-hcomp-2009.html">live blogging</a> from what, as far as I know, is the first <a href="http://www.hcomp2009.org/">Human Computation Workshop</a> (HCOMP  2009). You can also see the associated <a href="http://search.twitter.com/search?q=%23hcomp">#hcomp</a> Twitter activity.</p>
<p>Evidently <a href="http://www.cs.cmu.edu/~biglou/">Luis von Ahn</a> used his keynote to unveil <a href="http://monolingo.com/">MonoLingo</a>, a human-powered system for translation, but only using people that know one language (no idea if he used the old <a href="http://everything2.com/title/What%2520do%2520you%2520call%2520a%2520person%2520who%2520speaks%2520three%2520languages%253F">joke</a>).</p>
<p>According to Panos:</p>
<blockquote><p>Monolingo relies on the fact that machine translation is pretty good at this point, but not perfect. So MonoLingo starts by by translating each word using a dictionary, giving multiple interpretations for each word. The human then (who is a native speaker of the target language) selects the translation for each word and forms the sentence that makes most sense.</p></blockquote>
<p>I&#8217;m curious to hear more: as of this writing, the site is password-protected with no further information.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/29/reports-from-hcomp-2009/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/29/reports-from-hcomp-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Are Spammers Taking Over Twitter?</title>
		<link>http://thenoisychannel.com/2009/06/27/are-spammers-taking-over-twitter/</link>
		<comments>http://thenoisychannel.com/2009/06/27/are-spammers-taking-over-twitter/#comments</comments>
		<pubDate>Sun, 28 Jun 2009 00:39:46 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2278</guid>
		<description><![CDATA[Until recently, I&#8217;ve noticed the occasional incdent where a Twitter &#8220;trending topic&#8221; was socially engineered by a spammer, usually by an application which auto-tweets on sign-up. But the problem seems to be getting noticeably worse. Just a few days ago Habitat, a furniture store, used the trending topics as hashtags&#8211;including one associated with the disputed [...]]]></description>
			<content:encoded><![CDATA[<p>Until recently, I&#8217;ve noticed the occasional incdent where a Twitter &#8220;trending topic&#8221; was socially engineered by a spammer, usually by an application which auto-tweets on sign-up. But the problem seems to be getting noticeably worse.</p>
<p>Just a few days ago Habitat, a furniture store, used the trending topics as hashtags&#8211;including one associated with the disputed Iran election&#8211;to pimp their &#8220;totally desirable Spring collection&#8221;. It made for a great case study in <a href="http://www.digitaltip.com.au/index.php/how-not-to-use-twitter-habitatuk-as-a-case-study/">how not to use Twitter</a>.</p>
<p>And today I see that the top two trending topics are <a href="http://search.twitter.com/search?q=%22What+McFLY+Song+Are%22">What McFLY Song Are</a> and <a href="http://search.twitter.com/search?q=%22TweetBoard+Alpha%22">TweetBoard Alpha</a>, both edging out <a href="http://search.twitter.com/search?q=%23iranelection">#iranelection</a>. The first spams through a quiz; the second through a request for invitations. It&#8217;s enough to make you want to scream, <a href="http://www.stoptwitterspam.com/">Stop Twitter Spam</a>!</p>
<p>Of course, the solution may be to ignore the trending topics, which we can now see are easily gamed. Even when they&#8217;re legitimate, the topics aren&#8217;t necessarily all that useful. In the Twitterquake of Michael Jackson&#8217;s death, nine of the top ten trending topics related to the Gloved One&#8211;one of them even <a href="http://www.businessinsider.com/twitter-confirms-that-people-cant-spell-michael-jackson-2009-6">misspelled as Micheal</a>. the tenth related to a <a href="http://www.thehollywoodgossip.com/2009/06/jeff-goldblum-dead-not-so-much/">hoax</a> that Jeff Goldblum had died.</p>
<p>As I&#8217;ve said <a href="http://thenoisychannel.com/2009/06/17/spam-in-the-twitterverse/">before</a>, I actually look forward to a spamageddon that forces us to confront the attention scarcity problem head-on. At this rate, perhaps I won&#8217;t have to wait much longer.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/27/are-spammers-taking-over-twitter/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/27/are-spammers-taking-over-twitter/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Aardvark Burrows Out Of Beta</title>
		<link>http://thenoisychannel.com/2009/06/27/aardvark-burrows-out-of-beta/</link>
		<comments>http://thenoisychannel.com/2009/06/27/aardvark-burrows-out-of-beta/#comments</comments>
		<pubDate>Sun, 28 Jun 2009 00:16:16 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2275</guid>
		<description><![CDATA[I just received an email from Max Ventilla, CEO of social search startup Aardvark, to let me know that Aardvark is now open to anyone who wants to sign up. Well, anyone with a Facebook account&#8211;but I can&#8217;t imagine that there are many people who are curious to sign up for a service like Aardvark [...]]]></description>
			<content:encoded><![CDATA[<p>I just received an email from Max Ventilla, CEO of social search startup <a href="http://vark.com/">Aardvark</a>, to let me know that Aardvark is now open to anyone who wants to sign up. Well, anyone with a Facebook account&#8211;but I can&#8217;t imagine that there are many people who are curious to sign up for a service like Aardvark but don&#8217;t already have Facebook accounts!</p>
<p>Apparently <a href="http://www.techcrunch.com/2009/06/27/aardvark-open-for-business-via-facebook-connect/">Michael Arrington</a> got the email too. But Aardvark&#8217;s larger PR coup is a feature in the Business section of the Sunday New York Times, entitled &#8220;<a href="http://www.nytimes.com/2009/06/28/business/28digi.html">Now All Your Friends Are in the Answer Business</a>&#8220;.</p>
<p>I&#8217;ve blogged about Aardvark a bit&#8211;see my previous <a href="http://thenoisychannel.com/?s=aardvark+challenge">pair of posts</a> about the blog + Twitter vs. Aardvark challenge. I like the idea of expert-mediated information seeking, though I have at least two concerns with Aardvark.</p>
<p>The first is in how Aardvark routes questions to experts&#8211;I&#8217;ve had mixed results that I attribute to the inherent challenge of inferring a topic through natural language processing. I think Aardvark would to well to offer guidance to users both in volunteering their own areas of expertise and in specifying their query topics.</p>
<p>The second is that questions and answers are private. I&#8217;m a big fan of &#8220;<a href="http://thenoisychannel.com/2008/11/27/when-in-doubt-make-it-public/">when in doubt, make it public</a>&#8220;&#8211;and this is a clear-cut case where public is at least the right default. I&#8217;m curious how often people ask questions that someone else has already answered. Yes, there&#8217;s something to be said about getting an answer from someone in your own social network. But I don&#8217;t see any reason that the correspondence has to be private&#8211;especially for a question that you&#8217;re willing to have routed to a total stranger.</p>
<p>I hope Aardvark addresses both of these concerns, improving its routing and publishing question-answer pairs. As I&#8217;ve mentioned in recent posts, I think social search deserves a lot more attention than real-time search, and it&#8217;s great to see startups like Aardvark and <a href="http://hunch.com/">Hunch</a> working on it.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/27/aardvark-burrows-out-of-beta/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/27/aardvark-burrows-out-of-beta/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Can Real-Time Search Help Hedge Funds?</title>
		<link>http://thenoisychannel.com/2009/06/25/can-real-time-search-help-hedge-funds/</link>
		<comments>http://thenoisychannel.com/2009/06/25/can-real-time-search-help-hedge-funds/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 13:33:05 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2259</guid>
		<description><![CDATA[I haven&#8217;t exactly been generous in my opinons about the widespread obsession with &#8220;real-time&#8221; search. But in today&#8217;s Telegraph there&#8217;s at least a story that makes sense in theory: &#8220;Hedge fund managers betting Twitter will give them an edge in rapid trading&#8220;. In practice, I&#8217;m pretty skeptical, as is Gwen Robinson at the Financial Times [...]]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t exactly been generous in <a href="http://thenoisychannel.com/2009/06/18/real-time-but-not-ready-for-prime-time/">my opinons</a> about the widespread obsession with &#8220;real-time&#8221; search. But in today&#8217;s Telegraph there&#8217;s at least a story that makes sense in theory: &#8220;<a href="http://www.telegraph.co.uk/finance/newsbysector/mediatechnologyandtelecoms/digital-media/5614073/Hedge-fund-managers-betting-Twitter-will-give-them-an-edge-in-rapid-trading.html">Hedge fund managers betting Twitter will give them an edge in rapid trading</a>&#8220;.</p>
<p>In practice, I&#8217;m pretty skeptical, as is Gwen Robinson at the Financial Times <a href="http://ftalphaville.ft.com/blog/2009/06/24/58736/twitter-for-those-dirty-alpha-seeking-strategies/">Alphaville</a> blog. She writes:</p>
<blockquote><p>That’s very interesting, because several hedge fund managers we spoke to dismissed the idea variously as “all twatter” and “rubbish” &#8211; not least because Twitter has carved a reputation more for unfounded speculation and even sensational disinformation than for ground-breaking, market-moving alerts for alpha-hungry fund managers.</p></blockquote>
<p>I&#8217;ll concede that time really is money for for hedge funds and other traders who need to make decisions before the rest of the market catches up. But I&#8217;m dubious that Twitter&#8211;let alone an automated processing of tweets&#8211;will enable traders to make better decisions. Moreover, any success would immediately be gamed, along the lines of <a href="http://en.wikipedia.org/wiki/Pump_and_dump">pump and dump</a> scams. I suppose that hasn&#8217;t put a damper on the popularity of <a href="http://stocktwits.com/">StockTwits</a>, but popularity does not necessarily translate to profitability for the traders. I hear that <a href="http://www.swoopo.com/">Swoopo</a> (a great example of exploiting <a href="http://en.wikipedia.org/wiki/Behavioral_economics">behavioral economics</a>) is popular too.</p>
<p>If real-time search is to be useful&#8211;and I think it really should be called alerting&#8211;then the information it provides has to have some sort of quality assurance, and not just freshness. There&#8217;s almost certainly a trade-off, since it usually takes time to vet  information for quality, even if the vetting is through <a href="http://en.wikipedia.org/wiki/Crowdsourcing">crowdsourcing</a>. But that reality doesn&#8217;t seem to have sunk in yet for the real-time advocates. I say, give it time.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/25/can-real-time-search-help-hedge-funds/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/25/can-real-time-search-help-hedge-funds/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Marti Hearst&#8217;s Book: Now Available Online</title>
		<link>http://thenoisychannel.com/2009/06/25/marti-hearsts-book-now-available-online/</link>
		<comments>http://thenoisychannel.com/2009/06/25/marti-hearsts-book-now-available-online/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 12:52:46 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2256</guid>
		<description><![CDATA[Check out Marti Hearst&#8217;s new book on Search User Interfaces! You can read my review here. Thanks to Christina for the heads up.]]></description>
			<content:encoded><![CDATA[<p>Check out Marti Hearst&#8217;s new book on <a href="http://searchuserinterfaces.com/">Search User Interfaces</a>! You can read my review <a href="http://thenoisychannel.com/2009/06/22/marti-hearsts-book-on-search-user-interfaces/">here</a>. Thanks to <a href="http://twitter.com/cpikas/statuses/2324840744">Christina</a> for the heads up.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/25/marti-hearsts-book-now-available-online/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/25/marti-hearsts-book-now-available-online/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JCDL 2009 Proceedings now in ACM Digital Library</title>
		<link>http://thenoisychannel.com/2009/06/23/jcdl-2009-proceedings-now-in-acm-digital-library/</link>
		<comments>http://thenoisychannel.com/2009/06/23/jcdl-2009-proceedings-now-in-acm-digital-library/#comments</comments>
		<pubDate>Tue, 23 Jun 2009 20:37:34 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2246</guid>
		<description><![CDATA[Thanks to Gene for letting us know that the JCDL 2009 proceedings are now available to ACM Digital Library subscribers. Hopefully authors will make their posts available to those who don&#8217;t have DL subscriptions. For more information about the conference check out Gene&#8217;s posts at the FXPAL blog and Judie&#8217;s at Curious Judith.]]></description>
			<content:encoded><![CDATA[<p>Thanks to <a href="http://twitter.com/HCIR_GeneG/statuses/2298603780">Gene</a> for letting us know that the <a href="http://www.jcdl2009.org/">JCDL 2009</a> proceedings are now available to <a href="http://portal.acm.org/toc.cfm?id=1555400&amp;idx=SERIES492&amp;type=proceeding&amp;coll=ACM&amp;dl=ACM&amp;part=series&amp;WantType=Journals&amp;title=Proceedings%20of%20the%202009%20joint%20international%20conference%20on%20Digital%20libraries&amp;CFID=42244867&amp;CFTOKEN=14196073">ACM Digital Library</a> subscribers. Hopefully authors will make their posts available to those who don&#8217;t have DL subscriptions. For more information about the conference check out Gene&#8217;s posts at the <a href="http://palblog.fxpal.com/?tag=jcdl2009">FXPAL</a> blog and Judie&#8217;s at <a href="http://www.grey-cat.com/curious/?tag=jcdl2009">Curious Judith</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/23/jcdl-2009-proceedings-now-in-acm-digital-library/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/23/jcdl-2009-proceedings-now-in-acm-digital-library/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Real-Time But Not Ready For Prime Time</title>
		<link>http://thenoisychannel.com/2009/06/18/real-time-but-not-ready-for-prime-time/</link>
		<comments>http://thenoisychannel.com/2009/06/18/real-time-but-not-ready-for-prime-time/#comments</comments>
		<pubDate>Thu, 18 Jun 2009 18:12:59 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2227</guid>
		<description><![CDATA[Extra, extra, read all about it&#8211;two new real-time search engines debuted today: CrowdEye and Collecta. I love the headlines from Techmeme: Mashable!: Collecta: True Real-Time Social Search paidContent: Startup Promises Best Real-Time Search Results Yet Tech Beat: Collecta Launches *Really* Real-Time Search Engine ReadWriteWeb: Collecta: Summize Backer Launches Broader Real-Time Search Search Engine Land: Collecta [...]]]></description>
			<content:encoded><![CDATA[<p>Extra, extra, read all about it&#8211;two new real-time search engines debuted today: <a href="http://crowdeye.com/home.aspx">CrowdEye</a> and <a href="http://collecta.com/">Collecta</a>.</p>
<p>I love the headlines from <a href="http://techmeme.com/">Techmeme</a>:</p>
<ul>
<li><a href="http://mashable.com/">Mashable!</a>:<a href="http://mashable.com/2009/06/18/collecta/"> Collecta: True Real-Time Social Search</a></li>
<li><a href="http://paidcontent.org/">paidContent</a>: <a href="http://paidcontent.org/article/419-startup-promises-best-real-time-search-results-yet">Startup Promises Best Real-Time Search Results Yet</a></li>
<li><a href="http://www.businessweek.com/the_thread/techbeat/">Tech Beat</a>:<a href="http://www.businessweek.com/the_thread/techbeat/archives/2009/06/collecta_launch.html"> Collecta Launches *Really* Real-Time Search Engine</a></li>
<li><a href="http://www.readwriteweb.com/">ReadWriteWeb</a>:<a href="http://www.readwriteweb.com/archives/collecta_summize_backer_launches_broader_real-time.php"> Collecta: Summize Backer Launches Broader Real-Time Search</a></li>
<li><a href="http://searchengineland.com/" target="_self">Search Engine Land</a>: <a href="http://searchengineland.com/collecta-and-crowdeye-join-the-real-time-search-club-21231">Collecta And CrowdEye Join The “Real Time” Search Club</a></li>
<li><a href="http://www.theregister.co.uk/" target="_self">The Register</a>: <a href="http://www.theregister.co.uk/2009/06/18/collecta_launch/">Collecta &#8211; real-time search in real-time</a></li>
<li><a href="http://venturebeat.com/" target="_self">VentureBeat</a>: <a href="http://digital.venturebeat.com/2009/06/18/collecta-says-its-the-fastest-contender-in-real-time-search-race/">Collecta says it&#8217;s the fastest contender in the real-time search race</a></li>
<li><a href="http://www.techcrunch.com/" target="_self">TechCrunch</a>: <a href="http://www.techcrunch.com/2009/06/18/collecta-enters-the-real-time-search-wars/">Collecta Enters The Real Time Search Wars</a></li>
<li><a href="http://www.altsearchengines.com/" target="_self">AltSearchEngines</a>: <a href="http://www.altsearchengines.com/2009/06/18/top-story-collecta-launches-real-time-search-engine/">Top Story: Collecta launches real-time search engine!</a></li>
</ul>
<p>Yes, folks, it&#8217;s really, really, real-time! Of course <a href="http://www.searchenginejournal.com/twitter-search-gradually-becoming-a-real-time-search-engine/11143/">Twitter</a> and <a href="http://blogs.zdnet.com/BTL/?p=19833">Facebook</a> have their own real-time search offerings. And apparently Google, Yahoo, and Microsoft are <a href="http://online.wsj.com/article/BT-CO-20090615-712397.html">looking hard at real-time</a> too.</p>
<p>I concede that there&#8217;s something in this real-time mania. I&#8217;ve live-tweeted events, and I&#8217;ve followed others who were doing so. I certainly read current news and blogs&#8211;as they say, today&#8217;s newspaper wraps tomorrow&#8217;s fish (someone will have to translate the expression for folks who&#8217;ve never read an analog newspaper). But yes, recency / freshness  is a certainly a concern in information seeking.</p>
<p>But it&#8217;s not the only one, and I doubt it&#8217;s the dominant one. Moreover, the dismissal of web search engines as if their index contents are ancient history is preposterous. Search for<em> iran election</em> on <a href="http://www.google.com/search?q=iran+election">Google</a>, <a href="http://search.yahoo.com/search?p=iran+election">Yahoo</a>, or <a href="http://www.bing.com/search?q=iran+election&amp;form=OSDSRC">Bing</a>, and you see a lot of current news. I suppose <a href="http://search.twitter.com/search?q=iran+election">Twitter</a> offers more recently generated bits, but the main virtue there is not the immediacy&#8211;rather, it&#8217;s the social nature of the content. For example, a number of people are following <a href="http://twitter.com/persiankiwi">@persiankiwi</a> for a personal perspective. I&#8217;ll let you decide for yourselves if <a href="http://www.collecta.com/#q=iran%20election">Collecta</a> or <a href="http://www.crowdeye.com/viewer.aspx?view=Iran+Election&amp;public=true">Crowdeye</a> offer something new or valuable&#8211;I&#8217;m still waiting for the former to show me anything at all!</p>
<p>I know that the technology press likes new buzzwords, and &#8220;real-time&#8221; search is surely the buzzword du jour, even giving &#8220;semantic&#8221; search a run for its money. And I understand how many in the blogosphere feel it is their moral duty to cheer on any start-up that makes a go at disrupting the current regime. But I wish these folks would evaluate the new entrants on their merits, rather than simply on the drama of the David vs. Goliath story.</p>
<p>I understand what it&#8217;s like on the startup side&#8211;it wasn&#8217;t that long ago that few people outside the Boston-area technology scene had heard of <a href="http://endeca.com/">Endeca</a>. For a long time, I was jealous of people whose companies had generated more buzz. But, in retrospect, I&#8217;m at least glad that my colleagues and I had a chance to build a robust product before the press noticed us. Overenthusiastic press isn&#8217;t necessarily a good thing, as I&#8217;m sure a line-up of prematurely crowned Google killers can attest.</p>
<p>In that spirit, I hope that CrowdEye and Collecta bring something interesting to the market. But I doubt that &#8220;real-time&#8221; search will cut it, especially if it&#8217;s not ready for prime time.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/18/real-time-but-not-ready-for-prime-time/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/18/real-time-but-not-ready-for-prime-time/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>JCDL 2009</title>
		<link>http://thenoisychannel.com/2009/06/17/jcdl-2009/</link>
		<comments>http://thenoisychannel.com/2009/06/17/jcdl-2009/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 19:15:39 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2220</guid>
		<description><![CDATA[For the benefit of those of us not lucky enough to be attending this year&#8217;s Joint Conference on Digital Libraries (JCDL 2009), a number of attendees are live-tweeting the conference using the hashtag #jcdl2009. I&#8217;m sure there will be blog posts (like these), and I&#8217;ll try to round up what I can when the conference [...]]]></description>
			<content:encoded><![CDATA[<p>For the benefit of those of us not lucky enough to be attending this year&#8217;s Joint Conference on Digital Libraries (<a href="http://www.jcdl2009.org/">JCDL 2009</a>), a number of attendees are live-tweeting the conference using the hashtag <a href="http://search.twitter.com/search?q=%23jcdl2009">#jcdl2009</a>. I&#8217;m sure there will be blog posts (<a href="http://www.grey-cat.com/curious/?tag=jcdl2009">like</a> <a href="http://palblog.fxpal.com/?tag=jcdl2009">these</a>), and I&#8217;ll try to round up what I can when the conference wraps up. I also understand that papers will eventually be available in the <a href="http://portal.acm.org/dl.cfm">ACM Digital Library</a>, and that authors are being encouraged to post their own papers on their web sites&#8211;if / when that happens, I&#8217;ll try to assemble a list here, at least of the ones that particularly catch my attention.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/17/jcdl-2009/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/17/jcdl-2009/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Spam in the Twitterverse</title>
		<link>http://thenoisychannel.com/2009/06/17/spam-in-the-twitterverse/</link>
		<comments>http://thenoisychannel.com/2009/06/17/spam-in-the-twitterverse/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 17:27:57 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2214</guid>
		<description><![CDATA[I&#8217;ve noted in the past that &#8220;real-time&#8221; alerting systems, in contrast to search engines that place less emphasis on immediacy, are particularly vulnerable to spamming. It&#8217;s a lot like telemarketing&#8211;you could avoid it entirely if you routed any questionable calls to voicemail, but then you would, at the very least, not be able to be [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve noted in the past that &#8220;real-time&#8221; alerting systems, in contrast to search engines that place less emphasis on immediacy, are particularly <a href="http://thenoisychannel.com/2008/10/13/alerting-push-or-pull/">vulnerable to spamming</a>. It&#8217;s a lot like telemarketing&#8211;you could avoid it entirely if you routed any questionable calls to voicemail, but then you would, at the very least, not be able to be reached in real time.</p>
<p>At first glance, Twitter seems immune from this sort of spamming, since you only see tweets from the users you follow. Yes, <a href="http://twitter.com/barackobama">Barack Obama</a> and <a href="http://twitter.com/guykawasaki">Guy Kawasaki</a> must spend a lot of time on Twitter! But, regardless of how many users you follow, you are the one in control.</p>
<p>At least that&#8217;s the theory. Of course, things tend to work a bit differently in practice. Like many Twitter users, I use Twitter Search to maintain a running <a href="http://search.twitter.com/search?lang=all&amp;q=dtunkelang+OR+tunkelang+OR+noisychannel+OR+%22noisy+channel%22+OR+tunkrank+OR+endeca">vanity query</a> for mentions of my user name, employer, blog, etc. As a result, a user I don&#8217;t follow can nonetheless get my attention by tweeting an &#8220;<a href="http://www.mariasguides.com/2009/03/13/twitter-primer-reply-vs-dm/">at reply</a>&#8221; to me. Twitter has <a href="http://blog.twitter.com/2009/05/small-settings-update.html">struggled</a> to figure out whether that is a good thing or a bad thing, but I suspect that my erring on the side of vanity is a common behavior.</p>
<p>But I do recognize that I&#8217;m opening myself up to alert spamming&#8211;perhaps not just in theory, but in practice. Today I read on <a href="http://mediamemo.allthingsd.com/20090617/another-twitter-business-that-doesnt-make-money-for-twitter-pay-per-twitterer/">All Things Digital</a> that:</p>
<blockquote><p>Pontiflex, a lead generation startup that hoovers up names and other other info from users that visit its network of publishers, then sells the data to marketers. The Brooklyn-based company is rolling out a Twitter product that lets marketers compile a list of interested Twitter users.<br />
&#8230;<br />
Since the users aren’t actually signing up to “follow” any of the marketers, said marketers can’t send them direct messages. The marketers could try to “at reply” their leads — the equivalent of shouting out the name of someone you think might be at a loud cocktail party, but who you can’t actually see. But that’s about it.</p></blockquote>
<p>That&#8217;s about enough, if enough users are like me. Fortunately, I&#8217;m not enough of a celebrity to be particularly concerned about being singled out&#8211;at this stage. But I think the writing is on the wall, and spammers will innovate to embrace social media. I&#8217;ve already experienced <a href="http://thenoisychannel.com/2008/11/22/the-great-hatsby-and-other-im-bots/">a</a> <a href="http://thenoisychannel.com/2008/11/22/focused-comment-spamming/">few</a> <a href="http://thenoisychannel.com/2009/05/05/got-hate-tweets/">examples</a> of such innovation, and I&#8217;m sure that they are child&#8217;s play compared to what&#8217;s in store.</p>
<p>Personally, I look forward to this spamageddon. Why? Because I think we already have a problem managing attention scarcity in social media, but haven&#8217;t found sufficient motivation to confront the problem head on. A spam epidemic will certainly cause us to revisit our priorities, and I&#8217;m optimistic that we&#8217;ll innovate beyond the existing approaches used for email spam.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/17/spam-in-the-twitterverse/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/17/spam-in-the-twitterverse/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Hunch Has Launched</title>
		<link>http://thenoisychannel.com/2009/06/15/hunch-has-launched/</link>
		<comments>http://thenoisychannel.com/2009/06/15/hunch-has-launched/#comments</comments>
		<pubDate>Mon, 15 Jun 2009 16:12:29 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2209</guid>
		<description><![CDATA[For anyone who has been waiting to try Hunch (which really is a &#8220;decision engine&#8220;) but didn&#8217;t manage to snarf an invite, today is your lucky day: Hunch has launched. They&#8217;ve added some new features too&#8211;for example, they offer a faceted navigation interface that lets you bypass their ordering of the questions in the decision [...]]]></description>
			<content:encoded><![CDATA[<p>For anyone who has been waiting to try <a href="http://hunch.com/">Hunch</a> (which really is a &#8220;<a href="http://blog.searchenginewatch.com/090615-101513">decision engine</a>&#8220;) but didn&#8217;t manage to snarf an invite, today is your lucky day: Hunch has <a href="http://blog.hunch.com/?p=3874">launched</a>. They&#8217;ve added some new features too&#8211;for example, they offer a <a href="http://en.wikipedia.org/wiki/Faceted_search">faceted navigation</a> interface that lets you bypass their ordering of the questions in the decision tree (e.g., for choosing a <a href="http://www.hunch.com/cocktails/all/">cocktail</a>).</p>
<p>But be warned, the site may be a bit sluggish. They&#8217;re certainly getting bombarded with traffic from the <a href="http://techmeme.com/search/query?q=hunch&amp;wm=false">blogosphere</a> and <a href="http://search.twitter.com/search?q=hunch">Twitterverse</a>!</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/15/hunch-has-launched/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/15/hunch-has-launched/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Don&#8217;t believe everything you read in the New York Post</title>
		<link>http://thenoisychannel.com/2009/06/14/dont-believe-everything-you-read-in-the-new-york-post/</link>
		<comments>http://thenoisychannel.com/2009/06/14/dont-believe-everything-you-read-in-the-new-york-post/#comments</comments>
		<pubDate>Mon, 15 Jun 2009 03:50:45 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2204</guid>
		<description><![CDATA[Now this is the sort of publicity that even $100M can&#8217;t buy: the New York Post is reporting that, in response to Microsoft&#8217;s recent Bing launch, &#8220;FEAR GRIPS GOOGLE&#8221; (all caps in the original): Sergey Brin is so rattled by the launch of Microsoft&#8217;s rival search engine that he has assembled a team of top [...]]]></description>
			<content:encoded><![CDATA[<p>Now this is the sort of publicity that even <a href="http://www.nypost.com/seven/06042009/business/microsoft_bing_blitz_a_100m_long_shot_172437.htm">$100M</a> can&#8217;t buy: the New York Post is reporting that, in response to Microsoft&#8217;s recent <a href="http://www.bing.com/">Bing</a> launch, &#8220;<a href="http://www.nypost.com/seven/06142009/business/fear_grips_google_174235.htm">FEAR GRIPS GOOGLE</a>&#8221; (all caps in the original):</p>
<blockquote><p>Sergey Brin is so rattled by the launch of Microsoft&#8217;s rival search engine that he has assembled a team of top engineers to work on urgent upgrades to his Web service.</p></blockquote>
<p>I never imagined that anyone would get their technology news from the New York Post, but evidently it&#8217;s well read in the blogosphere. <a href="http://www.techmeme.com/090614/p14#a090614p14">Techmeme</a> reports the following articles as citing the New York Post article:</p>
<ul>
<li>Larry Dignan / <a href="http://blogs.zdnet.com/BTL">Between the Lines</a>: <a href="http://blogs.zdnet.com/BTL/?p=19715">Google missed a marketing turn with the ‘decision engine’ thing</a></li>
<li>Steven Musil / <a href="http://news.cnet.com/">CNET News</a>: <a href="http://news.cnet.com/8301-10805_3-10264417-75.html">Does Microsoft&#8217;s Bing have Google running scared?</a></li>
<li>Bob Caswell: <a href="http://bobcaswell.com/2009/06/14/its-official-i-now-use-bing-instead-of-google/">It&#8217;s official: I now use Bing instead of Google</a></li>
<li>Amit Chowdhry / <a href="http://pulse2.com/">Pulse2</a>: <a href="http://pulse2.com/2009/06/14/bing-increases-microsofts-market-share-upsets-googles-sergey-brin/">Bing Increases Microsoft&#8217;s Market Share; Upsets Google&#8217;s Sergey Brin</a></li>
<li>Ben Parr / <a href="http://mashable.com/">Mashable!</a>: <a href="http://mashable.com/2009/06/14/bing-google-sergey/">Google vs. Bing Battle Heating Up: Is Google Scared?</a></li>
<li>Tony Hung / <a href="http://www.deepjiveinterests.com/">Deep Jive Interests</a>: <a href="http://www.deepjiveinterests.com/2009/06/14/how-many-geeks-are-even-trying-bing/">How Many Geeks Are Even Trying Bing?</a></li>
</ul>
<p>I know that the press loves a good fight, and in technology it&#8217;s hard to ask for a better pairing than Google and Microsoft. Moreover, I do think that Google should be paying attention to Microsoft&#8217;s <a href="http://www.decisionengine.com/">positioning</a> of Bing, regardless of how well Microsoft has delivered on that positioning. In any case, it makes sense for Google to keep close tabs on its competitors. After all, even a fraction of a percent of web search market share translates into millions&#8211;more than enough revenue to justify a few full-time employees.</p>
<p>Still, to assert that Google is gripped with fear stretches credibility, even for a tabloid. I don&#8217;t mean to suggest that Google is so self-confident as to be fearless. Google may well have reacted with fear when it looked like Microsoft would acquire Yahoo&#8211;in fact, some have <a href="http://www.nytimes.com/2008/02/04/technology/04yahoo.html">suggested</a> that Google&#8217;s <a href="http://www.google.com/yahoogooglefacts/">proposed</a> (but ultimately <a href="http://googleblog.blogspot.com/2008/11/ending-our-agreement-with-yahoo.html">abandoned</a>) advertising deal with Yahoo was a <a href="http://en.wikipedia.org/wiki/Niccol%C3%B2_Machiavelli">Machiavellian</a> maneuver to scuttle the acquisition.</p>
<p>But, unless I&#8217;m missing something, Bing simply isn&#8217;t a threat to Google&#8217;s market dominance. If anyone should be concerned, it&#8217;s folks like <a href="http://kayak.com/">Kayak</a> who might lose some market share to Bing&#8217;s <a href="http://www.bing.com/travel/">travel search</a>&#8211;which seems to be generally acknowledged as Bing&#8217;s strongest vertical.</p>
<p>Personally, after being <a href="http://thenoisychannel.com/2009/06/01/banging-on-bing-a-bummer/">underwhelmed</a> by Bing, I decided to <a href="http://twitter.com/dtunkelang/status/2017503974">try it for 2 weeks</a>. I made it for about a week and a half, and you can see some of my commentary on <a href="http://search.twitter.com/search?q=from%3Adtunkelang+bing">Twitter</a>. I stand by initial impression: it&#8217;s not bad, but it&#8217;s noticeably inferior to Google, and even <a href="http://thenoisychannel.com/2008/08/05/is-google-good-enough/">parity is not enough</a> to reverse the tide. Perhaps the tiny gain&#8211;or the slowdown in loss&#8211;that they will make in market share will justify their investment. But this is no revolution, and the Gevil Empire is not running scared.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/14/dont-believe-everything-you-read-in-the-new-york-post/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/14/dont-believe-everything-you-read-in-the-new-york-post/feed/</wfw:commentRss>
		<slash:comments>36</slash:comments>
		</item>
		<item>
		<title>Google Wave or just a Blip?</title>
		<link>http://thenoisychannel.com/2009/06/11/google-wave-or-just-a-blip/</link>
		<comments>http://thenoisychannel.com/2009/06/11/google-wave-or-just-a-blip/#comments</comments>
		<pubDate>Thu, 11 Jun 2009 20:14:13 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2197</guid>
		<description><![CDATA[Yesterday, I was fortunate to attend a presentation from a Google Engineering Director about Google Wave, an online communication and collaboration tool that Google recently unveiled at the Google I/O developer conference. For those who, like me, were unable to attend I/O, Google has posted the entire 80-minute presentation on YouTube (embedded above). For those [...]]]></description>
			<content:encoded><![CDATA[<p><object width="286" height="174" data="http://www.youtube.com/v/v_UyVmITiYQ&amp;hl=en&amp;fs=1&amp;" type="application/x-shockwave-flash"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/v_UyVmITiYQ&amp;hl=en&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /></object></p>
<p>Yesterday, I was fortunate to attend a presentation from a Google Engineering Director about <a href="http://wave.google.com/">Google Wave</a>, an online communication and collaboration tool that Google recently unveiled at the <a href="http://code.google.com/events/io/">Google I/O</a> developer conference. For those who, like me, were unable to attend I/O, Google has posted the entire 80-minute presentation on YouTube (embedded above). For those of you without 80 minutes to spare, Gina Trapani has assembled a <a href="http://smarterware.org/1955/the-google-wave-highlight-reel">highlight reel</a>.</p>
<p>The pitch is that email, the most popular technology for online communication, is a 40 years old and needs an overhaul to reflect the opportunities of an always-on world. They also emphasize that everything they&#8217;ve done works inside the browser.</p>
<p>The video is sexy, showing off both the real-time updating capabilities of Wave (blurring the lines between email and instant messaging) and the ability to support structure more cleanly than email (e.g., responding to only part of an email). The conversation model is also nice: for example, participants can bring someone new into a conversation, and that new person can access the evolution of a conversation (a sort of retroactive cc). Indeed, Wave looks more like <a href="http://www.basecamphq.com/">Basecamp</a> than like email.</p>
<p>Google is pitching Wave to developers&#8211;they even stole a page from Oprah and gave every Google I/O attendee a new Android phone in order to develop applications using their early-access Wave accounts. I haven&#8217;t studied the <a href="http://code.google.com/apis/wave/">APIs</a>, but the object model seems reasonable, ranging from a &#8220;blip&#8221; (a low-level event associated with content, possibly as fine-grained as someone typing a single character) to &#8220;wavelets&#8221; (the sub-conversations that comprise a wave) to of course the wave itself. And, given that the team is led by the folks who developed <a href="http://maps.google.com/">Google Maps</a>, I have no doubt that they understand how to play well with mash-ups.</p>
<p>But I&#8217;m left with two big questions.</p>
<p>The first is what it would feel like to access this rich structured history of conversation. The search interface feels a lot like Gmail&#8217;s&#8211;and I don&#8217;t mean that as a compliment. I use Gmail, but I curse every time I have to deal with managing search results that include large conversational threads. I think there will be a lot of challenges for managing search results, and I&#8217;m curious how Google, with its historically spartan approach to search interfaces, will address them.</p>
<p>The second is about interoperability. For all of the openness, I get the sense that everything can be brought into Wave and Waves can be embedded anywhere. That feels about as open as Facebook. What I&#8217;m missing is a sense of how (or even if!) Google Wave will interoperate with other communication platforms. They do show an example of building a Twitter client within Wave&#8211;perhaps that is representative of their interoperability strategy.</p>
<p>The Google Wave demo is impressive, and I have no doubt that developers will play with it and build cool demos of their own. But I believe the ultimate success of Google Wave will depend on how they address the above two questions. Time will tell.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/11/google-wave-or-just-a-blip/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/11/google-wave-or-just-a-blip/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Attending Endeca Discover</title>
		<link>http://thenoisychannel.com/2009/06/09/attending-endeca-discover/</link>
		<comments>http://thenoisychannel.com/2009/06/09/attending-endeca-discover/#comments</comments>
		<pubDate>Tue, 09 Jun 2009 15:37:24 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
		
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2191</guid>
		<description><![CDATA[Apologies for the unusual hiatus in posting&#8211;I&#8217;ve been attending Endeca Discover (an annual user conference) and haven&#8217;t managed to allocate time for blogging. I&#8217;ll make up for it by blogging about the conference tomorrow, when I&#8217;m back to what passes for normality. In the mean time, feel free to follow the conference on Twitter.]]></description>
			<content:encoded><![CDATA[<p>Apologies for the unusual hiatus in posting&#8211;I&#8217;ve been attending <a href="http://endeca.com/discover">Endeca Discover</a> (an annual user conference) and haven&#8217;t managed to allocate time for blogging. I&#8217;ll make up for it by blogging about the conference tomorrow, when I&#8217;m back to what passes for normality. In the mean time, feel free to follow the conference on <a href="http://search.twitter.com/search?q=%23discov+OR+endeca">Twitter</a>.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/09/attending-endeca-discover/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/09/attending-endeca-discover/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Those Who Give Twitter *Get* Twitter</title>
		<link>http://thenoisychannel.com/2009/06/06/those-who-give-twitter-get-twitter/</link>
		<comments>http://thenoisychannel.com/2009/06/06/those-who-give-twitter-get-twitter/#comments</comments>
		<pubDate>Sat, 06 Jun 2009 18:50:23 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2187</guid>
		<description><![CDATA[Marshall Kirkpatrick at ReadWriteWeb wrote a post arguing that the people working at Twitter aren&#8217;t using the service the way its power users do, and that this bodes ill for Twitter. His main arguments: Twitter&#8217;s employees don&#8217;t twitter very much: an average of 2 to 3 tweets per person per day. Twitter employees don&#8217;t follow [...]]]></description>
			<content:encoded><![CDATA[<p>Marshall Kirkpatrick at ReadWriteWeb wrote a <a href="http://www.readwriteweb.com/archives/twitters_staff_may_not_use_twitter_like_you_do_tha.php">post</a> arguing that the people working at Twitter aren&#8217;t using the service the way its power users do, and that this bodes ill for Twitter. His main arguments:</p>
<ul>
<li>Twitter&#8217;s employees don&#8217;t twitter very much: an average of 2 to 3 tweets per person per day.</li>
<li>Twitter employees don&#8217;t follow very many other people: only 2 out of 49 Twitter team members follow more than 500 people and no one was over 1k.</li>
<li>Twitter staff members aren&#8217;t following top Twitter developers in the community.</li>
</ul>
<p>I can&#8217;t really address the third point, but the first two&#8211;and especially the second&#8211;are hardly helpful to Kirkpatrick&#8217;s case. To the contrary, they argue that the people who work at Twitter get it. And, to make sure Kirkpatrick got it, Twitter CEO <a href="http://twitter.com/ev">Ev Williams</a> even wrote him a letter, in which he said:</p>
<blockquote><p>Many people fall into the trap that you should follow all or most people back out of a sense of politeness or so-called engagement with the community&#8230; At a certain point, you&#8217;re not actually reading any more tweets by following more people &#8212; you&#8217;re just dipping into the stream somewhat randomly and missing a whole lot of what people say. That&#8217;s fine, but I believe people will generally get more value out of Twitter by dropping the symmetrical relationship expectation and simply curating their following list based on the information and people they want to tune in to.</p></blockquote>
<p>Amen! I&#8217;ve been hammering this point here in most of my posts about Twitter, but here is a handful of examples for newer readers:</p>
<ul>
<li><a title="October 10, 2008" rel="bookmark" href="../../2008/10/10/twitters-twist-on-the-attention-economy/">Twitter’s Twist on the Attention Economy</a></li>
<li><a title="January 2, 2009" rel="bookmark" href="../../2009/01/02/an-attention-ponzi-scheme/">An Attention Ponzi Scheme?</a></li>
<li><a title="December 25, 2008" rel="bookmark" href="../../2008/12/25/putting-the-social-back-in-social-networks/">Putting the Social back in Social Networks</a></li>
<li><a title="January 7, 2009" rel="bookmark" href="../../2009/01/07/the-real-twitter/">The Real Twitter</a></li>
<li><a title="April 6, 2009" rel="bookmark" href="../../2009/04/06/guy-kawasaki-ill-say-it/">Guy Kawasaki, I’ll Say It</a></li>
</ul>
<p>And of course the whole point of <a href="http://tunkrank.com/">TunkRank</a> is to discourage the vicious circle of reciprocity and fake following. That&#8217;s baked into the the <a href="http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/">measure</a> which, like <a href="http://en.wikipedia.org/wiki/PageRank">PageRank</a>, divides the voting power by the number of out-links.</p>
<p>The comments on Kirkpatrick&#8217;s post suggest that a lot of regular Twitter users also get it. I find that reassuring, especially given the hype around Twitter in the last several weeks. Twitter can be a useful tool but it will help if people don&#8217;t devalue it by imposing cultural norms that devalue the social network. I&#8217;m glad the folks who have given us Twitter realize that.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/06/those-who-give-twitter-get-twitter/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/06/those-who-give-twitter-get-twitter/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Google Squared: A Great First Step</title>
		<link>http://thenoisychannel.com/2009/06/04/google-squared-a-great-first-step/</link>
		<comments>http://thenoisychannel.com/2009/06/04/google-squared-a-great-first-step/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 04:08:06 +0000</pubDate>
		<dc:creator>Daniel Tunkelang</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://thenoisychannel.com/?p=2175</guid>
		<description><![CDATA[Regular readers know that I am not a Google fan boy, and that much of my commentary on Google focuses on their neglect of exploratory search. Nonetheless, when I saw the initial Youtubeware describing Google Squared a few weeks ago, my ears perked up. I decided to wait until it went live to assess it. [...]]]></description>
			<content:encoded><![CDATA[<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="474" height="290" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/__INtIXNLmI&amp;hl=en&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="474" height="290" src="http://www.youtube.com/v/__INtIXNLmI&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Regular readers know that I am not a Google fan boy, and that much of my commentary on Google focuses on their neglect of <a href="http://en.wikipedia.org/wiki/Exploratory_search">exploratory search</a>. Nonetheless, when I saw the initial <a href="http://www.techcrunch.com/2009/05/12/what-is-google-squared-it-is-how-google-will-crush-wolfram-alpha-exclusive-video/">Youtubeware</a> describing <a href="http://www.google.com/squared">Google Squared</a> a few weeks ago, my ears perked up. I decided to wait until it went live to assess it. Well, it&#8217;s <a href="http://googleblog.blogspot.com/2009/06/square-your-search-results-with-google.html">live now</a>.</p>
<p>The idea of Google Squared is simple: it &#8220;collects facts from the web and presents them in an organized collection, similar to a spreadsheet.&#8221; The best way to understand it is to try it. For example, search for <a href="http://www.google.com/squared/search?q=hybrid+car">hybrid car</a>, and you&#8217;ll see a table of hybrids, with columns corresponding to image, description, type of transmission, yeah, and height. Add a price column if you&#8217;d like, and it will populate it for you. Very slick.</p>
<p>Of course, it is, as Google admits, &#8220;by no means perfect&#8221;. Most queries will show its warts, and some, like <a href="http://www.google.com/squared/search?q=information+scientists">information scientists</a>, are way off (it doesn&#8217;t even try to return results for <a href="http://www.google.com/squared/search?q=library+scientists">library scientists</a>). But it does pretty well when there is structured data out there, and it makes admirable attempt to find it! I suspect the real trick here is that it does a decent job of finding determining instances of the query category (perhaps a souped up version of work they started discussing back in <a href="http://portal.acm.org/citation.cfm?id=1031171.1031194">2004</a>), and then mining structured content about those instances from repositories like <a href="http://www.freebase.com/">Freebase</a>.</p>
<p>I mean, look at these results:</p>
<ul>
<li><a href="http://www.google.com/squared/search?q=b-movies">b-movies</a></li>
<li><a href="http://www.google.com/squared/search?q=renaissance+choral+composers">renaissance choral composers</a></li>
<li><a href="http://www.google.com/squared/search?q=heroes+characters">heroes characters</a></li>
<li><a href="http://www.google.com/squared/search?q=sf+books">sf books</a></li>
</ul>
<p>To be clear, I picked these examples after a fair amount of trial and error&#8211;like <a href="http://wolframalpha.com/">Wolfram Alpha</a>, it is hit and miss, with more miss than hit. But, as <a href="http://sethgrimes.com/">Seth Grimes</a> said at the recent <a href="http://www.textanalyticsnews.com/usa/">Text Analytics Summit</a>, when Wolfram Alpha is good, it&#8217;s very very good, but when it&#8217;s bad, it&#8217;s <a href="http://www.cs.rice.edu/~ssiyer/minstrels/poems/835.html">horrid</a>. Google Squared doesn&#8217;t fail quite so spectacularly, and it gives you a lot more of a chance to interact with it.</p>
<p>This is, by far, the best step I&#8217;ve seen Google take towards <a href="http://en.wikipedia.org/wiki/Human_Computer_Information_Retrieval">HCIR</a>, and I&#8217;m impressed. It&#8217;s still a toy at this stage, but I think it has a future. My warmest congratulations to <a href="http://twitter.com/dulitz">Daniel Dulitz</a> and the rest of the <a href="http://www.google.com/squared/search?q=magpie+team">magpie team</a> that developed it; I&#8217;m looking forward to seeing it evolve.</p>
<script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://thenoisychannel.com/2009/06/04/google-squared-a-great-first-step/"></script>]]></content:encoded>
			<wfw:commentRss>http://thenoisychannel.com/2009/06/04/google-squared-a-great-first-step/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
	</channel>
</rss>

