Rich Skrenta
Richard Skrenta (born 1967) is an American computer programmer and entrepreneur best known for authoring Elk Cloner, the first known self-replicating computer virus, which he created in 1982 as a 15-year-old high school student in Pittsburgh, Pennsylvania.[1][2] Written for the Apple II operating system, Elk Cloner spread via floppy disks and displayed a humorous poem every 50th boot, marking an early milestone in malware history without causing system damage.[2][3]
Skrenta earned a B.A. in computer science from Northwestern University in 1989 and began his professional career at Commodore Business Machines, followed by roles at Unix System Laboratories and Sun Microsystems, where he contributed to engineering efforts in operating systems and network technologies.[4][5] In the 1990s, he co-founded NewHoo, a community-driven web directory that was acquired by Netscape in 1998 and relaunched as the Open Directory Project (ODP), which became the largest human-edited web directory and powered search features for major platforms like Google and Yahoo.[5]
Throughout the 2000s and 2010s, Skrenta held engineering leadership positions at AOL (overseeing Netscape Search and other products), IBM Watson, and Meta, focusing on web-scale data crawling, machine learning, and large-scale systems.[4] He founded several startups, including Topix (a news aggregation and community platform, where he served as CEO), Tobiko (a restaurant recommendation service), and Blekko (a search engine emphasizing transparency and slashtags, founded in 2007).[5][4] Since 2022, Skrenta has served as Executive Director of the Common Crawl Foundation, overseeing open-access web archives used for AI training and research.[4] Additionally, he has authored open-source projects, including early Usenet tools like TASS (a precursor to the tin newsreader), and holds a patent in network security.[1][5]
Early Life and Education
Childhood and Early Computing Interests
Richard Skrenta was born in 1967 in Pittsburgh, Pennsylvania. He grew up in the suburb of Mt. Lebanon, a community known for its strong educational resources and middle-class family environment.[6][1]
During the late 1970s and early 1980s, Skrenta developed an early fascination with computers amid the rise of personal computing. He gained access to Apple II systems through his school and engaged in personal experimentation, exploring the capabilities of these machines that were becoming popular in educational settings.[7][6]
At Mt. Lebanon High School, Skrenta joined the computer club, where he began writing simple programs and playful pranks on floppy disks shared among peers. This involvement fostered his technical skills and interest in software manipulation within a collaborative group of young enthusiasts.[6][8] These early activities laid the groundwork for more ambitious programming projects in high school.
Creation of Elk Cloner
At the age of 15, Rich Skrenta developed Elk Cloner on January 30, 1982 during winter break from high school, initially as a practical joke to target friends' Apple II computers after they refused to share floppy disks with him due to his habit of inserting taunting messages into games.[9][10] Written in assembly language and spanning about 400 lines of code, the program disguised itself as a boot loader to evade detection in an era without antivirus software.[9][3]
Elk Cloner functioned as a boot sector virus for Apple DOS 3.3, storing its code in unused sectors of floppy disks (such as the last 12 sectors of track 2) and replicating itself onto any uninfected disk inserted into the system by hijacking commands such as LOAD, BLOAD, and CATALOG, while using a reserved sector to store its code and a signature byte to avoid reinfecting the same disk.[3][11] Upon booting from an infected disk, it copied itself into memory and modified the operating system to ensure persistence, but its payload was benign: it would invert screen text on the 15th boot and, every 50th boot, display a humorous poem without causing data loss or hardware damage.[3][11] The poem read:
Elk Cloner: The program with a personality
It will get on all your disks
It will infiltrate your chips
Yes it's Cloner!
It will stick to you like glue
It will modify RAM too
Send in the Cloner!
```[](https://www.techtarget.com/searchsecurity/definition/Elk-Cloner)[](https://www.edn.com/1st-computer-virus-is-written-january-30-1982/)[](https://www.versus.com/en/news/elk-cloner-the-first-computer-virus)
As one of the earliest self-replicating programs to spread "in the wild" on personal computers—predating the formal coining of the term "[computer virus](/page/Computer_virus)" in 1983 and widespread [malware](/page/Malware) awareness—Elk Cloner highlighted vulnerabilities in [floppy disk](/page/Floppy_disk) sharing but stemmed from Skrenta's playful experimentation rather than malicious intent, circulating primarily within his [Mount Lebanon](/page/Mount_Lebanon) High School circle in [Pittsburgh](/page/Pittsburgh) before fading as users became aware of the prank.[](https://www.techtarget.com/searchsecurity/definition/Elk-Cloner)[](https://www.digit.fyi/elk-cloner/)[](https://www.versus.com/en/news/elk-cloner-the-first-computer-virus)
### College Years at Northwestern
Rich Skrenta attended [Northwestern University](/page/Northwestern_University), earning a [Bachelor of Arts](/page/Bachelor_of_Arts) in [Computer Science](/page/Computer_science) in 1989.[](https://www.linkedin.com/in/skrenta)[](https://commoncrawl.org/team/rich-skrenta-director)[](https://www.pubcon.com/bios/rich_skrenta.htm) Building on his high school programming experiments, such as the creation of [Elk Cloner](/page/Elk_Cloner), Skrenta's studies provided structured training in [software development](/page/Software_development) and [computing](/page/Computing) fundamentals during a period when personal computing was rapidly evolving.[](https://www.theregister.com/2012/12/14/first_virus_elk_cloner_creator_interviewed/?page=2)
During his time at Northwestern, Skrenta had access to university mainframes, which offered significantly more memory and processing power than the [Apple II](/page/Apple_II) systems he had used earlier, enabling more complex programming tasks amid the rise of [MS-DOS](/page/MS-DOS) machines.[](https://www.theregister.com/2012/12/14/first_virus_elk_cloner_creator_interviewed/?page=2) The [computer science](/page/Computer_science) curriculum at the time emphasized core areas like programming languages, operating systems, and [systems design](/page/Systems_design), aligning with emerging technologies such as Unix, which were becoming integral to academic and professional computing environments. While specific student projects or internships from this period are not widely documented, Skrenta's academic experience honed his skills in [software engineering](/page/Software_engineering), preparing him for contributions to open-source initiatives later in his career.[](https://commoncrawl.org/team/rich-skrenta-director)
Upon earning his degree in 1989, Skrenta transitioned directly into the tech industry, leveraging his educational background to enter professional roles in software development.[](https://www.pubcon.com/bios/rich_skrenta.htm)[](https://www.fastcompany.com/user/rich-skrenta) This marked the bridge from his student-era explorations to a trajectory in Silicon Valley entrepreneurship and technology leadership.
## Career
### Early Professional Roles
After graduating from Northwestern University with a degree in computer science, Rich Skrenta began his professional career at Commodore Business Machines, where he worked from 1989 to 1991 in the Amiga UNIX Group.[](https://www.fastcompany.com/user/rich-skrenta) There, he contributed to the development of Amiga UNIX, a port of UNIX System V Release 4 to the Amiga hardware platform, including reviewing documentation such as the *Learning Amiga UNIX* manual.[](http://www.vintagebytes.de/downloads/LearningAmigaUnix.pdf) His efforts helped adapt UNIX features like the X Window System for Amiga systems, supporting multi-user and multitasking capabilities on the platform.[](https://www.pubcon.com/bios/rich_skrenta.htm)
In 1991, Skrenta joined Unix System Laboratories (USL), a joint venture between [AT&T](/page/AT&T) and [Sun Microsystems](/page/Sun_Microsystems) focused on advancing UNIX standards, where he remained until 1995.[](https://www.fastcompany.com/user/rich-skrenta) At USL, he took on development roles contributing to operating system enhancements, building on the System V lineage to improve portability, networking, and system utilities during a pivotal era for commercial UNIX adoption.[](https://www.pubcon.com/bios/rich_skrenta.htm) This work involved refining [kernel](/page/Kernel) components and tools that influenced enterprise-grade UNIX implementations.
From 1996 to 1998, Skrenta served as an engineering manager at [Sun Microsystems](/page/Sun_Microsystems), concentrating on [software engineering](/page/Software_engineering) tasks including IP-level [encryption](/page/Encryption) for secure network communications.[](https://www.fastcompany.com/user/rich-skrenta) His role emphasized integrating security features into Sun's [Solaris](/page/Solaris) operating system and related networking software, addressing growing needs for encrypted data transmission in early [internet](/page/Internet) [infrastructure](/page/Infrastructure).[](https://blog.codinghorror.com/shipping-isnt-enough/) Through these positions, Skrenta developed deep expertise in Unix-based systems, operating system [porting](/page/Porting), and foundational enterprise computing practices that shaped his later technical contributions.[](https://www.pubcon.com/bios/rich_skrenta.htm)
### Contributions to Internet Search and Directories
In the late [1990s](/page/1990s), Rich Skrenta co-founded the Open Directory Project (ODP), originally launched as NewHoo in June 1998 alongside Bob Truel, Chris Tolles, Bryn Dole, and Jeremy Wenokur, as a collaborative, open-source alternative to proprietary web directories like Yahoo!'s.[](http://odp.org/about.html)[](https://www.infotoday.com/online/ol2000/sherman7.html) The project aimed to leverage volunteer editors to build a comprehensive, human-curated catalog of websites, addressing frustrations with outdated and incomplete directories by enabling community-driven submissions and categorizations.[](https://www.infotoday.com/online/ol2000/sherman7.html) By October 1998, NewHoo had grown to index over 100,000 sites with thousands of editors, leading to its acquisition by [Netscape](/page/Netscape) Communications, after which it was rebranded as [DMOZ](/page/DMOZ) and integrated into Netscape's ecosystem while preserving its open data model.[](https://www.infotoday.com/online/ol2000/sherman7.html)[](https://www.latimes.com/archives/la-xpm-1999-oct-18-fi-23567-story.html)
Following [Netscape](/page/Netscape)'s acquisition by [AOL](/page/AOL) in 1998, Skrenta took on engineering leadership roles within the company, heading development for [Netscape](/page/Netscape) Search, [AOL](/page/AOL) Music, and [AOL](/page/AOL) Shopping through the early 2000s.[](https://www.fastcompany.com/user/rich-skrenta)[](https://www.pubcon.com/bios/rich_skrenta.htm) In these positions, he contributed to enhancements in search functionalities, including refinements to algorithms for better result retrieval and user interfaces designed to improve navigation and personalization for early web users.[](https://www.fastcompany.com/user/rich-skrenta) DMOZ's categorized data became a foundational resource, powering directory services for [Netscape](/page/Netscape) Search and [AOL](/page/AOL) Search, as well as portals like [Alexa Internet](/page/Alexa_Internet), by providing structured, vetted links that supplemented automated crawling.[](https://www.infotoday.com/online/ol2000/sherman7.html)
Skrenta's innovations emphasized [human](/page/Human) oversight in [web](/page/Web) [organization](/page/Organization), introducing peer-reviewed [editing](/page/Editing) processes and automated [checks](/page/The_Checks) that maintained [dead](/page/The_Dead) [link](/page/Link) rates below 0.5%, far surpassing competitors.[](https://www.infotoday.com/online/ol2000/sherman7.html) This approach advanced [categorization](/page/Categorization) by allowing multi-level hierarchies and editor expertise to ensure topical [relevance](/page/Relevance), enabling more intuitive browsing in an era before dominant algorithmic search engines.[](https://www.infotoday.com/online/ol2000/sherman7.html) By making ODP's dataset freely available under an open license, Skrenta facilitated its adoption by engines like [Lycos](/page/Lycos), [AltaVista](/page/AltaVista), and later [Google](/page/Google) for enhancing result [relevance](/page/Relevance) through hybrid human-machine methods.[](https://www.infotoday.com/online/ol2000/sherman7.html)
These efforts marked Skrenta's shift from hands-on development to shaping the broader [web](/page/Web) infrastructure, influencing how directories supported scalable [internet](/page/Internet) navigation and search accuracy during the dot-com expansion.[](https://www.infotoday.com/online/ol2000/sherman7.html) By 2000, [DMOZ](/page/DMOZ) had expanded to over 1.5 million sites and 22,000 editors, underscoring its role as a pivotal community resource in the evolving search landscape.[](https://www.infotoday.com/online/ol2000/sherman7.html)
### Founding Topix
In 2004, Rich Skrenta co-founded [Topix](/page/TOPIX) LLC, a [hyperlocal](/page/Hyperlocal) news and discussion platform designed to aggregate and deliver community-specific content from thousands of online sources.[](https://www.latimes.com/archives/la-xpm-2005-mar-23-fi-topix23-story.html) Along with co-founders including Chris Tolles, Bryn Dole, Bob Truel, Tom Markson, and Mike Markson, Skrenta aimed to create a centralized hub for [local news](/page/Local_news), forums, and information, drawing on his prior experience in web directories and search technologies at [Netscape](/page/Netscape) and [AOL](/page/AOL).[](https://www.reuters.com/article/topix-ceo/topix-com-co-founder-chief-executive-departs-idUSN2636819120070626/) The platform initially focused on automating the collection of articles from over 10,000 newspapers and websites, enabling users to access tailored feeds for more than 32,000 U.S. locales.[](https://www.nytimes.com/2005/03/23/business/media/newspaper-giants-buy-web-news-monitor.html)
As the company's CEO and primary technical architect, Skrenta led the development of proprietary algorithms central to Topix's operations, including the NewsRank system for scoring and prioritizing local stories.[](https://www.poynter.org/reporting-editing/2004/topix-net-ups-ante-in-news-aggregation/) This algorithm combined automated analysis—such as frequency of mentions across sources and geographic relevance—with editorial oversight to rank content, ensuring that aggregated newspaper articles surfaced the most pertinent local developments without overwhelming users.[](https://www.poynter.org/reporting-editing/2004/topix-net-ups-ante-in-news-aggregation/) By emphasizing geo-sophisticated indexing, Topix differentiated itself from broader news aggregators, fostering community engagement through integrated discussion boards while maintaining a focus on verifiable, source-driven reporting.[](http://itc.conversationsnetwork.org/shows/detail3312.html)
The venture's growth accelerated in 2005 when Skrenta negotiated a deal selling a 75% stake to a [consortium](/page/Consortium) of major newspaper publishers—Tribune Company, Gannett Co., and Knight Ridder Inc.—each acquiring 25% ownership, while the founders retained the remaining 25%.[](https://www.nytimes.com/2005/03/23/business/media/newspaper-giants-buy-web-news-monitor.html) This infusion of capital and industry partnerships enabled rapid scaling, including expanded server infrastructure and content partnerships, propelling [Topix](/page/TOPIX) to profitability within months and solidifying its position as a leading [local news](/page/Local_news) destination.[](https://www.latimes.com/archives/la-xpm-2005-mar-23-fi-topix23-story.html) Skrenta continued as CEO, overseeing technical refinements and strategic direction until a leadership transition in 2007, when he stepped down but remained on the board as an advisor.[](https://www.reuters.com/article/topix-ceo/topix-com-co-founder-chief-executive-departs-idUSN2636819120070626/)
### Development and Sale of Blekko
In 2007, Rich Skrenta co-founded Blekko Inc., drawing on his prior experience in search technologies from ventures like [Topix](/page/TOPIX) and the Open Directory Project, to create an alternative [web](/page/Web) [search engine](/page/Search_engine) aimed at addressing perceived shortcomings in dominant players like [Google](/page/Google).[](https://www.crunchbase.com/organization/blekko) The company operated in [stealth mode](/page/Stealth_mode) initially, securing early funding from investors including [Marc Andreessen](/page/Marc_Andreessen), before emerging from beta testing with a public launch on November 1, 2010.[](https://www.nbcnews.com/id/wbna39955277)[](https://techcrunch.com/2008/05/14/stealth-search-engine-blekko-gets-money-from-marc-andreessen-softtech/)
Blekko differentiated itself through its "slashtag" system, which allowed users to append slashes to search queries (e.g., /news or /reviews) to filter results to specific high-quality sites or categories, promoting site-specific searches and user-curated refinements.[](https://www.computerworld.com/article/1542164/q-a-blekko-execs-explain-their-search-engine-strategy.html) This approach emphasized transparency by publicly displaying the filters applied to results and reducing ad clutter, with options like "No Ads" searches to prioritize clean, relevant outputs over monetized promotions.[](https://searchengineland.com/blekko-will-keep-user-data-48-hours-76871) Additionally, Blekko implemented robust spam filtering at the crawling and indexing stages, automatically curating results in spam-prone areas such as health or automotive queries, and incorporated open elements inspired by collaborative projects like Wikipedia to crowdsource result validation and combat content farms.[](https://www.computerworld.com/article/1542164/q-a-blekko-execs-explain-their-search-engine-strategy.html)[](https://lifehacker.com/blekko-de-spams-search-results-with-slashtags-5678321) These features positioned Blekko as a more trustworthy alternative, focusing on human moderation and algorithmic clarity to deliver fewer but higher-quality results.[](https://www.theatlantic.com/technology/2010/11/introducing-blekko-the-wikipedia-of-search-engines/343615/)
Despite innovative tools like slashtags and SEO analytics tabs for users, Blekko faced significant growth challenges in a search market dominated by [Google](/page/Google), which held over 90% share at the time, making it difficult to attract substantial user adoption or scale beyond niche audiences.[](https://techcrunch.com/2008/01/02/the-next-google-search-challenger-blekko/) The engine's emphasis on curation required ongoing [community](/page/Community) input, which limited rapid expansion compared to automated giants, and while it raised over $30 million in funding, including from [Yandex](/page/Yandex), it struggled against established ad-driven models.[](https://www.computerworld.com/article/1423747/search-engine-blekko-hauls-in-15m-russian-investment.html)[](https://247wallst.com/apps-software/2010/11/01/blekko-a-search-engine-for-idiots/)
In March 2015, IBM acquired Blekko's technology and team to enhance its [Watson](/page/Watson) AI platform, integrating the engine's advanced web-crawling, categorization, and intelligent filtering capabilities to improve data ingestion for [cognitive computing](/page/Cognitive_computing) applications.[](https://siliconangle.com/2015/03/30/ibm-acquires-technology-from-curated-search-engine-blekko-to-bolster-watson/) Following the acquisition, Skrenta joined [IBM](/page/IBM), contributing to the development of [Watson](/page/Watson) from 2015 to 2018.[](https://www.linkedin.com/in/skrenta) He subsequently founded [Tobiko](/page/Tobiko), a [restaurant](/page/Restaurant) recommendation service that emphasized expert reviews over [user-generated content](/page/User-generated_content), serving as its leader from 2018 to 2020,[](https://searchengineland.com/restaurant-app-tobiko-goes-old-school-by-shunning-user-reviews-323373) before joining [Meta](/page/Meta) as [Software Engineering](/page/Software_engineering) Director from 2020 to 2022.[](https://www.crunchbase.com/person/rich-skrenta) The acquisition marked the end of Blekko as an independent search service, with its assets repurposed for [enterprise](/page/Enterprise) AI rather than consumer search.[](https://venturebeat.com/ai/ibm-acquires-web-crawling-startup-blekko)
### Leadership at Common Crawl and Recent Activities
In 2022, Rich Skrenta was appointed as the [executive director](/page/Executive_director) of the [Common Crawl](/page/Common_Crawl) Foundation, a [nonprofit organization](/page/Nonprofit_organization) dedicated to archiving and providing [open access](/page/Open_access) to vast amounts of web data for research purposes.[](https://commoncrawl.org/team) Under his leadership, [Common Crawl](/page/Common_Crawl) has maintained and expanded its massive web archive, which spans multiple petabytes of crawled internet content dating back to 2008, enabling researchers, academics, and developers to analyze trends, build datasets, and support open-source projects without proprietary restrictions.
Skrenta's tenure has significantly advanced the use of this archive in [artificial intelligence](/page/Artificial_intelligence) and [machine learning](/page/Machine_learning) applications, positioning Common Crawl as a [primary source](/page/Primary_source) of training data for numerous large [language](/page/Language) models and [AI](/page/Ai) systems. The foundation's datasets, freely available through initiatives like the AWS [Open Data](/page/Open_data) program, have been utilized in the development of thousands of [AI](/page/Ai) models, fostering innovations in [natural language processing](/page/Natural_language_processing) and web-scale analysis while emphasizing ethical data accessibility for non-commercial research.[](https://www.theatlantic.com/technology/2025/11/common-crawl-ai-training-data/684567/)[](https://www.influencewatch.org/non-profit/common-crawl/)
In June 2025, Skrenta participated in the [United Nations](/page/United_Nations) Open Source Week in [New York](/page/New_York), where he delivered opening remarks on the role of [open data](/page/Open_data) in [AI](/page/Ai) development, highlighting Common Crawl's contributions to [provenance](/page/Provenance) and attribution in training datasets to promote [transparency](/page/Transparency) and global collaboration.[](https://commoncrawl.org/blog/common-crawl-at-un-open-source-week-june-2025) Later that year, amid growing controversies over [web archiving](/page/Web_archiving) practices, Skrenta publicly defended Common Crawl's approach in response to a November 4, 2025, article in *[The Atlantic](/page/The_Atlantic)* that accused the organization of inadvertently enabling access to paywalled [content](/page/Content) for [AI](/page/Ai) training. He argued that content published online should remain accessible for archival and research purposes, stating, "You shouldn't have put your content on the [internet](/page/Internet) if you didn't want it to be on the [internet](/page/Internet)."[](https://www.theatlantic.com/technology/2025/11/common-crawl-ai-training-data/684567/)
Skrenta has been a vocal [advocate](/page/Advocate) for AI models' rights to access publicly available [web content](/page/Web_content), particularly in addressing publisher objections and deletion requests that emerged in 2024. In response to efforts by [media](/page/Media) outlets to block or remove their material from archival datasets, he warned that such actions could undermine the open [internet](/page/Internet), describing them as an "affront to the [internet](/page/Internet) as we know it" and emphasizing the importance of comprehensive [data](/page/Data) for advancing AI [research](/page/Research) without favoring commercial gatekeepers.[](https://www.wired.com/story/the-fight-against-ai-comes-to-a-foundational-data-set/)[](https://www.theatlantic.com/technology/2025/11/common-crawl-ai-training-data/684567/) This stance has sparked debates on [data](/page/Data) [ethics](/page/Ethics), with Skrenta asserting that AI entities, like human researchers, should not be restricted from public resources, provided no paywalls are bypassed in the crawling process.[](https://mashable.com/article/common-crawl-accused-sharing-paywalled-content-ai-companies)
Elk Cloner: The program with a personality
It will get on all your disks
It will infiltrate your chips
Yes it's Cloner!
It will stick to you like glue
It will modify RAM too
Send in the Cloner!
```[](https://www.techtarget.com/searchsecurity/definition/Elk-Cloner)[](https://www.edn.com/1st-computer-virus-is-written-january-30-1982/)[](https://www.versus.com/en/news/elk-cloner-the-first-computer-virus)
As one of the earliest self-replicating programs to spread "in the wild" on personal computers—predating the formal coining of the term "[computer virus](/page/Computer_virus)" in 1983 and widespread [malware](/page/Malware) awareness—Elk Cloner highlighted vulnerabilities in [floppy disk](/page/Floppy_disk) sharing but stemmed from Skrenta's playful experimentation rather than malicious intent, circulating primarily within his [Mount Lebanon](/page/Mount_Lebanon) High School circle in [Pittsburgh](/page/Pittsburgh) before fading as users became aware of the prank.[](https://www.techtarget.com/searchsecurity/definition/Elk-Cloner)[](https://www.digit.fyi/elk-cloner/)[](https://www.versus.com/en/news/elk-cloner-the-first-computer-virus)
### College Years at Northwestern
Rich Skrenta attended [Northwestern University](/page/Northwestern_University), earning a [Bachelor of Arts](/page/Bachelor_of_Arts) in [Computer Science](/page/Computer_science) in 1989.[](https://www.linkedin.com/in/skrenta)[](https://commoncrawl.org/team/rich-skrenta-director)[](https://www.pubcon.com/bios/rich_skrenta.htm) Building on his high school programming experiments, such as the creation of [Elk Cloner](/page/Elk_Cloner), Skrenta's studies provided structured training in [software development](/page/Software_development) and [computing](/page/Computing) fundamentals during a period when personal computing was rapidly evolving.[](https://www.theregister.com/2012/12/14/first_virus_elk_cloner_creator_interviewed/?page=2)
During his time at Northwestern, Skrenta had access to university mainframes, which offered significantly more memory and processing power than the [Apple II](/page/Apple_II) systems he had used earlier, enabling more complex programming tasks amid the rise of [MS-DOS](/page/MS-DOS) machines.[](https://www.theregister.com/2012/12/14/first_virus_elk_cloner_creator_interviewed/?page=2) The [computer science](/page/Computer_science) curriculum at the time emphasized core areas like programming languages, operating systems, and [systems design](/page/Systems_design), aligning with emerging technologies such as Unix, which were becoming integral to academic and professional computing environments. While specific student projects or internships from this period are not widely documented, Skrenta's academic experience honed his skills in [software engineering](/page/Software_engineering), preparing him for contributions to open-source initiatives later in his career.[](https://commoncrawl.org/team/rich-skrenta-director)
Upon earning his degree in 1989, Skrenta transitioned directly into the tech industry, leveraging his educational background to enter professional roles in software development.[](https://www.pubcon.com/bios/rich_skrenta.htm)[](https://www.fastcompany.com/user/rich-skrenta) This marked the bridge from his student-era explorations to a trajectory in Silicon Valley entrepreneurship and technology leadership.
## Career
### Early Professional Roles
After graduating from Northwestern University with a degree in computer science, Rich Skrenta began his professional career at Commodore Business Machines, where he worked from 1989 to 1991 in the Amiga UNIX Group.[](https://www.fastcompany.com/user/rich-skrenta) There, he contributed to the development of Amiga UNIX, a port of UNIX System V Release 4 to the Amiga hardware platform, including reviewing documentation such as the *Learning Amiga UNIX* manual.[](http://www.vintagebytes.de/downloads/LearningAmigaUnix.pdf) His efforts helped adapt UNIX features like the X Window System for Amiga systems, supporting multi-user and multitasking capabilities on the platform.[](https://www.pubcon.com/bios/rich_skrenta.htm)
In 1991, Skrenta joined Unix System Laboratories (USL), a joint venture between [AT&T](/page/AT&T) and [Sun Microsystems](/page/Sun_Microsystems) focused on advancing UNIX standards, where he remained until 1995.[](https://www.fastcompany.com/user/rich-skrenta) At USL, he took on development roles contributing to operating system enhancements, building on the System V lineage to improve portability, networking, and system utilities during a pivotal era for commercial UNIX adoption.[](https://www.pubcon.com/bios/rich_skrenta.htm) This work involved refining [kernel](/page/Kernel) components and tools that influenced enterprise-grade UNIX implementations.
From 1996 to 1998, Skrenta served as an engineering manager at [Sun Microsystems](/page/Sun_Microsystems), concentrating on [software engineering](/page/Software_engineering) tasks including IP-level [encryption](/page/Encryption) for secure network communications.[](https://www.fastcompany.com/user/rich-skrenta) His role emphasized integrating security features into Sun's [Solaris](/page/Solaris) operating system and related networking software, addressing growing needs for encrypted data transmission in early [internet](/page/Internet) [infrastructure](/page/Infrastructure).[](https://blog.codinghorror.com/shipping-isnt-enough/) Through these positions, Skrenta developed deep expertise in Unix-based systems, operating system [porting](/page/Porting), and foundational enterprise computing practices that shaped his later technical contributions.[](https://www.pubcon.com/bios/rich_skrenta.htm)
### Contributions to Internet Search and Directories
In the late [1990s](/page/1990s), Rich Skrenta co-founded the Open Directory Project (ODP), originally launched as NewHoo in June 1998 alongside Bob Truel, Chris Tolles, Bryn Dole, and Jeremy Wenokur, as a collaborative, open-source alternative to proprietary web directories like Yahoo!'s.[](http://odp.org/about.html)[](https://www.infotoday.com/online/ol2000/sherman7.html) The project aimed to leverage volunteer editors to build a comprehensive, human-curated catalog of websites, addressing frustrations with outdated and incomplete directories by enabling community-driven submissions and categorizations.[](https://www.infotoday.com/online/ol2000/sherman7.html) By October 1998, NewHoo had grown to index over 100,000 sites with thousands of editors, leading to its acquisition by [Netscape](/page/Netscape) Communications, after which it was rebranded as [DMOZ](/page/DMOZ) and integrated into Netscape's ecosystem while preserving its open data model.[](https://www.infotoday.com/online/ol2000/sherman7.html)[](https://www.latimes.com/archives/la-xpm-1999-oct-18-fi-23567-story.html)
Following [Netscape](/page/Netscape)'s acquisition by [AOL](/page/AOL) in 1998, Skrenta took on engineering leadership roles within the company, heading development for [Netscape](/page/Netscape) Search, [AOL](/page/AOL) Music, and [AOL](/page/AOL) Shopping through the early 2000s.[](https://www.fastcompany.com/user/rich-skrenta)[](https://www.pubcon.com/bios/rich_skrenta.htm) In these positions, he contributed to enhancements in search functionalities, including refinements to algorithms for better result retrieval and user interfaces designed to improve navigation and personalization for early web users.[](https://www.fastcompany.com/user/rich-skrenta) DMOZ's categorized data became a foundational resource, powering directory services for [Netscape](/page/Netscape) Search and [AOL](/page/AOL) Search, as well as portals like [Alexa Internet](/page/Alexa_Internet), by providing structured, vetted links that supplemented automated crawling.[](https://www.infotoday.com/online/ol2000/sherman7.html)
Skrenta's innovations emphasized [human](/page/Human) oversight in [web](/page/Web) [organization](/page/Organization), introducing peer-reviewed [editing](/page/Editing) processes and automated [checks](/page/The_Checks) that maintained [dead](/page/The_Dead) [link](/page/Link) rates below 0.5%, far surpassing competitors.[](https://www.infotoday.com/online/ol2000/sherman7.html) This approach advanced [categorization](/page/Categorization) by allowing multi-level hierarchies and editor expertise to ensure topical [relevance](/page/Relevance), enabling more intuitive browsing in an era before dominant algorithmic search engines.[](https://www.infotoday.com/online/ol2000/sherman7.html) By making ODP's dataset freely available under an open license, Skrenta facilitated its adoption by engines like [Lycos](/page/Lycos), [AltaVista](/page/AltaVista), and later [Google](/page/Google) for enhancing result [relevance](/page/Relevance) through hybrid human-machine methods.[](https://www.infotoday.com/online/ol2000/sherman7.html)
These efforts marked Skrenta's shift from hands-on development to shaping the broader [web](/page/Web) infrastructure, influencing how directories supported scalable [internet](/page/Internet) navigation and search accuracy during the dot-com expansion.[](https://www.infotoday.com/online/ol2000/sherman7.html) By 2000, [DMOZ](/page/DMOZ) had expanded to over 1.5 million sites and 22,000 editors, underscoring its role as a pivotal community resource in the evolving search landscape.[](https://www.infotoday.com/online/ol2000/sherman7.html)
### Founding Topix
In 2004, Rich Skrenta co-founded [Topix](/page/TOPIX) LLC, a [hyperlocal](/page/Hyperlocal) news and discussion platform designed to aggregate and deliver community-specific content from thousands of online sources.[](https://www.latimes.com/archives/la-xpm-2005-mar-23-fi-topix23-story.html) Along with co-founders including Chris Tolles, Bryn Dole, Bob Truel, Tom Markson, and Mike Markson, Skrenta aimed to create a centralized hub for [local news](/page/Local_news), forums, and information, drawing on his prior experience in web directories and search technologies at [Netscape](/page/Netscape) and [AOL](/page/AOL).[](https://www.reuters.com/article/topix-ceo/topix-com-co-founder-chief-executive-departs-idUSN2636819120070626/) The platform initially focused on automating the collection of articles from over 10,000 newspapers and websites, enabling users to access tailored feeds for more than 32,000 U.S. locales.[](https://www.nytimes.com/2005/03/23/business/media/newspaper-giants-buy-web-news-monitor.html)
As the company's CEO and primary technical architect, Skrenta led the development of proprietary algorithms central to Topix's operations, including the NewsRank system for scoring and prioritizing local stories.[](https://www.poynter.org/reporting-editing/2004/topix-net-ups-ante-in-news-aggregation/) This algorithm combined automated analysis—such as frequency of mentions across sources and geographic relevance—with editorial oversight to rank content, ensuring that aggregated newspaper articles surfaced the most pertinent local developments without overwhelming users.[](https://www.poynter.org/reporting-editing/2004/topix-net-ups-ante-in-news-aggregation/) By emphasizing geo-sophisticated indexing, Topix differentiated itself from broader news aggregators, fostering community engagement through integrated discussion boards while maintaining a focus on verifiable, source-driven reporting.[](http://itc.conversationsnetwork.org/shows/detail3312.html)
The venture's growth accelerated in 2005 when Skrenta negotiated a deal selling a 75% stake to a [consortium](/page/Consortium) of major newspaper publishers—Tribune Company, Gannett Co., and Knight Ridder Inc.—each acquiring 25% ownership, while the founders retained the remaining 25%.[](https://www.nytimes.com/2005/03/23/business/media/newspaper-giants-buy-web-news-monitor.html) This infusion of capital and industry partnerships enabled rapid scaling, including expanded server infrastructure and content partnerships, propelling [Topix](/page/TOPIX) to profitability within months and solidifying its position as a leading [local news](/page/Local_news) destination.[](https://www.latimes.com/archives/la-xpm-2005-mar-23-fi-topix23-story.html) Skrenta continued as CEO, overseeing technical refinements and strategic direction until a leadership transition in 2007, when he stepped down but remained on the board as an advisor.[](https://www.reuters.com/article/topix-ceo/topix-com-co-founder-chief-executive-departs-idUSN2636819120070626/)
### Development and Sale of Blekko
In 2007, Rich Skrenta co-founded Blekko Inc., drawing on his prior experience in search technologies from ventures like [Topix](/page/TOPIX) and the Open Directory Project, to create an alternative [web](/page/Web) [search engine](/page/Search_engine) aimed at addressing perceived shortcomings in dominant players like [Google](/page/Google).[](https://www.crunchbase.com/organization/blekko) The company operated in [stealth mode](/page/Stealth_mode) initially, securing early funding from investors including [Marc Andreessen](/page/Marc_Andreessen), before emerging from beta testing with a public launch on November 1, 2010.[](https://www.nbcnews.com/id/wbna39955277)[](https://techcrunch.com/2008/05/14/stealth-search-engine-blekko-gets-money-from-marc-andreessen-softtech/)
Blekko differentiated itself through its "slashtag" system, which allowed users to append slashes to search queries (e.g., /news or /reviews) to filter results to specific high-quality sites or categories, promoting site-specific searches and user-curated refinements.[](https://www.computerworld.com/article/1542164/q-a-blekko-execs-explain-their-search-engine-strategy.html) This approach emphasized transparency by publicly displaying the filters applied to results and reducing ad clutter, with options like "No Ads" searches to prioritize clean, relevant outputs over monetized promotions.[](https://searchengineland.com/blekko-will-keep-user-data-48-hours-76871) Additionally, Blekko implemented robust spam filtering at the crawling and indexing stages, automatically curating results in spam-prone areas such as health or automotive queries, and incorporated open elements inspired by collaborative projects like Wikipedia to crowdsource result validation and combat content farms.[](https://www.computerworld.com/article/1542164/q-a-blekko-execs-explain-their-search-engine-strategy.html)[](https://lifehacker.com/blekko-de-spams-search-results-with-slashtags-5678321) These features positioned Blekko as a more trustworthy alternative, focusing on human moderation and algorithmic clarity to deliver fewer but higher-quality results.[](https://www.theatlantic.com/technology/2010/11/introducing-blekko-the-wikipedia-of-search-engines/343615/)
Despite innovative tools like slashtags and SEO analytics tabs for users, Blekko faced significant growth challenges in a search market dominated by [Google](/page/Google), which held over 90% share at the time, making it difficult to attract substantial user adoption or scale beyond niche audiences.[](https://techcrunch.com/2008/01/02/the-next-google-search-challenger-blekko/) The engine's emphasis on curation required ongoing [community](/page/Community) input, which limited rapid expansion compared to automated giants, and while it raised over $30 million in funding, including from [Yandex](/page/Yandex), it struggled against established ad-driven models.[](https://www.computerworld.com/article/1423747/search-engine-blekko-hauls-in-15m-russian-investment.html)[](https://247wallst.com/apps-software/2010/11/01/blekko-a-search-engine-for-idiots/)
In March 2015, IBM acquired Blekko's technology and team to enhance its [Watson](/page/Watson) AI platform, integrating the engine's advanced web-crawling, categorization, and intelligent filtering capabilities to improve data ingestion for [cognitive computing](/page/Cognitive_computing) applications.[](https://siliconangle.com/2015/03/30/ibm-acquires-technology-from-curated-search-engine-blekko-to-bolster-watson/) Following the acquisition, Skrenta joined [IBM](/page/IBM), contributing to the development of [Watson](/page/Watson) from 2015 to 2018.[](https://www.linkedin.com/in/skrenta) He subsequently founded [Tobiko](/page/Tobiko), a [restaurant](/page/Restaurant) recommendation service that emphasized expert reviews over [user-generated content](/page/User-generated_content), serving as its leader from 2018 to 2020,[](https://searchengineland.com/restaurant-app-tobiko-goes-old-school-by-shunning-user-reviews-323373) before joining [Meta](/page/Meta) as [Software Engineering](/page/Software_engineering) Director from 2020 to 2022.[](https://www.crunchbase.com/person/rich-skrenta) The acquisition marked the end of Blekko as an independent search service, with its assets repurposed for [enterprise](/page/Enterprise) AI rather than consumer search.[](https://venturebeat.com/ai/ibm-acquires-web-crawling-startup-blekko)
### Leadership at Common Crawl and Recent Activities
In 2022, Rich Skrenta was appointed as the [executive director](/page/Executive_director) of the [Common Crawl](/page/Common_Crawl) Foundation, a [nonprofit organization](/page/Nonprofit_organization) dedicated to archiving and providing [open access](/page/Open_access) to vast amounts of web data for research purposes.[](https://commoncrawl.org/team) Under his leadership, [Common Crawl](/page/Common_Crawl) has maintained and expanded its massive web archive, which spans multiple petabytes of crawled internet content dating back to 2008, enabling researchers, academics, and developers to analyze trends, build datasets, and support open-source projects without proprietary restrictions.
Skrenta's tenure has significantly advanced the use of this archive in [artificial intelligence](/page/Artificial_intelligence) and [machine learning](/page/Machine_learning) applications, positioning Common Crawl as a [primary source](/page/Primary_source) of training data for numerous large [language](/page/Language) models and [AI](/page/Ai) systems. The foundation's datasets, freely available through initiatives like the AWS [Open Data](/page/Open_data) program, have been utilized in the development of thousands of [AI](/page/Ai) models, fostering innovations in [natural language processing](/page/Natural_language_processing) and web-scale analysis while emphasizing ethical data accessibility for non-commercial research.[](https://www.theatlantic.com/technology/2025/11/common-crawl-ai-training-data/684567/)[](https://www.influencewatch.org/non-profit/common-crawl/)
In June 2025, Skrenta participated in the [United Nations](/page/United_Nations) Open Source Week in [New York](/page/New_York), where he delivered opening remarks on the role of [open data](/page/Open_data) in [AI](/page/Ai) development, highlighting Common Crawl's contributions to [provenance](/page/Provenance) and attribution in training datasets to promote [transparency](/page/Transparency) and global collaboration.[](https://commoncrawl.org/blog/common-crawl-at-un-open-source-week-june-2025) Later that year, amid growing controversies over [web archiving](/page/Web_archiving) practices, Skrenta publicly defended Common Crawl's approach in response to a November 4, 2025, article in *[The Atlantic](/page/The_Atlantic)* that accused the organization of inadvertently enabling access to paywalled [content](/page/Content) for [AI](/page/Ai) training. He argued that content published online should remain accessible for archival and research purposes, stating, "You shouldn't have put your content on the [internet](/page/Internet) if you didn't want it to be on the [internet](/page/Internet)."[](https://www.theatlantic.com/technology/2025/11/common-crawl-ai-training-data/684567/)
Skrenta has been a vocal [advocate](/page/Advocate) for AI models' rights to access publicly available [web content](/page/Web_content), particularly in addressing publisher objections and deletion requests that emerged in 2024. In response to efforts by [media](/page/Media) outlets to block or remove their material from archival datasets, he warned that such actions could undermine the open [internet](/page/Internet), describing them as an "affront to the [internet](/page/Internet) as we know it" and emphasizing the importance of comprehensive [data](/page/Data) for advancing AI [research](/page/Research) without favoring commercial gatekeepers.[](https://www.wired.com/story/the-fight-against-ai-comes-to-a-foundational-data-set/)[](https://www.theatlantic.com/technology/2025/11/common-crawl-ai-training-data/684567/) This stance has sparked debates on [data](/page/Data) [ethics](/page/Ethics), with Skrenta asserting that AI entities, like human researchers, should not be restricted from public resources, provided no paywalls are bypassed in the crawling process.[](https://mashable.com/article/common-crawl-accused-sharing-paywalled-content-ai-companies)