This week's "special report" discusses three internal documents from the Wikimedia Foundation that shed light on the history of the Knowledge Engine project. Here, we examine each one in depth.
Wikipedia, on the other hand, is characterised as follows:
“ | Wikipedia's Roots
No other search engines carry these ideals Wikipedia Search Originates
No other search engines carry these ideals Wikipedia Search is ... Trusted. Private. Open. Wikipedia Search Globally democratizes knowledge. |
” |
The presentation concludes with screen mock-ups of what a Wikipedia search engine could look like, highlighting content from Wikivoyage, Openmaps, Fox News, Wikipedia and Wikidata.
Marked "CONFIDENTIAL – DRAFT", this 11-page document addressed to the Knight Foundation has the headline "Knowledge Engine by Wikipedia: A Proposal from the Wikimedia Foundation".
After briefly describing the history and achievements of the Wikipedia project, the document states:
“ | The Wikimedia Foundation is embarking on a new global project that will once again change the way people access knowledge on the Internet. Knowledge Engine By Wikipedia is a federated knowledge engine that will give users the most reliable and most trustworthy public information channel on the web, applying fundamentals of transparent Wikibased systems to surfacing the most relevant and important information. Knowledge Engine By Wikipedia will democratize the discovery of media, news and information – it will make the Internet’s most relevant information more accessible and openly curated, and it will create an open data engine that’s completely free of commercial interests. Our new site will be the Internet’s first transparent search engine, and the first one that carries the reputation of Wikipedia and the Wikimedia Foundation.
The Problem The emergence of the Internet had promised massive democratization of content delivery. On the creation side, that promise has been largely fulfilled. Any person can easily add content to the enormous internet system. Simultaneously, as the availability of this information exploded, a few proprietary technologies began to consolidate channels of access to this data. This is accomplished through consolidation of access points into giant enterprises that today control user interfaces through device access, search, and media networks. The mechanisms by which the information on the internet is collected and displayed is largely obscured by proprietary algorithms. An exception to this pattern is Wikipedia. As a nonprofit, ad-free and collaboratively built site it has no incentives leveled upon the commercial systems. It is fully transparent in what information takes precedence, and how it is produced. It does not use personal data to market or sell to users or to optimize for ad revenue, and it prioritizes personal information security to avoid undue bias or censorship. In other words, it is aligned with user needs for transparency, clarity and trust. The Solution Knowledge Engine By Wikipedia will differ from commercial search engines in key areas:
Knowledge Engine By Wikipedia will surface important noncommercial results that are:
How Is It Different? The goal of today’s commercial engine is to give the user what they (or the interested party) think they want to know – the fact and data about a query: a medicine sold by a drug company, a movie ticket, or a most popular result. The knowledge engine of tomorrow will guide the user to discover what they need to know that is only available with a crowd-based knowledge engine: a new or alternative medicine producing better results at a lower price point, a book summary and source language and versions of the movies based on it, the most relevant result to the user’s area of exploration. Current engines rely on indexing and interlinking as the primary method for identifying and highlighting relevant results. In a world where data proliferation is rapid and unabiding, Wikipedia has a few advantages:
Our Knowledge Engine Will Be: Performance Based We are building a knowledge engine that has speed, open data, and relevance at its core. A new entry point to the sum of all knowledge, Knowledge Engine By Wikipedia has the responsiveness of commercial search engines and the ethos of Wikipedia and the Wikimedia Foundation. An Efficient Experience Quality is more important than quantity. The user doesn’t always need 10 or 20 or 200 results – they need the right set or even one result that provides a sufficient amount of knowledge with the contextual discovery to dig deeper. Still, in most searches, our knowledge engine will uncover a multitude of quality results, which should encourage a “down the rabbit-hole” discovery experience. The engine’s speed will bring consistency across the user interface, configuration options that adapt to users’ preferences, and an ease of experience that lets the user concentrate on the discovery task rather than the interface. Speed is crucial for global enablement but also for getting things done. Quickness and quality will be hallmarks of Knowledge Engine By Wikipedia. Openly Curated We are building a unique engine that sets us apart from commercial engines. Our knowledge engine leverages open data sources and champions an open understanding of where and how the results are calculated and curated. We have the unique opportunity to merge open knowledge graphs and data sources in a federated landscape. By combining human and machine curation, we are forming a holistic, usercentered model to drive our knowledge engine. A Multifaceted Tool Knowledge Engine By Wikipedia is much more than a search input – it’s like a collection of powerful apps and portals rolled into a singular interface and input. We’re creating a tool where questions like “show me the progress of an event” display contextual maps and timelines, and where a query reveals multiple types of media and data displayed with charts and visualizations – all in a way that illustrates quicker and more completely than text alone. With Knowledge Engine By Wikipedia, the user instantly gets the context of a query in a larger perspective. From an Open Community We’re focused on creating resources and tools for an open knowledge-engine community, and building on the input of an advisory team. We will strengthen the Application Programming Interface and the resources around the knowledge engine to enable us and others to build, contribute to, and extend the engine. “Openness” – through curation, sourcing, and community – means everyone can contribute to Knowledge Engine By Wikipedia, and everyone can use the results and software without restrictions. It's what the Internet was meant to be and it’s what Wikipedia is, and what our knowledge engine will be, too. |
” |
This is followed by a set of screen mock-ups labeled "Trending", "Multimedia Content", "Smarter Answers" and "Nearby" and an outline of the four stages of the plan:
“ | The Plan in Four Stages
We anticipate each stage will take 16–18 months to develop and transition into the overlapping stages. The Discovery stage has already begun, and each stage has the potential to overlap with other stages.
|
” |
There follows a timeline graphic and a more detailed description of these four stages, each comprising an introductory paragraph followed by an average of half a dozen bullet points. The document concludes with the table of costs reproduced on page 9 of the Knowledge Engine grant agreement, appended to which is the following:
“ | If we see significant progress on the project during the first six months of the fiscal year (July December 2015), we may petition the Wikimedia Foundation Board of Trustees for permission to seek and spend additional resources in support of the project.
Future Fiscal Years We anticipate future years’ budgets to increase by 20% per year as we accelerate the growth of the program. Projected future budgets FY 16–17: $2,900,000 FY 17–18: $3,500,000 Request of the Knight Foundation To support the project, we respectfully request $2 million per year for three fiscal years, which would make the Knight Foundation Knowledge Engine By Wikipedia's primary initial sponsor. The remaining initial support will come from the Wikimedia Foundation's general fund or from additional restricted grants. To identify other foundations that would support Knowledge Engine By Wikipedia, we welcome your suggestions and assistance. Thank you. |
” |
The formal grant application, requesting a much reduced $250,000 from the Knight Foundation, summarizes the proposal as follows:
“ | Knowledge Engine By Wikipedia is a federated knowledge engine that will give users the most reliable and most trustworthy public information channel on the web, applying fundamentals of transparent Wiki-based systems to surfacing the most relevant and important information.
The funds requested are in support of Stage One of this project. |
” |
The remainder of this document is largely reproduced on the latter pages of the grant agreement itself.
Discuss this story
If Fox News or TeleSUR have the slightest chance of appearing as data sources of this searching project, I will campaign to stop it. --NaBUru38 (talk) 14:04, 15 February 2016 (UTC)[reply]
Curation
Regarding "Establish curation process." When I see the WMF talk of "curation" I see them continuing to add more hamster wheels to a cage which already has in excess of a ten-to-one wheel-to-hamster ratio. Get a clue: we can only run on one wheel at a time. Tools which enable us to run more efficiently are what we need. How this "curation process" is likely to pan out: teams of low-paid "curators" in various third-world countries will work tirelessly to push the importance of their sponsors' favored articles and move them to the upper echelons of search results, overwhelming any efforts of independent curators. Either that, or it will only take 12 months to establish an 11-month "curation backlog". Wbm1058 (talk) 04:04, 16 February 2016 (UTC)[reply]
Asked and Answered
At User talk:Jimbo Wales#Basic question about the scope of the grant I asked the following question:
The reply I got was
I followed up with:
And the response was
- "Sure. We don't have, and won't have, the resources at our disposal to even contemplate a Google/Bing style search engine, and all the talk about that is just that - talk based on nothing. I can envision - but this is not current planned and isn't even in a serious brainstorm yet as far as I know..." . --Jimbo Wales
I trust Jimbo, based upon ten years of experience dealing with him. If any WMF or Knight foundation documents appear to contradict the above, then either those documents are lying, someone is doing something without Jimbo's knowledge, or someone is reading too much into what are essentially marketing documents and not paying enough attention to the deliverables. --Guy Macon (talk) 01:21, 17 February 2016 (UTC)[reply]