Open Content Resources
Open Content is a general term for data (text, images, audio, video, maps, etc.) that is published with the intention of making the works available for anyone to copy, modify, or repurpose. (See http://en.wikipedia.org/wiki/Open_Content for a more formal definition and history.) Open content generally falls into one of three categories for copyright purposes:
- works that are in the public domain either intentionally, through expiration of copyright, or through an error in the filing of copyright.
- works released under the GNU Public License, Project Gutenberg License, one of the Creative Commons Licenses, or other standardized license
- works released with an explicit statement of how they may be used, but without adherence to a specific standardized license
Table of Contents
- Using These Resources
- General Resources
- Learning Repositories
- Open Data
- Formats and Players
- About Licenses and Copyright
Open Content works will be in generally accessible formats. If the formats used are proprietary, the software necessary to read or play the files is readily available in a free version. Likewise, these works do not use Digital Rights Management (DRM) systems to limit copying and distribution. DRMs are software (or sometimes hardware) based systems that prevent or restrict reproduction of materials.
The movement is not new, dating back to at least 1971 (http://lwn.net/Articles/177602/) when Richard Stallman tried to distribute the Declaration of Independence to everyone on ARPANET and when Abby Hoffman published Steal This Book; however, it has taken on a new importance in recent years, as copyright laws have changed and companies have attempted to restrict use of their publications. Not all Open Content resources are free (in the monetary sense), but most are at available for non-commercial uses.
Closely allied with the Open Content movement, is the Open Access movement, which is focused on making scholarly and other articles freely available, though not necessarily intended for repurposing, modification, etc.
This document is intended as an aid in finding and using open content for MU faculty and students. Note that some of the sites listed in this document (such as Google Books) contain a mix of open content, open access, and copyrighted materials.
Many of these sites provide different kinds of materials that can be inserted into papers, digital media projects, or combined in novel ways. For instance, portions of USGS maps from the Libre Map Project might be inserted into a term paper on watersheds in Missouri.
A term paper on Pearl Harbor might be expanded in several directions by including extracts from the various Federal Pearl Harbor investigation reports available on iBiblio, an embedded newsreel of Admiral Kimmel's testimony before Congress in 1946 from the Internet Archive, and links to Google Earth files that overlay contemporary aerial photographs of Pearl Harbor onto a three-dimensional representation of the harbor from the Google Earth Community. Any of these materials could, of course, be annotated by the student.
Still images, contemporary music, and early newsreels from the Library of Congress' American Memory site could be combined with student-created slides and narration in a digital media project on women's suffrage.
Information about copyright, licenses, and file formats may be found at the end of this document.
The Internet Archive (http://www.archive.org) The Internet Archive contains a diverse and huge collection of video, audio, and texts. These range from classic books to WWII newsreels and recent short audio and video. This is really a collection of several collections organized in a uniform manner. It is a good source of older materials as well as recent works. It is perhaps the best place to start looking for content. Mostly Public Domain and CC Licenses. License information usually appears on the left for CC-licensed works. See: http://www.archive.org/about/faqs.php for more specifics on particular parts of the archive.
iBiblio http://www.ibiblio.org Another digital archive, though more oriented to text and audio, is ibiblio: the public's library and digital archive at UNC. Public Domain unless otherwise specified in the material. See the "Read and Heed" section of http://www.ibiblio.org/collection.html.
Library of Congress http://www.loc.gov The Library of Congress, particularly the American Memory and Global Gateway sections, provides treasure troves of images and materials that are open or otherwise usable content. It is an excellent source of images for digital media projects. Materials are in various states of copyright. The location and amount of information on copyright varies by sub-collection.
Europeana (http://europeana.eu) is a repository of texts, images, audio, and video sponsored by the European Union. There are a very wide range of materials from throughout Europe covering prehistory to the present day. Copyright and usage guidelines vary with the source of the object. See:http://europeana.eu/portal/termsofservice.html.
Creative Commons http://creativecommons.org Creative Commons, in addition to its copyright licenses and information, contains links to a number of open content sites around the world. See especially:
- Audio: http://creativecommons.org/audio
- Video: http://creativecommons.org/video
- Images: http://creativecommons.org/image
These sites are useful for finding almost any kind of media that may be included in other projects; however, they are not very useful for older works. Creative Commons and GNU Licenses. License location varies by site.
Wikimedia Commons (http://www.wikimedia.org/wiki/Main_Page) Wikimedia Commons is home to a variety of free content, including audio, video, and still images. Public Domain and CC Licenses. License information appears below the item.
OAIster (http://www.oaister.org) OAIster is a union catalog of digital resources. As such, it is a search engine, rather than a repository. Copyright and licensing will vary across the collections it references.
Open Content Library Wiki (http://www.opencontentlibrary.org)
Open Content Library Wiki is a listing of open content repositories.
Several of the items on this page are derived from it. Copyright and
licensing will vary across the sites it references. Searching for
images, video, etc. in Google can be made to return those in the public
domain or released under CC licenses by adding the words "public
domain" or "creative commons" after the search term. You will still
need to look at any notices on the originating pages.
Open Clip Art Library (http://openclipart.org/) Open Clip Art Library is an archive of public domain, user-contributed clip art in various formats. Public Domain and CC licensed. Symbols indicating the licensing status are the last field in each record.
Open Photo (http://openphoto.net) Open Photo is a searchable and browsable site of user-contributed, CC-licensed photographs. CC licensed. Details are in a sidebar to the right of the photo.
Flickr: Creative Commons (http://www.flickr.com/creativecommons) Flickr: Creative Commons allows searching through several million user-contributed photographs in Flickr by type of CC license. CC licensed. License information is in the "Additional Information" section in the sidebar to the right. Note that it uses small symbols in light gray, as well as text to indicate the license details.
Web Museum (http://www.ibiblio.org/wm) The Web Museum is one of the oldest collections of photographs and descriptions of art work on the Web, dating back to 1994. Resolution is excellent for online viewing and usable for small printed illustrations of the work. Although this is an iBiblio site, there are copyright considerations. Users have the option of following either CC BY-SA or GFDL (or GPL) licenses. See http://www.ibiblio.org/wm/about/license.html for the main copyright information, but see also http://www.ibiblio.org/wm/about/copyright-issues.html for a deeper discussion of reproducing artworks online.
CCMixter (http://www.ccmixter.org/) CCMixter is a Creative Commons project for producing, sharing, and distributing music remixes. This is site is devoted to contemporary user-created content and is a useful source for background music. Unless the user is looking for a particular piece or artist, the Browse Tags feature is probably the best way to locate suitable clips. Click the speaker icons to preview clips without downloading. CC licensed. Specifics of the licenses are indicated by symbols at the bottom of each entry.
Freesound Project (http://freesound.iua.upf.edu/) Freesound Project is a browsable/searchable collaborative database of sounds (not songs) for use in digital projects. A great source for sound effects for digital media projects. Finding sounds can be done through several means from the "Search/Browse" section of the left-hand sidebar. Unless the user has a specific sound in mind, tags are probably the best way of browsing. The "Geotagged Samples" interface allows browsing some of the sounds by location. Uses Creative Commons Sampling Plus License.
Free Music Project (http://freemusic.freeculture.org) Free Music Project distributes user-created music. The site is similar in content and navigation to CCMixter, though at this time there are fewer tracks.Navigation is easiest through tags. CC licensed. Symbols indicating license details appear at the lower left of each entry.
The Internet Archive (see above in General Resources) probably provides the widest range of public domain or CC-licensed content available. The Creative Commons directories listed in the same section provide links to a number of small and sometimes highly focused video sites (e.g., EngageMedia deals with "social justice and environmental issues in Australia, Southeast Asia, and the Pacific").
Open Video (http://open-video.org/) Open Video is a repository of digitized video with more of an emphasis on items for digital libraries and researchers. The emphasis is on short documentaries, old educational video, and ephemera. Potentially good for short samples for use in digital media projects. There is some overlap with the Internet Archive and National Archives Video via Google Video. CC licensed. Certain NARA and NASA videos have ambiguous copyrights but are released on a fair use basis for academic use.
National Archives Video via Google Video (http://video.google.com/nara.html) This resource has three sections: Nasa History of Space Flight, United Newsreel Motion Pictures (1942-1945), and Department of the Interior Motion Pictures. The United Newsreels form a nice complement to the Universal Newsreels available from the Internet Archive. If you wish to edit or sample these videos, it is recommended that you choose the "iPod/Sony PSP" option to download, as this downloads an MP4 file, rather than a proprietary Google Video format. No information is provided on copyright; however, US Government video should be public domain unless otherwise specified. It is unclear how Google's distribution may modify the permissions.
BlipTV is primarily oriented to distributng vlogs and other, serialized
video content. Content may be streamed, downloaded, and subscribed to.
Much of the material uses various CC licenses. The Copyright/License
information is listed in the lower right. In working with video found
online, please be careful about downloading and using videos from
streaming sites like YouTube or Revver. Even though there are many
services and programs for downloading YouTube content, YouTube's
terms-of-service prohibit downloading (as opposed to streaming). Also,
few YouTube videos are public domain or CC-licensed. Although Revver
allows downloading and use a Creative Commons license, that license
does not permit remixing or any derivatives. Many of these sites
(including some that do not allow downloads like YouTube) do make
embedding video streams into web pages as easy as copying code into an
HTML editor. This can be useful in Blackboard courses.
Project Gutenberg (http://www.gutenberg.org/wiki/Main_Page) An RSS feed of recently posted or updated books is available (http://www.gutenberg.org/feeds/today.rss) Over 20,000 out of copyright books are available online, as well as some audio books and sheet music. See (http://www.gutenberg.org/wiki/Gutenberg:The_Project_Gutenberg_License) or the copyright/license information in the header of each work.
Wikisource (http://en.wikisource.org/wiki/Main_Page) Wikisource, "the free library that anyone can edit" is part of wikimedia. It is a repository of texts from different periods and many different languages. GFLD unless otherwise noted at the bottom of the page. Wikisource gives specifics of why a given work is in the public domain.
O'Reilly Open Books (http://www.oreilly.com/openbook/) O'Reilly Open Books are a joint project between O'Reilly, Creative Commons, and Internet Archive to provide free online access to a variety of programming and technology books. Mostly CC licensed, but location and specifics vary widely.
Google Books (http://books.google.com) contains in excess of 1.5 million scanned, public domain copies of books (some books, such as The Hound of the Baskervilles are available from Google Books in different editions or in some cases the same edition has been scanned at more than one library). These books are in the public domain, but, given the expense of scanning and hosting the books, Google requests that these texts only be used for non-commercial purposes. Users should also be aware of problems in these scanned books. Some pages are badly scanned (for instance http://books.google.com/books?id=EEAJAAAAIAAJ&lr=&pg=PA64#v=onepage&q=&f=false, which is part of an 1845 edition of Lord Nelson's letters and dispatches) while in other cases old type fonts and diacriticals may result in errors in the OCR output. These books may be downloaded as PDFs, or some cases ePub files (an open standards format for ebooks). They are also available with special formatting for hand-held devices (e.g. smartphones, iPhones, and iPod Touches) at http://books.google.com/m. Many of the Google Books are also available through the Internet Archive or via Sony's eBook Library software. Another interesting feature is that Google provides code that allows a book or pages of a book to be embedded in a blog or other web page.
Mutopia (http://www.mutopiaproject.org/) Mutopia is a CC licensed repository of classical sheet music, browseable by composer or instrument. Mostly CC licensed, but some under older licenses. Click on the More Information link in the record for each piece. Licenses may also be found at the bottom of each page of music, with a more detailed colophon at the end of each piece.
Public Library of Science (http://www.plos.org/) The Public Library of Science "is a nonprofit organization of scientists and physicians committed to making the world's scientific and medical literature a freely available public resource." At this time, the journals available concentrate on medicine and biology. CC Attribution license unless otherwise noted.For a wide range of open access, peer-reviewed, scholarly journals consult the Directory of Open Access Journals (http://www.doaj.org/). Note that these are open access, not necessarily open content.
See also the entries in General Resources above.
More information on learning repositories may be found in the ET@MO handout "Finding Reusable Instructional Materials".
MERLOT (http://www.merlot.org/merlot/index.htm) MERLOT (Multimedia Educational Resource for Learning and Online Teaching) is one of the more established repositories for learning modules. Browse or search by topic or title in the Learning Materials tab. Not really an open content site, but the site has fairly liberal rules on usage, though not as liberal as a CC license. See http://taste.merlot.org/intellectualpolicy.html for details.
Connexions (http://cnx.org/) Connexions is a CC-licensed repository for course modules hosted at Rice University. Navigation is best done through the Content tab where searches and browsing by topic is supported. The specific CC licenses are noted at the bottom of each entry.
Open Educational Resources Commons (http://www.oercommons.org) Open Educational Resources Commons provides course materials ranging from kindergarten through graduate school. There are many navigation options. Search (upper-right corner of main page), browse by subject or level (left sidebar) or tags (right sidebar) are all available. CC licenses are noted with specific icons and text near the bottom of each entry.
Open Learn (http://www.open.ac.uk/openlearn/home.php) Open Learn is a repository of course materials from the Open University in Britain. Try the "Browse topics" link at the top of the page to find materials. Most material is under a CC by-nc-sa 2.0 license but special restrictions may apply to specific materials (for instance photographs in units may be labeled with copyright information applying to the owner). In some cases, copyright is held by the Open University.
WikiEducator (http://www.wikieducator.org/Main_Page) WikiEducator is a learning repository of free content sponsored by the Commonwealth of Learning (an intergovernmental organization of the Commonwealth Countries). Navigation to resources depends largely on the search engine found in the middle of the left side bar. CC by-sa 3.0 license.
MIT Open Courseware (http://ocw.mit.edu)
MIT Open Courseware is MIT's repository that will eventually include
content from all of its courses, free, online. It has probably received
more press attention than any of the other sites. Search or browse by
topic through the navigation sidebar on the left. RSS feeds of new
courses by discipline are also available. CC by-nc-sa 2.5 license. See
also the legal notice at the bottom of the home page.
The Libre Map Project (http://libremap.org/data/) The Libre Map Project consists of USGS maps that have been digitized and placed online with a CC license. The digitized versions of these maps are released under a CC by-sa license.
USGS (http://www.usgs.gov/) A wide variety of US Geological Survey maps are available online. This site links to many others and requires some time to explore and learn the best ways to use. Public Domain unless otherwise noted.
David Rumsey Collection (http://www.davidrumsey.com) The David Rumsey Collection is a huge collection of historical maps and atlases available for reproduction and remixing for non-commercial purposes under a CC license. The tools on the site will require some effort to learn. CC by-nc-sa license. See the bottom of http://www.davidrumsey.com/index4.html for more detail.
Geocommons is a user-created repository of data sets presented in the
format of heat maps. It is based on Google Maps and can export data to
Google Earth and other programs capable of reading kml files. An
interesting aspect of the site is the ability to easily customize the
variables displayed through a drag-and-drop interface. It is a data
visualization and mashup tool, as well as a repository. Because data is
uploaded by users, be sure to check the sources. CC by-sa license. Seehttp://help.geocommons.com/faq - d4for more detail.
This category is a little different from the others. It includes guides to openly distributed and usable data, much of it produced by governmental bodies, but it also includes a new category of mashup tools designed to create interactive data visualizations. One of the characteristics of Web 2.0 is that there is a blurring of the line between tools and data. Some of the sites in this section only aggregate publicly available data, while some provide tools for manipulating the data in different ways. Many of the latter actually allow users to upload data sets themselves. These sites are useful, but like Wikipedia, users should doublecheck the sources of the data. (See also GeoCommons and Google Earth Community in the Geographic section.)
- del.icio.us / judell /publicdata (http://del.icio.us/judell/publicdatafor web version or http://del.icio.us/rss/judell/publicdatafor rss version) A set of del.icio.us tags by columnist, blogger, and screencaster Jon Udell of Microsoft pointing to public data repositories and news about public data. Because it is a del.icio.us feed, this is not well organized, but many people are contributing links, so it provides a wide range of data. Items range from sources of local crime statistics to current and historical climate data. Udell's descriptions should be assumed to be copyrighted. Most of the data sources are either in the public domain or under forms of government copyright allowing general use.
- Liberate Government (http://wiki.oreillynet.com/foocamp07/index.cgi?LiberateGovernmentInfo) An attempt to catalog sources of public information and tools for manipulating it. Most of the data sources are either in the public domain or under forms of government copyright allowing general use. Some data sources are proprietary.
- Gapminder (http://tools.google.com/gapminder) Gapminder is one of a series of applications that allow users to view graphs (animated in time) of demographics and economic data for countries comparing up to four variables at once. For additional tools that visualize more specialized sets of data, see http://www.gapminder.org. The Gapminder Foundation holds copyright on the content and programming, except where copyright is held by the creators of the data sources. Users are encouraged to link to pages within the site.
- Swivel (http://www.swivel.com or http://www.swivel.com/start/rss for rss feed of featured content) The concept behind Swivel is to provide an online graphing tool that allows users to upload (or use data already shared by users) to create interactive graphs, data visualizations, and maps. This is a mixture of data visualization, data mining, and social networking. Because data is uploaded by users, check the sources. Content uploaded to Swivel is governed by a CC Attribution 2.5 license.
- Many Eyes (http://services.alphaworks.ibm.com/manyeyes/home) This site from IBM is similar to Swivel, though with a somewhat different feature set. Unlike Swivel, Many Eyes provides some types of data visualization useful for textual and other forms of analysis. It is highly dependent on Java and sensitive to different Java versions. Because data is uploaded by users, be sure to check the sources.Users uploading content agree to provide IBM with a license to to the data and derivative works.
- Science Commons (http://sciencecommons.org) is a spin-off of Creative Commons. They are working to create standards and databases to provide open access to scholarly, scientific data sets.
Anyone posting material to the web or doing remixes and mashups needs to be aware of file formats. Most of the material available from the sites listed in this document are in commonly used file formats, such as PDF, JPEG, MP3, MP4, etc.
For the most part, you can probably rely on software you already have such as Adobe Reader, iTunes, Windows Media, Quicktime, and Word to view files you download. The open source programs Open Office (cross-platform) and Neo Office (Macintosh) are excellent at opening a wide variety of proprietary formats. The Macintosh shareware program Graphic Converter is widely available and can open a vast variety of graphic formats.
A few of these sites, such as the wikimedia sites, will tend to use formats that come from the open source/open format community. The ones most frequently seen are Ogg Vorbis and FLAC for audio, Ogg Theora for video, and DJVU for texts. Some will also use more-or-less proprietary formats, such as KML for geographic information. The table below gives some suggestions for players and editing applications that can handle some of these formats that are outside the mainstream.
|MPlayer, VLC, Songbird||Audacity|
|MPlayer, VLC, Songbird||Audacity|
|MPlayer, VLC, Miro||Cinelerra (Unix only)|
|DJVU Browser Plugin from Lizardtech|
|Stanza, Calibre, Sony eLibrary, various others||Calibre|
|Google Earth, Google Maps, NASA Worldwind, ARCIS Explorer, eventually MS Virtual Earth||Google Earth and a text editor. There are also specialized tools, such as GE Graph available. Geocommons can generate KML files for use in these programs.|
|Google Earth, NASA Worlwind, and ARCIS Explorer|
|Most recent web browsers, Adobe SVG plugin, Rensis Player from Emia||Inkscape|
Copyright and Intellectual Property are far from being the same thing, the former encompasses a subset of the latter. The relationship between the two has been dynamic from the time of early attempts at licensing and copyrighting content since they appeared in Italy around 1500. The "information wants to be free" attitudes of the early internet have come into conflict with lobbying for increasingly restrictive copyright and intellectual property legislation by corporations in the past decade. The result has been a series of important law suites, some legislation (the TEACH Act being the most notable for universities), and a great deal of activism. Perhaps the most important outcome of the activists has been the creation of "licenses" for content that allow content creators and publishers to modify the terms of copyright for their works. For the most part, these licenses open the work up to a wider variety of uses; however, there are licenses, which may include DRM's for audio and video that are more restrictive. It is important to understand the licenses of works.
If a work is in the Public Domain, it may be used in any way desired; however, it must still be credited to avoid plagiarism. Note that most US Government publications are in the public domain, but some created under contract may not be in the public domain. Materials from the Library of Congress are not necessarily in the public domain, as they make some works available that are still under copyright. Works created by foreign governments are not necessarily in the public domain. The United Kingdom releases most of its material under Crown Copyright. This is only marginally more restrictive that the US policy, but does require accurate reproduction, that the work not be "used in a misleading context," that the source be identified, and that the Crown Copyright be acknowledged. (See: http://www.opsi.gov.uk/about/copyright-notice.htm.)
The GNU General Public License (GPL) and GNU Free Documentation License (GFDL) require that works that derivative works must ensure freedom of copying, redistribution, and modification. The derivative may not contain any DRM technology and a copy of the license must accompany it. The GPL is intended more for use with software, but is sometimes seen applied to content. For the GFDL see http://www.gnu.org/copyleft/fdl.html. For the GPL, see http://www.gnu.org/copyleft/gpl.html.
The GPL and GFDL are largely being replaced for content, as opposed to software, by Creative Commons (CC) licenses. Creative Commons offers a variety of licenses allowing creators and publishers to fine tune how their works are used. The most common are defined by their attributes (http://creativecommons.org/about/licenses/meet-the-licenses). These are:
- Attribution (by) - allows the user to reuse, redistributes, and modify works as long as attribution is provided. Commercial use may be permitted.
- Share Alike (sa) - requires that any derived work be released under the same license. Commercial use may be permitted.
- No Derivaties (nd) - the work must be used whole, but may be redistributed freely. Commercial use may be permitted.
- Non-Commercial (nc) - No commercial use permitted.
These are typically referred to by their abbreviations. Thus if a work is licensed as "by-nc-sa", you may redistribute, copy, remix, and incorporate the work so long as any distribution carries the same license, attribution, and is not used for commercial purposes.
A variety of other, more standardized licenses are offered, including ones that make the Public Domain status of a work explicit, a Developing Nations license (allowing free use in developing nations, but not retaining regular copyright in the rest of the world), and Founder's Copyright, which allows you to specify that the work is released under the copyright rules as they were framed in 1790 (14 years with one renewal of 14 years).
Works using CC Licenses will normally have some variation of the CC logo, such as this:
For more explanation of Creative Commons, see "Lawerence Lessig Explains Creative Commons Licensing" on YouTube. (http://www.youtube.com/watch?v=AWxyx5iYdvI)
Finding the copyright or licensing information for a work will vary from site to site. It is likely to be at the bottom of a page, or if there are several images, videos, or sound recordings inline on one page, it is likely to be below each one. Sites using a multi-column layout are more likely to place it in the right hand column. They may use words, symbols (sometimes quite small as in the case of flickr), or words and symbols. The major exception to this, and given the size of the archive it is a major exception) is the Internet Archive, which places the information on copyright and licenses prominently in the left-hand column, directly below the download links. Some popular sites, such as YouTube, do not have any consistent way to displaying licenses, so that descriptions and tags must be read very carefully.
If the site provides no licensing or copyright information, nor any explicit written or recorded disclaimer, then you must assume that the work is protected by standard copyright and act accordingly. In some cases, Fair Use or the TEACH Act may apply.
When developing content with University resources, the copyright usually belongs to the Curators of the University of Missouri and the faculty member jointly. Therefore, it is possible that the inclusion of materials released under one of these open-content licenses may impinge upon the University's copyright rules for materials produced by faculty and staff. To the best of our knowledge these issues have not been defined. If this is a potential issue for your work, we recommend you contact your Campus Technology Transfer Office (http://otsp.missouri.edu/about/offices.asp).
Disclaimer: The content in this document is meant for informational purposes only. The author is not a lawyer and is not giving legal advice.