« Back to the main CS 300 website
Project SRC: Time Machine
Partner Choice Due Friday, April 5th at 11:59 PM (EDT)
Part 1 Due Friday, April 12th at 6:00 PM (EDT)
All Parts Due Friday, April 19th at 6:00pm (EDT)
Introduction
Each of us produces immense amounts of digital data today, and we all use software in doing so. In particular, if you use a note-taking app, that app might store your notes in its own proprietary file format, and the app itself will assume that it runs on a current operating system (e.g., iOS, Android, macOS, or Windows) and widely-used hardware architecture (e.g., x86-64).
But what happens if you, a few decades from now, decide to look at your old CS 300 notes to relive your college days? The app may be long since defunct, your laptop or smartphone from your college days may have moved on to greener pastures, and you’ll probably be using devices that run a different operating system on a different hardware architecture (e.g., ARM64 or one of its successors).
If your college notes aren’t exciting enough, consider NASA’s lost Apollo guidance software and their hunt for old processors on eBay: despite enabling some of humanity’s greatest technological achievements, NASA faced difficulty maintaining their space shuttles with antiquated software.
Nowadays, there is a massive amount of open source software that is critical for our increasingly digital society, and much of its source code is stored on GitHub (e.g. Linux, Docker, and Python, just to name a few). To safeguard the world's open source software for future generations, GitHub launched the GitHub Arctic Code Vault in 2020. This project preserves a significant portion of GitHub repositories by storing them in an archive inside an abandoned coal mine deep in the Arctic Circle, on the Norwegian archipelago of Svalbard. (If a repo of yours was public and had at least 1 star and 1 commit on 02/02/2020, it exists on a reel of film deep underground in Svalbard!)
The preservation of digital artifacts and the software needed to access them is already an acute challenge, and will grow in importance as more critical infrastructure comes to depend on software. In this assignment, you’ll explore some of the motivations behind software preservation, and do a hands-on exploration of the challenges of software preservation.
Partners: 🚨 Action Required 🚨
You will choose a partner for part 3 of the project. Alternatively, if you would prefer to be randomly assigned a partner, we will pair you with a classmate. To help us determine groups, everyone should fill out this form by 11:59pm on Friday, April 5! Within the form, you will either write down your chosen partner's CS login, or opt into random assignment.
Learning Objectives and Answer Expections
This project will help you:
- Understand how maintaining old software is critical to digital data preservation.
- Develop hands-on experience with software emulators used to keep decades-old applications running on modern computers.
- Reason about the societal context of software preservation: who should engage in it, how we create the right incentive structures to preserve our digital heritage, and whether we should even preserve everything.
Answer Expectations: Strong answers to the written questions in this assignment will be characterized by the quality of your arguments. You can ensure you get full credit by making explicit claims and supporting them with evidence along clear lines of reasoning.
Partner Expectations: You will work with a partner for the written section of this assignment only. Please note that you should still submit your project individually! Feel free to discuss any of the written questions with your partner. However, you should only submit the same response for questions that you have been instructed to complete together.
Assignment installation
Ensure that your project repository has a handout
remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci0300/cs300-s24-projects.git
Then run:
$ git pull
$ git pull handout main
This will merge our timemachine
folder with your repository.
Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You'll be handing in your work for this project to thetimemachine
directory in the working copy of your projects repository.
Part 1: Reading Old Files
You, a budding digital preservationist, start browsing digital collections (as one does), and stumble upon an archive with an interesting file:
Eager to make your mark on history, you take a peek at what it is… and are met with almost complete gibberish. What’s going on?
The problem you’re facing is one that is all too common in the digital preservation world. Until recently, much of the focus has been on data preservation; but preserving data alone can often miss key components of the environment that are necessary to interpret the data meaningfully. Be it notes or 3D modeling, data formats are often integrated with the software used to create, view, and manage them, rendering them useless without this necessary software. Digital preservationists have increased their focus on software to address this problem.
After some research, you discover that the file is actually a WordPerfect file. Since the file doesn't seem to be working with your current software, you deduce that it's from an earlier time: the DOS era (mid-1980s to mid-1990s).
Assignment
How do we run a program that was written for MS-DOS in the 1980s on today's computers? First of all, we need a processor and hardware that can run the program. Fortunately, WordPerfect ran on x86 machines (amongst other architectures in use at the time), so as long as we can get a copy of a WordPerfect executable, we should be good, right? Not so fast.
The WordPerfect executable, like all programs, makes assumptions about the syscalls available and the kernel that runs underneath. So, we also need a DOS kernel and the ability to run it on modern hardware! In this instance, we'll emulate a DOS kernel, rather than actually installing an ancient operating system on your computer.
Emulators are one of the most common methods of interacting with legacy software. Essentially, an emulator recreates the original environment needed by the software, allowing it to run on a modern computer. In our case, we will use DOSBox to emulate a DOS environment. This will give us a platform to run WordPerfect and open our file.
Note: For this, and all following installation steps, you should not use the course container.
Download DOSBox
- Install DOSBox on your local machine: follow the installation instructions for your operating system (Mac OS X, Windows).
(If you want the full retro experience, press
Alt-Enter
(Opt-Enter
on Mac) to enable fullscreen when running DOSBox.)
- Create a folder on your host computer (e.g. on Mac,
~/DOSBOX
), then start DOSBox. You’ll see a MS-DOS command prompt (Z:\>
). Mount the folder you created as the C:
drive with Z:\> MOUNT C <FULL-PATH-TO-FOLDER>
inside DOSBox (For instance, Z:\> MOUNT C ~/cs300/DOSBOX
.) Note that this mounting functionality is something the emulator provides; it wasn't available in the original DOS running on a physical computer.
- Now, any file within that folder will automatically appear in the DOSBox emulated environment. Switch to the
C:
drive with the command Z:\> C:
.
- DOSBox rebinds the Ctrl-Fn shortcuts to special actions, such as Screenshot, Record Video/Audio, etc. These are helpful for DOSBox’s configurations, but unfortunately, they interfere with the WordPerfect program. You will need to remap all of the special actions so that you can use the Ctrl-Fn buttons freely. To start, open the DOSBox Keymapper with
Ctrl-F1
(or Ctrl-Shift-F1
/ Ctrl-Cmd-F1
).
You should now see a screen like this!
Apple note
Macs have function keys disabled by default.
There are two ways to use the function keys with DOSBox
- In system setting search for "function keys". You should see a screen like this:
Flip the switch on.
- Every time you want to use a function key, also hold down the
fn
key. (If you are on a Mac without physical function keys, you will have to hold down the fn
key regardless.)
Help, my KeyMapper is not showing up!
Alternatively: Help, I messed up my KeyMapper!
- If you cannot open the KeyMapper using any of the three commands listed in Step 4 (and you have read through the previous box if on Mac), you likely have a problem with your KeyMapper. If you did manage to get to the KeyMapper, but changed something and now can't get back, this will also be helpful!
- We will be resetting the KeyMapper to its default values. Please follow the instructions that correspond to your OS.
- IF YOU ARE ON MAC: On your terminal,
cd
into the .dmg
file containing your download of DOSBox. Then, type in cd dosbox.app
, then cd Contents
, and then cd MacOS
. From here, you can reset your KeyMapper by entering ./DOSBox -resetmapper
.
- IF YOU ARE ON WINDOWS: Locate the folder where DOSBox is installed (typically
C:\Program Files (x86)\DOSBox-0.74-3
). Then, run the Reset KeyMapper.bat
script by double-clicking it.
- Now, your KeyMapper should be reset! Go ahead and try the three commands in Step 4 again.
- One by one, select each of the items in the bottom-right table (containing ShutDown, Cap Mouse, Fullscreen, etc.). For each item, except for Fullscreen and Mapper, you should ensure that both "mod1" and "mod2" are selected. Refer to the image below for an example of what this looks like.
Remapping Example
- Save and exit out of the Keymapper; your keys should behave normally now.
Download WordPerfect
- Time to download WordPerfect 5.1! Since WordPerfect was developed in 1979, it would normally be installed from a physical floppy disk. Nowadays, we can cut some corners and download the software online as virtual floppy disk "image" file (
.img
). Unfortunately, it's a bit tricky to interact with this file type, requiring yet more software. So, instead, we're providing you with the pre-extracted installation files.
- Make a directory titled "WP" inside your mounted DOSBOX folder, and copy all of the installation files to that directory. Additionally, move the
secret-files
you downloaded earlier into your DOSBOX (not your DOSBOX/WP
) folder.
- DOSBOX does not reflect changes to your mounted folder immediately in some situations. Restart DOSBOX, mounting the folder again.
- Inside DOSBOX, navigate to your WP directory, and run
INSTALL
. This will start the WordPerfect installation process. Press y
or <Enter>
when prompted. You only need to install the core program and the help files. Make sure you respond y
to features described as essential (including help/utility files), and n
to all questions about installing extra features (printer drivers, graphics, etc).
- During installation, a new directory called WP51 should be created in your DOSBOX folder. This is your WordPerfect directory! Copy
CS300REF.WPD
into this folder.
Success! WordPerfect should now be installed.
- Within DOSBOX, change into your WordPerfect directory,
WP51
, using cd WP51
from within your mounted folder. You can run WordPerfect on a file by typing WP <file-name>
at the C:\WP51>
prompt.
- Familiarize yourself with WordPerfect’s interface. Then, complete the tasks listed in the
CS300REF.WPD
document.
Your instructions are inside the document. You should copy any files produced into the timemachine
directory of your CS 300 projects repo. Enjoy!
Reflection
Wahoo! We did it; the old documents were saved from disappearing forever into the void, and you now have a working DOS environment to experiment with more legacy software. (If you’re interested, websites like DOSGames and WinWorld contain numerous DOS-era software and games; try one out, and experience what technology felt like in the 90s.) Now, if someone discovers a stash of George R. R. Martin’s unfinished novels, you’ll be adequately equipped to handle his WordStar files.
The process you went through, albeit very simplified, is a real problem digital archivists face every day when handling decades-old data formats. As we explore additional challenges to and alternative methods of software preservation, keep in mind some of the difficulties you faced while setting up preservation software and interacting with legacy applications.
Part 2: Personal Throwback
Pictures of young Malte at the computer!
As we all know, technology moves fast! Within just this year, an immeasureable amount of digital content has been lost. Some of that lost content has been archived, rewritten, or emulated and some, lost to the void. Even beyond "content," the way we interact with technology is constantly changing! (How many of you still use a physical alarm clock?) While we all like to reminisce about our pasts at some points, it may be difficult or even impossible to relive those past experiences for any number of reasons.
One such example for Nick and Malte is LAN Parties. In the late 90s to early 2000s, groups of people – often students – would pack up their home desktop computers or game consoles and gather together to set up local area networks (LANs) and player multiplayer video games. Playing games over the internet wasn't feasible yet, and because there was no ubiquitous WiFi infrastructure, they had to physically connect their computers to each other with ethernet cables via switches and routers. Because of the substantial overhead involved, people would get together and play video games for many hours or even days, hunkering down with snacks and energy drinks. But trying to recreate that experience has proven difficult. There is something lost bringing a laptop and connecting to Wifi to play games together.
In this part of the assignment, we ask you to think back on experiences with technology from your childhood that may have become inaccessible.
You can consider a broad definition of "experiences": everything from an event like a LAN party to the look and feel of applications, lost websites, lost games, or anything else is fair game. In your weekly section, you will be given the chance to brainstorm experiences with a group, one of which you will explore further for the remainder of part 2.
Help, I can't think of an experience!
In your in-person section for week 6 (Virtual Memory and Pagetables) you will have the opportunity to brainstorm lost tech experiences with your classmates! We recommend that you wait until you've talked with others before beginning this part of the assignment.
Part 3 is independent of this discussion, so you can skip forward to it if you would like to continue working.
Task:
Try to recreate a technology experience from your past!
Broadly speaking, you have two options here:
- Recreate an experience yourself and explore what it takes to recreate that experience (e.g. hosting a LAN party with modern technology)
- Explore how others have preserved an experience that was meaningful to you (examples below)
If the experience is still available in its original form, we ask that you find another experience to write about.
(Note that while some experiences may appear unchanged, they may still be preserved in some way and thus are valid for this assignment! See: Adobe Flash Player Games)
What does an attempt to recreate an experience look like?
For examples you think of, explore the internet to see if these experiences are still accessible, if they have been emulated or archived elsewhere, or if they appear to no longer be available. Some ways to start could include performing a simple Google search or using the Wayback Machine to explore the Internet Archive.
Example 1: (Limited/No preservation)
Napster was a peer-to-peer (P2P) file-sharing service founded in 1999 that enabled users to share music as so-called "MP3" files with each other. It became extremely popular for allowing easy, free access to a vast collection of music but faced significant legal challenges due to copyright infringement, leading to its shutdown in 2001. There currently exists a music streaming platform that has the same name and branding as the original Napster but is not connected to it in any other way. There does not appear to be a rewritten version of Napster available online, and archived old versions of Napster available for download on the web are potentially unsafe and probably incompatible with modern operating systems.
If you were writing about Napster, we would be looking for you to find information like the above to determine why you cannot (at least safely) recreate your experience. You would then reflect on how Napster could have been preserved and the resources required to realize that preservation.
Example 2: (Moderate preservation)
Friendster was a popular social network in the early 2000s. Facing decline in June 2011, the company pivoted to being a social gaming site and discontinued its social network accounts. Friendster ultimately shut down all its services in June 2015. However, old snapshots of Friendster's homepage are accessible on the Internet Archive! See this snapshot from June 2007.
If you were writing about Friendster, we would be looking for you to discover this information about it and spend some time looking around at its modern incarnations as well as its preserved state (in this case, the Internet Archive). The site currently features a "Get early access" button and it looks like there may be the beginnings of an attempt to revive the service under the domain friendster.click as well.
On the Internet Archive, many features are not available. You cannot, for example, sign into an account, see specific friends, etc. We would want to see you determine what old features are and are not available in this archived form.
Example 3: (Extensive preservation)
Toontown Online was an early 2000s massively-multiplayer online role playing game published by the Walt Disney Company where players would play as cartoon animal avatars (Toons) and battle against corporate robots (Cogs) using whimsical slapstick-comedy-esque weapondry. Its servers were shut down in 2013. Since its closure, Toontown has seen significant preservation efforts through projects such as Toontown Rewritten, a fan-run, free-to-play revival of the game. Using original assets and publically available material from Toontown Online, the game has been restored in private servers with near-identical functioning to its original state. Many of these servers go beyond the preservation of Toontown Online to include new features, expanded storylines, etc. For example, Toontown Rewritten includes new playable deer and crocodile characters on top of many other additions. Efforts have also been taken to preserve the history of Toontown Online beyond the playable game through the Toontown Preservation Project, a digital museum of old Toontown material such as artwork from game development or promotional materials.
If you were writing about Toontown, we would look for you to find information on revival efforts such as the ones named here and play the game to see how it differs from your original experience with it. If you were not comfortable downloading the game, you may also look to re-experience it second-hand, e.g., by watching someone play Toontown Rewritten on YouTube or reading blogs discussing someone's experience with it. Ideally, we would really like you to gain first-hand experience, but we also want to make sure you are safe in completing this assignment!
NOTE: DO NOT DOWNLOAD FILES FROM SHADY WEBSITES
Err on the side of caution to avoid malware. If you find an archived file of an old software application on the internet but are unsure if it's safe to download, you do not need to download it for this assignment.
Task: Answer the following questions in the file timemachine/README.md
.
Q1. What experience are you focusing on for the following questions?
Q2. What difficulties did you have recreating your experience? (1-2 paragraphs)
Q3. Answer the applicable question (1-2 paragraphs):
- If you were able to recreate your experience: How did that experience differ from your original interactions with it (if at all)? What efforts have been taken/did you take to preserve it?
- If you were not able to recreate your experience: What kind of resources might be required to make it accessible (e.g. servers, personnel to manage a service, emulators, etc.)?
Q4. Do you think there is substantial benefit to the preservation of your experience? Do you think the costs for its preservation are warranted? (1-2 paragraphs)
Part 3: Social Context (Partner)
Answer Expectations
Partner Expectations
Society has plenty of experience with preserving valuable historical artifacts outside of the digital realm: whether art, architecture, film, or archaeological conservation, the importance and process of preserving our cultural heritage is well established. But preservation in the digital context is less well understood.
Task: First, familiarize yourselves with approaches to preserving our cultural heritage.
- Read this article about ethical considerations surrounding preservation; then, read this case study about modern historical preservation techniques, priorities, and cultural and financial shortcomings.
Next, consider the efforts and cultural implications in the digital domain.
- Read pages 12-22 of this Library of Congress report (An Executable Past: The Case for a National Software Registry). As you’re reading, consider the justifications for and approaches to software preservation, and compare them with traditional notions of preservation in other contexts.
In the assignment, you explored a common way emulation assists software preservation; while not the only option, emulation has seen the most success as a preservation tool. As interest grows and we continue to improve our toolbox, our ability to tackle technical challenges as a software preservation community increases. However, technical difficulties are just one aspect to consider when dealing with preservation.
Representation
As we reckon with our digital legacy, it’s important to consider what we choose to represent.
Task: With your partner, discuss the following and write up some key points from your conversation (again, in timemachine/README.md
).
Q5: How does software preservation compare to other forms of preservation that society already engages in today? What standards should we apply, and how does this compare to the standards used for other types of preservation? For example, you could consider the preservation of art or architecture in your answer.
Legality
Another challenge in software preservation is legality. If you look into it, WordPerfect was commercial software at the time when it was produced; to use it, you must obtain a license and/or purchase the software. Yet, preservation efforts have made it and programs like it freely available on the internet. Digital archivists who host such artifacts face legal uncertainties every day.
Archivists frequently operate with abandonware — software that, while technically still proprietary and protected by copyright, has been ignored by a potentially defunct manufacturer. Some manufacturers (if they’re still around) actively help abandonware sites, or at least tolerate them. However, this is not always the case; some manufacturers, like Microsoft or Nintendo, have pursued legal challenges against digital archives, which resulted in some major sites shutting down.
Task: For the following question, coordinate with your partner to take on opposing, or at least conflicting views. Then, respond individually in timemachine/README.md
. When you are done, come together and discuss your responses! Together, write up 3-4 bullet points from your conversation (also in timemachine/README.md
). If you are able to come to a consensus, make sure to include your conclusion and explain how you reached it. If not, explain where your positions clashed.
Q6: Should digital libraries and preservationists receive legal protections, and if so, what might this look like? When supporting you answer, be sure to consider:
- WordPerfect 5.1 for DOS can cost hundreds of dollars, and other preserved materials like Atari ROM images (see optional section below) can be similarly expensive. Do you have an obligation to pay for legacy software and games?
- Where should we draw the line between piracy and preservation in efforts to keep legacy software available? What criteria might we draw on to distinguish the two?
Tying it Together
So far, companies and government institutions have been rather uninvolved in software preservation efforts. As software proliferates and grows in complexity, and more of it moves to web services that have proprietary server-side code, it will become increasingly difficult for individuals and non-profit organizations to preserve digital artifacts on their own.
Maintaining legacy software is an expensive, laborious, and ever-growing task, and the financial incentive for effective preservation is not always high. Furthermore, thoroughly maintaining all of our old software requires substantial use of online resources and programming ability. As such, it is worth discussing the true importance of software preservation, weighed against these high costs.
One way to look at this conversation is to define where software preservation falls between public good and expensive taste. In Michael J. Rushton's paper on Expensive Tastes and Public Funding for the Arts, he defines expensive taste as:
"…those held by a person who, compared with the general population, in order to achieve a given level of welfare, needs to have available for consumption a good (or a few goods) that is only available at a high price. Suppose, for example, George only enjoys an art form that is expensive to experience, when most of the population is satisfied by cultural offerings more cheaply obtained, and further assume that this art deeply matters to George in terms of his wellbeing and capability for enjoying a fully satisfying life."
Meanwhile, public goods are commodities that benefit all citizens, and should therefore be made publicly available. Services that qualify vary by country, and might include public education, national defense, and healthcare. Furthermore, once an item is a public good, it will be:
"… made available to all members of a society. Typically, these services are administered by governments and paid for collectively through taxation. …The two main criteria that distinguish a public good are that it must be non-rivalrous and non-excludable. Non-rivalrous means that the goods do not dwindle in supply as more people consume them; non-excludability means that the good is available to all citizens." (Investopedia)
Task: As with Question 6, coordinate with your partner to take on opposing, or at least conflicting views. After you respond individually in timemachine/README.md
, come together and share what you wrote! You should then write a few sentences responding to your partner's position, and include that in your README.md
as well.
Q7: Is legacy software an expensive taste or a public good? Should this impact our approach to software preservation? Be sure to explain your reasoning.
Congratulations! 🎉 You’ve completed the SRC project for CS 300. We hope you’ve developed an appreciation for the difficulty and importance of software preservation, as well as some common techniques and considerations, and how they relate to the technical operating systems and hardware topics we discuss in the course.
Handing In
Please hand in the files and answers for this assignment via Git in your cs300-s24-projects-YOURNAME
repository. Put your answers into the README.md
file in the timemachine/
subdirectory of your project repository, and also put all other files from this assignment into that directory. (You should have two .txt files and README.md
.)
By 6:00 PM on Friday, April 19th, you must have filled in the file README.md
in the timemachine
directory in your projects repo, and pushed the files produced by Part 1 of this assignment.
Grading breakdown
This assignment is worth 3% of your total course grade.
THIS SECTION IS NOT REQUIRED
THERE IS NO EXTRA CREDIT ASSOCIATED WITH IT
That said, if you are interested in hardware emulation or Atari Games, you may be interested in this section
The emulator can be quite buggy and difficult to work with (think about what this means for hardware emulation as a whole). Because this section is not officially part of the assignment, course staff will likely NOT be available to help resolve issues.
In Part 1, we explored a solely software emulator, DOSBox. While DOSBox provides a DOS kernel emulation on an x86 machine, it is certainly possible to boot up DOS on modern machines, as DOS environments (e.g., MS-DOS or IBM PC DOS) are compatible with today's Intel x86 CPUs.
However, it is not always so straightforward. For example, Library of Congress archivists who worked to preserve digital data of nobel laureate Nina Federoff faced some serious challenges: her data was created with the MacDraw Plus and Hypercard programs, which require an Apple Mac OS 9 environment. Mac computers prior to Mac OS 10 used the PowerPC instruction set, and software for them is incompatible with modern x86 or ARM64 computers. Indeed, with new hardware like Apple’s M1 chips and the accompanying transition to the ARM64 architecture, it is entirely possible that in a few years or decades, software applications written for our current x86 architectures will be incompatible with the hardware of the day. Apple silicon users are already acutely aware of this issue.
Of course, one option to access applications for specific architectures is to purchase a physical retro-computer; for instance, old PowerPC Macintoshes can be found on eBay. However, this is clearly neither durable nor feasible; physical hardware eventually breaks, and rare retro-hardware may be difficult for digital archivists to access—even NASA has had difficulty. What if, like software, we could achieve some sort of emulation, but instead of different hardware components (i.e., different CPU architectures)?
Enter hardware emulation. In this part, we’ll explore one such example: Atari 2600! Released in 1977, the Atari 2600 quickly dominated the market, becoming synonymous with video games and sparking the growth of the entire industry; following its decline, its games have become favorites in retro gaming communities (and have even found use in a rather surprising modern application: deep reinforcement learning).
Atari Emulation
The Atari 2600 operating system used the MOS Technology 6502 instruction set, a long-since defunct CPU; thus, we’ll use the Stella emulator.
Apple M1/ARM64 notes
The Stella emulator for this part of the assignment is not officially compatible with M1 machines, and the download page says that it is "Intel only".
However, our testing on M1 devices suggests that it works just fine. The reason for this is that Apple built an x86-64 emulation mode into the hardware of M1 processors ("Rosetta 2"). M1 processors can dynamically translate the machine code of x86-64 executables into ARM64 machine code as they run them (though this does come with some slowdown). So, if you have an M1, you're running one emulator (Rosetta 2) to run another emulator (Stella), and there are no fewer than three architectures involved: ARM64 (hardware), x86-64 (emulated by Rosetta 2), and MOS-6502 (emulated by Stella)!
Steps to play Atari Games!
- Download the emulator for your operating system.
- Acquire some ROMs for Atari 2600 systems (Stella provides some guidelines here, or you can choose from this list).
- If you are having trouble starting a game, don't be afraid to check the Stella documentation!
Hardware emulation provides a complete infrastructure for digital preservation, but it is also technically difficult to execute: all technical specifications and digital logic, down to precise clock cycles and analog elements, must be recreated by a hardware emulator. Moreover, hardware is constantly changing.
For example, Apple switched from Intel processors to Apple Silicon in 2020. If you have an M1 or M2 MacBook, your computer uses this new architecture! This means that many emulators designed for Intel hardware will not work on your device.
This assignment was created for CS 300.
« Back to the main CS 300 website
Project SRC: Time Machine
Partner Choice Due Friday, April 5th at 11:59 PM (EDT)
Part 1 Due Friday, April 12th at 6:00 PM (EDT)
All Parts Due Friday, April 19th at 6:00pm (EDT)
Introduction
Each of us produces immense amounts of digital data today, and we all use software in doing so. In particular, if you use a note-taking app, that app might store your notes in its own proprietary file format, and the app itself will assume that it runs on a current operating system (e.g., iOS, Android, macOS, or Windows) and widely-used hardware architecture (e.g., x86-64).
But what happens if you, a few decades from now, decide to look at your old CS 300 notes to relive your college days? The app may be long since defunct, your laptop or smartphone from your college days may have moved on to greener pastures, and you’ll probably be using devices that run a different operating system on a different hardware architecture (e.g., ARM64 or one of its successors).
If your college notes aren’t exciting enough, consider NASA’s lost Apollo guidance software and their hunt for old processors on eBay: despite enabling some of humanity’s greatest technological achievements, NASA faced difficulty maintaining their space shuttles with antiquated software.
Nowadays, there is a massive amount of open source software that is critical for our increasingly digital society, and much of its source code is stored on GitHub (e.g. Linux, Docker, and Python, just to name a few). To safeguard the world's open source software for future generations, GitHub launched the GitHub Arctic Code Vault in 2020. This project preserves a significant portion of GitHub repositories by storing them in an archive inside an abandoned coal mine deep in the Arctic Circle, on the Norwegian archipelago of Svalbard. (If a repo of yours was public and had at least 1 star and 1 commit on 02/02/2020, it exists on a reel of film deep underground in Svalbard!)
The preservation of digital artifacts and the software needed to access them is already an acute challenge, and will grow in importance as more critical infrastructure comes to depend on software. In this assignment, you’ll explore some of the motivations behind software preservation, and do a hands-on exploration of the challenges of software preservation.
Partners: 🚨 Action Required 🚨
You will choose a partner for part 3 of the project. Alternatively, if you would prefer to be randomly assigned a partner, we will pair you with a classmate. To help us determine groups, everyone should fill out this form by 11:59pm on Friday, April 5! Within the form, you will either write down your chosen partner's CS login, or opt into random assignment.
Learning Objectives and Answer Expections
This project will help you:
Assignment installation
Ensure that your project repository has a
handout
remote. Type:If this reports an error, run:
Then run:
This will merge our
timemachine
folder with your repository.Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You'll be handing in your work for this project to the
timemachine
directory in the working copy of your projects repository.Part 1: Reading Old Files
You, a budding digital preservationist, start browsing digital collections (as one does), and stumble upon an archive with an interesting file:
Task: Download the archive.
Eager to make your mark on history, you take a peek at what it is… and are met with almost complete gibberish. What’s going on?
The problem you’re facing is one that is all too common in the digital preservation world. Until recently, much of the focus has been on data preservation; but preserving data alone can often miss key components of the environment that are necessary to interpret the data meaningfully. Be it notes or 3D modeling, data formats are often integrated with the software used to create, view, and manage them, rendering them useless without this necessary software. Digital preservationists have increased their focus on software to address this problem.
After some research, you discover that the file is actually a WordPerfect file. Since the file doesn't seem to be working with your current software, you deduce that it's from an earlier time: the DOS era (mid-1980s to mid-1990s).
Assignment
How do we run a program that was written for MS-DOS in the 1980s on today's computers? First of all, we need a processor and hardware that can run the program. Fortunately, WordPerfect ran on x86 machines (amongst other architectures in use at the time), so as long as we can get a copy of a WordPerfect executable, we should be good, right? Not so fast.
The WordPerfect executable, like all programs, makes assumptions about the syscalls available and the kernel that runs underneath. So, we also need a DOS kernel and the ability to run it on modern hardware! In this instance, we'll emulate a DOS kernel, rather than actually installing an ancient operating system on your computer.
Emulators are one of the most common methods of interacting with legacy software. Essentially, an emulator recreates the original environment needed by the software, allowing it to run on a modern computer. In our case, we will use DOSBox to emulate a DOS environment. This will give us a platform to run WordPerfect and open our file.
Note: For this, and all following installation steps, you should not use the course container.
Download DOSBox
Alt-Enter
(Opt-Enter
on Mac) to enable fullscreen when running DOSBox.)~/DOSBOX
), then start DOSBox. You’ll see a MS-DOS command prompt (Z:\>
). Mount the folder you created as theC:
drive withZ:\> MOUNT C <FULL-PATH-TO-FOLDER>
inside DOSBox (For instance,Z:\> MOUNT C ~/cs300/DOSBOX
.) Note that this mounting functionality is something the emulator provides; it wasn't available in the original DOS running on a physical computer.C:
drive with the commandZ:\> C:
.Ctrl-F1
(orCtrl-Shift-F1
/Ctrl-Cmd-F1
).You should now see a screen like this!
Apple note
Macs have function keys disabled by default.
There are two ways to use the function keys with DOSBox
Flip the switch on.
fn
key. (If you are on a Mac without physical function keys, you will have to hold down thefn
key regardless.)Help, my KeyMapper is not showing up!
Alternatively: Help, I messed up my KeyMapper!
cd
into the.dmg
file containing your download of DOSBox. Then, type incd dosbox.app
, thencd Contents
, and thencd MacOS
. From here, you can reset your KeyMapper by entering./DOSBox -resetmapper
.C:\Program Files (x86)\DOSBox-0.74-3
). Then, run theReset KeyMapper.bat
script by double-clicking it.Remapping Example
Download WordPerfect
.img
). Unfortunately, it's a bit tricky to interact with this file type, requiring yet more software. So, instead, we're providing you with the pre-extracted installation files.Task: Download ALL of the installation files.
secret-files
you downloaded earlier into your DOSBOX (not yourDOSBOX/WP
) folder.INSTALL
. This will start the WordPerfect installation process. Pressy
or<Enter>
when prompted. You only need to install the core program and the help files. Make sure you respondy
to features described as essential (including help/utility files), andn
to all questions about installing extra features (printer drivers, graphics, etc).CS300REF.WPD
into this folder.Success! WordPerfect should now be installed.
WP51
, usingcd WP51
from within your mounted folder. You can run WordPerfect on a file by typingWP <file-name>
at theC:\WP51>
prompt.CS300REF.WPD
document.Your instructions are inside the document. You should copy any files produced into the
timemachine
directory of your CS 300 projects repo. Enjoy!Reflection
Wahoo! We did it; the old documents were saved from disappearing forever into the void, and you now have a working DOS environment to experiment with more legacy software. (If you’re interested, websites like DOSGames and WinWorld contain numerous DOS-era software and games; try one out, and experience what technology felt like in the 90s[1].) Now, if someone discovers a stash of George R. R. Martin’s unfinished novels, you’ll be adequately equipped to handle his WordStar files[2].
The process you went through, albeit very simplified, is a real problem digital archivists face every day when handling decades-old data formats. As we explore additional challenges to and alternative methods of software preservation, keep in mind some of the difficulties you faced while setting up preservation software and interacting with legacy applications.
Part 2: Personal Throwback
Pictures of young Malte at the computer!
As we all know, technology moves fast! Within just this year, an immeasureable amount of digital content has been lost. Some of that lost content has been archived, rewritten, or emulated and some, lost to the void. Even beyond "content," the way we interact with technology is constantly changing! (How many of you still use a physical alarm clock?) While we all like to reminisce about our pasts at some points, it may be difficult or even impossible to relive those past experiences for any number of reasons.
One such example for Nick and Malte is LAN Parties. In the late 90s to early 2000s, groups of people – often students – would pack up their home desktop computers or game consoles and gather together to set up local area networks (LANs) and player multiplayer video games. Playing games over the internet wasn't feasible yet, and because there was no ubiquitous WiFi infrastructure, they had to physically connect their computers to each other with ethernet cables via switches and routers. Because of the substantial overhead involved, people would get together and play video games for many hours or even days, hunkering down with snacks and energy drinks. But trying to recreate that experience has proven difficult. There is something lost bringing a laptop and connecting to Wifi to play games together.
In this part of the assignment, we ask you to think back on experiences with technology from your childhood that may have become inaccessible.
You can consider a broad definition of "experiences": everything from an event like a LAN party to the look and feel of applications, lost websites, lost games, or anything else is fair game. In your weekly section, you will be given the chance to brainstorm experiences with a group, one of which you will explore further for the remainder of part 2.
Help, I can't think of an experience!
In your in-person section for week 6 (Virtual Memory and Pagetables) you will have the opportunity to brainstorm lost tech experiences with your classmates! We recommend that you wait until you've talked with others before beginning this part of the assignment.
Part 3 is independent of this discussion, so you can skip forward to it if you would like to continue working.
Task: Try to recreate a technology experience from your past!
Broadly speaking, you have two options here:
If the experience is still available in its original form, we ask that you find another experience to write about.
(Note that while some experiences may appear unchanged, they may still be preserved in some way and thus are valid for this assignment! See: Adobe Flash Player Games)
What does an attempt to recreate an experience look like?
For examples you think of, explore the internet to see if these experiences are still accessible, if they have been emulated or archived elsewhere, or if they appear to no longer be available. Some ways to start could include performing a simple Google search or using the Wayback Machine to explore the Internet Archive.
Example 1: (Limited/No preservation)
Napster was a peer-to-peer (P2P) file-sharing service founded in 1999 that enabled users to share music as so-called "MP3" files with each other. It became extremely popular for allowing easy, free access to a vast collection of music but faced significant legal challenges due to copyright infringement, leading to its shutdown in 2001. There currently exists a music streaming platform that has the same name and branding as the original Napster but is not connected to it in any other way. There does not appear to be a rewritten version of Napster available online, and archived old versions of Napster available for download on the web are potentially unsafe and probably incompatible with modern operating systems.
If you were writing about Napster, we would be looking for you to find information like the above to determine why you cannot (at least safely) recreate your experience. You would then reflect on how Napster could have been preserved and the resources required to realize that preservation.
Example 2: (Moderate preservation)
Friendster was a popular social network in the early 2000s. Facing decline in June 2011, the company pivoted to being a social gaming site and discontinued its social network accounts. Friendster ultimately shut down all its services in June 2015. However, old snapshots of Friendster's homepage are accessible on the Internet Archive! See this snapshot from June 2007.
If you were writing about Friendster, we would be looking for you to discover this information about it and spend some time looking around at its modern incarnations as well as its preserved state (in this case, the Internet Archive). The site currently features a "Get early access" button and it looks like there may be the beginnings of an attempt to revive the service under the domain friendster.click as well.
On the Internet Archive, many features are not available. You cannot, for example, sign into an account, see specific friends, etc. We would want to see you determine what old features are and are not available in this archived form.
Example 3: (Extensive preservation)
Toontown Online was an early 2000s massively-multiplayer online role playing game published by the Walt Disney Company where players would play as cartoon animal avatars (Toons) and battle against corporate robots (Cogs) using whimsical slapstick-comedy-esque weapondry. Its servers were shut down in 2013. Since its closure, Toontown has seen significant preservation efforts through projects such as Toontown Rewritten, a fan-run, free-to-play revival of the game. Using original assets and publically available material from Toontown Online, the game has been restored in private servers with near-identical functioning to its original state. Many of these servers go beyond the preservation of Toontown Online to include new features, expanded storylines, etc. For example, Toontown Rewritten includes new playable deer and crocodile characters on top of many other additions. Efforts have also been taken to preserve the history of Toontown Online beyond the playable game through the Toontown Preservation Project, a digital museum of old Toontown material such as artwork from game development or promotional materials.
If you were writing about Toontown, we would look for you to find information on revival efforts such as the ones named here and play the game to see how it differs from your original experience with it. If you were not comfortable downloading the game, you may also look to re-experience it second-hand, e.g., by watching someone play Toontown Rewritten on YouTube or reading blogs discussing someone's experience with it. Ideally, we would really like you to gain first-hand experience, but we also want to make sure you are safe in completing this assignment!
NOTE: DO NOT DOWNLOAD FILES FROM SHADY WEBSITES
Err on the side of caution to avoid malware. If you find an archived file of an old software application on the internet but are unsure if it's safe to download, you do not need to download it for this assignment.
Task: Answer the following questions in the file
timemachine/README.md
.Q1. What experience are you focusing on for the following questions?
Q2. What difficulties did you have recreating your experience? (1-2 paragraphs)
Q3. Answer the applicable question (1-2 paragraphs):
Q4. Do you think there is substantial benefit to the preservation of your experience? Do you think the costs for its preservation are warranted? (1-2 paragraphs)
Part 3: Social Context (Partner)
Answer Expectations
Partner Expectations
Society has plenty of experience with preserving valuable historical artifacts outside of the digital realm: whether art, architecture, film, or archaeological conservation, the importance and process of preserving our cultural heritage is well established. But preservation in the digital context is less well understood.
Task: First, familiarize yourselves with approaches to preserving our cultural heritage.
Next, consider the efforts and cultural implications in the digital domain.
In the assignment, you explored a common way emulation assists software preservation; while not the only option, emulation has seen the most success as a preservation tool. As interest grows and we continue to improve our toolbox, our ability to tackle technical challenges as a software preservation community increases. However, technical difficulties are just one aspect to consider when dealing with preservation.
Representation
As we reckon with our digital legacy, it’s important to consider what we choose to represent.
Task: With your partner, discuss the following and write up some key points from your conversation (again, in
timemachine/README.md
).Q5: How does software preservation compare to other forms of preservation that society already engages in today? What standards should we apply, and how does this compare to the standards used for other types of preservation? For example, you could consider the preservation of art or architecture in your answer.
Legality
Another challenge in software preservation is legality. If you look into it, WordPerfect was commercial software at the time when it was produced; to use it, you must obtain a license and/or purchase the software. Yet, preservation efforts have made it and programs like it freely available on the internet. Digital archivists who host such artifacts face legal uncertainties every day.
Archivists frequently operate with abandonware — software that, while technically still proprietary and protected by copyright, has been ignored by a potentially defunct manufacturer. Some manufacturers (if they’re still around) actively help abandonware sites, or at least tolerate them. However, this is not always the case; some manufacturers, like Microsoft or Nintendo, have pursued legal challenges against digital archives, which resulted in some major sites shutting down.
Task: For the following question, coordinate with your partner to take on opposing, or at least conflicting views. Then, respond individually in
timemachine/README.md
. When you are done, come together and discuss your responses! Together, write up 3-4 bullet points from your conversation (also intimemachine/README.md
). If you are able to come to a consensus, make sure to include your conclusion and explain how you reached it. If not, explain where your positions clashed.Q6: Should digital libraries and preservationists receive legal protections, and if so, what might this look like? When supporting you answer, be sure to consider:
Tying it Together
So far, companies and government institutions have been rather uninvolved in software preservation efforts. As software proliferates and grows in complexity, and more of it moves to web services that have proprietary server-side code, it will become increasingly difficult for individuals and non-profit organizations to preserve digital artifacts on their own.
Maintaining legacy software is an expensive, laborious, and ever-growing task, and the financial incentive for effective preservation is not always high. Furthermore, thoroughly maintaining all of our old software requires substantial use of online resources and programming ability. As such, it is worth discussing the true importance of software preservation, weighed against these high costs.
One way to look at this conversation is to define where software preservation falls between public good and expensive taste. In Michael J. Rushton's paper on Expensive Tastes and Public Funding for the Arts, he defines expensive taste as:
Meanwhile, public goods are commodities that benefit all citizens, and should therefore be made publicly available. Services that qualify vary by country, and might include public education, national defense, and healthcare. Furthermore, once an item is a public good, it will be:
Task: As with Question 6, coordinate with your partner to take on opposing, or at least conflicting views. After you respond individually in
timemachine/README.md
, come together and share what you wrote! You should then write a few sentences responding to your partner's position, and include that in yourREADME.md
as well.Q7: Is legacy software an expensive taste or a public good? Should this impact our approach to software preservation? Be sure to explain your reasoning.
Final Remarks
Congratulations! 🎉 You’ve completed the SRC project for CS 300. We hope you’ve developed an appreciation for the difficulty and importance of software preservation, as well as some common techniques and considerations, and how they relate to the technical operating systems and hardware topics we discuss in the course.
Handing In
Please hand in the files and answers for this assignment via Git in your
cs300-s24-projects-YOURNAME
repository. Put your answers into theREADME.md
file in thetimemachine/
subdirectory of your project repository, and also put all other files from this assignment into that directory. (You should have two .txt files andREADME.md
.)By 6:00 PM on Friday, April 19th, you must have filled in the file
README.md
in thetimemachine
directory in your projects repo, and pushed the files produced by Part 1 of this assignment.Grading breakdown
This assignment is worth 3% of your total course grade.
Fun Extra Section: Hardware Emulation
THIS SECTION IS NOT REQUIRED
THERE IS NO EXTRA CREDIT ASSOCIATED WITH IT
That said, if you are interested in hardware emulation or Atari Games, you may be interested in this section
The emulator can be quite buggy and difficult to work with (think about what this means for hardware emulation as a whole). Because this section is not officially part of the assignment, course staff will likely NOT be available to help resolve issues.
In Part 1, we explored a solely software emulator, DOSBox. While DOSBox provides a DOS kernel emulation on an x86 machine, it is certainly possible to boot up DOS on modern machines, as DOS environments (e.g., MS-DOS or IBM PC DOS) are compatible with today's Intel x86 CPUs.
However, it is not always so straightforward. For example, Library of Congress archivists who worked to preserve digital data of nobel laureate Nina Federoff faced some serious challenges: her data was created with the MacDraw Plus and Hypercard programs, which require an Apple Mac OS 9 environment. Mac computers prior to Mac OS 10 used the PowerPC instruction set, and software for them is incompatible with modern x86 or ARM64 computers. Indeed, with new hardware like Apple’s M1 chips and the accompanying transition to the ARM64 architecture, it is entirely possible that in a few years or decades, software applications written for our current x86 architectures will be incompatible with the hardware of the day. Apple silicon users are already acutely aware of this issue[3].
Of course, one option to access applications for specific architectures is to purchase a physical retro-computer; for instance, old PowerPC Macintoshes can be found on eBay. However, this is clearly neither durable nor feasible; physical hardware eventually breaks, and rare retro-hardware may be difficult for digital archivists to access—even NASA has had difficulty. What if, like software, we could achieve some sort of emulation, but instead of different hardware components (i.e., different CPU architectures)?
Enter hardware emulation. In this part, we’ll explore one such example: Atari 2600! Released in 1977, the Atari 2600 quickly dominated the market, becoming synonymous with video games and sparking the growth of the entire industry; following its decline, its games have become favorites in retro gaming communities (and have even found use in a rather surprising modern application: deep reinforcement learning).
Atari Emulation
The Atari 2600 operating system used the MOS Technology 6502 instruction set, a long-since defunct CPU; thus, we’ll use the Stella emulator.
Apple M1/ARM64 notes
The Stella emulator for this part of the assignment is not officially compatible with M1 machines, and the download page says that it is "Intel only".
However, our testing on M1 devices suggests that it works just fine. The reason for this is that Apple built an x86-64 emulation mode into the hardware of M1 processors ("Rosetta 2"). M1 processors can dynamically translate the machine code of x86-64 executables into ARM64 machine code as they run them (though this does come with some slowdown). So, if you have an M1, you're running one emulator (Rosetta 2) to run another emulator (Stella), and there are no fewer than three architectures involved: ARM64 (hardware), x86-64 (emulated by Rosetta 2), and MOS-6502 (emulated by Stella)!
Steps to play Atari Games!
Hardware emulation provides a complete infrastructure for digital preservation, but it is also technically difficult to execute: all technical specifications and digital logic, down to precise clock cycles and analog elements, must be recreated by a hardware emulator. Moreover, hardware is constantly changing.
For example, Apple switched from Intel processors to Apple Silicon in 2020. If you have an M1 or M2 MacBook, your computer uses this new architecture! This means that many emulators designed for Intel hardware will not work on your device.
This assignment was created for CS 300.
One application that we found particularly interesting was Sid Meier’s Civilization 1; even in the 90s, games were truly quite sophisticated! ↩︎
WordStar was the first word processor that offered textual WYSIWYG functionality; it preceded WordPerfect, and dominated the market until WordPerfect eventually took over. ↩︎
Indeed, our course Docker container originated in part to support the new M1 Macs; the old virtualization software, VirtualBox, worked only on x86 machines, and was incompatible with ARM. ↩︎