I'm working with my university press on new models for DH scholarship... and as I started typing up a description of using Jekyll and GitHub for a proposed DH book project, I ended up discussing not just Jekyll and GitHub but also defining static vs. dynamic sites, git and versioning, and GitHub pages.

Thought I'd share this as a starting point for others! (Caveat emptor: this is one of those emails that turned into a huge email that turned into a blog post.) If you want to be superfancy, you can click this link to annotate the page with questions and requests for clarification, comments, and suggestions using the magic of Hypothesis, and I'll work on refining this into a more exhaustive FAQ as time allows.

What are.... dynamic versus static websites?

Dynamic sites (like Drupal or WordPress) pull information from a database to populate a page; when you search for some words on Amazon.com, for example, the search results page you are shown didn’t already exist as a full HTML page—Amazon.com has a template for search results page with things they all share like their main menu and logo, but it queries the database to insert the results of that search you initiated. Jekyll is a “static site generator” in that it takes page templates (those things like main menus and footers, shared across all the web pages) and other files with specific information (e.g. a file for each blog post on the site) and combines these into full HTML pages for the site visitors to see (i.e. generating a static site, aka a folder of HTML files)—and these are already put together and ready to serve up when you're visiting the site. That is, Jekyll doesn’t need to do anything like querying a database when you visit a page; it’s already got the pages fully formed, and it just updates them when and if they ever change. (For someone who took more time to think through a metaphor for static site generation, check out this post!)

How is Jekyll like Drupal/WordPress/Omeka/[other CMS]?

Jekyll is like software called content management systems (CMSs) such as Drupal or WordPress in that it’s a set of code that makes a website run and handles certain repeated tasks like displaying a logo and menubar on every page, creating a searchable archive of blog posts, etc. Unlike CMSs, though, Jekyll doesn’t have a web “dashboard” (that shiny UI you use to administrate the website: moderating comments, writing a blog post from within the live site...). And! Jekyll does not use a database (the source of a lot of code and security headaches in CMSs like Drupal). If you don’t need the power and possibilities behind these CMSs, it’s often better for effort, security, and preservation not to use them. Or if you need some of the things that Jekyll particularly offers...

Why/when Jekyll?

Versioning! Security (no database hacking) and less sysadminy hassle in general. Speed of loading pages. Free and easy to set up and host with GitHub Pages, and it's then linked into the GitHub.com ecosystem of code versioning, sharing, and reuse.

Jekyll doesn’t work as well if you have a site like Amazon.com that basically lets visitors create a huge number of custom pages by making complex searches from a huge set of things, since it’s easier for Amazon.com to create search result pages on the fly from their database than to store an HTML page for each possible search result combination. But for a site where you can largely imagine now what all the pages will look like (e.g. a digital book), Jekyll is faster, lighter, and better for longterm security and preservation.

GitHub.com, git, and repos

GitHub.com is a site where people share code they’ve written so others can reuse it, build on it, and report bugs and feature requests. It’s based on git, which is software for versioning code: keeping track of the various changes in code over time as people add, subtract, and otherwise edit that text. People use GitHub.com for things other than code including blogging and professional writing, since it supports keeping track of multiple versions of a text and is friendly to collaboratively authored text. (Other sites besides GitHub use git, and you can use git without GitHub.com. GitHub.com is just a wildly popular public platform for sharing and working on versioned code.)

GitHub “repos" (repositories) are just collections of code; for example, I have one repo with all the code involved in my dissertation, and another repo that shares an old Omeka maintenance-page hack I wrote.

GitHub Pages

GitHub Pages can be located for free at a GitHub subdomain such as amandavisconti.github.io/SGAPedagogyPage (where amandavisconti is my GitHub username and SGAPedagogyPage is the name of the GitHub repository holding its web files). You can also purchase from elsewhere and use your own domain name with the same site (as I did for LiteratureGeek.com). Custom domain names aren’t permanently bought, so you need to renew your license over them from time to time (you can do this once a year or buy a bunch of years at a time; the type of domain names a DHer might want to acquire usually cost around $10-20 each per year... except if you want DH.center, they will overcharge you for that!).

You don’t need to use GitHub or GitHub pages to use Jekyll; you could, for example, use commercial or university web hosting. I like GitHub Pages hosting because it’s free, uses git (which manages multiple versions and drafts of collaboratively authored/edited texts nicely), is set up to make Jekyll use easier, and because GitHub is already the most popular place for DH data sharing, code and text versioning, etc.

Can I use these for projects that might have a commercial aspect (e.g. a book series with digital components)?

I think so. Jekyll is free and open-source. Its code can be used for commercial projects (its MIT License file with its attribution to the Jekyll creator should be kept with the rest of the files).

GitHub Pages are webpages hosted for free on GitHub.com. GitHub Pages can run Jekyll. My LiteratureGeek.com research blog is an example of a site using Jekyll to run the site and GitHub Pages to host the site. According to this page, commercial use of GitHub Pages is allowed, though because GitHub retains the right to alter that, anyone using it in a book project should have a backup plan in place (something very simple: time required, who to contact, cost of transferring to university or commercial web hosting).

If you do begin to use GitHub for multiple DH projects, you’ll want to look into their various plans, as the default free plans are set up for individual use and public repositories (e.g. you might want multiple private repositories so sites can be developed there before going public). There are various options including free plans for educational and non-profit uses.

There you have it, an only slightly-cleaned-up email turned into a blog post! It isn't meant to be a universal resource, but feel free to comment/suggest using Hypothesis via this link.