Go back

This is what happens when you use react-router with Github Pages

October 20th, 2024 7:39 PM

gh-pages
react-router
hash routing
giscus
debugging

Image Courtesy:Praveen Thirumurugan

Hey, before you start reading this blog, you might wanna read this. In the context of my project, this blog is largely obsolete. Kinda sad, specially cause I had anticipated this might happen from the very first day I wrote this. Still, I believe it could hold some value for others, so I’ll leave it here. Feel free to take a look—maybe you’ll find something useful! :) For the new updated version, feel free to check out Here's what I did not tell you about my Portfolio Website

Hi again!

I'm back to discuss some technical aspects of my portfolio website. One of the major issues I faced was with React Routing, which has been particularly challenging.

Introduction

For those who are new to this— this is how most browsers work basically. Everytime you type in a url into the browser's address bar, a GET HTTP request is fired. Static web hosting services like GitHub Pages, provide a minimal server as a service, which responds HTTP requests— as the name suggests— with static assets.

For example, visiting https://example.com/xyz.png serves the image xyz.png, while visiting https://example.com typically serves the index.html file automatically, without needing to explicitly include the filename in the url.

However, the key limitation of static hosting is that it expects static asset for each url. In case a static asset is not present, you get the classic 404, 'resouce not found' error.

Another way we can set off a GET HTTP request is by clicking on hyperlinks occuring in html files. These hyperlinks does a job tantamount to pasting a link present in the href attribute of the anchor tag, into the browser's address bar and pressing the enter key.

In more advance serving strategies, like Static Site Generation or Server Side Rendering, these assets are usually dynamically generated during build time or run time respectively, and then served. But for everything you see on the page is a rendered product of what once was a mere HTTP response to the url present on that page's address bar.

But there is an exception to this rule. And that exception is Client Side Rendering routings.

There is a way we can programmatically change the url in our browser's address bar and not fire a HTTP request. And that way is using the History API .

Most CSR tools have their own ecosystem, and atleast one router library, which uses the History API under the hood, including React react-router-dom v6.

Now the problem with Client Side Rendering solutions like React is that when we serve the build files through a static web hosting service, you do not have the views or static 'html' files at your disposal ready to be served at each route defined in the react app. When you serve these build files, at your base url, a minimal index.html is served with a reference to a bundled JS script, which when recieved and executed, all the router related scripts are made available. It works perfectly fine when you interact and navigate to various routes with the website you recieve at the base url, but guess what happens if you refresh the page? or manually enter a route?

The JS script that determines routes programmatically referenced at the index.html is not received as a HTTP response to the request triggered by the act of refreshing the page or manually typing the url.

For e.g., if you enter https://example.com/about URL manually in your browser, you are essentially causing the browser to send a GET request for that specific path. Since GitHub Pages and other static hosting services don't have an about.html, a 404 error is triggered, even if there is a route corresponding to that URL defined in your React script.

If you had control over the server, you can serve index.html for all routes. By sending index.html on all the routes, the script referenced in it is excecuted and only the component meant to be rendered at a particular route is mounted. But unfortunately we do not have that option in most Static Web hosting services like GitHub Pages. I am gonna go through my journey of tackling this problem, so that you can learn from it next you are deploying your static website!

Traditionally, the code for all the routes is bundled into a single JS file at build time. It isn't very difficult to see how this approach can be significantly suboptimal as users have to wait for the entire bundle to load, which may include unnecessary components for the route they initially visited, specially if there are alot of routes and components. There are some strategies like splitting the JS bundle into smaller chunks by lazy-loading components. Lazy-loading essentially just defers loading a component’s code until it is rendered for the first time. 1

Definitions

Before that I am just gonna quickly establish how I am going to use some jargon to avoid ambiguous interpretations:

  • origin or hostname: domain name and the top-level domain. E.g., example.com.
  • pathname: anything that is followed by the hostname, but preceeding the hash and query strings. E.g., expample.com/users/blogs/hello-world#title?theme=dark&spacing=2, here /users/blogs/hello-world would be a pathname in my book.
  • query string: begins with '?'; key value pairs. E.g., ?theme=dark&spacing=2 in the previous url.
  • anchor: string followed by the #.

I consider all the term I have introduced so far as parts of an url.

  • base url: the url at which the index.html is served in a static hosting. In package.json it is called homepage.
  • basename: the base path at which index.html is served. That would be base url without the origin.

Problem Forumulation in my Project's Context

So conventionally, the default base url at which github pages serve static file is <username>.github.io/<repository_name>.

BrowserRouter from react-router usually assumes the base url to be the origin part of the url.

For you to specifically set a url containing a pathname as a part of base url, you must pass it as a value of the basename in the basename property of BrowserRouter.

My initial draft of the Router component I wrote looked something akin to the following:

1const Router = () => { 2 return ( 3 <BrowserRouter basename={process.env.PUBLIC_URL}> 4 <Routes> 5 <Route path='/' element={<Layout />}> 6 <Route index element={<Home />} /> 7 <Route path='blog' element={<BlogList />} /> 8 <Route path='blog/:id' element={<Blog />} /> 9 </Route> 10 </Routes> 11 </BrowserRouter> 12 ) 13} 14

where process.enc.PUBLIC_URL automatically grabs the basename from package.json's "homepage" property.

1 "homepage": "https://karunika.github.io/portfolio/", 2

that worked very well, but unsuprisingly, served an error whenever I manually visited https://karunika.github.io/portfolio/blog.

404 github error

Well this poses an undoubtedly huge problem.

Replacing BrowserRouter with HashRouter

Here's how anchors work: they are never sent to the server. This means that modifying or appending anything after the # in the url does trigger an HTTP GET request but not with a new url—it behaves as though the anchor doesn’t exist. As a result, it always gets the index.html file if the request is made to base url followed by a hash followed by anything.

So a strategy people often use is append a # to the base url and manage all the routing after that programmatically. Simple, right? No :)

here's what I did

1const Router = () => { 2 return ( 3 <HashRouter basename={process.env.PUBLIC_URL}> 4 <Routes> 5 <Route path='/' element={<Layout />}> 6 <Route index element={<Home />} /> 7 <Route path='blog' element={<BlogList />} /> 8 <Route path='blog/:id' element={<Blog />} /> 9 </Route> 10 </Routes> 11 </HashRouter> 12 ) 13} 14

Simply replaced BrowserRouter with HashRouter and quickly realised that this didn't work as expected.

Let's take this example from the react router docs .

1function App() { 2 return ( 3 <HashRouter basename="/app"> 4 <Routes> 5 <Route path="/" /> {/* 👈 Renders at /#/app/ */} 6 </Routes> 7 </HashRouter> 8 ); 9} 10

So using this, doesn't append a hash to the route, but prepends it to the base name!

Well that wouldn't have helped my case at all, cause I cannot make GitHub pages serve at https://karunika.github.io/#/portfolio/ instead of https://karunika.github.io/portfolio/.

So I made some changes here and there, researched it on the internet forums, stackoverflow, and what not and found myself at the same position as what I had began with.

Here's an excercise for you :) Tell me in the comments why this won't work either:

1const Router = () => { 2 return ( 3 <HashRouter basename={process.env.PUBLIC_URL}> 4 <Routes> 5 <Route path='/' element={<Layout />}> 6 <Route index element={<Home />} /> 7 <Route path='blog' element={<BlogList />} /> 8 <Route path='blog/:id' element={<Blog />} /> 9 </Route> 10 </Routes> 11 </HashRouter> 12 ) 13} 14

Workarounds that didn't work for me

Some workarounds that didn't work for me, but might work for you.

If your goal is simply to avoid the github's 404, you can write a custom 404.html in the public folder, but anything you see there won't be a part of your React Single Page Application (SPA). And the only way to enter your SPA from this page would be by manually navigating to the base URL provided by GitHub.

That said, one way to route users back into your React SPA is by redirecting them to the index route using a script in your 404.html file, like so:

1window.location.href = window.location.hostname + repositoryLink 2

Just for the record, this triggers a full HTTP request to the new route, rather than just programmatically updating the url as would happen if you used window.history methods for navigation.

Using this solution, if I were to enter https://karunika.github.io/portfolio/blog manually in my address bar, I'll be redirected to https://karunika.github.io/portfolio/. If your site has many routes, this isn’t a great user experience since users would need to navigate back to their desired page every time they reload.

For me, I wanted the ability to share links to my blogs, so this was definitely not it.

My last unsuccessful try

Had my base url just been https://karunika.github.io/, the HashRouter would have worked perfectly fine. Why?

It wouldn't require a basename. Simple. Just a simple append of # after the link.

So here's what I tried. I soft redirected myself to the https://karunika.github.io/ in my root component, and then used a HashRouter with no basename.

But to my dissapointment, that didn't work either cause on reloads, since react app wasn't served by GitHub pages at that route, my poor server was clueless where to find the resources to display.

I bought a domain name

Well, I bought a domain from Cloudflare , configured my GitHub pages to use my custom domain, and now I didn't have to worry about basenames at all.

The index.html is served at the base '/' and using HashRouter on top of it worked perfectly.

At the end, I am pretty happy about actually buying the domain name, cause it even looks cooler.

Other solutions

Some people on Reactiflux Discord Server recommended alternatives like netlify or heroku , and I’ve used both before—they're great options. However, I personally preferred the GitHub domain name, so I was hesitant to switch away from GitHub.

Eventually I bought my own domain which worked seamlessly with GitHub, so didn't bother to myself to switch to a different hosting service.

Incorporating giscus comment section

There is still more to the story than meets the eye.

The authentication callback URLs using giscus included a trailing #comments anchor. In a HashRouter, the #comments is treated as the /comments route. Since none of my routes matched that path, I was constantly getting router warnings—though they were disabled in production mode. Given that most of my routing issues were caused by GitHub Pages in the past, I was mostly testing my app in production mode (lol), so I was pretty oblivious to what was going on for a couple of hours.

This warning ended up being the reason for the authentication failure, although I’m still unsure of the exact cause. What made it even more perplexing was that my cookies were still being set. If you have any insights on why the authentication was failing, feel free to share in the comments!

Eventually adding this line was a neat little fix:

1<Route path='comments' element={<Navigate to='../blog' replace={true} />} /> 2

There are still some issues with the current setup—for instance, after signing in with GitHub, you're not redirected back to the original page where you clicked 'Sign in with GitHub'.

I don't believe this can be easily fixed, and here's why.

With client side routing, you have the option to pass along a piece of information or "state" to the route you're navigating to. In React, for example, you can pass state like this:

1<Navigate to='xyz' state={ data: 'hello world' } /> 2

Then, in the component rendered at /xyz, you can access that state using:

1import { useLocation } from 'react-router-dom' 2 3const XyzPage = () => { 4 const { state } = useLocation() 5 6 return <div>{state.hello}</div> 7} 8

In vanilla JavaScript, this can be achieved with,

To navigate,

1pushState({ data: 'hello world' }, '', '/xyz') 2

an to access state.

1console.log(window.history.state) // 'hello world' 2

Using this approach, you could kind of send the current path to the new route when navigating, allowing you to track where the user came from.

However, there's no way to pass state during a "hard" navigation (a full page reload). Authentication redirects are hard navigations, which means they don’t preserve state. There’s no way to access the navigation history via standard browser APIs after such a redirection.

Giscus Page-Discussion mapping

Everything was going well, until I hit one final issue related to HashRouter.

giscus page-dicussion mapping options

As you can see, I chose the default option: 'Discussion title contains page pathname'. This means that whenever someone comments on a blog post, the comment is linked to a discussion with a title matching the pathname of the url where the Giscus comment section is embedded.

Oof.

So, Giscus comment section authentication is working. But comment POST request gave a bad request exception.

Any guesses why?

Anything after # is treated as an 'anchor', not a path name. And that was it.

So was my time to choose another option from that list.

Now that I think about it, using the full url would’ve been a better choice. At the time, I dismissed this option because it would have caused redundancy by including the origin part of the url in every discussion title. As for the next option, I didn't want my Blog page tab to be displaying an alphanumeric jibber-jabber of an ID generated by Contentful, so that eliminated the title content option for me really quick too. At the end, I opted for using meta tags, but that came at the cost of side effects.

I also realized, in hindsight, that I had given my blog page component a rather unhelpful name—Calc. Not the most descriptive, and I really hope my recruiter never stumbles upon it.

1const Calc = () => { 2 // ... 3 let { id } = useParams(); 4 5 useEffect(() => { 6 const meta = document.createElement('meta') 7 8 if (id) { 9 meta.setAttribute('property', 'og:title') 10 meta.setAttribute('content', id) 11 document.head.appendChild(meta) 12 } 13 14 return () => { 15 document.head.removeChild(meta) 16 } 17 }, [id]) 18 19 // ... 20} 21

This was my somewhat hacky solution. And after a day of debugging, I could really not think any better.

If you have any cool suggestions to fix any of the bugs I have mentioned so far, feel free to share!

Footnote links in Markdown

Congratulations for making it this far! This is the last bug I needed to fix before my website was finally complete (except for code refactoring, particularly making it typesafe for easier maintenance in the future).

This problem was somewhat similar to the Giscus authentication callback url issue, so I’ll do my best to keep it short.

Background

I’m using the react-markdown library to render markdown, along with the remark-gfm plugin for supporting autolink literals, footnotes, strikethroughs, tables, and task lists.

The issue arose with footnotes.

Here’s a simple markdown example for adding footnotes to a blog:

1A note[^1] 2 3[^1]: Big note. 4

This example comes directly from the rehype-gfm README.md by the way.

The corresponding HTML output looks like this:

1<p>A note 2 <sup> 3 <a 4 href="#user-content-fn-1" 5 id="user-content-fnref-1" 6 data-footnote-ref 7 aria-describedby="footnote-label" 8 > 9 1 10 </a> 11 </sup> 12</p> 13

And here’s the footnotes section generated at the end of the blog:

1<section data-footnotes class="footnotes"> 2 <h2 class="sr-only" id="footnote-label">Footnotes</h2> 3 <ol> 4 <li id="user-content-fn-1"> 5 <p>Big note. 6 <a 7 href="#user-content-fnref-1" 8 data-footnote-backref 9 class="data-footnote-backref" 10 aria-label="Back to content" 11 > 1213 </a> 14 </p> 15 </li> 16 </ol> 17</section> 18

The Problem

The issue was with the footnote links. URLs can have only one anchor text. So clicking on these links would replace the entire hash route in the URL with the href of the clicked element (e.g., #user-content-fn-1). This caused the same error as the giscus authentication url i.e., 'no routes matched'.

The Solution

I solved it by using using side effects again. I am gonna just explain it in short and not overwhelm you with the code.

  • Selecting Footnote Anchors: I used the Web API’s querySelector to select all relevant anchor elements.
  • Disabling Anchor String Replacement: I removed the href attributes from the footnote links to prevent the hash route from being altered.
  • Custom Scrolling Logic: I registered event listeners on these anchors and used the react-scroll library to programmatically scroll to the target IDs when clicked.
  • Cleanup: Of course, I made sure to clean up the event listeners to avoid memory leaks.

Actually never mind. Here's the code for just in case you are curious lol.

1const useFootnotes = (loading: boolean, blogId: string | undefined) => { 2 useEffect(() => { 3 const footnote = document.querySelector('section.footnotes'); 4 const as_ = footnote?.querySelectorAll('span.data-footnote-backref > a'); 5 const refs = document.querySelectorAll('[data-footnote-ref] > a'); 6 7 const cb = (id: string) => { 8 return () => scroller.scrollTo('user-content-' + id.replace('#', ''), { 9 duration: 1000, 10 smooth: true, 11 }); 12 }; 13 14 if (!loading && as_ && refs) { 15 as_.forEach((backLink) => { 16 const backId = backLink.getAttribute('href'); 17 if (backId) { 18 backLink.removeAttribute('href'); 19 backLink.addEventListener('click', cb(backId)); 20 } 21 }); 22 23 refs.forEach((refLink) => { 24 const refId = refLink.getAttribute('href'); 25 if (refId) { 26 refLink.removeAttribute('href'); 27 refLink.addEventListener('click', cb(refId)); 28 } 29 }); 30 } 31 32 return () => { 33 if (as_ && refs) { 34 as_.forEach((backLink) => { 35 const backId = backLink.getAttribute('href'); 36 if (backId) { 37 backLink.removeEventListener('click', cb(backId)); 38 } 39 }); 40 41 refs.forEach((refLink) => { 42 const refId = refLink.getAttribute('href'); 43 if (refId) { 44 refLink.removeEventListener('click', cb(refId)); 45 } 46 }); 47 } 48 }; 49 }, [loading, blogId]); 50} 51

Conclusion

Sigh

Finally, all routing bugs are fixed. Well, atleast to far as my knowledge goes.

Overall, debugging these routing issues was a fun and rewarding experience, and I learned a lot along the way. It’s always interesting to dive deep into problems like this, as they challenge you to think creatively and explore new solutions. Hopefully, my experience has given you some useful insights and perhaps even saved you from a few headaches down the road!

This was pretty long blog- longer than anticipated to say the least. Thanks for following along, and I hope you picked up something new. If you have any thoughts or solutions, feel free to share. Until next time, happy coding!

Footnotes

Copyright © 2024 Karunika