The Modern Web: More Is Not Always Better

Originally this was a very long post that was basically just complaining about everything wrong with the modern web. And to be sure, there’s still plenty of that. But I’ve realized two things while writing this: one, the average person is more aware of these issues than I might have guessed, and I’d largely be preaching to the choir, and two, complaining about things doesn’t actually fix anything. So instead of just piling on yet more negativity with jaded apathy, the second half of this post will focus on the ways in which I hope to solve problems, instead of just describing the problems themselves.

But first, the complaining.

The Complaining
Complexity
Privacy
Advertising
Accessibility
Centralization
Identity
Where are the alternatives?
Mastodon
Gemini
Where do we go from here?
The Social Media Problem
A Unified Platform
Robust and Usable P2P Networking
Protocol-Independence
Decentralized Identity
Managing Complexity For Users
References

The Complaining

Complexity

As I have personally discovered, having a slow internet connection will make you painfully aware of how bloated many modern websites are. A lot of major websites out there need to download megabytes of JavaScript just in order to be functional. Obviously, a lot of web apps rely heavily on JavaScript for their functionality, and this is basically unavoidable. But a lot of websites really don’t. As a quick test, I reloaded a Quora page I had open for research I did for this diatribe and used the network monitor to check how much JavaScript it downloaded. The result was 46 separate files totaling around 6.4 megabytes. That’s after adblock and Firefox’s tracking prevention. Far from the worst I’ve seen, but this is for a website that’s based around what is mostly static text. For reference, the actual content of the article you’re reading right now is about 40 kB, or 0.04 megabytes.

And it’s not just data transfer that’s the problem with these sites, a lot of those unnecessary scripts run constantly, polling for interactions, phoning home with analytics data, and so on. Firefox is currently using over 6 gigabytes of memory on my computer, and around 20 percent of my CPU. I’m not interacting with it, it’s not playing any media, it’s not even visible on my monitors right now. This is ridiculous. And of course this doesn’t even touch on how annoying these scripts can be to deal with. Pointless JavaScript often breaks totally basic functions of a website, like, clicking on links. Clicking on links! I’m sure anyone reading this has at some point tried to right-click or ctrl-click or middle-click a link, any of which should allow you to open the link in a new tab, and have it do exactly nothing, because it’s not a link, it’s a plain <div> element with a JavaScript event handler on it listening for click events so it can redirect you to another page. Browsers already have links! Please stop replacing them with your own worse ones!

And here’s another thing. It’s an important one, but maybe not one you’ve ever thought about before. How hard do you think it would be to write your own web browser? Not a fully-featured, fully-compliant browser with all the bells and whistles, just being able to load and render an average webpage. The answer is way more work than one person could ever be expected to do. The endless features and anti-features that keep getting tacked onto browsers and added to web standards make web browsers exceptionally complicated programs. And at this point, about 92% of users use one of three browsers[1]: Chrome, Firefox, or Safari, or Edge which now uses the Chromium engine under the hood so it’s essentially just Chrome with a different paint job. And for something as indispensable to the average person as a web browser, something most of us use for important tasks every single day, don’t you think that lack of variety is troublesome?

I mentioned tacked-on features and I want to come back to that. By my count, CSS, just counting the bits that are at least a working draft and therefore might be implemented in browsers, there are 37 data types, 60 pseudo-classes, 15 pseudo-elements, 35 at-rules, 150 functions, and a whopping 549 properties! And keep in mind that most of these are in various different states of implementation in different browsers, sometimes with unique quirks in specific browser implementations. And all of this is just for styling web pages. I’ve used a lot of modern CSS features, and using CSS nowadays is definitely nicer than it used to be. But do we really need over 500 unique properties? Again, think about how that limits implementations of compliant CSS parsers and renderers basically to professional development teams - it’s so far beyond the scope of what a hobbyist can do. And since so many websites rely on this massive pile of features, any web browser that only implements a modest subset will be unable to correctly render a lot of websites. And JavaScript is a mess I could write another whole post about, but suffice it to say it’s gotten quite bloated and funky too.

Privacy

It’s not just client-side JS and CSS, the HTTP protocol itself has had features tacked on to it over the years as well. The complexity of the web has provided plenty of tools for tracking users and collecting information on them. For example, a JavaScript API named “evercookie” is capable of storing persistent information in a user’s browser using no less than sixteen different methods, including regular cookies, abusing resource caching, storage space provided by Flash, Java, and Silverlight, and even storing data in browsing history[2]. All of these methods store persistent data that can be read back later, and evercookie automatically recreates all of the ones you try to delete if it detects even one remaining data source, making it extremely difficult to fully remove. And there are other methods of violating user privacy, such as, obviously, services collecting information linked to logged-in users, and less-obviously, browser “fingerprinting”. This is where a site abuses the fact that web standards are such an inconsistently-implemented mess in order to identify - sometimes down to the specific version - what browser you are using, by testing for the particular set of features your browser implements, which in some cases can make it possible to link activity back to you. And since this kind of fingerprinting is capability-based, even if you take measures to avoid identifying your browser, such as spoofing[3] your user agent[4], sites can still perform this fingerprinting to leak information about your browser.

Advertising

Ads are a common annoyance on the web, to the point that many users (such as myself) use browser extensions specifically designed to prevent any and all ads from loading, at least to the extent that is reasonably possible. This practice receives a lot of criticism, saying that blocking ads denies web publishers access to ad revenue. And yes, this is certainly true. But it misses the larger picture about ads, which is that they’re not just annoying, they’re security risks. Ads have always presented a number of security issues. Many advertising platforms are minimally filtered, if at all, so malicious actors can easily add malicious ads to a service and have them served on hundreds or thousands of websites. Even if the ad platform itself filters submitted ads sufficiently, if the platform is compromised then hackers could inject their own malicious ads to be served across the whole platform.

Many of these ad systems will send a user through several intermediate links after clicking on an ad, before reaching the destination website. These intermediate sites are used for tracking and analytics purposes. If any one of those intermediate sites are compromised, that’s another attack vector. Occasionally browser exploits are found that use specially constructed media to compromise a browser, such as a specially-designed image that causes a buffer overflow in a vulnerable browser renderer. These exploits are usually found and patched fairly quickly, but despite developers’ best efforts, they do keep popping up from time to time, and such exploits make any image or video being pulled from an unknown, untrusted third party a potential security risk. And perhaps the most dangerous kind of “malvertising” is the simplest of all - tricking users via ad design. Frequently, malicious advertisers will create ads that are designed to look legitimate, like perhaps a download button, appearing conveniently in a banner ad right before the actual download button. These are especially devious since sometimes you will be trying to download an executable, like perhaps a program installer. So the malicious ad redirects you to a site that serves you a “setup.exe”. If you missed any subtle clues that this link was illegitimate, you better hope your antivirus catches that file before you run it.

These are threats that can affect even tech-savvy users, especially the media exploits. And since these days everyone is an internet user, a huge portion of regular internet users are non-tech-savvy folks who are much more vulnerable to malicious ads. And yet, even after all of this, sites that use ads for revenue will still put insulting messages behind ad elements to shame you for using an ad blocker, even as they try to push untrusted content onto your computer.

But it’s not just security that’s problematic about advertising. A distressing amount of the web is designed around advertising. Companies redesigning their sites to maximize ad exposure, using analytics data to optimize ad clicks, performing algorithmic filtering of content to try to serve you more, or more strongly personalized, ads. If you take a look at Twitter’s quarterly results[5] you’ll find that the primary statistic Twitter uses to measure user activity isn’t total users, or active users, it’s monetizable daily active usage. Twitter measures their success or failure based on how much monetization they are able to perform. And from a business standpoint, that makes sense. They’re a business, and they’re trying to make money. But from a personal standpoint, what does this mean? It means that if you use an ad blocker, or you post explicit content, or you tweet with blacklisted words that disqualify you from certain advertising, Twitter does not care about you or your experience on Twitter. The only users that matter are the ones that are marketable. And basically all other ad-focused, for-profit platforms are going to follow that same methodology. If you are someone who actively tries to protect your privacy and security by blocking tracking methods and advertisements, you are persona non grata to these platforms.

Accessibility

Accessibility is an important but often-overlooked aspect of human-computer interaction. It’s easy to forget that there are people with poor eyesight, or poor color vision, or who are blind, or deaf, or have a physical disability, or have any other accessibility need, when you’re perfectly-abled yourself. Modern HTML has some built-in accessibility features, like the use of <nav> to point out navigation links, and the use of alt attributes to describe images for users with screen readers. But accessible design goes much further than that, and many web devs have some bad habits that make websites less usable for people with disabilities. Things like using elements for the wrong thing, like the link that’s actually a scripted <div> I complained about above, or using tables (which are supposed to display data in tabulated form) to lay out pages, or abusing HTML’s leniency to write poorly-structured text that screen readers struggle with. The primary issue here is that HTML was designed in the ’90s, and accessibility simply was not a concern when it was being developed. No one could have predicted that the web would become as ubiquitous as it is today, and so all of the accessibility features we have now are just awkwardly tacked onto a markup language that’s barely changed in 30 years to maintain backwards-compatibility. Now the responsibility to ensure websites are accessible to users with disabilities falls squarely on web designers, and time has shown that most of them won’t even bother.

A different - but still very important - type of accessibility issue is the difficulty of creating content on the web. The mess of HTML, JS, and CSS, all of which are horrible incrementally-updated backwards-compatible nightmares, makes it harder than it needs to be for new users to create content online. Modern tools do exist for beginner-friendly web content creation, with varying levels of usefulness and bloat, but in the end they still need to transform their output into the same formats described above, sometimes introducing more issues in the process. The success of Flash content online can at least partially be attributed to the fact that it was fairly easy for non-tech savvy folks to create complex interactive content that could be published online and appear and function exactly like they designed it. Since then, Flash has become a whole different mess on its own, but it’s a useful lesson in usability.

And speaking of problems with web publishing, it’s even more difficult to publish content online in a way that works correctly and is discoverable by users, unless you go with one of the website-designer… websites, that let you publish content on a subdomain or sometimes even with custom domain support. But most of these services are paid, or the free version is horribly limited, or they’re full of ads, or they work poorly and make designing sites miserable, or some combination of these problems. And ultimately they all limit the user’s creative power by only letting them use whatever point-and-click functionality they happen to implement, whatever premade widgets they happen to provide, whatever arbitrary limitations they put on the design. And of course there’s no interoperability between these services, so if you want to switch to a different service you have to start over your design from scratch. Like most of the things I’m complaining about in this post, these things have gotten better over time, but it’s still indicative of some serious underlying accessibility issues with publishing on the web. For something so fundamental to our daily lives, and communications with others, and content creation, do we still need to be using this hulking 30-year old amalgamation of a platform? Wouldn’t it be nice to have a platform that lowered the barrier to content creation and publishing? (and yes, I know there are some, I’m addressing that in a bit)

Centralization

The most obvious case of centralization on the web is in the form of the most popular web services. Want to have control over your own emails? Too bad. The centralization of email services means that it’s functionally impossible to reliably self-host email in 2022[6]. Want to host easily-accessible video content that’s discoverable by users but don’t like YouTube? Too bad. No other video hosting platforms come close to YouTube’s viewership or content availability. YouTube’s search is the third-most-used search engine in the world, behind Google and Google Images[7]. Want a simple microblogging service like Twitter but don’t want to use Twitter? Well actually, Mastodon has been doing pretty well in that regard[8], and it’s a big inspiration for my own work. But it still doesn’t come close to Twitter[9]. And as many users of social media or other web apps are already aware, many of these centralized services offer little to no interoperability amongst themselves. Being competitors in a capitalist economy, these sites are actively discouraged from letting their users share their data with other apps. I was going to add another aside acknowledging the recent collapse of Twitter, but actually this highlights another problem with centralization, which is that when the monolith falls, the small competitors previously starved of use now need to race to try to meet that same multi-billion dollar quality of service (and there’s a lot of catching up to do) so users often feel stranded with nowhere to go.

Another problem with centralized services - one any Twitter or YouTube user should be familiar with - is that they’re impossible to moderate effectively. Massive platforms like YouTube and Twitter with users in the hundreds of millions to billions have very few actual humans performing moderation, and have instead largely turned to automated systems to do their moderation. The moderation scene has gotten bad enough that on YouTube, videos are being copyright claimed for using white noise or cricket sounds[10], and on Twitter, accounts are being banned due to tweets with just a single flagged word in them (in one case, the word was “Memphis”[11]).

There are also serious security implications posed by the centralization of internet infrastructure. DNS servers - the servers that resolve domains like google.com into an IP address like 142.251.33.78 so your computer knows what address to connect to - are becoming increasingly centralized as of late, where in some places like New Zealand and the Netherlands, around one-third of all DNS traffic goes through just a handful of DNS providers[12]. Why is this a problem? Well, the more centralized DNS services become, the more they become targets for both criminals and government surveillance. In 2016, a denial-of-service attack on just one DNS provider, Dyn, brought down several major sites, including Twitter, Netflix, and Reddit, for most of a day[13]. Attacks like this will only become more serious as fewer and fewer services provide the majority of DNS traffic. And the threats posed by centralization extend beyond being unable to watch online movies for a day - in 2017, an exploit in CloudFlare’s content delivery services, which at the time was in use by over 5.5 million websites[14], exposed sensitive user information that was supposed to be protected by TLS, including “full https requests, client IP addresses, full responses, cookies, passwords, keys, data, everything”[15]. According to a report from CloudFlare, this leaking of data likely happened more than 18 million times before the exploit was fixed, with sensitive information also being cached by search engines[16]. This even included the two-factor authentication app Authy, meaning even accounts protected by 2FA could have been compromised[17].

Identity

Perhaps the most obvious issue with identity on the modern web is how often it is tightly coupled to one’s real-life identity. How many sites or services have you signed up for that asked for your first and last name, even though there was absolutely no need for it? Many websites also require a mandatory valid phone number, which is even more unavoidably coupled to a real-life identity since any valid phone number is registered to one real-life person, and this information can be obtained from the phone number alone. These requirements also make it difficult if not impossible to maintain more than one identity online (unless you’re willing to pay for multiple phone plans).

Another well-known problem with web-based authentication is how it requires you to let a third party manage your security. The main reason why security breaches have been such a high-profile problem lately is because modern web systems require you to store your information - often information that can be used to identify you - on someone else’s web servers, and just hope that they have competent security. Well, that hasn’t been going so well, has it? This is also the reason why you need a separate password for every account you use: authenticating users with a username and password, and then storing those usernames and passwords in association with that service, means that a security breach can potentially give the attacker everything they need to access and control your account, and if you’ve used the same username and password elsewhere, those accounts too!

This is also the reason why you need to make new accounts everywhere you go (and use separate passwords for all of them). The way authentication is currently done on the web necessitates this kind of needless inconvenience and security risk. And single-sign on (SSO) services don’t fix the problem either. Like to log in everywhere with your Gmail or Facebook account? Well now, if your Gmail or Facebook account is compromised, everywhere else that you’ve used that account for SSO is also compromised. We’re back to square one. And of course, Gmail, Facebook, and similar services demand your full name and phone number to create an account, so those are linked as well, making these SSO credentials an even more attractive target for hackers.

Another issue with modern web identity is that almost no services expose facilities for cryptographic verification. Got an email that appears to be from a family member but was actually a phishing attempt? Seen accounts impersonate someone by using a username that looks the same, maybe with an I instead of an l? Worried (justifiably) that someone might be intercepting, blocking, or altering messages to or from you? All of these problems would be solved (at least mostly) if services provided the ability to easily cryptographically verify the author of the content.

Where are the alternatives?

So, as someone who isn’t satisfied with the current state of the web, what am I to do? Are there answers to these questions? Solutions to these problems? Find out next time on–

Ahem. The answer is yes, there are solutions, but they aren’t enough. I’m going to use two examples of systems that take steps in the right direction - Mastodon and Gemini - and highlight what they did right and what they still have problems with.

Mastodon

Mastodon claims to be an alternative to social media sites like Twitter or Tumblr, one that is decentralized and decommercialized. And it does solve a lot of the problems I talked about above. It is decentralized - made up of thousands of individual “instances” that all communicate with each other using a shared system to form a large, (mostly-)contiguous network. And the “mostly” qualifier there is a good thing; it’s not fully contiguous because instance administrators can choose which other instances they want to “federate” with, that is, exchange content with. This is a powerful tool to completely cut off harmful communities, in a way that each instance can control. Mastodon also encourages building communities in healthy ways (instead of monetizable ways) and providing better privacy and moderation. That last point is especially true; moderation on Mastodon (on some instances, anyway) is orders of magnitude better than the automated keyword police on Twitter. Why? Because since the individual instances are small, they can be reasonably moderated by a few people. Decentralization solves the moderation scaling problem.

Mostly. One problem highlighted by the recent Twitter exodus is that Mastodon is only partially decentralized. Some instances have become quite large - the current largest, mastodon.social, has over 832,000 users at the time of writing - and when instances become so large, they encounter many of the same problems that completely-centralized services have, and struggle to keep up with moderation as they scale. And with so many new users fleeing from Twitter, some servers have had technical issues with scaling as well, sometimes resulting in downtime and multi-hour delays with content federation[18].

Also, Mastodon by design cannot address the issues with the underlying web technology - HTTP, HTML/CSS/JavaScript - because it’s still built atop the same stuff. However, there are other systems out there that attempt to tackle this problem in their own way, such as…

Gemini

Gemini is a lightweight an application-level protocol (and alternative to HTTP) that focuses on minimalism, privacy, and self-publishing. It solves a lot of the problems with the modern web by simply… not implementing them. By freeing themselves from the burden of compatibility with HTTP, they no longer need to include features such as cookies and caching that can be used to track users and collect data on them. This makes browsing Gemini sites a very safe and comfortable experience - you know you won’t be tracked or advertised to, because the platform literally does not support it. Gemini is also designed with one kind of accessibility I mentioned earlier in mind, that is, ease of implementation and publishing. Gemini is designed so that someone could write a reasonably usable client application for it - a web browser, more or less - in around 200 lines of code. Compare that to the roughly 35 million lines of code in the Chromium browser engine[19].

Gemini’s lightweight implementation also means that Gemini sites can load much, much faster on slow internet connections, since the total amount of data needed to display a page is in the kilobytes instead of megabytes. The lack of heavy scripting and styling also means that Gemini clients run much more smoothly on low-end systems with limited computing resources. The decision not to include an equivalent to CSS for styling was also made in favor of user freedom: without a strict styling regimen dictated by the target website, clients are free to style pages however they want. No more dealing with sites that don’t have a dark theme - when your browser chooses the styling, everything can have a dark theme.

But Gemini is also very limited compared to the Internet. The same lack of styling that promotes user freedom for visitors also stifles creative expression for publishers by making them unable to control the styling of their own content. The lack of scripting means that the functionality of Gemini sites is severely limited. These points aren’t such big issues when you consider that Gemini is primarily designed for serving plain text content - blog posts, stories, poetry, and so on. But it does significantly limit the scope of what Gemini can be used for.

Where do we go from here?

Mastodon and Gemini are just two examples of projects that attempt to solve some of the issues we face using the modern web. There are many more out there, and I do hope that they gain more traction soon and spread awareness of the fact that we don’t need to be stuck with the One True Internet, or One True Social Media Platform. But none of the options available today (at least that I have found) effectively address all of the problems I am concerned with.

And that is why I am proud to announce that I am adding yet another obscure alternative to the list, to make things even more confusing!

In all seriousness, I am currently developing an entire network technology stack - communication protocols, networking systems, and yes, social media - that hopes to go even further beyond what options currently exist. Fully open-source, fully decentralized, fully decommercialized. In particular, there are a few pitfalls I want to avoid with this project which I will outline below.

There is a problem with nearly all social media systems, which is that no one will want to use it until there are people that use it. This is a nasty catch-22 that must be overcome in order to get any social media platform off the ground, where users need to be given good reason to switch over to a different platform, and generally that means their friends need to also be on it. The way I hope to overcome this is via a unified framework for content publishing and social interaction that integrates as fully as possible with existing platforms. “Come to Not-Twitter! There’s no one here yet but I promise it will be cool!” becomes “Come to Not-Twitter! It’s better and it also has Twitter!” which is a much more compelling argument. By making a layer that integrates seamlessly with existing platforms and new platforms alike, the “social media problem” is avoided by allowing a smooth transition from one service - from one social network - to another, with all the shades of grey you want in between.

A Unified Platform

As any artist will tell you, no art hosting platform is perfect. DeviantArt? Tumblr? Twitter? ArtStation? All of these have significant problems, and moreover, different problems for different users. Many have tried - and failed - to create a perfect platform, but they will always fail because no one can create an environment that fits everyone’s uses perfectly. Instead, the users themselves should define the environment they want. Imagine a universal system where users can publish stories, post art, show off their music, write tiny shitposts, and do whatever else one might want to do online, and every publisher gets to choose how their content is displayed to new viewers, and every viewer gets to decide how to display that content along with all of the other creators in their own personal feed.

The basic idea is that multimedia content will exist independently of a specific interface, and that content can be displayed in any number of ways depending on user needs. When you visit someone else’s personal page, they can showcase their content in whatever way they like. Maybe that’s showing off their latest album, or displaying long-form writing, or arranging a video player with space below for comments, or showcasing their latest artwork while also clearly showing their commission availability and prices. Meanwhile, when users “follow” or “subscribe” (or whatever else you want to call it) other users, that content gets aggregated and presented in a way, again, fully customizable. Users could set up multiple feeds, one displaying all of the microposts from Twitter, Mastodon, or any other platform, one feed showing new videos from creators you’re subscribed to on YouTube, as well as videos independently published on decentralized systems like PeerTube, one feed displaying art from artists you follow, and so on. And each of these feeds can be customizable to display content in exactly the way the viewer wants, in whatever layout suits it best.

Sound impossible? Probably. But I’m dumb and stubborn enough to try anyway.

Robust and Usable P2P Networking

Peer-to-peer networks have plenty of unique strengths, and unique weaknesses. The biggest weakness for a social system is discovery: how do you make connections with new people amongst a giant, amorphous mesh of millions? For the solution to this problem, I take inspiration from Mastodon’s mostly-decentralized architecture. Although the network is fully decentralized and works perfectly without any centralized authority, there’s nothing preventing centralized “hubs” or “relays” existing for various topics, so that people can more easily find others with similar interests. If you like listening to heavy metal and talking about woodworking, you can check out a heavy-metal-focused relay to find new musicians to follow (or if you make music, publish your own work to that relay for others to find), and you can subscribe to, and publish to, a woodworking-focused relay to exchange Twitter-esque microposts and engage in live chats with folks about woodworking. And if those relays go down? Maybe the server owner can’t afford to keep it running? Or the apartment they host it from burns down? You still know about all of those people you found, and you can track them down through the P2P network to find them somewhere else. The network is distributed and fault-tolerant while still being user-friendly and promoting discovery.

And speaking of fault-tolerance, this network can also be asynchronous. This means that you could write some blog posts, reply to some comments, and do whatever else with your locally-cached content while offline, and whenever you connect to the network again, those actions will be synchronized throughout the network, you can fetch the latest new stuff, and then go offline again. This is especially important if you’re concerned with off-grid networking, evading surveillance, or just have an unreliable network connection, or travel a lot.

Protocol-Independence

How to address the problem with web technology? JavaScript bloat? Fingerprinting and HTTP tracking methods? This is another case where jumping ship entirely wouldn’t work because you need users for a network to… work. Instead, the system will exist as a generic, protocol-agnostic abstraction layer which can communicate over many different protocols. You can host fancy web pages over HTTP, publish blog posts on Gemini, distribute your music over IPFS, send dank memes to your friends over ham radio. Similar to the unified, abstracted content presentation philosophy above, this system aims to be able to work with both existing and new network protocols, to provide as much flexibility and support as many use cases as possible, while still connecting users with different environments, disparate tastes, and unique use cases.

Decentralized Identity

Fair warning, this one’s a bit complicated.

You may have noticed that Mastodon, despite decentralization being a strong focus, still uses centralized authentication. That is, you still sign up with account credentials that are stored in a web server somewhere, becoming a more and more enticing target for hackers as the platform grows, and you still need to remember passwords and worry about grabbing the username you use everywhere else. This was an intentional compromise made when designing Mastodon, since the alternative - fully-decentralized peer-to-peer networking - is much more complex, both in terms of implementation and use. However, decentralized identity doesn’t need to be complicated. The solution lies in the form of a “web of trust” model, which users of PGP will likely be familiar with.

The overall concept is that you generate a cryptographic key pair - one public and one private key - and use the public key itself to represent your “identity”. Your private key is used to facilitate encryption and verifiable communication, which can be verified as authentic using the public key you pass around, which also identifies you uniquely among the whole world. No usernames, no passwords. Instead, users can endorse someone as “being the real so-and-so” by signing the person’s public key with their own private key. This signature is cryptographically verifiable, so that if you trust A, and A trusts B, and B trusts C, you can be reasonably sure that C is who they claim to be, even if you’ve never met them and they’re not a part of your social circles. And also, since key pairs don’t have real-life information like names or phone numbers attached to them, they are also anonymous. Your online identity can be totally coherent and consistent across all spaces, while also being decoupled from your real-life legal identity. In fact, you can have multiple key pairs that represent multiple different identities, and use those identities however you want to achieve whatever level of privacy you desire.

A common argument against the web-of-trust model is one similar to the social media problem, namely that you need people in the network you trust in order to have people in the network you can trust. However, this can be alleviated in a few ways. One is through the use of variable levels of trust. You trust a close friend much more than a casual acquaintance, and so their endorsement means more. Someone you’ve known for a long time can be given a strong trust value, and therefore be a useful tool in your verification toolbelt, while other people can be trusted at a much weaker level, possibly as little as “I know that they exist and claim to be someone”. And then, over time, the people you get to know better can be assigned higher levels of trust to strengthen the web of trust you’re building.

Another way to solve this problem is by what I call “reasonable verification” through side-channels. The only way to be absolutely certain that a public key belongs to the person (or persona) they claim is to physically meet them in person and obtain their public key from them directly. This is the only way to be 100% sure that no one has tampered with anything along the way. But this doesn’t mean the system is unusable otherwise. Yes, someone could intercept a public key sent over an unencrypted channel and tamper with it, so you can’t be totally sure. But if you have a cryptographic signature, much shorter than the key itself, which can be used to verify the authenticity of the key itself, you can easily communicate that short signature over other channels. For example, maybe you email your friend a quarter of the signature, send them another quarter over Telegram, another quarter spoken over the phone, and the final quarter by snail mail, that way an attacker would need to compromise four different communication channels simultaneously in order to fudge something. Are any of those channels 100% secure? Is this method 100% airtight? No. Is it more than good enough for nearly all cases? Yeah.

More importantly though, how often do you actually need 100% certainty that someone is who they claim to be? Maybe for financial exchanges or formal contract agreements, sure, but for sharing a picture of your cat with internet strangers? Who cares? The web of trust model can be used to verify the a key’s authenticity when needed, but the system can work fine for most purposes without it. The strongest defense against impersonation is simply keeping a copy of someone’s public key associated with their name so you’ll know if someone with a different key claims to be them, and this can all be done in software without any user intervention or understanding of cryptography needed.

Managing Complexity For Users

Everything I’ve mentioned up until now may sound a bit complicated. But don’t worry! In reality it’s much worse.

This massive omni-network project will necessarily contain a ton of complexity, and if blog posts complaining about not understanding Mastodon are any indication, that complexity can push users away before they get the chance to see the benefits of a system. This is why I’m going to put extra effort and attention towards making transitioning to this system from other platforms and networks as seamless and foolproof as possible. Coming from Twitter and just want a Twitter-like experience someplace else? Click the “just give me Twitter” button during setup and your interface, feeds, default networking settings, and so on will automatically be configured to mimic Twitter. And since the platform integrates with Twitter, you can pull in tweets from, and post tweets to, the existing Twitterverse with minimal hassle. Ideally switching to this system should be no more complicated than opening up TweetDeck, but with the added knobs under the hood that users can tweak to customize their experience as much (or as little) as they want.

There’s still plenty more to cover, and certainly things I’ve gotten wrong or forgotten to mention, but this post is already over 6400 words and several days in the making and I’d like to get this thing out the door, so I’m calling it here. If you want to hear me complain even more, you might have a problem. But it’s one you can indulge with my Mastodon account, where I will probably complain a lot more about things. If you’re interested in this project, let me know! I sure as hell ain’t gonna pull this off myself - I’m already working with Kasran (Twitter, Mastodon) - and help or discussion is always welcome. You can contact me at the Mastodon account above, or on Twitter at @trashbyte, or email me at hello@trashbyte.io. I’ll also be publishing updates here, but I’ll be most active on Mastodon.

To anyone who actually read all of this: thank you and/or I’m sorry. Really, your interest means a lot to me <3.

References

[1]: Browser Market Share Worldwide.

https://gs.statcounter.com/browser-market-share

[2]: evercookie - virtually irrevokable persistent cookies.

https://samy.pl/evercookie/

[3]: User Agent Spoofing: What Is It & Why Does It Matter?

https://www.clickcease.com/blog/what-is-user-agent-spoofing/

[4]: User agent - MDN Web Docs Glossary: Definitions of Web-related terms.

https://developer.mozilla.org/en-US/docs/Glossary/User_agent

[5]: Twitter Announces First Quarter 2022 Results

https://web.archive.org/web/20221015174950/https://s22.q4cdn.com/826641620/files/doc_financials/2022/q1/Final-Q1%e2%80%9922-earnings-release.pdf

[6]: After self-hosting my email for twenty-three years I have thrown in the towel. The oligopoly has won.

https://cfenollosa.com/blog/after-self-hosting-my-email-for-twenty-three-years-i-have-thrown-in-the-towel-the-oligopoly-has-won.html

[7]: YouTube 2nd Biggest Search Engine - The Myth That Just Won't Die.

https://www.tubics.com/blog/youtube-2nd-biggest-search-engine

[8]: Mastodon - Fediverse.Party.

https://fediverse.party/en/mastodon/

[9]: Most popular social networks worldwide as of January 2022, ranked by number of monthly active users.

https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/

[10]: Twitch clip: Video claimed for cricket sounds.

https://clips.twitch.tv/embed?clip=OptimisticGleamingLardDoggo&parent=example.com

[11]: Don't Tweet This! Twitter Automatically Blocks This Word.

https://www.techtimes.com/articles/258015/20210314/dont-tweet-this-twitter-automatically-blocked-this-word-learn-more.htm

[12]: How centralized is DNS traffic becoming?

https://blog.apnic.net/2020/11/24/how-centralized-is-dns-traffic-becoming/

[13]: DDoS attack that disrupted internet was largest of its kind in history, experts say.

https://www.theguardian.com/technology/2016/oct/26/ddos-attack-dyn-mirai-botnet

[14]: Cloudbleed Explained: Protect Yourself From the Internet's New Security Flaw.

[15]: Issue 1139: cloudflare: Cloudflare Reverse Proxies are Dumping Uninitialized Memory.

https://bugs.chromium.org/p/project-zero/issues/detail?id=1139

[16]: Change Your Passwords. Now.

https://gizmodo.com/cloudbleed-password-memory-leak-cloudflare-1792709635

[17]: Incident report on memory leak caused by Cloudflare parser bug.

https://web.archive.org/web/20170223233000/https://blog.cloudflare.com/incident-report-on-memory-leak-caused-by-cloudflare-parser-bug/

[18]: Scaling Mastodon in the Face of an Exodus

https://nora.codes/post/scaling-mastodon-in-the-face-of-an-exodus/

[19]: The Chromium Open Source Project on Open Hub

https://www.openhub.net/p/chrome/analyses/latest/languages_summary