Taking A Mile

The risk of the open internet is that someone will exploit your well-intentioned openness thoughtlessly. That’s how the internet slowly stops being open.

By Ernie Smith

Having been on Hacker News for a while, occasionally getting pieces that go viral over there, often without my direct influence, has given me a unique perspective on the point where tech and content meet.

To give you an idea, a piece on GeoWorks that I originally wrote in 2016 and updated in 2019 was a hit there just last week.

Plus, I comment a lot. And complain. And that, combined with the fact that I am someone who has been publishing on the internet largely as a writer has given me an interesting vantage point: I get to see what tech folks think about content, and then I get to parse that thinking into my own content-addled mind.

Recently, I was struck by a post about a tool that allows end users to essentially turn any website into a full-text RSS feed. Now, I am someone who has never not offered a full-text RSS feed on my site. I want people to access my content, and if it leaves money on the table, so be it. But I worry about our thought process behind building such stuff.

The creator, intelligence threat analyst David Greenwood, had a perfectly cromulent reason for doing what he was doing: He wanted to research historic information about a security vulnerability, and the fact that some of the sites he was researching did not have a full-text RSS feed (or a feed at all) made it difficult to do. So he created a tool to scrape the entire content history of a website.

Abbey

Book-up on anything

Abbey helps teams and individuals work smarter by turning different types of information into an easy-to-navigate workspace. Interact with all of your materials, one-click generate quizzes, navigate with auto created keys and outlines, and instantly share everything with your team. Try Abbey today and see how quickly you can learn.

To be clear, old data still holds value. I have one reader who periodically emails me after reading a piece I emailed him years ago. I know having old data is valuable. This guy is trying to extract value from old data. But his way of doing so could all too easily be misused by people with less noble intentions than he has.

But I think a lot about the choice of the publisher in the matter. I am someone who wants to leave his content open to as many people as possible, to leave open the possibility of clever new ideas. Essentially, Greenwood is taking the choice of whether a website has a full-text RSS feed out of the publisher’s hands. His reasons are noble; the reasons of the next person to use the tool may not be.

And I think there is a real, genuine risk, one made more obvious in the AI era, that people will take advantage of the mandate—freely available content on the internet—to do something that the mandate never intended: Take all of it, thoughtlessly. It is common to see sites like mine straight-up scraped by bots via their RSS feed, which is a big reason why the full-text RSS feed is a relative rarity. You could see someone with less chill than Greenwood take his work and use it to scrape wholesale.

To some degree, I get it. We spent 15 or 20 years essentially training end users to take content freely, but never discussed whether there is a realistic point where the free ride goes too far. It’s kind of like sampling a song—if you want to sample a short passage from the tune and make it your own, that’s fine, as long as the creator gets paid. But at which point does a sample thoughtlessly cross a line, where it’s out of exploitation, not deference?

Tape Measure

How much open internet is too much open internet? (Mark Wilkinson Hughes/Unsplash)

Earlier this year, my pals at 404 Media took a principled stance against this kind of scraping, putting up a regwall on their content to prevent it. But they also worked on ways to make the full-text RSS feed accessible to subscribers, to work in the spirit of the open internet.

I don’t want to put up a regwall, or a paywall, and I think that tech that thoughtlessly takes creates a real threat. The result: People who want to be on the open internet feel compelled to have to put up that regwall.

As a creator, I want to give an inch with what I create, because too much content is already hidden behind the shadow of the closed door. But in the current climate, so many people ignore the inch and take the mile.

Recently, it came out that a number of creators had their content scraped by the video generator service Runway. (Unrelatedly, 404 Media uncovered that. It led to an array of reactions from the affected people.) MKBHD, one of the people affected by this, found himself having to defend his discomfort with the fact in comments from people who seemed convinced that this was normal. “I’d love to know how this is different from me watching a bunch of your videos, learning from them, and making something similar,” one stated.

He was caught reacting to the news, rather than simply being proactive and being on top of it. If Runway asked to use his videos, he might have said yes! (He might have also asked for payment, as is his right.) I think so much of this isn’t even about pulling out pocketbooks, but simply asking ahead of time.

The FOSS space is also dealing with a degree of this as large companies increasingly struggle with how openness can be exploited without guardrails, in varying degrees of user hostility. As I noted recently, there has been a big discussion in that space around FUTO, an organization that thinks there needs to be a balance between “open-source” and “reasonably supporting the project.” In many ways, I think the point I’m making here is somewhat on the same wavelength.

So, to devs and fellow tech-heads, I think a good middle ground here is to actually talk to at least some of the people whose content you want to use. If asking all of them is too much work, at least get someone who has influence in the space that others listen to. Get them on your side, or at least hear out their concerns, so that they’re not being forced to react to something they do not like.

The risk if people don’t do that? A whole lot less content openly available on the internet. Hope you like regwalls.

Open Links

Brian Stelter has had a couple bad takes of late, but he ultimately was a journalist, a pretty good media analyst at that, done wrong by bad leadership when he was fired by CNN two years ago. CNN corrected its mistake and hired him back.

Been listening to a little Nick Lowe lately, an artist with an unmatched bioography. He has been a gray-haired elder statesman of rock for a quarter-century now, but there was a five-year period in the late-’70s and early-’80s where he was making very good mainstream pop music. (He also came out with perhaps the album with the best title in rock history during this period, Jesus of Cool, which was stupidly retitled for the American market.) Here’s one of his best songs from that period, “I Love The Sound of Breaking Glass,” his biggest UK solo hit.

I don’t see myself buying a reMarkable tablet, but they sure make it look attractive, don’t they?

--

Find this one an interesting read? Share it with a pal! And be sure to check out today’s sponsor, Abbey, if you’re looking to write, learn, and research more effectively.

Ernie Smith

Your time was just wasted by Ernie Smith

Ernie Smith is the editor of Tedium, and an active internet snarker. Between his many internet side projects, he finds time to hang out with his wife Cat, who's funnier than he is.

Find me on: Website Twitter